The FMA instruction set is a future extension to the 128-bit SIMD instructions in the X86 microprocessor instruction set to perform fused multiply-add operations. Two different variants of FMA will be used:
- FMA3 will be supported in Intel processors from 2011.
- FMA4 will be supported in AMD processors from 2011.
Contents |
New instructions
The FMA3 and FMA4 instruction sets have almost identical functionality but are not mutually compatible. Both contain fused multiply-and-add instructions for floating point scalar and SIMD operations.
Compatibility issue
The difference between FMA3 and FMA4 concerns the issue of whether the instruction can have three or four different operands. The fused multiply-add operation has the form:

The 4-operand form (FMA4) allows a, b, c and d to be four different registers, while the 3-operand form (FMA3) requires that d is the same register as either a, b or c. The 3-operand form makes the code shorter and the hardware implementation slightly simpler while the 4-operand form provides more programming flexibility.
See XOP instruction set for more discussion of compatibility issues between Intel and AMD.
CPUs with FMA3
- Intel
- AMD
- AMD will support FMA3 in the future for compatibility reasons if Intel sticks to FMA3 only[2].
CPUs with FMA4
- AMD
- Intel
- It is uncertain whether future Intel processors will support FMA4, due to Intel's announced change to FMA3.
History
The incompatibility between Intel's FMA3 and AMD's FMA4 is due to both companies changing plans without coordinating coding details with each other. AMD changed their plans from FMA3 to FMA4 while Intel changed their plans from FMA4 to FMA3 almost at the same time. The history can be summarized as follows:
- August 2007: AMD announces the SSE5 instruction set, which includes 3-operand fused multiply-add instructions. A new coding scheme (DREX) is introduced for allowing instructions to have three operands [4].
- April 2008: Intel announces their AVX and FMA instruction sets, including 4-operand fused multiply-add instructions. The coding of these instructions uses the new VEX coding scheme which is more flexible than AMD's DREX scheme [5].
- December 2008: Intel changes the specification for their FMA instructions from 4-operand to 3-operand instructions. The VEX coding scheme is still used [6].
- May 2009: AMD changes the specification of their FMA instructions from the 3-operand DREX form to the 4-operand VEX form, compatible with the April 2008 Intel specification rather than the December 2008 Intel specification[7].
It is currently uncertain whether the 3-operand VEX coded form (here called FMA3) or the 4-operand form (FMA4) will be the dominating standard in the future. It is also possible that future processors will support both forms.
References
- ^ (Japanese)PC Watch - Intel Roadmap: a break away from x86 (April 7, 2008)
- ^ "Striking a balance". Dave Christie, AMD Developer blogs. May 7, 2009. http://forums.amd.com/devblog/blogpost.cfm?threadid=112934&catid=208. Retrieved 2009-05-08.
- ^ "AMD64 Architecture Programmer’s Manual Volume 6: 128-Bit and 256-Bit XOP, FMA4 and CVT16 Instructions". AMD. May 1, 2009. http://support.amd.com/us/Processor_TechDocs/43479.pdf.
- ^ "128-Bit SSE5 Instruction Set". AMD Developer Central. http://developer.amd.com/SSE5. Retrieved 2008-01-28.
- ^ "Intel Advanced Vector Extensions Programming Reference". Intel. http://softwarecommunity.intel.com/isn/downloads/intelavx/Intel-AVX-Programming-Reference-31943302.pdf. Retrieved 2008-04-05.
- ^ "Intel Advanced Vector Extensions Programming Reference". Intel. http://software.intel.com/en-us/avx/. Retrieved 2009-05-06.
- ^ "Striking a balance". Dave Christie, AMD Developer blogs. May 7, 2009. http://forums.amd.com/devblog/blogpost.cfm?threadid=112934&catid=208. Retrieved 2009-05-08.
|
||||||||
This entry is from Wikipedia, the leading user-contributed encyclopedia. It may not have been reviewed by professional editors (see full disclaimer)




