The XOP instruction set, announced by AMD on May 1, 2009, is an extension to the 128-bit SSE core instructions in the X86 and AMD64 instruction set for the Bulldozer processor core, due to begin production in 2011[1].
XOP is a revision of the SSE5 instruction set proposal announced on August 30, 2007. This revision makes the binary coding of the proposed new instructions more compatible with Intel's AVX instruction extensions, while the functionality of the instructions is unchanged [2].
The XOP instructions include:
- Integer vector multiply-accumulate instructions
- Integer vector horizontal addition
- Integer vector compare
- Integer vector shift and rotate instructions
- Vector byte permutation
- Vector conditional move instructions
- Floating point fraction extraction
The XOP instruction set is supplemented by the FMA4 (floating point vector multiply-accumulate) and CVT16 (Half precision floating point conversion) instruction sets, which were also included in SSE5.
Compatibility issues
AMD have changed the encoding from the original SSE5 specification in order to improve compatibility with Intel's AVX instruction set and the new VEX coding scheme.
All SSE5 instructions that were equivalent or similar to instructions in the AVX and FMA4 instruction sets announced by Intel have been changed to use the coding proposed by Intel. Instructions with equivalents in the AVX framework were hence classified as part of the XOP extension.[3] The XOP instructions do not use the VEX coding scheme but an almost identical coding scheme beginning with the byte value 8F (hexadecimal) which is equivalent to the 3-byte VEX prefix beginning with C4 (hexadecimal).
Commentators have seen this as evidence that Intel have not allowed AMD to use any part of the large VEX coding space. AMD have been forced to use different codes in order to avoid using any code combination that Intel might possibly have used in their development pipeline for something else. The XOP coding scheme is as close to the VEX scheme as technically possible without risking that the AMD codes overlap with any future Intel codes. It must be noted that this inference is speculative, since no public information is available about negotiations between the two companies on this issue.
The use of the 8F byte requires that the m-bits (see VEX coding scheme) have a value bigger than or equal to 8 in order to avoid overlap with existing instructions. The C4 byte used in the VEX scheme has no such restriction. This may prevent the use of the m-bits for other purposes in the future in the XOP scheme, but not in the VEX scheme. Another possible problem is that the pp bits have the value 00 in the XOP scheme, while they have the value 01 in the VEX scheme for instructions that have no legacy equivalent. This may complicate the use of the pp bits for other purposes in the future.
A similar compatibility issue is the difference between the FMA3 and FMA4 instruction sets. Intel has canceled to support FMA4,[citation needed] and updated the AVX specification to use FMA3 instructions.[4][5] However, AMD already adepted FMA4, and will support these commands in any case, even without an Intel CPU supporting FMA4. Because these instructions are taken of intel's first AVX specification, AMD is using the VEX prefix, not the XOP prefix with these instructions.
See also
References
- ^ "AMD64 Architecture Programmer’s Manual Volume 6: 128-Bit and 256-Bit XOP, FMA4 and CVT16 Instructions". AMD. May 1, 2009. http://support.amd.com/us/Processor_TechDocs/43479.pdf.
- ^ "Striking a balance". Dave Christie, AMD Developer blogs. May 7, 2009. http://forums.amd.com/devblog/blogpost.cfm?threadid=112934&catid=208.
- ^ "Striking a balance". Dave Christie, AMD Developer blogs. May 7, 2009. http://forums.amd.com/devblog/blogpost.cfm?threadid=112934&catid=208.
- ^ Intel AVX Programming Reference, March 2008
- ^ Intel Advanced Vector Extensions Programming Reference, January 2009
|
||||||||
This entry is from Wikipedia, the leading user-contributed encyclopedia. It may not have been reviewed by professional editors (see full disclaimer)




