x86 instruction listings

The x86 instruction set refers to the set of instructions that x86-compatible microprocessors support. The instructions are usually part of an executable program, often stored as a computer file and executed on the processor.

The x86 instruction set has been extended several times, introducing wider registers and datatypes as well as new functionality.^[1]

x86 integer instructions[edit]

Below is the full 8086/8088 instruction set of Intel (81 instructions total). Most if not all of these instructions are available in 32-bit mode; they just operate on 32-bit registers (eax, ebx, etc.) and values instead of their 16-bit (ax, bx, etc.) counterparts. See also x86 assembly language for a quick tutorial for this processor family. The updated instruction set is also grouped according to architecture (i386, i486, i686) and more generally is referred to as (32-bit) x86 and (64-bit) x86-64 (also known as AMD64).

Original 8086/8088 instructions[edit]

Original 8086/8088 instruction set
Instruction	Meaning	Notes	Opcode
AAA	ASCII adjust AL after addition	used with unpacked binary-coded decimal	0x37
AAD	ASCII adjust AX before division	8086/8088 datasheet documents only base 10 version of the AAD instruction (opcode 0xD5 0x0A), but any other base will work. Later Intel's documentation has the generic form too. NEC V20 and V30 (and possibly other NEC V-series CPUs) always use base 10, and ignore the argument, causing a number of incompatibilities	0xD5
AAM	ASCII adjust AX after multiplication	Only base 10 version (Operand is 0xA) is documented, see notes for AAD	0xD4
AAS	ASCII adjust AL after subtraction		0x3F
ADC	Add with carry	`destination = destination + source + carry_flag`	0x10…0x15, 0x80…0x81/2, 0x82…0x83/2 (since 80186)
ADD	Add	(1) `r/m += r/imm;` (2) `r += m/imm;`	0x00…0x05, 0x80/0…0x81/0, 0x82/0…0x83/0 (since 80186)
AND	Logical AND	(1) `r/m &= r/imm;` (2) `r &= m/imm;`	0x20…0x25, 0x80…0x81/4, 0x82…0x83/4 (since 80186)
CALL	Call procedure	`push eip; eip points to the instruction directly after the call`	0x9A, 0xE8, 0xFF/2, 0xFF/3
CBW	Convert byte to word		0x98
CLC	Clear carry flag	`CF = 0;`	0xF8
CLD	Clear direction flag	`DF = 0;`	0xFC
CLI	Clear interrupt flag	`IF = 0;`	0xFA
CMC	Complement carry flag		0xF5
CMP	Compare operands		0x38…0x3D, 0x80…0x81/7, 0x82…0x83/7 (since 80186)
CMPSB	Compare bytes in memory. May be used with a REP prefix to repeat the instruction CX times.		0xA6
CMPSW	Compare words. May be used with a REP prefix to repeat the instruction CX times.		0xA7
CWD	Convert word to doubleword		0x99
DAA	Decimal adjust AL after addition	(used with packed binary-coded decimal)	0x27
DAS	Decimal adjust AL after subtraction		0x2F
DEC	Decrement by 1		0x48…0x4F, 0xFE/1, 0xFF/1
DIV	Unsigned divide	(1) `AX = DX:AX / r/m;` resulting `DX = remainder` (2) `AL = AX / r/m;` resulting `AH = remainder`	0xF7/6, 0xF6/6
ESC	Used with floating-point unit		0xD8..0xDF
HLT	Enter halt state		0xF4
IDIV	Signed divide	(1) `AX = DX:AX / r/m;` resulting `DX = remainder` (2) `AL = AX / r/m;` resulting `AH = remainder`	0xF7/7, 0xF6/7
IMUL	Signed multiply in One-operand form	(1) `DX:AX = AX * r/m;` (2) `AX = AL * r/m`	0x69, 0x6B (both since 80186), 0xF7/5, 0xF6/5, 0x0FAF (since 80386)
IN	Input from port	(1) `AL = port[imm];` (2) `AL = port[DX];` (3) `AX = port[imm];` (4) `AX = port[DX];`	0xE4, 0xE5, 0xEC, 0xED
INC	Increment by 1		0x40…0x47, 0xFE/0, 0xFF/0
INT	Call to interrupt		0xCC, 0xCD
INTO	Call to interrupt if overflow		0xCE
IRET	Return from interrupt		0xCF
Jcc	Jump if condition	(JA, JAE, JB, JBE, JC, JE, JG, JGE, JL, JLE, JNA, JNAE, JNB, JNBE, JNC, JNE, JNG, JNGE, JNL, JNLE, JNO, JNP, JNS, JNZ, JO, JP, JPE, JPO, JS, JZ)	0x70…0x7F, 0x0F80…0x0F8F (since 80386)
JCXZ	Jump if CX is zero		0xE3
JMP	Jump		0xE9…0xEB, 0xFF/4, 0xFF/5
LAHF	Load FLAGS into AH register		0x9F
LDS	Load pointer using DS		0xC5
LEA	Load Effective Address		0x8D
LES	Load ES with pointer		0xC4
LOCK	Assert BUS LOCK# signal	(for multiprocessing)	0xF0
LODSB	Load string byte. May be used with a REP prefix to repeat the instruction CX times.	`if (DF==0) AL = SI++; else AL = SI--;`	0xAC
LODSW	Load string word. May be used with a REP prefix to repeat the instruction CX times.	`if (DF==0) AX = SI++; else AX = SI--;`	0xAD
LOOP/LOOPx	Loop control	(LOOPE, LOOPNE, LOOPNZ, LOOPZ) `if (x && --CX) goto lbl;`	0xE0…0xE2
MOV	Move	copies data from one location to another, (1) `r/m = r;` (2) `r = r/m;`	0xA0...0xA3
MOVSB	Move byte from string to string. May be used with a REP prefix to repeat the instruction CX times.	if (DF==0) (byte)DI++ = (byte)SI++; else (byte)DI-- = (byte)SI--; .	0xA4
MOVSW	Move word from string to string. May be used with a REP prefix to repeat the instruction CX times.	if (DF==0) (word)DI++ = (word)SI++; else (word)DI-- = (word)SI--;	0xA5
MUL	Unsigned multiply	(1) `DX:AX = AX * r/m;` (2) `AX = AL * r/m;`	0xF7/4, 0xF6/4
NEG	Two's complement negation	`r/m *= -1;`	0xF6/3…0xF7/3
NOP	No operation	opcode equivalent to `XCHG EAX, EAX`	0x90
NOT	Negate the operand, logical NOT	`r/m ^= -1;`	0xF6/2…0xF7/2
OR	Logical OR	(1) `r/m \|= r/imm;` (2) `r \|= m/imm;`	0x08…0x0D, 0x80…0x81/1, 0x82…0x83/1 (since 80186)
OUT	Output to port	(1) `port[imm] = AL;` (2) `port[DX] = AL;` (3) `port[imm] = AX;` (4) `port[DX] = AX;`	0xE6, 0xE7, 0xEE, 0xEF
POP	Pop data from stack	`r/m = *SP++;` POP CS (opcode 0x0F) works only on 8086/8088. Later CPUs use 0x0F as a prefix for newer instructions.	0x07, 0x0F(8086/8088 only), 0x17, 0x1F, 0x58…0x5F, 0x8F/0
POPF	Pop FLAGS register from stack	`FLAGS = *SP++;`	0x9D
PUSH	Push data onto stack	`*--SP = r/m;`	0x06, 0x0E, 0x16, 0x1E, 0x50…0x57, 0x68, 0x6A (both since 80186), 0xFF/6
PUSHF	Push FLAGS onto stack	`*--SP = FLAGS;`	0x9C
RCL	Rotate left (with carry)		0xC0…0xC1/2 (since 80186), 0xD0…0xD3/2
RCR	Rotate right (with carry)		0xC0…0xC1/3 (since 80186), 0xD0…0xD3/3
REPxx	Repeat MOVS/STOS/CMPS/LODS/SCAS	(REP, REPE, REPNE, REPNZ, REPZ)	0xF2, 0xF3
RET	Return from procedure	Not a real instruction. The assembler will translate these to a RETN or a RETF depending on the memory model of the target system.
RETN	Return from near procedure		0xC2, 0xC3
RETF	Return from far procedure		0xCA, 0xCB
ROL	Rotate left		0xC0…0xC1/0 (since 80186), 0xD0…0xD3/0
ROR	Rotate right		0xC0…0xC1/1 (since 80186), 0xD0…0xD3/1
SAHF	Store AH into FLAGS		0x9E
SAL	Shift Arithmetically left (signed shift left)	(1) `r/m <<= 1;` (2) `r/m <<= CL;`	0xC0…0xC1/4 (since 80186), 0xD0…0xD3/4
SAR	Shift Arithmetically right (signed shift right)	(1) `(signed) r/m >>= 1;` (2) `(signed) r/m >>= CL;`	0xC0…0xC1/7 (since 80186), 0xD0…0xD3/7
SBB	Subtraction with borrow	alternative 1-byte encoding of `SBB AL, AL` is available via undocumented SALC instruction	0x18…0x1D, 0x80…0x81/3, 0x82…0x83/3 (since 80186)
SCASB	Compare byte string. May be used with a REP prefix to repeat the instruction CX times.		0xAE
SCASW	Compare word string. May be used with a REP prefix to repeat the instruction CX times.		0xAF
SHL	Shift left (unsigned shift left)		0xC0…0xC1/4 (since 80186), 0xD0…0xD3/4
SHR	Shift right (unsigned shift right)		0xC0…0xC1/5 (since 80186), 0xD0…0xD3/5
STC	Set carry flag	`CF = 1;`	0xF9
STD	Set direction flag	`DF = 1;`	0xFD
STI	Set interrupt flag	`IF = 1;`	0xFB
STOSB	Store byte in string. May be used with a REP prefix to repeat the instruction CX times.	`if (DF==0) ES:DI++ = AL; else ES:DI-- = AL;`	0xAA
STOSW	Store word in string. May be used with a REP prefix to repeat the instruction CX times.	`if (DF==0) ES:DI++ = AX; else ES:DI-- = AX;`	0xAB
SUB	Subtraction	(1) `r/m -= r/imm;` (2) `r -= m/imm;`	0x28…0x2D, 0x80…0x81/5, 0x82…0x83/5 (since 80186)
TEST	Logical compare (AND)	(1) `r/m & r/imm;` (2) `r & m/imm;`	0x84, 0x84, 0xA8, 0xA9, 0xF6/0, 0xF7/0
WAIT	Wait until not busy	Waits until BUSY# pin is inactive (used with floating-point unit)	0x9B
XCHG	Exchange data	`r :=: r/m;` A spinlock typically uses xchg as an atomic operation. (coma bug).	0x86, 0x87, 0x91…0x97
XLAT	Table look-up translation	behaves like `MOV AL, [BX+AL]`	0xD7
XOR	Exclusive OR	(1) `r/m ^= r/imm;` (2) `r ^= m/imm;`	0x30…0x35, 0x80…0x81/6, 0x82…0x83/6 (since 80186)

Added in specific Intel processors[edit]

Added with 80186/80188[edit]

Instruction	Opcode	Meaning	Notes
BOUND	62 /r	Check array index against bounds	raises software interrupt 5 if test fails
ENTER	C8 iw ib	Enter stack frame	Modifies stack for entry to procedure for high level language. Takes two operands: the amount of storage to be allocated on the stack and the nesting level of the procedure.
INSB/INSW	6C	Input from port to string	equivalent to: IN AX, DX MOV ES:[DI], AX ; adjust DI according to operand size and DF
INSB/INSW	6D	Input from port to string
LEAVE	C9	Leave stack frame	Releases the local stack storage created by the previous ENTER instruction.
OUTSB/OUTSW	6E	Output string to port	equivalent to: MOV AX, DS:[SI] OUT DX, AX ; adjust SI according to operand size and DF
OUTSB/OUTSW	6F	Output string to port
POPA	61	Pop all general purpose registers from stack	equivalent to: POP DI POP SI POP BP POP AX ; no POP SP here, all it does is ADD SP, 2 (since AX will be overwritten later) POP BX POP DX POP CX POP AX
PUSHA	60	Push all general purpose registers onto stack	equivalent to: PUSH AX PUSH CX PUSH DX PUSH BX PUSH SP ; The value stored is the initial SP value PUSH BP PUSH SI PUSH DI
PUSH immediate	6A ib	Push an immediate byte/word value onto the stack	example: PUSH 12h PUSH 1200h
PUSH immediate	68 iw	Push an immediate byte/word value onto the stack	example: PUSH 12h PUSH 1200h
IMUL immediate	6B /r ib	Signed and unsigned multiplication of immediate byte/word value	example: IMUL BX,12h IMUL DX,1200h IMUL CX, DX, 12h IMUL BX, SI, 1200h IMUL DI, word ptr [BX+SI], 12h IMUL SI, word ptr [BP-4], 1200h Note that since the lower half is the same for unsigned and signed multiplication, this version of the instruction can be used for unsigned multiplication as well.
IMUL immediate	69 /r iw
SHL/SHR/SAL/SAR/ROL/ROR/RCL/RCR immediate	C0	Rotate/shift bits with an immediate value greater than 1	example: ROL AX,3 SHR BL,3
SHL/SHR/SAL/SAR/ROL/ROR/RCL/RCR immediate	C1	Rotate/shift bits with an immediate value greater than 1	example: ROL AX,3 SHR BL,3

Added with 80286[edit]

Instruction	Opcode	Meaning	Notes
ARPL r/m16, r16	63 /r	Adjust RPL field of selector	Available in 16/32-bit protected mode only. Causes #UD in Real mode and Virtual 8086 Mode - Windows 95 and OS/2 2.x are known to make extensive use of this #UD to use the 63 opcode as a one-byte breakpoint to transition from Virtual 8086 Mode to kernel mode.^[2]^[3]
CLTS	0F 06	Clear task-switched flag in Machine Status Word.
LAR r,r/m16	0F 02 /r	Load access rights byte from the specified segment descriptor	Sets ZF=1 if the descriptor could be loaded, ZF=0 otherwise. 32-bit variant of LAR instruction is documented to load undefined data into bits 19:16 of destination register on Intel CPUs.
LSL r,r/m16	0F 03 /r	Load segment limit from the specified segment descriptor	Sets ZF=1 if the descriptor could be loaded, ZF=0 otherwise.
LGDT m16&32	0F 01 /2	Load Global Descriptor Table Register	Each of these instructions loads a 2-part table descriptor. The first part is a 16-bit value, specifying table size in bytes minus 1. The second part is a 32-bit value (64-bit value in 64-bit mode), specifying the linear start address for the table. This address is ANDed with 00FFFFFFh for the 16-bit variants of these instructions. LIDT can relocate the Interrupt Vector Table in Real Mode as well. LGDT and LIDT are serializing instructions.
LIDT m16&32	0F 01 /3	Load Interrupt Descriptor Table Register
LLDT r/m16	0F 00 /2	Load Local Descriptor Table Register	LLDT and LTR are serializing instructions.
LTR r/m16	0F 00 /3	Load Task Register	LLDT and LTR are serializing instructions.
LMSW r/m16	0F 01 /6	Load Machine Status Word	On 80386 and later, the "Machine Status Word" is the same as the CR0 register, however LMSW can only modify the bottom 4 bits of this register. LMSW can be used to enter but not leave x86 Protected Mode. On the 80286, it is not possible to leave Protected Mode at all without a CPU reset - on 80386 and later, it is possible to leave Protected Mode, but this requires the use of the 80386-and-later MOV to CR0 instruction. LMSW is a serializing instruction.
SGDT m16&32	0F 01 /0	Store Global Descriptor Table Register	The SGDT,SIDT,SLDT,SMSW,STR were unprivileged on all x86 CPUs from 80286 onwards until the introduction of UMIP in 2017.^[4] This has been a significant security problem for software-based virtualization, since it enables these instructions to be used by a VM guest to detect that it is running inside a VM.^[5]^[6] The 16-bit variants of the SGDT and SIDT instructions also show a difference between Intel documentation and actual behavior observed on Intel CPUs: as of Intel SDM revision 076, december 2021, the last 8 bits of the descriptor is documented as being written as 0, however observed behavior is that bits 31:24 of the descriptor table address are written instead.^[7] SLDT and SMSW (but not STR) with a 32-bit register argument are documented to set the top 16 bits of the specified register to an undefined value on Intel CPUs.
SIDT m16&32	0F 01 /1	Store Interrupt Descriptor Table Register
SLDT r/m16	0F 00 /0	Store Local Descriptor Table Register
SMSW r/m16	0F 01 /4	Store Machine Status Word
STR r/m16	0F 00 /1	Store Task Register
VERR r/m16	0F 00 /4	Verify a segment for reading	Sets ZF=1 if segment can be read, ZF=0 otherwise.
VERW r/m16	0F 00 /5	Verify a segment for writing	Sets ZF=1 if segment can be written, ZF=0 otherwise. On some Intel CPU/microcode combinations from 2019 onwards, the VERW instruction also flushes microarchitectural data buffers. This enables it to be used as part of workarounds for Microarchitectural Data Sampling security vulnerabilities.^[8]^[9]
LOADALL	0F 05	Load all CPU registers, including internal ones such as GDT	Undocumented, 80286 only. (A different variant of LOADALL with a different opcode and memory layout exists on 80386.)

Added with 80386[edit]

Instruction	Meaning	Notes
BSF	Bit scan forward	BSF and BSR produce undefined results if the source argument is all-0s.
BSR	Bit scan reverse
BT	Bit test
BTC	Bit test and complement	Instructions atomic only if LOCK prefix present.
BTR	Bit test and reset
BTS	Bit test and set
CDQ	Convert double-word to quad-word	Sign-extends EAX into EDX, forming the quad-word EDX:EAX. Since (I)DIV uses EDX:EAX as its input, CDQ must be called after setting EAX if EDX is not manually initialized (as in 64/32 division) before (I)DIV.
CMPSD	Compare string double-word	Compares ES:[(E)DI] with DS:[(E)SI] and increments or decrements both (E)DI and (E)SI, depending on DF; can be prefixed with REP
CWDE	Convert word to double-word	Unlike CWD, CWDE sign-extends AX to EAX instead of AX to DX:AX
IBTS	Insert Bit String	Discontinued with B1 step of 80386.
IMUL	Two-operand form of IMUL: Signed and Unsigned	Allows to multiply two registers directly, storing the partial (truncated) lower bit result. Since the lower half is the same for unsigned and signed multiplication, this version of the instruction can be used for unsigned multiplication as well
INSD	Input from port to string double-word	`*(long)ES:EDI±± = port[DX];` (±± depends on DF, ES: cannot be overridden). Can be prefixed with REP.
IRETx	Interrupt return; D suffix means 32-bit return, F suffix means do not generate epilogue code (i.e. LEAVE instruction)	Use IRETD rather than IRET in 32-bit situations
Jxx (near)	Jump conditionally	Conditional near jump instructions for all 8086 Jxx short jump instructions
JECXZ	Jump if ECX is zero
LFS, LGS	Load far pointer
LSS	Load stack segment and register	Normally used to update both SS and SP at the same time.
LODSD	Load string double-word	`EAX = *DS:(E)SI±±;` (±± depends on DF, DS: can be overridden); can be prefixed with REP
LOOPW, LOOPccW	Loop, conditional loop	Same as LOOP, LOOPcc for earlier processors
LOOPD, LOOPccD	Loop while equal	`if (cc && --ECX) goto lbl;`, cc = Z(ero), E(qual), NonZero, N(on)E(qual)
MOV to/from CR/DR/TR	Move to/from special registers	CR=control registers, DR=debug registers, TR=test registers (up to 80486)
MOVSD	Move string double-word	`(dword)ES:EDI±± = (dword)ESI±±;` (±± depends on DF); can be prefixed with REP
MOVSX	Move with sign-extension	`(long)r = (signed char) r/m;` and similar
MOVZX	Move with zero-extension	`(long)r = (unsigned char) r/m;` and similar
OUTSD	Output to port from string double-word	`port[DX] = (long)DS:ESI±±;` (±± depends on DF, DS: can be overridden); can be prefixed with REP.
POPAD	Pop all double-word (32-bit) registers from stack	Does not pop register ESP off of stack
POPFD	Pop data into EFLAGS register
PUSHAD	Push all double-word (32-bit) registers onto stack
PUSHFD	Push EFLAGS register onto stack
PUSHD	Push a double-word (32-bit) value onto stack
SCASD	Scan string data double-word	Compares ES:[(E)DI] with EAX and increments or decrements (E)DI, depending on DF; can be prefixed with REP
SETcc	Set byte to one on condition, zero otherwise	(SETA, SETAE, SETB, SETBE, SETC, SETE, SETG, SETGE, SETL, SETLE, SETNA, SETNAE, SETNB, SETNBE, SETNC, SETNE, SETNG, SETNGE, SETNL, SETNLE, SETNO, SETNP, SETNS, SETNZ, SETO, SETP, SETPE, SETPO, SETS, SETZ)
SHLD	Shift left double	`r1 = r1<<CL ∣ r2>>(register_width - CL);` Instead of CL, 8-bit immediate can be used.
SHRD	Shift right double	`r1 = r1>>CL ∣ r2<<(register_width - CL);` Instead of CL, 8-bit immediate can be used. SHLD and SHRD with 16-bit arguments and a shift-amount greater than 16 produce undefined results. (Actual results differ between different Intel CPUs, with at least three different behaviors known.^[10])
STOSD	Store string double-word	`*ES:EDI±± = EAX;` (±± depends on DF, ES cannot be overridden); can be prefixed with REP
XBTS	Extract Bit String	Discontinued with B1 step of 80386. Used by software mainly for detection of the buggy^[11] B0 stepping of the 80386. Microsoft Windows (v2.01 and later) will attempt to run the XBTS instruction as part of its CPU detection if CPUID is not present, and will refuse to boot if XBTS is found to be working.^[12]

Compared to earlier sets, the 80386 instruction set also adds opcodes with different parameter combinations for the following instructions: BOUND, IMUL, LDS, LES, MOV, POP, PUSH and prefix opcodes for FS and GS segment overrides.

Added with 80486[edit]

Instruction	Opcode	Meaning	Notes
BSWAP r32	0F C8+r	Byte Swap	`r = r<<24 \| r<<8&0x00FF0000 \| r>>8&0x0000FF00 \| r>>24;` Only defined for 32-bit registers. Usually used to change between little endian and big endian representations. When used with 16-bit registers produces various different results on 486,^[13] 586, and Bochs/QEMU.^[14]
CMPXCHG r/m8, r8	0F A6 /r^[15]	Compare and Exchange	0F A6/A7 encodings only available on 80486 stepping A.^[16] 0F B0/B1 encodings available on 80486 stepping B and later x86 CPUs. Instruction atomic only if used with LOCK prefix.
CMPXCHG r/m8, r8	0F B0 /r^[17]
CMPXCHG r/m, r16/32	0F A7 /r
CMPXCHG r/m, r16/32	0F B1 /r
INVD	0F 08	Invalidate Internal Caches	Flush internal caches. Modified data present in the cache are not written back to memory, potentially causing data loss.
INVLPG m8	0F 01 /7	Invalidate TLB Entry	Invalidate TLB Entry for page that contains data specified.
WBINVD	0F 09	Write Back and Invalidate Cache	Writes back all modified cache lines in the processor's internal cache to main memory and invalidates the internal caches.
XADD r/m,r8	0F C0 /r	eXchange and ADD	Exchanges the first operand with the second operand, then loads the sum of the two values into the destination operand. Instruction atomic only if used with LOCK prefix.
XADD r/m,r16/32	0F C1 /r	eXchange and ADD

Added with Pentium[edit]

Instruction	Opcode	Meaning	Notes
CPUID	0F A2	CPU IDentification	Returns data regarding processor identification and features, and returns data to the EAX, EBX, ECX, and EDX registers. Instruction functions specified by the EAX register.^[1] This was also added to later 80486 processors
CMPXCHG8B m64	0F C7 /1	CoMPare and eXCHanGe 8 bytes	Compare EDX:EAX with m64. If equal, set ZF and load ECX:EBX into m64. Else, clear ZF and load m64 into EDX:EAX. Instruction atomic only if used with LOCK prefix. LOCK CMPXCHG8B with a register operand (which is an invalid encoding) can cause hangs on some Intel Pentium CPUs (Pentium F00F bug).
RDMSR	0F 32	ReaD from Model-specific register	Load MSR specified by ECX into EDX:EAX
RDTSC	0F 31	ReaD Time Stamp Counter	Returns the number of processor ticks since the processor being "ONLINE" (since the last power on of system)
WRMSR	0F 30	WRite to Model-Specific Register	Write the value in EDX:EAX to MSR specified by ECX
RSM^[18]	0F AA	Resume from System Management Mode	This was introduced by the i386SL and later and is also in the i486SL and later, as well as Cyrix 486SLC/e^[19] and later. Resumes from System Management Mode (SMM)

Added with Pentium MMX[edit]

Instruction	Opcode	Meaning	Notes
RDPMC	0F 33	Read the PMC [Performance Monitoring Counter]	Specified in the ECX register into registers EDX:EAX

Also MMX registers and MMX support instructions were added. They are usable for both integer and floating point operations, see below.

Added with Pentium Pro[edit]

Instruction	Opcode	Meaning	Notes
CMOVcc r16,r/m CMOVcc r32,r/m	0F 4x /r	Conditional move	(CMOVA, CMOVAE, CMOVB, CMOVBE, CMOVC, CMOVE, CMOVG, CMOVGE, CMOVL, CMOVLE, CMOVNA, CMOVNAE, CMOVNB, CMOVNBE, CMOVNC, CMOVNE, CMOVNG, CMOVNGE, CMOVNL, CMOVNLE, CMOVNO, CMOVNP, CMOVNS, CMOVNZ, CMOVO, CMOVP, CMOVPE, CMOVPO, CMOVS, CMOVZ)
UD2	0F 0B	Undefined Instruction	Generates an invalid opcode exception. This instruction is provided for software testing to explicitly generate an invalid opcode. The opcode for this instruction is reserved for this purpose.
NOP r/m	0F 1F /0	Official long NOP	Introduced in the Pentium Pro, but undocumented until 2006.^[20] The whole 0F 18..1F opcode range was NOP in Pentium Pro. However, except for 0F 1F /0, Intel does not guarantee that these opcodes will remain NOP in future processors, and have indeed assigned some of these opcodes to other instructions in at least some processors.^[21]

Added with Pentium II[edit]

Instruction	Opcode	Meaning	Notes
SYSENTER	0F 34	SYStem call ENTER	Sometimes called the Fast System Call instruction, this instruction was intended to increase the performance of operating system calls. On the Pentium Pro, the CPUID instruction reports these instructions as available. This is considered incorrect, as the instructions are not officially supported on the Pentium Pro. (Third party testing indicates that the instructions are present but too defective to be usable on the Pentium Pro.^[22])
SYSEXIT	0F 35	SYStem call EXIT

Added in specific non-Intel processors[edit]

Added with AMD K6[edit]

These instructions were added with AMD-K6, and are present in all later AMD x86 CPUs. They were also made an integral part of x86-64, and are therefore supported in the 64-bit "Long Mode" operation mode of all 64-bit x86 processors, including processors from Intel and VIA.

Instruction	Opcode	Meaning	Notes
SYSCALL	0F 05	Fast System Call	functionally equivalent to SYSENTER
SYSRET	0F 07	Fast System Return	functionally equivalent to SYSEXIT

AMD changed the CPUID detection bit for this feature from the K6-II on.

Added as instruction set extensions[edit]

SSE instructions (non-SIMD)[edit]

Added with SSE[edit]

Instruction	Opcode	Meaning	Notes
PREFETCHT0	0F 18 /1	Prefetch Data from Address	Prefetch into all cache levels
PREFETCHT1	0F 18 /2	Prefetch Data from Address	Prefetch into all cache levels EXCEPT^[23]^[24] L1
PREFETCHT2	0F 18 /3	Prefetch Data from Address	Prefetch into all cache levels EXCEPT L1 and L2
PREFETCHNTA	0F 18 /0	Prefetch Data from Address	Prefetch to non-temporal cache structure, minimizing cache pollution.
SFENCE	0F AE F8	Store Fence	Processor hint to make sure all store operations that took place prior to the SFENCE call are globally visible

Added with SSE2[edit]

Instruction	Opcode	Meaning	Notes
CLFLUSH m8	0F AE /7	Cache Line Flush	Invalidates the cache line that contains the linear address specified with the source operand from all levels of the processor cache hierarchy
LFENCE	0F AE E8	Load Fence	Serializes load operations.
MFENCE	0F AE F0	Memory Fence	Performs a serializing operation on all load and store instructions that were issued prior the MFENCE instruction.
MOVNTI m32, r32	0F C3 /r	Move Doubleword Non-Temporal	Move doubleword from r32 to m32, minimizing pollution in the cache hierarchy.
PAUSE	F3 90	Hint To Suspend Execution	Provides a hint to the processor that the following code is a spin loop. Suspends execution of the thread for a number of cycles to free resources for the sibling SMT thread to proceed.

Added with SSE3[edit]

Instruction	Opcode	Meaning	Notes
`MONITOR EAX, ECX, EDX`	0F 01 C8	Setup Monitor Address	Sets up a linear address range to be monitored by hardware and activates the monitor.
`MWAIT EAX, ECX`	0F 01 C9	Monitor Wait	Processor hint to stop instruction execution and enter an implementation-dependent optimized state until occurrence of a class of events.

Added with SSE4.2[edit]

Instruction	Opcode	Meaning	Notes
CRC32 r32, r/m8	F2 0F 38 F0 /r	Accumulate CRC32	Computes CRC value using the CRC-32C (Castagnoli) polynomial 0x11EDC6F41 (normal form 0x1EDC6F41). This is the polynomial used in iSCSI. In contrast to the more popular one used in Ethernet, its parity is even, and it can thus detect any error with an odd number of changed bits.
CRC32 r32, r/m8	F2 REX 0F 38 F0 /r
CRC32 r32, r/m16	F2 0F 38 F1 /r
CRC32 r32, r/m32	F2 0F 38 F1 /r
CRC32 r64, r/m8	F2 REX.W 0F 38 F0 /r
CRC32 r64, r/m64	F2 REX.W 0F 38 F1 /r
CRC32 r32, r/m8	F2 0F 38 F0 /r

Added with x86-64[edit]

Instruction	Meaning	Notes
CDQE	Sign extend EAX into RAX
CQO	Sign extend RAX into RDX:RAX
CMPSQ	CoMPare String Quadword
CMPXCHG16B	CoMPare and eXCHanGe 16 Bytes
IRETQ	64-bit Return from Interrupt
JRCXZ	Jump if RCX is zero
LODSQ	LoaD String Quadword
MOVSXD	MOV with Sign Extend 32-bit to 64-bit
POPFQ	POP RFLAGS Register
PUSHFQ	PUSH RFLAGS Register
RDTSCP	ReaD Time Stamp Counter and Processor ID
SCASQ	SCAn String Quadword
STOSQ	STOre String Quadword
SWAPGS	Exchange GS base with KernelGSBase MSR

Bit manipulation extensions[edit]

Added with ABM[edit]

LZCNT, POPCNT (POPulation CouNT) – advanced bit manipulation

Added with BMI1[edit]

ANDN, BEXTR, BLSI, BLSMSK, BLSR, TZCNT

Added with BMI2[edit]

BZHI, MULX, PDEP, PEXT, RORX, SARX, SHRX, SHLX

Added with CLMUL instruction set[edit]

Instruction	Opcode	Description
PCLMULQDQ xmmreg,xmmrm,imm	66 0f 3a 44 /r ib	Perform a carry-less multiplication of two 64-bit polynomials over the finite field GF(2^k).
PCLMULLQLQDQ xmmreg,xmmrm	66 0f 3a 44 /r 00	Multiply the low halves of the two registers.
PCLMULHQLQDQ xmmreg,xmmrm	66 0f 3a 44 /r 01	Multiply the high half of the destination register by the low half of the source register.
PCLMULLQHQDQ xmmreg,xmmrm	66 0f 3a 44 /r 10	Multiply the low half of the destination register by the high half of the source register.
PCLMULHQHQDQ xmmreg,xmmrm	66 0f 3a 44 /r 11	Multiply the high halves of the two registers.

Added with Intel ADX[edit]

Instruction	Description
ADCX	Adds two unsigned integers plus carry, reading the carry from the carry flag and if necessary setting it there. Does not affect other flags than the carry.
ADOX	Adds two unsigned integers plus carry, reading the carry from the overflow flag and if necessary setting it there. Does not affect other flags than the overflow.

Added with Intel TSX[edit]

Instruction	Opcode	Description
XBEGIN rel16/32	C7 F8 cw/cd	Start transaction. If transaction fails, perform a branch to the given relative offset.
XEND	0F 01 D5	End transaction.
XABORT imm8	C6 F8 ib	Abort transaction with 8-bit immediate as error code.
XACQUIRE	F2	Instruction prefix to indicate start of hardware lock elision, used with memory atomic instructions only (for other instructions, the F2 prefix may have other meanings). When used with such instructions, may start a transaction instead of performing the memory atomic operation.
XRELEASE	F3	Instruction prefix to indicate end of hardware lock elision, used with memory atomic/store instructions only (for other instructions, the F3 prefix may have other meanings). When used with such instructions during hardware lock elision, will end the associated transaction instead of performing the store/atomic.

Added with Intel CET[edit]

CET adds two distinct features to help protect against security exploits such as return-oriented programming: a shadow stack (CET_SS), and indirect branch tracking (CET_IBT).

Instruction	Opcode	Description	Notes
INCSSPD r32	F3 0F AE /5	Increment shadow stack pointer	Shadow stack (CET_SS). When shadow stacks are enabled, return addresses are pushed on both the regular stack and the shadow stack when a function call is made. They are then both popped on return from the function call - if they do not match, then the stack is assumed to be corrupted, and a #CP exception is issued. The shadow stack is additionally required to be stored in specially marked memory pages which cannot be modified by normal memory store instructions.
INCSSPQ r64	F3 REX.W 0F AE /5	Increment shadow stack pointer
RDSSPD r32	F3 0F 1E /1	Read shadow stack pointer into register (low 32 bits)
RDSSPQ r64	F3 REX.W 0F 1E /1	Read shadow stack pointer into register (full 64 bits)
SAVEPREVSSP	F3 0F 01 EA	Save previous shadow stack pointer
RSTORSSP m64	F3 0F 01 /5	Restore saved shadow stack pointer
WRSSD m32,r32	0F 38 F6 /r	Write 4 bytes to shadow stack
WRSSQ m64,r64	REX.W 0F 38 F6 /r	Write 8 bytes to shadow stack
WRUSSD m32,r32	66 0F 38 F5 /r	Write 4 bytes to user shadow stack
WRUSSQ m64,r64	66 REX.W 0F 38 F5 /r	Write 8 bytes to user shadow stack
SETSSBSY	F3 0F 01 E8	Mark shadow stack busy
CLRSSBSY m64	F3 0F AE /6	Clear shadow stack busy flag
ENDBR32	F3 0F 1E FB	Terminate indirect branch in 32-bit mode	Indirect Branch Tracking (CBT_IBT). When IBT is enabled, an indirect branch (jump, call, return) to any instruction that is not an ENDBR32/64 instruction will cause a #CP exception.
ENDBR64	F3 0F 1E FA	Terminate indirect branch in 64-bit mode
(no mnemonic)	3E	Prefix used with indirect CALL/JMP near instructions (opcodes FF /2 and FF /4) to indicate that the branch target is not required to start with an ENDBR32/64 instruction. Prefix only honored when NO_TRACK_EN flag is set. This prefix has the same encoding as the DS: segment override prefix - as of April 2022, Intel documentation does not appear to specify whether this prefix also retains its old segment-override function when used as a no-track prefix, nor does it provide an official mnemonic for this prefix.^[25]^[26] (GNU binutils use "notrack"^[27])

x87 floating-point instructions[edit]

Original 8087 instructions[edit]

Instruction	Meaning	Notes
F2XM1	$2^{x}-1$	More precise than $2^{x}$ for $x$ close to zero. On 8087, only supported for $0\leq x\leq {\frac {1}{2}}$ . On 80387 and later, supported for $-1\leq x\leq 1$ .
FABS	Absolute value
FADD	Add
FADDP	Add and pop
FBLD	Load BCD	Undefined result for non-BCD input.
FBSTP	Store BCD and pop
FCHS	Change sign
FCLEX	Clear exceptions
FCOM	Compare
FCOMP	Compare and pop
FCOMPP	Compare and pop twice
FDECSTP	Decrement floating point stack pointer
FDISI	Disable interrupts	8087 only, otherwise FNOP
FDIV	Divide	Pentium FDIV bug
FDIVP	Divide and pop
FDIVR	Divide reversed
FDIVRP	Divide reversed and pop
FENI	Enable interrupts	8087 only, otherwise FNOP
FFREE	Free register
FIADD	Integer add
FICOM	Integer compare
FICOMP	Integer compare and pop
FIDIV	Integer divide
FIDIVR	Integer divide reversed
FILD	Load integer
FIMUL	Integer multiply
FINCSTP	Increment floating point stack pointer
FINIT	Initialize floating point processor
FIST	Store integer
FISTP	Store integer and pop
FISUB	Integer subtract
FISUBR	Integer subtract reversed
FLD	Floating point load	FLD m80 and FLD st(i) variants will, with an sNaN argument, cause an invalid-operation exception on AMD but not Intel FPUs.
FLD1	Load 1.0 onto stack
FLDCW	Load control word
FLDENV	Load environment state
FLDENVW	Load environment state, 16-bit
FLDL2E	Load $log 2 (e)$ onto stack	Using round-to-nearest rounding on 8087. Performing rounding based on rounding control on 80387 and later.
FLDL2T	Load $log 2 (10)$ onto stack
FLDLG2	Load $log 10 (2)$ onto stack
FLDLN2	Load $ln(2)$ onto stack
FLDPI	Load $π$ onto stack
FLDZ	Load 0.0 onto stack
FMUL	Multiply
FMULP	Multiply and pop
FNCLEX	Clear exceptions, no wait
FNDISI	Disable interrupts, no wait	8087 only, otherwise FNOP
FNENI	Enable interrupts, no wait	8087 only, otherwise FNOP
FNINIT	Initialize floating point processor, no wait
FNOP	No operation
FNSAVE	Save FPU state, no wait, 8-bit
FNSAVEW	Save FPU state, no wait, 16-bit
FNSTCW	Store control word, no wait
FNSTENV	Store FPU environment, no wait
FNSTENVW	Store FPU environment, no wait, 16-bit
FNSTSW	Store status word, no wait
FPATAN	Partial arctangent	Computes $\arctan {\frac {st(1)}{st(0)}}$ , with adjustment for quadrant similar to C's atan2() function. On 8087, only supported for $\|st(0)\|\leq \|st(1)\|$ . This restriction was removed on the 80387.
FPREM	Partial remainder	Computes remainder with same sign as dividend, which is not IEEE-compliant. May compute a partial remainder, in which case it must be run again (signalled by C2 flag register).
FPTAN	Partial tangent	On 8087, only supported for $\|x\|\leq {\frac {\pi }{4}}$ On 80387 and later, supported for $\|x\|<2^{63}$
FRNDINT	Round to integer
FRSTOR	Restore saved state
FRSTORW	Restore saved state	Perhaps not actually available in 8087
FSAVE	Save FPU state
FSAVEW	Save FPU state, 16-bit
FSCALE	Scale by factor of 2	On 8087, only supported for scale factors in range $-2^{15}\leq st(1)<2^{15}$ and produces undefined behavior if $0<\|st(1)\|<1$ . These restrictions were removed on the 80387.
FSQRT	Square root
FST	Floating point store
FSTCW	Store control word
FSTENV	Store FPU environment
FSTENVW	Store FPU environment, 16-bit
FSTP	Store and pop	FSTP m80 and FSTP st(i) variants will, with an sNaN argument, cause an invalid-operation exception on AMD but not Intel FPUs.
FSTSW	Store status word
FSUB	Subtract
FSUBP	Subtract and pop
FSUBR	Reverse subtract
FSUBRP	Reverse subtract and pop
FTST	Test for zero
FWAIT	Wait while FPU is executing
FXAM	Examine condition flags
FXCH	Exchange registers
FXTRACT	Extract exponent and significand
FYL2X	$y \cdot log 2 x$	if $y = log b 2$ , then the base- $b$ logarithm is computed
FYL2XP1	$y \cdot log 2 (x +1)$	More precise than $log 2 z$ if x is close to zero. Only supported for $\|x\|\leq \left(1-{\sqrt {\frac {1}{2}}}\right)\approx 0.2929$