www.digitalmars.com [Home] [Search] [D]

Last update Mar 6, 2002


D x86 Inline Assembler

Some Assembly Required D, being a systems programming language, provides an inline assembler. The inline assembler is standardized for D implementations across the same CPU family, for example, the Intel Pentium inline assembler for a Win32 D compiler will be syntax compatible with the inline assembler for Linux running on an Intel Pentium.

Differing D implementations, however, are free to innovate upon the memory model, function call/return conventions, argument passing conventions, etc.

This document describes the x86 implementation of the inline assembler.

	AsmInstruction:
		Identifier : AsmInstruction
		align IntegerExpression
		even
		naked
		db Operands
		ds Operands
		di Operands
		dl Operands
		df Operands
		dd Operands
		de Operands
		Opcode
		Opcode Operands

	Operands
		Operand
		Operand , Operands
	

Labels

Assembler instructions can be labeled just like other statements. They can be the target of goto statements. For example:
	void *pc;
	asm
	{
	    call L1		;
	 L1:			;
	    pop	EBX		;
	    mov	pc[EBP],EBX	;	// pc now points to code at L1
	}
	

align IntegerExpression

Causes the assembler to emit NOP instructions to align the next assembler instruction on an IntegerExpression boundary. IntegerExpression must evaluate to an integer that is a power of 2.

Aligning the start of a loop body can sometimes have a dramatic effect on the execution speed.

even

Causes the assembler to emit NOP instructions to align the next assembler instruction on an even boundary.

naked

Causes the compiler to not generate the function prolog and epilog sequences. This means such is the responsibility of inline assembly programmer, and is normally used when the entire function is to be written in assembler.

db, ds, di, dl, df, dd, de

These pseudo ops are for inserting raw data directly into the code. db is for bytes, ds is for 16 bit words, di is for 32 bit words, dl is for 64 bit words, df is for 32 bit floats, dd is for 64 bit doubles, and de is for 80 bit extended reals. Each can have multiple operands. If an operand is a string literal, it is as if there were length operands, where length is the number of characters in the string. One character is used per operand. For example:
	asm
	{
	    db 5,6,0x83;   // insert bytes 0x05, 0x06, and 0x83 into code
	    ds 0x1234;     // insert bytes 0x34, 0x12
	    di 0x1234;     // insert bytes 0x34, 0x12, 0x00, 0x00
	    dl 0x1234;     // insert bytes 0x34, 0x12, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
	    df 1.234;      // insert float 1.234
	    dd 1.234;      // insert double 1.234
	    de 1.234;      // insert extended 1.234
	    db "abc";      // insert bytes 0x61, 0x62, and 0x63
	    ds "abc";      // insert bytes 0x61, 0x00, 0x62, 0x00, 0x63, 0x00
	}
	

Opcodes

A list of supported opcodes is at the end.

The following registers are supported. Register names are always in upper case.

AL, AH, AX, EAX
BL, BH, BX, EBX
CL, CH, CX, ECX
DL, DH, DX, EDX
BP, EBP
SP, ESP
DI, EDI
SI, ESI
ES, CS, SS, DS, GS, FS
CR0, CR2, CR3, CR4
DR0, DR1, DR2, DR3, DR6, DR7
TR3, TR4, TR5, TR6, TR7
ST
ST(0), ST(1), ST(2), ST(3), ST(4), ST(5), ST(6), ST(7)
MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7

Special Cases

lock, rep, repe, repne, repnz, repz
These prefix instructions do not appear in the same statement as the instructions they prefix; they appear in their own statement. For example:
	asm
	{
	    rep   ;
	    movsb ;
	}
	
pause
This opcode is not supported by the assembler, instead use
	{
	    rep  ;
	    nop  ;
	}
	
which produces the same result.
floating point ops
Use the two operand form of the instruction format;
	fdiv ST(1);	// wrong
	fmul ST;        // wrong
	fdiv ST,ST(1);	// right
	fmul ST,ST(0);	// right
	

Operands

	Operand:
	    AsmExp

	AsmExp:
	    AsmLogOrExp
	    AsmLogOrExp ? AsmExp : AsmExp

	AsmLogOrExp:
	    AsmLogAndExp
	    AsmLogAndExp || AsmLogAndExp

	AsmLogAndExp:
	    AsmOrExp
	    AsmOrExp && AsmOrExp

	AsmOrExp:
	    AsmXorExp
	    AsmXorExp | AsmXorExp

	AsmXorExp:
	    AsmAndExp
	    AsmAndExp ^ AsmAndExp

	AsmAndExp:
	    AsmEqualExp
	    AsmEqualExp & AsmEqualExp

	AsmEqualExp:
	    AsmRelExp
	    AsmRelExp == AsmRelExp
	    AsmRelExp != AsmRelExp

	AsmRelExp:
	    AsmShiftExp
	    AsmShiftExp < AsmShiftExp
	    AsmShiftExp <= AsmShiftExp
	    AsmShiftExp > AsmShiftExp
	    AsmShiftExp >= AsmShiftExp

	AsmShiftExp:
	    AsmAddExp
	    AsmAddExp << AsmAddExp
	    AsmAddExp >> AsmAddExp
	    AsmAddExp >>> AsmAddExp

	AsmAddExp:
	    AsmMulExp
	    AsmMulExp + AsmMulExp
	    AsmMulExp - AsmMulExp

	AsmMulExp:
	    AsmBrExp
	    AsmBrExp * AsmBrExp
	    AsmBrExp / AsmBrExp
	    AsmBrExp % AsmBrExp

	AsmBrExp:
	    AsmUnaExp
	    AsmBrExp [ AsmExp ]

	AsmUnaExp:
	    AsmTypePrefix AsmExp
	    offset AsmExp
	    seg AsmExp
	    + AsmUnaExp
	    - AsmUnaExp
	    ! AsmUnaExp
	    ~ AsmUnaExp
	    AsmPrimaryExp

	AsmPrimaryExp
	    IntegerConstant
	    FloatConstant
	    __LOCAL_SIZE
	    $
	    Register
	    DotIdentifier

	DotIdentifier
	    Identifier
	    Identifier . DotIdentifier
	
The operand syntax more or less follows the Intel CPU documentation conventions. In particular, the convention is that for two operand instructions the source is the right operand and the destination is the left operand. The syntax differs from that of Intel's in order to be compatible with the D language tokenizer and to simplify parsing.

Operand Types

	AsmTypePrefix:
		near ptr
		far ptr
		byte ptr
		short ptr
		int ptr
		word ptr
		dword ptr
		float ptr
		double ptr
		extended ptr
	
In cases where the operand size is ambiguous, as in:
	add	[EAX],3		;
	
it can be disambiguated by using an AsmTypePrefix:
	add	byte ptr [EAX],3	;
	add	int ptr [EAX],7		;
	

Struct/Union/Class Member Offsets

To access members of an aggregate, given a pointer to the aggregate is in a register, use the qualified name of the member:
	struct Foo { int a,b,c; }
	int bar(Foo *f)
	{
	    asm
	    {	mov	EBX,f		;
		mov	EAX,Foo.b[EBX]	;
	    }
	}
	

Special Symbols

$
Represents the program counter of the start of the next instruction. So,
	jmp	$  ;
branches to the instruction following the jmp instruction.

__LOCAL_SIZE
This gets replaced by the number of local bytes in the local stack frame. It is most handy when the naked is invoked and a custom stack frame is programmed.

Opcodes Supported

aaa aad aam aas adc
add addpd addps addsd addss
and andnpd andnps andpd andps
arpl bound bsf bsr bswap
bt btc btr bts call
cbw cdq clc cld clflush
cli clts cmc cmova cmovae
cmovb cmovbe cmovc cmove cmovg
cmovge cmovl cmovle cmovna cmovnae
cmovnb cmovnbe cmovnc cmovne cmovng
cmovnge cmovnl cmovnle cmovno cmovnp
cmovns cmovnz cmovo cmovp cmovpe
cmovpo cmovs cmovz cmp cmppd
cmpps cmps cmpsb cmpsd cmpss
cmpsw cmpxch8b cmpxchg comisd comiss
cpuid cvtdq2pd cvtdq2ps cvtpd2dq cvtpd2pi
cvtpd2ps cvtpi2pd cvtpi2ps cvtps2dq cvtps2pd
cvtps2pi cvtsd2si cvtsd2ss cvtsi2sd cvtsi2ss
cvtss2sd cvtss2si cvttpd2dq cvttpd2pi cvttps2dq
cvttps2pi cvttsd2si cvttss2si cwd cwde
da daa das db dd
de dec df di div
divpd divps divsd divss dl
dq ds dt dw emms
enter f2xm1 fabs fadd faddp
fbld fbstp fchs fclex fcmovb
fcmovbe fcmove fcmovnb fcmovnbe fcmovne
fcmovnu fcmovu fcom fcomi fcomip
fcomp fcompp fcos fdecstp fdiv
fdivp fdivr fdivrp ffree fiadd
ficom ficomp fidiv fidivr fild
fimul fincstp finit fist fistp
fisub fisubr fld fld1 fldcw
fldenv fldl2e fldl2t fldlg2 fldln2
fldpi fldz fmul fmulp fnclex
fninit fnop fnsave fnstcw fnstenv
fnstsw fpatan fprem fprem1 fptan
frndint frstor fsave fscale fsetpm
fsin fsincos fsqrt fst fstcw
fstenv fstp fstsw fsub fsubp
fsubr fsubrp ftst fucom fucomi
fucomip fucomp fucompp fwait fxam
fxch fxrstor fxsave fxtract fyl2x
fyl2xp1 hlt idiv imul in
inc ins insb insd insw
int into invd invlpg iret
iretd ja jae jb jbe
jc jcxz je jecxz jg
jge jl jle jmp jna
jnae jnb jnbe jnc jne
jng jnge jnl jnle jno
jnp jns jnz jo jp
jpe jpo js jz lahf
lar ldmxcsr lds lea leave
les lfence lfs lgdt lgs
lidt lldt lmsw lock lods
lodsb lodsd lodsw loop loope
loopne loopnz loopz lsl lss
ltr maskmovdqu maskmovq maxpd maxps
maxsd maxss mfence minpd minps
minsd minss mov movapd movaps
movd movdq2q movdqa movdqu movhlps
movhpd movhps movlhps movlpd movlps
movmskpd movmskps movntdq movnti movntpd
movntps movntq movq movq2dq movs
movsb movsd movss movsw movsx
movupd movups movzx mul mulpd
mulps mulsd mulss neg nop
not or orpd orps out
outs outsb outsd outsw packssdw
packsswb packuswb paddb paddd paddq
paddsb paddsw paddusb paddusw paddw
pand pandn pavgb pavgw pcmpeqb
pcmpeqd pcmpeqw pcmpgtb pcmpgtd pcmpgtw
pextrw pinsrw pmaddwd pmaxsw pmaxub
pminsw pminub pmovmskb pmulhuw pmulhw
pmullw pmuludq pop popa popad
popf popfd por prefetchnta prefetcht0
prefetcht1 prefetcht2 psadbw pshufd pshufhw
pshuflw pshufw pslld pslldq psllq
psllw psrad psraw psrld psrldq
psrlq psrlw psubb psubd psubq
psubsb psubsw psubusb psubusw psubw
punpckhbw punpckhdq punpckhqdq punpckhwd punpcklbw
punpckldq punpcklqdq punpcklwd push pusha
pushad pushf pushfd pxor rcl
rcpps rcpss rcr rdmsr rdpmc
rdtsc rep repe repne repnz
repz ret retf rol ror
rsm rsqrtps rsqrtss sahf sal
sar sbb scas scasb scasd
scasw seta setae setb setbe
setc sete setg setge setl
setle setna setnae setnb setnbe
setnc setne setng setnge setnl
setnle setno setnp setns setnz
seto setp setpe setpo sets
setz sfence sgdt shl shld
shr shrd shufpd shufps sidt
sldt smsw sqrtpd sqrtps sqrtsd
sqrtss stc std sti stmxcsr
stos stosb stosd stosw str
sub subpd subps subsd subss
sysenter sysexit test ucomisd ucomiss
ud2 unpckhpd unpckhps unpcklpd unpcklps
verr verw wait wbinvd wrmsr
xadd xchg xlat xlatb xor
xorpd xorps

AMD Opcodes Supported

pavgusb pf2id pfacc pfadd pfcmpeq
pfcmpge pfcmpgt pfmax pfmin pfmul
pfnacc pfpnacc pfrcp pfrcpit1 pfrcpit2
pfrsqit1 pfrsqrt pfsub pfsubr pi2fd
pmulhrw pswapd

Copyright (c) 1999-2002 by Digital Mars, All Rights Reserved