<html><head><title>NASM Manual</title></head> <body><h1 align=center>The Netwide Assembler: NASM</h1> <p align=center><a href="nasmdoc6.html">Next Chapter</a> | <a href="nasmdoc4.html">Previous Chapter</a> | <a href="nasmdoc0.html">Contents</a> | <a href="nasmdoci.html">Index</a> <h2><a name="chapter-5">Chapter 5: Assembler Directives</a></h2> <p>NASM, though it attempts to avoid the bureaucracy of assemblers like MASM and TASM, is nevertheless forced to support a <em>few</em> directives. These are described in this chapter. <p>NASM's directives come in two types: <em>user-level</em> directives and <em>primitive</em> directives. Typically, each directive has a user-level form and a primitive form. In almost all cases, we recommend that users use the user-level forms of the directives, which are implemented as macros which call the primitive forms. <p>Primitive directives are enclosed in square brackets; user-level directives are not. <p>In addition to the universal directives described in this chapter, each object file format can optionally supply extra directives in order to control particular features of that file format. These <em>format-specific</em> directives are documented along with the formats that implement them, in <a href="nasmdoc6.html">chapter 6</a>. <h3><a name="section-5.1">5.1 <code><nobr>BITS</nobr></code>: Specifying Target Processor Mode</a></h3> <p>The <code><nobr>BITS</nobr></code> directive specifies whether NASM should generate code designed to run on a processor operating in 16-bit mode, or code designed to run on a processor operating in 32-bit mode. The syntax is <code><nobr>BITS 16</nobr></code> or <code><nobr>BITS 32</nobr></code>. <p>In most cases, you should not need to use <code><nobr>BITS</nobr></code> explicitly. The <code><nobr>aout</nobr></code>, <code><nobr>coff</nobr></code>, <code><nobr>elf</nobr></code> and <code><nobr>win32</nobr></code> object formats, which are designed for use in 32-bit operating systems, all cause NASM to select 32-bit mode by default. The <code><nobr>obj</nobr></code> object format allows you to specify each segment you define as either <code><nobr>USE16</nobr></code> or <code><nobr>USE32</nobr></code>, and NASM will set its operating mode accordingly, so the use of the <code><nobr>BITS</nobr></code> directive is once again unnecessary. <p>The most likely reason for using the <code><nobr>BITS</nobr></code> directive is to write 32-bit code in a flat binary file; this is because the <code><nobr>bin</nobr></code> output format defaults to 16-bit mode in anticipation of it being used most frequently to write DOS <code><nobr>.COM</nobr></code> programs, DOS <code><nobr>.SYS</nobr></code> device drivers and boot loader software. <p>You do <em>not</em> need to specify <code><nobr>BITS 32</nobr></code> merely in order to use 32-bit instructions in a 16-bit DOS program; if you do, the assembler will generate incorrect code because it will be writing code targeted at a 32-bit platform, to be run on a 16-bit one. <p>When NASM is in <code><nobr>BITS 16</nobr></code> state, instructions which use 32-bit data are prefixed with an 0x66 byte, and those referring to 32-bit addresses have an 0x67 prefix. In <code><nobr>BITS 32</nobr></code> state, the reverse is true: 32-bit instructions require no prefixes, whereas instructions using 16-bit data need an 0x66 and those working on 16-bit addresses need an 0x67. <p>The <code><nobr>BITS</nobr></code> directive has an exactly equivalent primitive form, <code><nobr>[BITS 16]</nobr></code> and <code><nobr>[BITS 32]</nobr></code>. The user-level form is a macro which has no function other than to call the primitive form. <h4><a name="section-5.1.1">5.1.1 <code><nobr>USE16</nobr></code> & <code><nobr>USE32</nobr></code>: Aliases for BITS</a></h4> <p>The `<code><nobr>USE16</nobr></code>' and `<code><nobr>USE32</nobr></code>' directives can be used in place of `<code><nobr>BITS 16</nobr></code>' and `<code><nobr>BITS 32</nobr></code>', for compatibility with other assemblers. <h3><a name="section-5.2">5.2 <code><nobr>SECTION</nobr></code> or <code><nobr>SEGMENT</nobr></code>: Changing and Defining Sections</a></h3> <p>The <code><nobr>SECTION</nobr></code> directive (<code><nobr>SEGMENT</nobr></code> is an exactly equivalent synonym) changes which section of the output file the code you write will be assembled into. In some object file formats, the number and names of sections are fixed; in others, the user may make up as many as they wish. Hence <code><nobr>SECTION</nobr></code> may sometimes give an error message, or may define a new section, if you try to switch to a section that does not (yet) exist. <p>The Unix object formats, and the <code><nobr>bin</nobr></code> object format, all support the standardised section names <code><nobr>.text</nobr></code>, <code><nobr>.data</nobr></code> and <code><nobr>.bss</nobr></code> for the code, data and uninitialised-data sections. The <code><nobr>obj</nobr></code> format, by contrast, does not recognise these section names as being special, and indeed will strip off the leading period of any section name that has one. <h4><a name="section-5.2.1">5.2.1 The <code><nobr>__SECT__</nobr></code> Macro</a></h4> <p>The <code><nobr>SECTION</nobr></code> directive is unusual in that its user-level form functions differently from its primitive form. The primitive form, <code><nobr>[SECTION xyz]</nobr></code>, simply switches the current target section to the one given. The user-level form, <code><nobr>SECTION xyz</nobr></code>, however, first defines the single-line macro <code><nobr>__SECT__</nobr></code> to be the primitive <code><nobr>[SECTION]</nobr></code> directive which it is about to issue, and then issues it. So the user-level directive <p><pre> SECTION .text </pre> <p>expands to the two lines <p><pre> %define __SECT__ [SECTION .text] [SECTION .text] </pre> <p>Users may find it useful to make use of this in their own macros. For example, the <code><nobr>writefile</nobr></code> macro defined in <a href="nasmdoc4.html#section-4.3.3">section 4.3.3</a> can be usefully rewritten in the following more sophisticated form: <p><pre> %macro writefile 2+ [section .data] %%str: db %2 %%endstr: __SECT__ mov dx,%%str mov cx,%%endstr-%%str mov bx,%1 mov ah,0x40 int 0x21 %endmacro </pre> <p>This form of the macro, once passed a string to output, first switches temporarily to the data section of the file, using the primitive form of the <code><nobr>SECTION</nobr></code> directive so as not to modify <code><nobr>__SECT__</nobr></code>. It then declares its string in the data section, and then invokes <code><nobr>__SECT__</nobr></code> to switch back to <em>whichever</em> section the user was previously working in. It thus avoids the need, in the previous version of the macro, to include a <code><nobr>JMP</nobr></code> instruction to jump over the data, and also does not fail if, in a complicated <code><nobr>OBJ</nobr></code> format module, the user could potentially be assembling the code in any of several separate code sections. <h3><a name="section-5.3">5.3 <code><nobr>ABSOLUTE</nobr></code>: Defining Absolute Labels</a></h3> <p>The <code><nobr>ABSOLUTE</nobr></code> directive can be thought of as an alternative form of <code><nobr>SECTION</nobr></code>: it causes the subsequent code to be directed at no physical section, but at the hypothetical section starting at the given absolute address. The only instructions you can use in this mode are the <code><nobr>RESB</nobr></code> family. <p><code><nobr>ABSOLUTE</nobr></code> is used as follows: <p><pre> absolute 0x1A kbuf_chr resw 1 kbuf_free resw 1 kbuf resw 16 </pre> <p>This example describes a section of the PC BIOS data area, at segment address 0x40: the above code defines <code><nobr>kbuf_chr</nobr></code> to be 0x1A, <code><nobr>kbuf_free</nobr></code> to be 0x1C, and <code><nobr>kbuf</nobr></code> to be 0x1E. <p>The user-level form of <code><nobr>ABSOLUTE</nobr></code>, like that of <code><nobr>SECTION</nobr></code>, redefines the <code><nobr>__SECT__</nobr></code> macro when it is invoked. <p><code><nobr>STRUC</nobr></code> and <code><nobr>ENDSTRUC</nobr></code> are defined as macros which use <code><nobr>ABSOLUTE</nobr></code> (and also <code><nobr>__SECT__</nobr></code>). <p><code><nobr>ABSOLUTE</nobr></code> doesn't have to take an absolute constant as an argument: it can take an expression (actually, a critical expression: see <a href="nasmdoc3.html#section-3.8">section 3.8</a>) and it can be a value in a segment. For example, a TSR can re-use its setup code as run-time BSS like this: <p><pre> org 100h ; it's a .COM program jmp setup ; setup code comes last ; the resident part of the TSR goes here setup: ; now write the code that installs the TSR here absolute setup runtimevar1 resw 1 runtimevar2 resd 20 tsr_end: </pre> <p>This defines some variables `on top of' the setup code, so that after the setup has finished running, the space it took up can be re-used as data storage for the running TSR. The symbol `tsr_end' can be used to calculate the total size of the part of the TSR that needs to be made resident. <h3><a name="section-5.4">5.4 <code><nobr>EXTERN</nobr></code>: Importing Symbols from Other Modules</a></h3> <p><code><nobr>EXTERN</nobr></code> is similar to the MASM directive <code><nobr>EXTRN</nobr></code> and the C keyword <code><nobr>extern</nobr></code>: it is used to declare a symbol which is not defined anywhere in the module being assembled, but is assumed to be defined in some other module and needs to be referred to by this one. Not every object-file format can support external variables: the <code><nobr>bin</nobr></code> format cannot. <p>The <code><nobr>EXTERN</nobr></code> directive takes as many arguments as you like. Each argument is the name of a symbol: <p><pre> extern _printf extern _sscanf,_fscanf </pre> <p>Some object-file formats provide extra features to the <code><nobr>EXTERN</nobr></code> directive. In all cases, the extra features are used by suffixing a colon to the symbol name followed by object-format specific text. For example, the <code><nobr>obj</nobr></code> format allows you to declare that the default segment base of an external should be the group <code><nobr>dgroup</nobr></code> by means of the directive <p><pre> extern _variable:wrt dgroup </pre> <p>The primitive form of <code><nobr>EXTERN</nobr></code> differs from the user-level form only in that it can take only one argument at a time: the support for multiple arguments is implemented at the preprocessor level. <p>You can declare the same variable as <code><nobr>EXTERN</nobr></code> more than once: NASM will quietly ignore the second and later redeclarations. You can't declare a variable as <code><nobr>EXTERN</nobr></code> as well as something else, though. <h3><a name="section-5.5">5.5 <code><nobr>GLOBAL</nobr></code>: Exporting Symbols to Other Modules</a></h3> <p><code><nobr>GLOBAL</nobr></code> is the other end of <code><nobr>EXTERN</nobr></code>: if one module declares a symbol as <code><nobr>EXTERN</nobr></code> and refers to it, then in order to prevent linker errors, some other module must actually <em>define</em> the symbol and declare it as <code><nobr>GLOBAL</nobr></code>. Some assemblers use the name <code><nobr>PUBLIC</nobr></code> for this purpose. <p>The <code><nobr>GLOBAL</nobr></code> directive applying to a symbol must appear <em>before</em> the definition of the symbol. <p><code><nobr>GLOBAL</nobr></code> uses the same syntax as <code><nobr>EXTERN</nobr></code>, except that it must refer to symbols which <em>are</em> defined in the same module as the <code><nobr>GLOBAL</nobr></code> directive. For example: <p><pre> global _main _main: ; some code </pre> <p><code><nobr>GLOBAL</nobr></code>, like <code><nobr>EXTERN</nobr></code>, allows object formats to define private extensions by means of a colon. The <code><nobr>elf</nobr></code> object format, for example, lets you specify whether global data items are functions or data: <p><pre> global hashlookup:function, hashtable:data </pre> <p>Like <code><nobr>EXTERN</nobr></code>, the primitive form of <code><nobr>GLOBAL</nobr></code> differs from the user-level form only in that it can take only one argument at a time. <h3><a name="section-5.6">5.6 <code><nobr>COMMON</nobr></code>: Defining Common Data Areas</a></h3> <p>The <code><nobr>COMMON</nobr></code> directive is used to declare <em>common variables</em>. A common variable is much like a global variable declared in the uninitialised data section, so that <p><pre> common intvar 4 </pre> <p>is similar in function to <p><pre> global intvar section .bss intvar resd 1 </pre> <p>The difference is that if more than one module defines the same common variable, then at link time those variables will be <em>merged</em>, and references to <code><nobr>intvar</nobr></code> in all modules will point at the same piece of memory. <p>Like <code><nobr>GLOBAL</nobr></code> and <code><nobr>EXTERN</nobr></code>, <code><nobr>COMMON</nobr></code> supports object-format specific extensions. For example, the <code><nobr>obj</nobr></code> format allows common variables to be NEAR or FAR, and the <code><nobr>elf</nobr></code> format allows you to specify the alignment requirements of a common variable: <p><pre> common commvar 4:near ; works in OBJ common intarray 100:4 ; works in ELF: 4 byte aligned </pre> <p>Once again, like <code><nobr>EXTERN</nobr></code> and <code><nobr>GLOBAL</nobr></code>, the primitive form of <code><nobr>COMMON</nobr></code> differs from the user-level form only in that it can take only one argument at a time. <h3><a name="section-5.7">5.7 <code><nobr>CPU</nobr></code>: Defining CPU Dependencies</a></h3> <p>The <code><nobr>CPU</nobr></code> directive restricts assembly to those instructions which are available on the specified CPU. <p>Options are: <ul> <li><code><nobr>CPU 8086</nobr></code> Assemble only 8086 instruction set <li><code><nobr>CPU 186</nobr></code> Assemble instructions up to the 80186 instruction set <li><code><nobr>CPU 286</nobr></code> Assemble instructions up to the 286 instruction set <li><code><nobr>CPU 386</nobr></code> Assemble instructions up to the 386 instruction set <li><code><nobr>CPU 486</nobr></code> 486 instruction set <li><code><nobr>CPU 586</nobr></code> Pentium instruction set <li><code><nobr>CPU PENTIUM</nobr></code> Same as 586 <li><code><nobr>CPU 686</nobr></code> P6 instruction set <li><code><nobr>CPU PPRO</nobr></code> Same as 686 <li><code><nobr>CPU P2</nobr></code> Same as 686 <li><code><nobr>CPU P3</nobr></code> Pentium III and Katmai instruction sets <li><code><nobr>CPU KATMAI</nobr></code> Same as P3 <li><code><nobr>CPU P4</nobr></code> Pentium 4 (Willamette) instruction set <li><code><nobr>CPU WILLAMETTE</nobr></code> Same as P4 <li><code><nobr>CPU IA64</nobr></code> IA64 CPU (in x86 mode) instruction set </ul> <p>All options are case insensitive. All instructions will be selected only if they apply to the selected CPU or lower. By default, all instructions are available. <p align=center><a href="nasmdoc6.html">Next Chapter</a> | <a href="nasmdoc4.html">Previous Chapter</a> | <a href="nasmdoc0.html">Contents</a> | <a href="nasmdoci.html">Index</a> </body></html>