RISCOS.com
www.riscos.com Technical Support:
BBC BASIC Reference Manual

Appendix J - ARM assembler

Assembly language is a programming language in which each statement translates directly into a single machine code instruction or piece of data. An assembler is a piece of software which converts these statements into their machine code counterparts.

Writing in assembly language has its disadvantages. The code is more verbose than the equivalent high-level language statements, more difficult to understand and therefore harder to debug. High-level languages were invented so that programs could be written to look more like English so we could talk to computers in our language rather than directly in its own.

There are two reasons why, in certain circumstances, assembly language is used in preference to high-level languages. The first reason is that the machine code program produced by it executes more quickly than its high-level counterparts, particularly those in languages such as BASIC which are interpreted. The second reason is that assembly language offers greater flexibility. It allows certain operating system routines to be called or replaced by new pieces of code, and it allows greater access to the hardware devices and controllers.

Finding out more

For more details of writing in assembly language see the Acorn Assembler Release 2 manual.

For more details of RISC OS see the Programmer's Reference Manual.

For more details of the ARM3 processor, see the Acorn RISC Machine family Data Manual. VLSI Technology Inc. (1990) Prentice-Hall, Englewood Cliffs, NJ, USA: ISBN 0-13-781618-9.

Using the BASIC assembler

The assembler is part of the BBC BASIC language. Square brackets '[' and ']' are used to enclose all the assembly language instructions and directives and hence to inform BASIC that the enclosed instructions are intended for its assembler. However, there are several operations which must be performed from BASIC itself to ensure that a subsequent assembly language routine is assembled correctly.

Initialising external variables

The assembler allows the use of BASIC variables as addresses or data in instructions and assembler directives. For example variables can be set up in BASIC giving the numbers of any SWI routines which will be called:

OS_WriteI = &100
...
[
...
SWI OS_WriteI+ASC">"
...

Reserving memory space for the machine code

The machine code generated by the assembler is stored in memory. However, the assembler does not automatically set memory aside for this purpose. You must reserve sufficient memory to hold your assembled machine code by using the DIM statement. For example:

1000 DIM code% 100

The start address of the memory area reserved is assigned to the variable code%. The address of the last memory location is code%+100. Hence, this example reserves a total of 101 bytes of memory. In future examples, the size of memory reserved is shown as required_size, to emphasise that you must substitute a value appropriate to the size of your code.

Memory pointers

You need to tell the assembler the start address of the area of memory you have reserved. The simplest way to do this is to assign P% to point to the start of this area. For example:

DIM code% required_size
...
P% = code%

P% is then used as the program counter. The assembler places the first assembler instruction at the address P% and automatically increments the value of P% by four so that it points to the next free location. When the assembler has finished assembling the code, P% points to the byte following the final location used. Therefore, the number of bytes of machine code generated is given by:

P% - code%

This method assumes that you wish subsequently to execute the code at the same location.

The position in memory at which you load a machine code program may be significant. For example, it might refer directly to data embedded within itself, or expect to find routines at fixed addresses. Such a program only works if it is loaded in the correct place in memory. However, it is often inconvenient to assemble the program directly into the place where it will eventually be executed. This memory may well be used for something else whilst you are assembling the program. The solution to this problem is to use a technique called 'offset assembly' where code is assembled as if it is to run at a certain address but is actually placed at another.

To do this, set O% to point to the place where the first machine code instruction is to be placed and P% to point to the address where the code is to be run.

To notify the assembler that this method of generating code is to be used, the directive OPT, which is described in more detail below, must have bit 2 set.

It is usually easy, and always preferable, to write ARM code that is position independent.

Implementing passes

Normally, when the processor is executing a machine code program, it executes one instruction and then moves on automatically to the one following it in memory. You can, however, make the processor move to a different location and start processing from there instead by using one of the 'branch' instructions. For example:

.result_was_0
        ...
        BEQ result_was_0

The fullstop in front of the name result_was_0 identifies this string as the name of a 'label'. This is a directive to the assembler which tells it to assign the current value of the program counter (P%) to the variable whose name follows the fullstop.

BEQ means 'branch if the result of the last calculation that updated the PSR was zero'. The location to be branched to is given by the value previously assigned to the label result_was_0.

The label can, however, occur after the branch instruction. This causes a slight problem for the assembler since when it reaches the branch instruction, it hasn't yet assigned a value to the variable, so it doesn't know which value to replace it with.

You can get around this problem by assembling the source code twice. This is known as two-pass assembly. During the first pass the assembler assigns values to all the label variables. In the second pass it is able to replace references to these variables by their values.

It is only when the text contains no forward references of labels that just a single pass is sufficient.

These two passes may be performed by a FOR...NEXT loop as follows:

DIM code% required_size
FOR pass% = 0 TO 3 STEP 3
  P% = code%
  [
  OPT pass%
  ... further assembly language statements and assembler directives
  ]
NEXT pass%

Note that the pointer(s), in this case just P%, must be set at the start of both passes.

The OPT directive

The OPT is an assembler directive whose bits have the following meaning:

Bit Meaning
0 Assembly listing enabled if set
1 Assembler errors enabled
2 Assembled code placed in memory at O% instead of P%
3 Check that assembled code does not exceed memory limit L%

Bit	Meaning
0	Assembly listing enabled if set
1	Assembler errors enabled
2	Assembled code placed in memory at O% instead of P%
3	Check that assembled code does not exceed memory limit L%

Bit 0 controls whether a listing is produced. It is up to you whether or not you wish to have one or not.

Bit 1 determines whether or not assembler errors are to be flagged or suppressed. For the first pass, bit 1 should be zero since otherwise any forward-referenced labels will cause the error 'Unknown or missing variable' and hence stop the assembly. During the second pass, this bit should be set to one, since by this stage all the labels defined are known, so the only errors it catches are 'real ones' - such as labels which have been used but not defined.

Bit 2 allows 'offset assembly', i.e. the program may be assembled into one area of memory, pointed to by O%, whilst being set up to run at the address pointed to by P%.

Bit 3 checks that the assembled code does not exceed the area of memory that has been reserved (i.e. none of it is held in an address greater than the value held in L%). When reserving space, L% might be set as follows:

DIM code% required_size
L% = code% + required_size

Saving machine code to file

Once an assembly language routine has been successfully assembled, you can then save it to file. To do so, you can use the *Save command. In our above examples, code% points to the start of the code; after assembly, P% points to the byte after the code. So we could use this BASIC command:

OSCLI "Save "+outfile$+" "+STR$~(code%)+" "+STR$~(P%)

after the above example to save the code in the file named by outfile$.

Executing a machine code program

From memory

From memory, the resulting machine code can be executed in a variety of ways:

CALL address
USR address

These may be used from inside BASIC to run the machine code at a given address.

From file

The commands below will load and run the named file, using either its filetype (such as &FF8 for absolute code) and the associated Alias$@LoadType_XXX and Alias$@RunType_XXX system variables, or the load and execution addresses defined when it was saved.

*name
*RUN name
*/name

We strongly advise you to use file types in preference to load and execution addresses.

Format of assembly language statements

The assembly language statements and assembler directives should be between the square brackets.

There are very few rules about the format of assembly language statements; those which exist are given below:

Each assembly language statement comprises an assembler mnemonic of one or more letters followed by a varying number of operands.
Instructions should be separated from each other by colons or newlines.
Any text following a full stop '.' is treated as a label name.
Any text following a semicolon ';', or backslash '\', or 'REM' is treated as a comment and so ignored (until the next end of line or ':').
Spaces between the mnemonic and the first operand, and between the operands themselves are ignored.

The BASIC assembler contains the following directives:

EQUB int Define 1 byte of memory from LSB of int (DCB, =)
EQUW int Define 2 bytes of memory from int (DCW)
EQUD int Define 4 bytes of memory from int (DCD)
EQUS str Define 0 - 255 bytes as required by string expression str (DCS)
ALIGN Align P% (and O%) to the next word (4 byte) boundary
ADR reg,addr Assemble instruction to load addr into reg

The first four operations initialise the reserved memory to the values specified by the operand. In the case of EQUS the operand field must be a string expression. In all other cases it must be a numeric expression. DCB (and =), DCW, DCD and DCS are synonyms for these directives.
The ALIGN directive ensures that the next P% (and O%) that is used lies on a word boundary. It is used after, for example, an EQUS to ensure that the next instruction is word-aligned.
ADR assembles a single instruction - typically but not necessarily an ADD or SUB - with reg as the destination register. It obtains addr in that register. It does so in a PC-relative (i.e. position independent) manner where possible.

Registers

At any particular time there are sixteen 32-bit registers available for use, R0 to R15. However, R15 is special since it contains the program counter and the processor status register.

R15 is split up with 24 bits used as the program counter (PC) to hold the word address of the next instruction. 8 bits are used as the processor status register (PSR) to hold information about the current values of flags and the current mode/register bank. These bits are arranged as follows:

The top six bits hold the following information:

Bit Flag Meaning
31 N Negative flag
30 Z Zero flag
29 C Carry flag
28 V Overflow flag
27 I Interrupt request disable
26 F Fast interrupt request disable

Bit	Flag	Meaning
31	N	Negative flag
30	Z	Zero flag
29	C	Carry flag
28	V	Overflow flag
27	I	Interrupt request disable
26	F	Fast interrupt request disable

The bottom two bits can hold one of four different values:

M Meaning
0 User mode
1 Fast interrupt processing mode (FIQ mode)
2 Interrupt processing mode (IRQ mode)
3 Supervisor mode (SVC mode)

M	Meaning
0	User mode
1	Fast interrupt processing mode (FIQ mode)
2	Interrupt processing mode (IRQ mode)
3	Supervisor mode (SVC mode)

User mode is the normal program execution state. SVC mode is a special mode which is entered when calls to the supervisor are made using software interrupts (SWIs) or when an exception occurs. From within SVC mode certain operations can be performed which are not permitted in user mode, such as writing to hardware devices and peripherals. SVC mode has its own private registers R13 and R14. So after changing to SVC mode, the registers R0 - R12 are the same, but new versions of R13 and R14 are available. The values contained by these registers in user mode are not overwritten or corrupted.

Similarly, IRQ and FIQ modes have their own private registers (R13 - R14 and R8 - R14 respectively).

Although only 16 registers are available at any one time, the processor actually contains a total of 27 registers.

For a more complete description of the registers, see the chapter entitled ARM Hardware in the Programmers' Reference Manual.

Condition codes

All the machine code instructions can be performed conditionally according to the status of one or more of the following flags: N, Z, C, V. The sixteen available condition codes are:

AL Always This is the default
CC Carry clear C clear
CS Carry set C set
EQ Equal Z set
GE Greater than or equal (N set and V set) or (N clear and V clear)
GT Greater than ((N set and V set) or (N clear and V clear)) and Z clear
HI Higher (unsigned) C set and Z clear
LE Less than or equal (N set and V clear) or (N clear and V set) or Z set
LS Lower or same (unsigned) C clear or Z set
LT Less than (N set and V clear) or (N clear and V set)
MI Negative N set
NE Not equal Z clear
NV Never
PL Positive N clear
VC Overflow clear V clear
VS Overflow set V set

Two of these may be given alternative names as follows:

LO Lower unsigned is equivalent to CC
HS Higher / same unsigned is equivalent to CS

You should not use the NV (never) condition code.

The instruction set

The available instructions are introduced below in categories indicating the type of action they perform and their syntax. The description of the syntax obeys the following standards:

« » indicates that the contents of the brackets are optional (unlike all other chapters, where we have been using [ ] instead)
(x|y) indicates that either x or y but not both may be given
#exp indicates that a BASIC expression is to be used which evaluates to an immediate constant. An error is given if the value cannot be stored in the instruction.
Rn indicates that an expression evaluating to a register number (in the range 0 - 15) should be used, or just a register name, e.g. R0. PC may be used for R15.
shift indicates that one of the following shift options should be used:
ASL (Rn|#exp) Arithmetic shift left by contents of Rn or expression
LSL (Rn|#exp) Logical shift left
ASR (Rn|#exp) Arithmetic shift right
LSR (Rn|#exp) Logical shift right
ROR (Rn|#exp) Rotate right
RRX Rotate right one bit with extend

In fact ASL and LSL are the same (because the ARM does not handle overflow for signed arithmetic shifts), and synonyms. LSL is the preferred form, as it indicates the functionality.

Moves

Syntax:

opcode«cond»«S» Rd, (#exp|Rm)«,shift»

There are two move instructions. 'Op2' means '(#exp|Rm)«,shift»':

Instruction Calculation performed
MOV Move Rd = Op2
MVN Move NOT Rd = NOT Op2

Instruction	Calculation performed
MOV	Move	Rd = Op2
MVN	Move NOT	Rd = NOT Op2

Each of these instructions produces a result which it places in a destination register (Rd). The instructions do not affect bytes in memory directly.

Again, all of these instructions can be performed conditionally. In addition, if the 'S' is present, they can cause the condition codes to be set or cleared. These instructions set N and Z from the ALU, C from the shifter (but only if it is used), and do not affect V.

Examples:

MOV R0, #10 ; Load R0 with the value 10.

Special actions are taken if the source register is R15; the action is as follows:

If Rm=R15 all 32 bits of R15 are used in the operation, i.e. the PC + PSR.

If the destination register is R15, then the action depends on whether the optional 'S' has been used:

If S is not present only the 24 bits of the PC are set.
If S is present the whole result is written to R15, the flags are updated from the result. (However the mode, I and F bits can only be changed when in non-user modes.)

Arithmetic and logical instructions

Syntax:

opcode«cond»«S» Rd, Rn, (#exp|Rm)«,shift»

The instructions available are given below; again, 'Op2' means '(#exp|Rm)«,shift»':

Instruction Calculation performed
ADC Add with carry Rd = Rn + Op2 + C
ADD Add without carry Rd = Rn + Op2
SBC Subtract with carry Rd = Rn - Op2 - (1 - C)
SUB Subtract without carry Rd = Rn - Op2
RSC Reverse subtract with carry Rd = Op2 - Rn - (1 - C)
RSB Reverse subtract without carry Rd = Op2 - Rn
AND Bitwise AND Rd = Rn AND Op2
BIC Bitwise AND NOT Rd = Rn AND NOT (Op2)
ORR Bitwise OR Rd = Rn OR Op2
EOR Bitwise EOR Rd = Rn EOR Op2

Instruction	Calculation performed
ADC	Add with carry	Rd = Rn + Op2 + C
ADD	Add without carry	Rd = Rn + Op2
SBC	Subtract with carry	Rd = Rn - Op2 - (1 - C)
SUB	Subtract without carry	Rd = Rn - Op2
RSC	Reverse subtract with carry	Rd = Op2 - Rn - (1 - C)
RSB	Reverse subtract without carry	Rd = Op2 - Rn
AND	Bitwise AND	Rd = Rn AND Op2
BIC	Bitwise AND NOT	Rd = Rn AND NOT (Op2)
ORR	Bitwise OR	Rd = Rn OR Op2
EOR	Bitwise EOR	Rd = Rn EOR Op2

Each of these instructions produces a result which it places in a destination register (Rd). The instructions do not affect bytes in memory directly.

As was seen above, all of these instructions can be performed conditionally. In addition, if the 'S' is present, they can cause the condition codes to be set or cleared. The condition codes N, Z, C and V are set by the arithmetic logic unit (ALU) in the arithmetic operations. The logical (bitwise) operations set N and Z from the ALU, C from the shifter (but only if it is used), and do not affect V.

Examples:

ADDEQ   R1, R1, #7      ; If the zero flag is set then add 7
                        ; to the contents of register R1.
SBCS R2, R3, R4         ; Subtract with carry the contents of register R4 from
                        ; the contents of register R3 and place the result in
                        ; register R2. The flags will be updated.
AND R3, R1, R2, LSR #2  ; Perform a logical AND on the contents of register R1
                        ; and the contents of register R2 / 4, and place the
                        ; result in register R3.

Special actions are taken if any of the source registers are R15; the action is as follows:

If Rm=R15 all 32 bits of R15 are used in the operation i.e. the PC + PSR.
If Rn=R15 only the 24 bits of the PC are used in the operation.

If the destination register is R15, then the action depends on whether the optional 'S' has been used:

If S is not present only the 24 bits of the PC are set.
If S is present the whole result is written to R15, the flags are updated from the result. (However the mode, I and F bits can only be changed when in non-user modes.)

Comparisons

Syntax:

opcode«cond»«S|P» Rn, (#exp|Rm)«,shift»

There are four comparison instructions; again, 'Op2' means '(#exp|Rm)«,shift»':

Instruction
Calculation performed
CMN Compare negated Rn + Op2
CMP Compare Rn - Op2
TEQ Test equal Rn EOR Op2
TST Test Rn AND Op2

Instruction	Calculation performed
CMN	Compare negated	Rn + Op2
CMP	Compare	Rn - Op2
TEQ	Test equal	Rn EOR Op2
TST	Test	Rn AND Op2

These are similar to the arithmetic and logical instructions listed above except that they do not take a destination register since they do not return a result. Also, they automatically set the condition flags (since they would perform no useful purpose if they didn't). Hence, the 'S' of the arithmetic instructions is implied. You can put an 'S' after the instruction to make this clearer.

These routines have an additional function which is to set the whole of the PSR to a given value. This is done by using a 'P' after the opcode, for example TEQP.

Normally the flags are set depending on the value of the comparison. The I and F bits and the mode and register bits are unaltered. The 'P' option allows the corresponding eight bits of the result of the calculation performed by the comparison to overwrite those in the PSR (or just the flag bits in user mode).

Example

        TEQP    PC, #&80000000      ; Set N flag, clear all others. Also enable
                                    ; IRQs, FIQs, select User mode if privileged

The above example (as well as setting the N flag and clearing the others) will alter the IRQ, FIQ and mode bits of the PSR - but only if you are in a privileged mode.

The 'P' option is also useful in user mode, for example to collect errors:

        STMFD   sp!, {r0, r1, r14}
        ...
        BL      routine1
        STRVS   r0, [sp, #0]                    ; save error block ptr in return r0
                                                ; in stack frame if error
        MOV     r1, pc                          ; save psr flags in r1
        BL      routine2                        ; called even if error from routine1
        STRVS   r0, [sp, #0]                    ; to do some tidy up action etc.
        TEQVCP  r1, #0                          ; if routine2 didn't give error,
        LDMFD   sp!, {r0, r1, pc}               ; restore error indication from r1

Multiply instructions

Syntax:

MUL«cond»«S» Rd,Rm,Rs
MLA«cond»«S» Rd,Rm,Rs,Rn

There are two multiply instructions:

Instruction Calculation performed
MUL Multiply Rd = Rm × Rs
MLA Multiply-accumulate Rd = Rm × Rs + Rn

Instruction	Calculation performed
MUL	Multiply	Rd = Rm × Rs
MLA	Multiply-accumulate	Rd = Rm × Rs + Rn

The multiply instructions perform integer multiplication, giving the least significant 32 bits of the product of two 32-bit operands.

The destination register must not be R15 or the same as Rm. Any other register combinations can be used.

If the 'S' is given in the instruction, the N and Z flags are set on the result, and the C and V flags are undefined.

Examples:

        MUL     R1,R2,R3        MLAEQS  R1,R2,R3,R4

Branching instructions

Syntax:

B«cond» expression
BL«cond» expression

There are essentially only two branch instructions but in each case the branch can take place as a result of any of the 15 usable condition codes:

Instruction
B Branch
BL Branch and link

Instruction
B	Branch
BL	Branch and link

The branch instruction causes the execution of the code to jump to the instruction given at the address to be branched to. This address is held relative to the current location.

Example:

        BEQ label1  ; branch if zero flag set
        BMI minus   ; branch if negative flag set

The branch and link instruction performs the additional action of copying the address of the instruction following the branch, and the current flags, into register R14. R14 is known as the 'link register'. This means that the routine branched to can be returned from by transferring the contents of R14 into the program counter and can restore the flags from this register on return. Hence instead of being a simple branch the instruction acts like a subroutine call.

Example:

        BLEQ equal
        .........               ; address of this instruction
        .........               ; moved to R14 automatically
.equal  .........               ; start of subroutine
        .........
        MOVS R15,R14            ; end of subroutine

Single register load/save instructions

Syntax:

opcode«cond»«B»«T» Rd, address

The single register load/save instructions are as follows:

Instruction
LDR Load register
STR Store register

Instruction
LDR	Load register
STR	Store register

These instructions allow a single register to load a value from memory or save a value to memory at a given address.

The instruction has two possible forms:

the address is specified by register(s), whose names are enclosed in square brackets
the address is specified by an expression

Address given by registers

The simplest form of address is a register number, in which case the contents of the register are used as the address to load from or save to. There are two other alternatives:

pre-indexed addressing (with optional write back)
post-indexed addressing (always with write back)

With pre-indexed addressing the contents of another register, or an immediate value, are added to the contents of the first register. This sum is then used as the address. It is known as pre-indexed addressing because the address being used is calculated before the load/save takes place. The first register (Rn below) can be optionally updated to contain the address which was actually used by adding a '!' after the closing square bracket.

Address syntax Address
[Rn] Contents of Rn
[Rn,#m]«!» Contents of Rn + m
[Rn,«-»Rm]«!» Contents of Rn ± contents of Rm
[Rn,«-»Rm,shift #s]«!» Contents of Rn ± (contents of Rm shifted by s places)

Address syntax	Address
[Rn]	Contents of Rn
[Rn,#m]«!»	Contents of Rn + m
[Rn,«-»Rm]«!»	Contents of Rn ± contents of Rm
[Rn,«-»Rm,shift #s]«!»	Contents of Rn ± (contents of Rm shifted by s places)

With post-indexed addressing the address being used is given solely by the contents of the register Rn. The rest of the instruction determines what value is written back into Rn. This write back is performed automatically; no '!' is needed. Post-indexing gets its name from the fact that the address that is written back to Rn is calculated after the load/save takes place.

Address syntax Value written back
[Rn],#m Contents of Rn + m
[Rn],«-»Rm Contents of Rn ± contents of Rm
[Rn],«-»Rm,shift #s Contents of Rn ± (contents of Rm shifted by s places)

Address syntax	Value written back
[Rn],#m	Contents of Rn + m
[Rn],«-»Rm	Contents of Rn ± contents of Rm
[Rn],«-»Rm,shift #s	Contents of Rn ± (contents of Rm shifted by s places)

Address given as an expression

If the address is given as a simple expression, the assembler will generate a pre-indexed instruction using R15 (the PC) as the base register. If the address is out of the range of the instruction (±4095 bytes), an error is given.

Options

If the 'B' option is specified after the condition, only a single byte is transferred, instead of a whole word. The top 3 bytes of the destination register are cleared by an LDRB instruction.

If the 'T' option is specified after the condition, then the TRANs pin on the ARM processor will be active during the transfer, forcing an address translation. This allows you to access User mode memory from a privileged mode. This option is invalid for pre-indexed addressing.

Using the program counter

If you use the program counter (PC, or R15) as one of the registers, a number of special cases apply:

the PSR is never modified, even when Rd or Rn is the PC
the PSR flags are not used when the PC is used as Rn, and (because of pipelining) it will be advanced by eight bytes from the current instruction
the PSR flags are used when the PC is used as Rm, the offset register.

Multiple load/save instructions

Syntax:

opcode«cond»type Rn«!», {Rlist}«^»

These instructions allow the loading or saving of several registers:

Instruction
LDM Load multiple registers
STM Store multiple registers

Instruction
LDM	Load multiple registers
STM	Store multiple registers

The contents of register Rn give the base address from/to which the value(s) are loaded or saved. This base address is effectively updated during the transfer, but is only written back to if you follow it with a '!'.

Rlist provides a list of registers which are to be loaded or saved. The order the registers are given, in the list, is irrelevant since the lowest numbered register is loaded/saved first, and the highest numbered one last. For example, a list comprising {R5,R3,R1,R8} is loaded/saved in the order R1, R3, R5, R8, with R1 occupying the lowest address in memory. You can specify consecutive registers as a range; so {R0-R3} and {R0,R1,R2,R3} are equivalent.

The type is a two-character mnemonic specifying either how Rn is updated, or what sort of a stack results:

Mnemonic Meaning
DA Decrement Rn After each store/load
DB Decrement Rn Before each store/load
IA Increment Rn After each store/load
IB Increment Rn Before each store/load

EA IEmpty Ascending stack is used
ED Empty Descending stack is used
FA Full Ascending stack is used
FD Full Descending stack is used

Mnemonic	Meaning
DA	Decrement Rn After each store/load
DB	Decrement Rn Before each store/load
IA	Increment Rn After each store/load
IB	Increment Rn Before each store/load

EA	IEmpty Ascending stack is used
ED	Empty Descending stack is used
FA	Full Ascending stack is used
FD	Full Descending stack is used

an empty stack is one in which the stack pointer points to the first free slot in it
a full stack is one in which the stack pointer points to the last data item written to it
an ascending stack is one which grows from low memory addresses to high ones
a descending stack is one which grows from high memory addresses to low ones

In fact these are just different ways of looking at the situation - the way Rn is updated governs what sort of stack results, and vice versa. So, for each type of instruction in the first group there is an equivalent in the second:

LDMEA is the same as LDMDB
LDMED is the same as LDMIB
LDMFA is the same as LDMDA
LDMFD is the same as LDMIA

STMEA is the same as STMIA
STMED is the same as STMDA
STMFA is the same as STMIB
STMFD is the same as STMDB

All Acorn software uses an FD (full, descending) stack. If you are writing code for SVC mode you should try to use a full descending stack as well - although you can use any type you like.

A '^' at the end of the register list has two possible meanings:

For a load with R15 in the list, the '^' forces update of the PSR.
Otherwise the '^' forces the load/store to access the User mode registers. The base is still taken from the current bank though, and if you try to write back the base it will be put in the User bank - probably not what you would have intended.

Examples:

        LDMIA R5, {R0,R1,R2}    ; where R5 contains the value &1484
                                ; This will load R0 from &1484
                                ;                R1 from &1488
                                ;                R2 from &148C
        LDMDB R5, {R0-R2}       ; where R5 contains the value &1484
                                ; This will load R0 from &1478
                                ;                R1 from &147C
                                ;                R2 from &1480

If there were a '!' after R5, so that it were written back to, then this would leave R5 containing &1490 and &1478 after the first and second examples respectively.

The examples below show directly equivalent ways of implementing a full descending stack. The first uses mnemonics describing how the stack pointer is handled:

        STMDB Stackpointer!, {R0-R3}    ; push onto stack
        ...
        LDMIA Stackpointer!, {R0-R3}    ; pull from stack

and the second uses mnemonics describing how the stack behaves:

        STMFD Stackpointer!, {R0,R1,R2,R3}      ; push onto stack
        ...
        LDMFD Stackpointer!, {R0,R1,R2,R3}      ; pull from stack

Using the base register

You can always load the base register without any side effects on the rest of the LDM operation, because the ARM uses an internal copy of the base, and so will not be aware that it has been loaded with a new value.
However, you should see Appendix B: Warnings on the use of ARM assembler in the Programmers' Reference Manual for notes on using writeback when doing so.
You can store the base register as well. If you are not using write back then no problem will occur. If you are, then this is the order in which the ARM does the STM:
- write the lowest numbered register to memory
- do the write back
- write the other registers to memory in ascending order.
So, if the base register is the lowest-numbered one in the list, its original value is stored:
```
        STMIA   R2!, {R2-R6}              ; R2 stored is value before write back
```
Otherwise its written back value is stored:
```
        STMIA   R2!, {R1-R5}              ; R2 stored is value after write back
```

Using the program counter

If you use the program counter (PC, or R15) in the list of registers:

the PSR is saved with the PC; and (because of pipelining) it will be advanced by twelve bytes from the current position
the PSR is only loaded if you follow the register list with a '^'; and even then, only the bits you can modify in the ARM's current mode are loaded.

It is generally not sensible to use the PC as the base register. If you do:

the PSR bits are used as part of the address, which will give an address exception unless all the flags are clear and all interrupts are enabled.

SWI instruction

Syntax:

SWI«cond» expression

SWI«cond» "SWIname" (BBC BASIC assembler)

The SWI mnemonic stands for SoftWare Interrupt. On encountering a SWI, the ARM processor changes into SVC mode and stores the address of the next location in R14_svc - so the User mode value of R14 is not corrupted. The ARM then goes to the SWI routine handler via the hardware SWI vector containing its address.

The first thing that this routine does is to discover which SWI was requested. It finds this out by using the location addressed by (R14_svc - 4) to read the current SWI instruction. The opcode for a SWI is 32 bits long; 4 bits identify the opcode as being for a SWI, 4 bits hold all the condition codes and the bottom 24 bits identify which SWI it is. Hence 224 different SWI routines can be distinguished.

When it has found which particular SWI it is, the routine executes the appropriate code to deal with it and then returns by placing the contents of R14_svc back into the PC, which restores the mode the caller was in.

This means that R14_svc will be corrupted if you execute a SWI in SVC mode - which can have disastrous consequences unless you take precautions.

The most common way to call this instruction is by using the SWI name, and letting the assembler translate this to a SWI number. The BBC BASIC assembler can do this translation directly:

        SWINE   "OS_WriteC"

See the chapter entitled An Introduction to SWIs in the Programmers' Reference Manual for a full description of how RISC OS handles SWIs, and the index of SWIs for a full list of the operating system SWIs.

	www.riscos.com Technical Support: BBC BASIC Reference Manual

`EQUB int`	Define 1 byte of memory from LSB of `int`	(DCB, =)
`EQUW int`	Define 2 bytes of memory from `int`	(DCW)
`EQUD int`	Define 4 bytes of memory from `int`	(DCD)
`EQUS str`	Define 0 - 255 bytes as required by string expression `str`	(DCS)
`ALIGN`	Align P% (and O%) to the next word (4 byte) boundary
`ADR reg,addr`	Assemble instruction to load `addr` into `reg`

AL	Always	This is the default
CC	Carry clear	C clear
CS	Carry set	C set
EQ	Equal	Z set
GE	Greater than or equal	(N set and V set) or (N clear and V clear)
GT	Greater than	((N set and V set) or (N clear and V clear)) and Z clear
HI	Higher (unsigned)	C set and Z clear
LE	Less than or equal	(N set and V clear) or (N clear and V set) or Z set
LS	Lower or same (unsigned)	C clear or Z set
LT	Less than	(N set and V clear) or (N clear and V set)
MI	Negative	N set
NE	Not equal	Z clear
NV	Never
PL	Positive	N clear
VC	Overflow clear	V clear
VS	Overflow set	V set

LO	Lower unsigned	is equivalent to CC
HS	Higher / same unsigned	is equivalent to CS

« »	indicates that the contents of the brackets are optional (unlike all other chapters, where we have been using [ ] instead)
(x\|y)	indicates that either x or y but not both may be given
#exp	indicates that a BASIC expression is to be used which evaluates to an immediate constant. An error is given if the value cannot be stored in the instruction.
Rn	indicates that an expression evaluating to a register number (in the range 0 - 15) should be used, or just a register name, e.g. R0. PC may be used for R15.
shift	indicates that one of the following shift options should be used:
	ASL	(Rn\|#exp)	Arithmetic shift left by contents of Rn or expression
	LSL	(Rn\|#exp)	Logical shift left
	ASR	(Rn\|#exp)	Arithmetic shift right
	LSR	(Rn\|#exp)	Logical shift right
	ROR	(Rn\|#exp)	Rotate right
	RRX		Rotate right one bit with extend

LDMEA	is the same as	LDMDB
LDMED	is the same as	LDMIB
LDMFA	is the same as	LDMDA
LDMFD	is the same as	LDMIA

STMEA	is the same as	STMIA
STMED	is the same as	STMDA
STMFA	is the same as	STMIB
STMFD	is the same as	STMDB

www.riscos.com Technical Support:BBC BASIC Reference Manual

Appendix J - ARM assembler

Finding out more

Using the BASIC assembler

Initialising external variables

Reserving memory space for the machine code

Memory pointers

Implementing passes

The OPT directive

Saving machine code to file

Executing a machine code program

From memory

From file

Format of assembly language statements

Registers

Condition codes

The instruction set

Moves

Syntax:

Examples:

Arithmetic and logical instructions

Syntax:

Examples:

Comparisons

Syntax:

Example

Multiply instructions

Syntax:

Examples:

Branching instructions

Syntax:

Example:

Example:

Single register load/save instructions

Syntax:

Address given by registers

Address given as an expression

Options

Using the program counter

Multiple load/save instructions

Syntax:

Examples:

Using the base register

Using the program counter

SWI instruction

Syntax:

www.riscos.com Technical Support:
BBC BASIC Reference Manual