RISCOS.com
www.riscos.com Technical Support:
Acorn Assembler

CPU instruction set

This chapter describes the CPU instructions available in ObjAsm. It includes instruction formats, assembler syntax, and a synopsis of each instruction.

The condition field

All ARM instructions are conditionally executed, which means that they will only be executed if the N, Z, C and V flags in the PSR are in the correct state at the end of the preceding instruction. The condition is encoded in a four bit condition field, held in bits 28 - 31 of an instruction. By default ObjAsm encodes the 'always execute' condition; other conditions can be requested by appending a two-character condition mnemonic to ObjAsm's mnemonic for an instruction.

The figure below shows the condition codes, their mnemonics, and the corresponding conditions under which the instruction is executed:

The condition field

Note that ObjAsm implements HS (Higher or Same) and LO (LOwer than) as synonymous with CS and CC respectively, giving it a total of 17 condition mnemonics.

For example, suppose you had a CMP (compare) instruction followed by an instruction with the EQ condition (so it is executed only if the Z flag is set):

If the CMP instruction's two operands were equal, it would set the Z flag, and so your conditional instruction would be executed.
If the CMP instruction's two operands were different, it would clear the Z flag, and so your conditional instruction would not be executed.

Conditional instruction sequence

Branches which are taken cause breaks in the pipeline. For this reason they often waste time, and can sometimes be replaced by a suitable conditional instruction sequence.

As an example, the coding of IF A=4 THEN B:=A ELSE C:=D+E might be conventionally achieved using five ARM instructions:

        CMP     R5,#4           ; test "A=4"
        BNE     LABEL           ; if not equal goto LABEL
        MOV     R6,R5           ; do "B:=A"
        B       LAB2            ; jump around the ELSE clause
LABEL   ADD     R0,R1,R2        ; do "C:=D+E"
LAB2                            ; finish

whereas, using the condition testing instructions, the same effect may be achieved using three instructions:

        CMP     5,#4            ; test "A=4"
        MOVEQ   R6,R5           ; if so do "B:=A"
        ADDNE   R0,R1,R2        ; else do "C:=D+E".

If the condition tested is true, the instruction is performed. If it is false, the instruction is skipped and the PC is advanced to the next memory word, which takes little processor time. The first of the examples above takes about twice as long as the second.

After the instruction is obeyed, the arithmetic logic unit (ALU) will output appropriate signals on the flag lines. On certain instructions, the flags set the condition code bits in the PSR; for other instructions, the flags in the PSR are only altered if the programmer permits them to be updated.

Instruction timings

Instruction timings can vary between versions of the ARM processor, and so we do not detail them here. For code that is timing dependent, we advise that you consult the datasheets for all ARM versions on which your code may run.

The barrel shifter

The arithmetic logic unit has a 32-bit barrel shifter capable of various shift and rotate operations. Data involved in the data processing group of instructions (detailed in the Data processing) may pass through the barrel shifter, either as a direct consequence of the programmer's actions, or as a result of the internal computations of ObjAsm. The barrel shifter also affects the index for the single data transfer instructions (detailed in the Single data transfer (LDR, STR)).

The barrel shifter has a carry in, which takes its input from the C flag of the PSR; and a carry out, which may be latched back into the C bit of the PSR for logical data operations (see The S bit).

The shift mechanism can produce the following types of operand:

Unshifted register

Syntax: register
For example: RO

Register shifted by a constant amount

A register shifted by a constant amount, in the range 0-31, 1-31 or 1-32 (depending on shift type).

Syntax: register, shift-type #amount
For example: R0,LSR #1

Value resulting from rotating register and carry bit one bit right

A value which is the result of rotating a register and the carry bit one bit right. Because the carry is included in the shift, 33 bits (rather than 32 bits) are affected. The shift type is known as rotate right extended.

Syntax: register,RRX
For example: R0,RRX

Register shifted by n bits

A register shifted by n bits, where n is the least significant byte of a register. This form is not valid as an index in a single register transfer.

Syntax: register,shift-type register
For example: R1,LSL R2

8-bit constant rotated right by 2n bits

A constant constructed by rotating an 8-bit constant right by 2n bits, where n is a 4-bit constant. The shift type is always rotate right. This form is not valid as an index in a single register transfer.

Syntax: #expression
For example: #&3FC

Note that the rotation is invisible to the programmer, who should merely supply an immediate value for the data processing instruction to use.

ObjAsm will evaluate the expression and reject any number which cannot be expressed as a rotation by an even amount of a number in the range 0-255. If possible, ObjAsm always constructs it as an unrotated value, even if there are other possibilities.

Examples of valid immediate constants are:

#1
#&FF
#&3FC       This is &FF rotated right by 30
#&80000000  This is 2 rotated right by 2
#&FC000003  This is &FF rotated right by 6.

Examples of invalid constants are:

#&101       cannot be obtained by rotating an 8-bit value
#&1FE       an 8-bit value rotated by an odd amount - but not an
            8-bit value rotated by an even amount.

8-bit constant rotated right by 2n bits and specified explicitly

A constant constructed as in the point above, but specified explicitly. This form is not valid as an index in a single register transfer.

Syntax: #constant, rotate amount
For example: #4,2

The shift amount should be an even number in the range 0-30. This can be important for setting the carry flag on an operation which would otherwise not update it.

For example:

MOVS    R0, #4,2        produces the same result as
MOVS    R0, #1

but because the first instruction does a rotate right of two bits the carry flag is cleared, whereas it is not altered by the second instruction.

Shift types

Various instructions use the barrel shifter to shift register operands. The effects of such shifts are detailed in this section, rather than being repeated for each instruction.

Mnemonics

There are six assembler mnemonics for shift types, used to control the barrel shifter. These are:

`LSL`	Logical Shift Left
`ASL`	Arithmetic Shift Left
`LSR`	Logical Shift Right
`ASR`	Arithmetic Shift Right
`ROR`	Rotate Right
`RRX`	Rotate Right with Extend

The mnemonic ASL (arithmetic shift left) may be freely interchanged with LSL (logical shift left).

Specification of the shift amount

The shift amount may either be specified in the instruction, or in a register specified by the instruction.

Instruction specified shift amount

When the shift amount is specified in the instruction, it is contained in a 5 bit field which may take any value from 0 to 31.

Register specified shift amount

Only the least significant byte of the contents of Rs is used to determine the shift amount.

If this byte is zero, the unchanged contents of Rm will be used as the second operand, and the old value of the PSR C flag will be passed on as the shifter carry output.

If the byte has a value between 1 and 31, the shifted result will exactly match that of an instruction specified shift with the same value and shift operation.

If the value in the byte is 32 or more, the result will be a logical extension of the shifting process. This is detailed for each mnemonic described below.

Logical shift left, or arithmetic shift left

`Rm,LSL #n` or `Rm,ASL #n`	Shift contents of Rm left by n bits, where n is 0 to 31.
`Rm,LSL Rs` or `Rm,ASL Rs`	Shift contents of Rm left by the least significant byte of Rs.

A logical shift left (LSL) takes the contents of Rm and moves each bit by the specified amount to a more significant position. The least significant bits of the result are filled with zeroes. The high bits of Rm which do not map into the result are discarded - except that the least significant discarded bit becomes the barrel shifter's carry out.

For example, the effect of LSL #5 is:

A logical or arithmetic shift left by 5

Special cases

LSL #0 or ASL #0, and LSL Rs or ASL Rs where Rs is 0:
The barrel shifter's result is the unchanged contents of Rm, and its carry out is the old value of the PSR C flag.
LSL Rs or ASL Rs where Rs is 32:
The result is zero, and the carry out is bit 0 of Rm.
LSL Rs or ASL Rs where Rs is greater than 32:
Both the result and the carry out are zero.

Logical shift right

`Rm,LSR #n`	Shift contents of Rm right by n bits, where n is 1 to 32.
`Rm,LSR Rs`	Shift contents of Rm right by the least significant byte of Rs.

A logical shift right (LSR) is similar to a logical shift left, but the contents of Rm are moved to less significant positions in the result. LSR #5 has this effect:

A logical shift right by 5

Logical shift right zero is redundant as it is the same as logical shift left zero. The form of the shift field which might be expected to correspond to LSR #0 is therefore used to encode LSR #32. ObjAsm assembles LSR #0 (and ASR #0 and ROR #0) as LSL #0, and allows you to specify LSR #32.

Special cases

LSR #0:
This is assembled as LSL #0 (see Logical shift left, or arithmetic shift left), which has the same effect as LSR #0.
LSR Rs where Rs is 0:
The barrel shifter's result is the unchanged contents of Rm, and its carry out is the old value of the PSR C flag.
LSR #32, or LSR Rs where Rs is 32:
The result is zero, and the carry out is bit 31 of Rm. (LSR #32 is encoded in the format you would expect to correspond to LSR #0.)
LSR Rs where Rs is greater than 32:

Both the result and the carry out are zero.

Arithmetic shift right

`Rm,ASR #n`	Shift contents of Rm right by n bits, where n is 1 to 32.
`Rm,ASR Rs`	Shift contents of Rm right by the least significant byte of Rs.

An arithmetic shift right (ASR) is similar to a logical shift right, except that the high bits are filled with bit 31 of Rm instead of zeroes. This preserves the sign in 2's complement notation. For example, ASR #5:

An arithmetic shift right by 5

Arithmetic shift right zero is redundant as it is the same as logical shift left zero. The form of the shift field which might be expected to correspond to ASR #0 is therefore used to encode ASR #32. ObjAsm assembles ASR #0 (and LSR #0 and ROR #0) as LSL #0, and allows you to specify ASR #32.

Special cases

ASR #0:
This is assembled as LSL #0 (see Logical shift left, or arithmetic shift left), which has the same effect as ASR #0.
ASR Rs where Rs is 0:
The barrel shifter's result is the unchanged contents of Rm, and its carry out is the old value of the PSR C flag.
ASR #32, or ASR Rs where Rs is 32 or more:
Each bit of the result is equal to bit 31 of Rm; the result is therefore all ones or all zeroes. The carry out is also bit 31 of Rm. (ASR #32 is encoded in the format you would expect to correspond to ASR #0.)

Rotate right

`Rm,ROR #n`	Rotate contents of Rm right by n bits, where n is 1 to 31.
`Rm,ROR Rs`	Rotate contents of Rm right by the least significant byte of Rs.

Rotate right (ROR) operations reuse the bits which 'overshoot' in a logical shift right operation by reintroducing them at the high end of the result, in place of the zeroes used to fill the high end in logical right operations. For example, ROR #5:

A rotate right by 5

Rotate right zero is redundant as it is the same as logical shift left zero. The form of the shift field which might be expected to correspond to ROR #0 is therefore used to encode rotate right extended (see the next section). ObjAsm assembles ROR #0 (and LSR #0 and ASR #0) as LSL #0.

Special cases

ROR #0:
This is assembled as LSL #0 (see Logical shift left, or arithmetic shift left), which has the same effect as ROR #0.
ROR Rs where Rs is 0:
The barrel shifter's result is the unchanged contents of Rm, and its carry out is the old value of the PSR C flag.
ROR Rs where Rs is 32:
The result is equal to Rm, and the carry out is bit 31 of Rm.
ROR Rs where Rs is greater than 32:
The result and carry out are the same as for ROR ((Rs - 1) MOD 32 + 1); therefore repeatedly subtract 32 from Rs until its value is in the range 1 to 32, and then see above.

Rotate right with extend

Rm,RRX Rotate contents of Rm and the carry flag right by 1 bit only.

The form of the shift field which might be expected to give ROR #0 is used to encode a special function of the barrel shifter, rotate right extended (RRX). This is a rotate right by one bit position of the 33 bit quantity formed by appending the PSR C flag to the most significant end of the contents of Rm:

A rotate right with extend

Coprocessor instructions

The ARM can work with up to 16 external coprocessors, which (if present) will execute the instructions listed below. If the requested coprocessor is absent, these instructions will be regarded as undefined. The undefined instruction trap can then take appropriate action (for example emulating the requested instruction in software or telling the user that the program won't run in a machine without the coprocessor.)

The floating point coprocessor uses coprocessor numbers 1 and 2. If it's absent, the floating point emulator traps the resulting undefined instructions and emulates them. The coprocessor 15 instructions are used by ARM as instructions to control its own operation (such as cache control, and 26/32 bit configuration).

ObjAsm provides support for coprocessors at two levels. Firstly, it provides a set of generic coprocessor instructions, detailed below. Secondly, it recognises a standard set of floating point instructions and translates them into the appropriate coprocessor instructions; see the chapter entitled Floating point instructions for details.

All the generic coprocessor operations include a coprocessor number symbol and one or more coprocessor register symbols. These should be defined using the CP and CN directives respectively. (See the chapter entitled Directives.)

All coprocessor instructions are conditional. Whether they are executed depends on the ARM's condition flags, not on any coprocessor status register.

Branch, Branch with Link (B, BL)

Instructions for branching to an instruction other than the next one

Instruction format

Assembler syntax

B«L»«cond» expression

where:

is a two-character condition mnemonic; see the chapter entitled The condition field.

`«L»`	requests the Branch with Link form of the instruction (see The link bit below). If absent, R14 will not be affected by the instruction.
`«cond»`
`expression`	is a program-relative expression describing the branch destination, from which ObjAsm calculates the offset.

Synopsis

These instructions branch to an instruction other than the next one, by altering the value of the program counter (R15). The Branch with Link form of the instruction also stores a return address in the link register (R14), so that program flow can branch to a subroutine, and then return to the instruction immediately following the Branch with Link instruction; for more details see The link bit below.

All branches take a signed 2's complement 24 bit word offset. This is shifted left two bits, and added to the program counter, with any overflow being ignored, giving an offset of ±32Mbytes. The branch can therefore reach any word aligned address within a 26 bit address space, since the calculation 'wraps round' between the top and bottom of memory.

When using this instruction with ObjAsm you should provide a label, from which ObjAsm will calculate the 24 bit offset.

The encoded offset must take account of the effects of pipelining and prefetching within the CPU, which causes the PC to be two words ahead of the current instruction. ObjAsm automatically handles this for you. For example, the calculated jump offset in the following piece of code is 000000, even though the jump is to a label two PC locations ahead.

Code generated	Label	Mnemonic	Destination
`EA000000`	`L1`	`BEQ`	`L2`
`xxxxxxxx`		`xxx`
`xxxxxxxx`	`L2`	`xxx`

The instruction is only executed if the condition specified in the condition field is true (see the chapter entitled The condition field).

The link bit

Branch with Link works in the same way as Branch, but it also writes the old PC and PSR into the link register (R14) of the current bank. The PC value written is first adjusted to allow for the prefetch, and contains the address of the instruction following the branch and link instruction.

This form of the instruction is often used for branching to subroutines. At the end of the subroutine the program flow can return to the instruction immediately following the Branch with Link instruction by writing the link register (R14) value back into the program counter (R15). To do so, the subroutine should end with:

MOV PC,R14

if the link register has not been saved on a stack, or:

LDMxx Rn,{PC}

if the link register has been saved on a stack addressed by Rn. (xx is the stack type; see the chapter entitled Block data transfer (LDM, STM).)

These methods of returning do not restore the original PSR. If the PSR does need to be restored then

MOV PC,R14

can be replaced by

MOVS PC,R14

LDMxx Rn,{PC}

LDMxx Rn,{PC}^

However, care should be taken when using these methods in modes other than user mode, as they will also restore the mode and the interrupt bits. In particular, restoring the interrupt bits may interfere unintentionally with the interrupt system.

32 bit operation

Calculating the offset

In 32 bit operation, the offset is sign extended to 32 bits before it is added to the program counter.

Branches beyond ±32Mbytes must use an offset or an absolute destination which has previously been loaded into a register. In this case you should manually save the PC in R14 if you require a Branch with Link type operation.

The link bit

Branch with Link does not save the CPSR with the PC. If you need to preserve the CPSR over a subroutine, it is your responsibility to explicitly save and restore it, either on entry to and exit from (respectively) the subroutine, or in the calling part of the program.

Examples

here    BAL     here            ; Assembles to EAFFFFFE
                                ; (note effect of PC offset)

        B       there           ; ALways condition used as default

        CMP     R1,#0           ; Compare register 1 with zero
        BEQ     fred            ; Branch to fred if register 1 was zero,
                                ; otherwise continue to next instruction

        BL      sub + ROM       ; Unconditionally call subroutine at
                                ; computed address

        ADDS    R1,#1           ; Add 1 to register 1, setting PSR flags on
                                ; the result
        BLCC    sub             ; Call subroutine if the C flag is clear,
                                ; which will be the case unless R1 contained
                                ; FFFFFFFFH
                                ; Otherwise continue to next instruction

Data processing

Instructions for performing arithmetic or logical operation on one or two operands

Instruction format

Assembler syntax

The data processing instructions use three different types of syntax, depending on whether the opcode being used takes one or two operands, and whether or not it writes the result into a destination register:

MOV and MVN - single operand

opcode«cond»«S»Rd,op2

CMN, CMP, TEQ and TST - no result written

opcode«cond»«P»Rn,op2

ADC, ADD, AND, BIC, OR, ORR, RSB, RSC, SBC, SUB - two operands

opcode«cond»«S»Rd,Rn,op2

Parameters

`opcode`	is a mnemonic for the data processing operation to be performed; see Opcodes below
`«cond»`	is a two-character condition mnemonic; see the chapter entitled The condition field.
`«S»`	means to set the PSR's condition codes from the operation. ObjAsm forces this for CMN, CMP, TEQ and TST, provided the P flag is not specified. See Opcodes below for a summary of the flags affected by each opcode, and The S bit for more detail.
`«P»`	means to take the result of a CMN, CMP, TEQ or TST operation, and move it to the bits of R15 that hold the PSR - even though the instruction has no destination register. Bits corresponding to the PC are masked out, as are (in User mode) the I, F, and mode bits.
`Rd`, `Rn` & `Rm`	are expressions evaluating to a valid ARM register number.
`op2`	may be any of the operands that the barrel shifter can produce. The syntax is `Rm«,shift»` or `#expression` If `#expression` is used, ObjAsm will attempt to match the expression by generating a shifted immediate 8-bit field. If this is impossible, it will give an error. `shift` is `shiftname Rs` or `shiftname #expression`, or `RRX` (rotate right one bit with extend). `shiftname`s are: `ASL`, `LSL`, `LSR`, `ASR`, and `ROR`. (ASL is a synonym for LSL, and the two assemble to the same code.) See Shift types.

Opcodes

The opcodes supported are:

Assembler Mnemonic	Meaning	Operation	Flags affected
ADC	Add with Carry	Rd:=Rn + op2 + C flag	N,Z,C,V
ADD	Add	Rd:=Rn + op2	N,Z,C,V
AND	And	Rd:=Rn AND op2	N,Z,C
BIC	Bit Clear	Rd:=Rn AND (NOT(op2))	N,Z,C
CMN	Compare Negated	Rn + op2	N,Z,C,V
CMP	Compare	Rn - op2	N,Z,C,V
EOR	Exclusive Or	Rd:=Rn EOR op2	N,Z,C
MOV	Move	Rd:=op2	N,Z,C
MVN	Move Not	Rd:=NOT op2	N,Z,C
ORR	Logical Or	Rd:=Rn OR op2	N,Z,C
RSB	Reverse Subtract	Rd:=op2 - Rn	N,Z,C,V
RSC	Reverse Subtract with Carry	Rd:=op2 - Rn - NOT(C flag)	N,Z,C,V
SBC	Subtract with Carry	Rd:=Rn - op2 - NOT(C flag)	N,Z,C,V
SUB	Subtract	Rd:=Rn - op2	N,Z,C,V
TEQ	Test Equivalence	Rn EOR op2	N,Z,C
TST	TeST and mask	Rn AND op2	N,Z,C

Synopsis

These instructions produce a result by performing a specified arithmetic or logical operation on one or two operands.

The operation is performed between a source register Rn and an operand op2 - except for MOV and MVN, where only the operand is needed (and for which the assembler sets Rn to R0). The source register can be any one of the 16 registers. The operand can be any operand that the barrel shifter can produce: i.e. a shifted register Rm, or a rotated 8 bit immediate value Imm, according to the value of the I bit in the instruction. (See The barrel shifter and Shift types.) Note that any shifting is done before the operation is performed.

The logical operations (AND, BIC, EOR, MOV, MVN, ORR, TEQ, TST) perform the logical action on all corresponding bits of the operand or operands to produce the result. The arithmetic operations (ADC, ADD, CMP, CMN, RSB, RSC, SBC, SUB) treat each operand as a 32 bit integer (either unsigned or 2's complement signed, the two are equivalent). Some add the bit held in the ALU's carry flag into the operation.

The result of the operation is placed in the destination register Rd - except for CMN, CMP, TEQ and TST, which are used only to perform tests and to set the condition codes on the result (and for which the assembler sets Rd to R0). The destination register may be any one of the 16 registers.

The condition codes in the PSR may be preserved or updated as a result of this instruction, according to the value of the S bit; see The S bit below.

The instruction is only executed if the condition is true. The various conditions are defined in the The condition field.

The S bit

The instruction contains a one bit field called the S bit, standing for 'set condition codes'. The result of the operation in the ALU affects its N and Z flags, and may also affect its C and V flags. However, the ALU doesn't copy its flags to the relevant parts of the PSR unless the S bit is set. ObjAsm always sets the S bit for the four instructions CMN, CMP, TEQ and TST, since they would be meaningless unless their results were copied to the PSR. In the case of the remaining 12 instructions, you may request that the S bit be set by appending the letter S to the instruction mnemonic.

The way the PSR flags are altered differs for logical and arithmetic operations:

Logical operations (AND, BIC, EOR, MOV, MVN, ORR, TEQ, TST)

The V flag in the PSR will be unaffected.
The C flag will be set to the last bit shifted out by the barrel shifter, or is unchanged if no shifting took place.
The Z flag will be set if and only if the result is all zeroes.
The N flag will be set to the logical value of bit 31 of the result.

Arithmetic operations (ADC, ADD, CMP, CMN, RSB, RSC, SBC, SUB)

The V flag in the PSR will be set if signed overflow occurs (i.e. if you regard the operands as signed 32 bit integers, the signed result does not fit in a 32 bit integer); this may be ignored if the operands were considered unsigned, but warns of a possible error if the operands were 2's complement signed (the destination register is set to the bottom 32 bits of the correct unsigned result).
The C flag will be set to the carry out of bit 31 of the ALU, which for addition indicates that 32 bit overflow occurred, and for subtraction indicates that 32 bit underflow did not occur.
The Z flag will be set if and only if the result was zero.
The N flag will be set to the value of bit 31 of the result, indicating a negative result if the operands are considered to be 2's complement signed.

The P flag

The P flag invokes a special form of the CMN, CMP, TEQ and TST operations, used to update the PSR. The operation is carried out, and then the PSR is overwritten by the corresponding bits in the ALU result: so bit 31 of the result goes to the N flag, bit 30 to the Z flag, bit 29 to the C flag, and bit 28 to the V flag. In user mode the other flags (I, F, M1, M0) are protected from direct change, but in non-user modes these will also be affected, accepting copies of bits 27, 26, 1 and 0 of the result respectively.

This is typically used to change modes. For example:

TEQP R15, #0 ; Change to user mode.

Note the treatment of R15 as the first operand, described in Using R15 as an operand.

This form is encoded by setting the S bit, and setting the destination register to R15.

Shifts

When the second operand is specified to be a shifted register, the operation of the barrel shifter is controlled by the Shift field in the instruction. This field indicates the type of shift to be performed (logical left or right, arithmetic right or rotate right). The amount by which the register should be shifted may be contained in an immediate field in the instruction, or in the bottom byte of another register:

Shifts

Shifts are detailed in the Shift types.

Note that the zero in bit 7 of an instruction with a register controlled shift is compulsory; a one in this bit will cause the instruction to be a multiply or an undefined instruction.

Immediate operand rotates

The immediate operand rotate field is a 4 bit unsigned integer which specifies a shift operation on the 8 bit immediate value. The immediate value is zero extended to 32 bits, and then subject to a rotate right by twice the value in the rotate field. This enables many common constants to be generated, for example all powers of 2. Another example is that the 8 bit constant may be aligned with the PSR flags (bits 0, 1, and 26 to 31). All the flags can thereby be initialised in one TEQP instruction.

Immediate operand rotates are detailed in the The barrel shifter.

Using R15 as the destination or operand

Note that the CPU takes certain actions whenever the destination or any operand is R15. These are as follows:

Using R15 as the destination

If R15 is the destination register, and the S bit is not set, the PC is overwritten, but not the PSR.

If the S bit is set, then the PC is overwritten, and also all bits of the PSR that are unprotected in the current mode; thus in User mode the N, Z, C and V flags are overwritten, whereas in other modes the entire PSR is overwritten.

Using R15 as an operand

R15 will always contain the value of the PC, which will be the address of the instruction, plus 8 or 12 bytes due to instruction prefetching. If the shift amount is specified in the instruction, the PC will be 8 bytes ahead. If a register is used to specify the shift amount, the PC will be 8 bytes ahead when used as Rs, and 12 bytes ahead when used as Rn or Rm.

R15 may or may not contain the values of the PSR flags as they were at the completion of the previous instruction, depending on which operand position it occupies:

If R15 is the first operand in a two operand instruction, it is presented to the arithmetic logic unit (ALU) with the PSR bits set to zero.
If the second or only operand is R15 (possibly shifted), it is presented to the barrel shifter or ALU with the PSR bits unchanged.
If R15 is the shift register, it is presented to the barrel shifter with the PSR bits set to zero.

32 bit operation

TEQP, TSTP, CMPP and CMNP

These opcodes should not be used in 32 bit modes. You should instead use the new PSR transfer functions. When used in a privileged mode, TEQP moves the SPSR for the current mode to the CPSR.

Using R15 as the shift register

You must not use R15 as the shift register.

Using R15 as the destination

If R15 is the destination register, and the S bit is not set, the PC is overwritten, but not the CPSR. This is what you would expect as an extension of the 26 bit behaviour.

If the destination register is R15 and the S bit is set, then as well as writing the result to the PC, the SPSR for the current mode is moved to the CPSR. This is again what you would expect as an extension of the 26 bit behaviour.

Examples

; Simple use of a one operand instruction:
        MVN     R2,R3           ; R2 is set to the bitwise inverse of the
                                ; contents of R3.

; Simple uses of instructions that does not write a result:
        CMP     R0,R1           ; Compare the contents of R0 with R1

        CMP     R0,#&80         ; Compare the contents of R0 with &80

        TEQS    R4,#3           ; Test R4 for equality with 3
                                ; (The S is in fact redundant as the assembler
                                ; inserts it automatically)

; Simple use of a two operand instruction:
        ADD     R0,R1,R2        ; R0=R1+R2

; Conditional execution of an instruction:
        ADDEQ   R2,R4,R5        ; If the Z flag is set make R2:=R4+R5

; Use of the S bit to alter the PSR:
        ADDS    R0,R1,#1        ; R0=R1+1, and set N,Z,C,V

; Use of a register specified shift:
        SUB     R4,R5,R7,LSR R2 ; Logical right shift R7 by the number in
                                ; the bottom byte of R2, subtract the result
                                ; from R5, and put the answer into R4
                                ; Use of an immediate shift:
        MOV     R0,R1,LSL#2     ; The contents of R1 are shifted left by
                                ; 2 bits and transferred to R0.

; Using ADC to implement multi-word additions. For example a 64 bit ADD:
        ADDS    R4,R2,R0        ; Add least significant 32 bits updating carry
        ADC     R5,R3,R1        ; Add most significant 32 bits and carry
                                ; from previous

; Using SBC to implement multi-word subtractions. For example:
        SUBS    R4,R2,R0        ; Do least significant word of subtraction
        SBC     R5,R3,R1        ; Do most significant word, taking account
                                ; of the borrow. This does the 64 bit
                                ; subtraction (R5,R4)=(R3,R2)-(R1,R0)

; Changing to user mode and returning from a subroutine:
        ; Assume non-user mode here
        TEQP    R15,#0          ; Change to user mode and clear N,Z,C,V,I,F
                                ; NB R15 is here in the Rn position,
                                ; so it comes without the PSR flags
        MOV     R0,R0           ; No-op to avoid mode change hazard
        MOV     PC,R14          ; Return from subroutine
                                ; (R14 is a banked register)

; Returning from a subroutine and restoring the PSR:
        MOVS    PC,R14          ; return from subroutine and restore the PSR

PSR transfer (MRS, MSR)

Instructions for accessing the CPSR and SPSR registers

These instructions are not available on ARM2 and ARM3 series processors

Instruction format

Assembler syntax

MRS«cond» Rd,psr MSR«cond» psr,Rm MSR«cond» psrf,Rm MSR«cond» psrf,#expression

where:

`«cond»`	is a two-character condition mnemonic; see the chapter entitled The condition field.
`Rd` & `Rm`	are expressions evaluating to a valid ARM register number other than R15.
`psr`	is `CPSR`, `CPSR_all`, `SPSR` or `SPSR_all`. (`CPSR` and `CPSR_all` are synonyms, as are `SPSR` and `SPSR_all`.)
`psrf`	is `CPSR_flg` or `SPSR_flg`. The most significant four bits of `Rm` or `#expression` are written to the N, Z, C and V flags respectively.
`#expression`	is an expression symbolising a 32 bit value. If `#expression` is used, ObjAsm will attempt to match the expression by generating a shifted immediate 8-bit field. If this is impossible, it will give an error.

Synopsis

These instructions allow access to the CPSR and SPSR registers:

The MRS instruction moves the contents of the CPSR or SPSR_current mode register to a general register.
The MSR instruction moves the contents of a general register to the CPSR or SPSR_current mode register.

Alternatively, the MSR instruction can write to the condition code flags of the CPSR or SPSR_current mode register without affecting its control bits:

In this case the source may be either the contents of a general register or an immediate value, and only its top four bits are used.

The instructions are encoded using the CMN, CMP, TEQ and TST instructions without the S flag set.

The instruction is only executed if the condition is true. The various conditions are defined in the The condition field.

These instructions are not available on ARM2 and ARM3 series processors.

On ARM6 series processors and later, they are available in all modes and configurations. However, we recommend that you avoid using these instructions, as you will lose backwards compatibility with older ARMs. Indeed, in the 26 bit modes used by RISC OS (except when handling FIQs), you can access the PSR just as you always have - for example, with TEQP.

Operand restrictions

In user mode, the control bits of the CPSR are protected from change, so only the condition code flags of the CPSR can be changed. In other (privileged) modes the entire CPSR can be changed.

R15 must not be specified as the source or destination register.

You must not attempt to access the SPSR in user mode, as no such register exists.

Reserved bits

Not all bits of the PSR are defined (e.g. only N, Z, C, V, I, F and M[4:0] are defined for the ARM 6 and 7 series). The remaining ones (bits 27-8 and 5 in the ARM 6 and 7 series) are reserved for use in future versions of the ARM. The ensure future compatibility, the following rules should be observed:

You must preserve the reserved bits when changing the value in a PSR.
When you are checking the PSR status, you must not rely on specific values from the reserved bits, since they may read as one or zero in future processors.

You should therefore use a read-modify-write strategy when altering the control bits of any PSR register. This involves transferring the appropriate PSR register to a general register using the MRS instruction, changing only the relevant bits, and then transferring the modified value back to the PSR register using the MSR instruction.

For example, to perform a mode change:

        MRS     R0,CPSR                 ; Take a copy of the PSR
        BIC     R0,R0,#0x1F             ; Clear the mode bits
        ORR     R0,R0,#new_mode         ; Set bits for new mode
        MSR     CPSR,R0                 ; Write back the modified CPSR,
                                        ; changing mode

When you wish simply to change the condition flags in a PSR, you can write an immediate value directly to the flag bits without disturbing the control bits. For example, the following instruction sets the N, Z, C and V flags:

        MSR     CPSR_flg,#0xF0000000    ; Set all the flags regardless
                                        ; of their previous state
                                        ; (does not affect any control bits)

You must not attempt to write an 8 bit immediate value into the whole PSR, since such an operation cannot preserve the reserved bits.

Examples

In user mode the instructions behave as follows:

        MSR     CPSR_all,Rm             ; CPSR[31:28] <- Rm[31:28]
        MSR     CPSR_flg,Rm             ; CPSR[31:28] <- Rm[31:28]

        MSR     CPSR_flg,#0xA0000000    ; CPSR[31:28] <- 0xA
                                        ;  (i.e. set N,C; clear Z,V)

        MRS     Rd,CPSR                 ; Rd[31:0] <- CPSR[31:0]

In privileged modes the instructions behave as follows:

        MSR     CPSR_all,Rm             ; CPSR[31:0]  <- Rm[31:0]
        MSR     CPSR_flg,Rm             ; CPSR[31:28] <- Rm[31:28]

        MSR     CPSR_flg,#0x50000000    ; CPSR[31:28] <- 0x5
                                        ;  (i.e. set Z,V; clear N,C)

        MRS     Rd,SPSR                 ; Rd[31:0] <- SPSR[31:0]
        MSR     SPSR_all,Rm             ; SPSR_<mode>[31:0]  <- Rm[31:0]
        MSR     SPSR_flg,Rm             ; SPSR_<mode>[31:28] <- Rm[31:28]

        MSR     SPSR_flg,#0xC0000000    ; SPSR_<mode>[31:28] <- 0xC
                                        ;  (i.e. set N,Z; clear C,V)
        MRS     Rd,SPSR                 ; Rd[31:0] <- SPSR_<mode>[31:0]

Multiply and Multiply-Accumulate (MUL, MLA)

Instructions for performing integer multiplication, giving a 32 bit result

Instruction format

Assembler syntax

MUL«cond»«S» Rd,Rm,Rs MLA«cond»«S» Rd,Rm,Rs,Rn

`«cond»`	is a two-character condition mnemonic; see the chapter entitled The condition field.
`«S»`	means to set the PSR's condition codes from the operation.
`Rd`, `Rm`, `Rs` & `Rn`	are expressions evaluating to a valid ARM register number. (Rd must not be R15 and must not be the same as Rm.)

Synopsis

The multiply and multiply-accumulate instructions use a 2 bit Booth's algorithm to perform integer multiplication. They give the least significant 32 bits of the product of two 32 bit operands, and may be used to synthesize higher precision multiplications.

The multiply form of the instruction gives Rd:=Rm×Rs. Rn is ignored, and should be set to zero for compatibility with possible future upgrades to the instruction set.

The multiply-accumulate form gives Rd:=Rm×Rs+Rn, which can save an explicit ADD instruction in some circumstances.

The results of a signed multiply and of an unsigned multiply of 32 bit operands differ only in the upper 32 bits; the low 32 bits are identical. As these instructions only produce those low 32 bits, they can be used with operands which may be considered as either signed (2's complement) or unsigned integers.

The instruction is only executed if the condition is true. The various conditions are defined in the The condition field.

PSR flags

Setting the PSR flags is optional, and is controlled by the S bit in the instruction. The N and Z flags are set correctly on the result (N is equal to bit 31 of the result, Z is set if and only if the result is zero), the V flag is unaffected by the instruction, and the C flag is set to a meaningless value.

Operand restrictions

Because of the way the Booth's algorithm has been implemented, you should avoid certain combinations of operand registers. (ObjAsm will issue a warning if you overlook these restrictions.)

The destination register Rd must not be the same as the Rm operand register, as Rd is used to hold intermediate values, and Rm is used repeatedly during the multiply.

The destination register Rd must also not be R15.

All other register combinations will give correct results, and Rd, Rn and Rs may use the same register when required.

32 bit operation

R15 must not be used as any of Rd, Rm, Rn or Rs.

Examples

        MUL     R1,R2,R3                ; R1:=R2*R3
        
        MLAEQS  R1,R2,R3,R4             ; conditionally R1:=R2*R3+R4,
                                        ; setting condition codes

The multiply instruction may be used to synthesize higher precision multiplications, for instance to multiply two 32 bit integers and generate a 64 bit result:

mul64
        MOV     a1,A,LSR #16            ; a1:=top half of A
        MOV     D,B,LSR #16             ; D:=top half of B
        BIC     A,A,a1,LSL #16          ; A:=bottom half of A
        BIC     B,B,D,LSL #16           ; B:=bottom half of B
        MUL     C,A,B                   ; Low section of result
        MUL     B,a1,B                  ; ) Middle sections
        MUL     A,D,A                   ; ) of result
        MUL     D,a1,D                  ; High section of result
        ADDS    A,B,A                   ; Add middle sections (couldn't use
                                        ;  MLA as we need C correct)
        ADDCS   D,D,#&10000             ; Carry from above add
        ADDS    C,C,A,LSL #16           ; C is now bottom 32 bits of product
        ADC     D,D,A,LSR #16           ; D is top 32 bits

(A, B are registers containing the 32 bit integers; C, D are registers for the 64 bit result; a1 is a temporary register. A and B are overwritten during the multiply.)

Note that more recent ARM processors have a single instruction to do just this; see the next section.

Multiply Long and Multiply-Accumulate Long (UMULL, SMULL, UMLAL, SMLAL)

Instructions for performing integer multiplication, giving a 64 bit result

This instruction is only available in 32 bit mode on the ARM7M series or later

Instruction format

Assembler syntax

UMULL«cond»«S» RdLo,RdHi,Rm,Rs SMULL«cond»«S» RdLo,RdHi,Rm,Rs UMLAL«cond»«S» RdLo,RdHi,Rm,Rs SMLAL«cond»«S» RdLo,RdHi,Rm,Rs

`«cond»`	is a two-character condition mnemonic; see the chapter entitled The condition field.
`«S»`	means to set the PSR's condition codes from the operation.
`RdLo`, `RdHi`, `Rm` & `Rs`	are expressions evaluating to a valid ARM register number other than R15.

Synopsis

The multiply long instructions perform integer multiplication on two 32 bit operands, and produce a 64 bit result. The multiplication can be signed or unsigned, which - with optional accumulate - gives rise to four variations.

The multiply forms of the instruction (UMULL and SMULL) give a 64 bit result of the form RdHi,RdLo:=Rm×Rs.

The multiply-accumulate forms (UMLAL and SMLAL) give Rd:=Rm×Rs+Rn, which can save an explicit ADD instruction in some circumstances.

The lower 32 bits of the result and of the accumulator (where used) are held in RdLo, and the upper 32 bits in RdHi.

The unsigned forms of the instruction (UMULL and UMLAL) treat all four registers as unsigned numbers. The signed forms (SMULL and SMLAL) treat the two operand registers as 2's complement signed 32 bit numbers, and the two destination registers as a 2's complement signed 64 bit number.

The instruction is only executed if the condition is true. The various conditions are defined in the The condition field.

This instruction was first introduced on the ARM7M series of processor, and is only available in 32 bit modes. This instruction is therefore unlikely to be of use under RISC OS.

PSR flags

Setting the PSR flags is optional, and is controlled by the S bit in the instruction. The N and Z flags are set correctly on the result (N is equal to bit 31 of the result, Z is set if and only if the result is zero), and the V and C flags are set to a meaningless value.

Operand restrictions

R15 must not be used as any of RdHi, RdLo, Rm or Rs.

RdHi, RdLo and Rm must all specify different registers.

Examples

        UMULL   R1,R4,R2,R3          ; R1,R4:=R2*R3

        UMLALS  R1,R5,R2,R3          ; R1,R5:=R2*R3+R1,R5,
                                     ; also setting condition codes

Single data transfer (LDR, STR)

Instructions for loading or storing single bytes or words of data

Instruction format

Assembler syntax

LDR|STR«cond»«B»«T» Rd,address

LDR loads from memory into a register.

STR stores from a register into memory.

«cond» is a two-character condition mnemonic; see the chapter entitled The condition field.

«B» means to transfer a byte, otherwise a word is transferred.

«T» means to set the W bit in a post-indexed instruction, forcing non-privileged mode for the transfer cycle. T is not allowed when a pre-indexed addressing mode is specified or implied.

Rd is an expression evaluating to a valid ARM register number.

address

can be:

An expression which generates an address:
expression

ObjAsm will attempt to generate an instruction using the PC as a base and a corrected immediate offset to address the location given by evaluating the expression. This will be a PC relative, pre-indexed address. If the address is out of range, an error will be generated.
A pre-indexed addressing specification:

[Rn] offset of zero
[Rn,#expression]«!» offset of expression bytes
[Rn,«+|-»Rm«,shift»]«!» offset of ±contents of index register, shifted by shift.
A post-indexed addressing specification:

[Rn],#expression offset of expression bytes
[Rn],«+|-»Rm«,shift» offset of ±contents of index register, shifted by shift.

Rn and Rm are expressions evaluating to a valid ARM register number. Note if Rn is R15 then ObjAsm will subtract 8 from the offset value to allow for ARM pipelining.

shift is a general shift operation (see the chapter entitled Shift types), but note that the shift amount may not be specified by a register.

«!» if present sets the W bit to write-back the base register.

Synopsis

The single data transfer instructions are used to load or store single bytes or words of data. The memory address used in the transfer is calculated by adding an offset to or subtracting an offset from a base register. The result of this calculation may be written back into the base register if 'auto-indexing' is required. If the contents of the base are not destroyed by other instructions, the continued use of LDR (or STR) with write back will continually move the base register through memory in steps given by the index value. Note that ! is invalid for post-indexed addressing, as write back is automatic in this case.

The instruction is only executed if the condition is true. The various conditions are defined in the The condition field.

For register to register transfers, see the chapter entitled Data processing, particularly the MOV instruction.

Offsets and auto-indexing

The offset from the base may be either a 12 bit unsigned binary immediate value in the instruction, or a second register (possibly shifted in some way). The offset may be added to (U=1) or subtracted from (U=0) the base register Rn. The offset modification may be performed either before (pre-indexed, P=1) or after (post-indexed, P=0) the base is used as the transfer address.

The W bit gives optional auto increment and decrement addressing modes. The modified base value may be written back into the base (W=1), or the old base value may be kept (W=0). In the case of post-indexed addressing, the write back bit is redundant, since the old base value can be retained by setting the offset to zero. Therefore post-indexed data transfers always write back the modified base. The only use of the W bit in a post-indexed data transfer is in privileged mode code; depending on the processor, setting the W bit either forces the TRANS pin to go LOW or forces non-privileged mode for the transfer, allowing the operating system to generate a user address in a system where the memory management hardware makes suitable use of this hardware.

Shifted register offset

The 8 shift control bits are described in the Data processing, but the register specified shift amounts are not available in this instruction class.

Bytes and words

This instruction class may be used to transfer a byte (B=1) or a word (B=0) between an ARM register and memory.

A byte load (LDRB) expects the data on bits 0 to 7 if the supplied address is on a word boundary, on bits 8 to 15 if it is a word address plus one byte, and so on. The selected byte is placed in the bottom 8 bits of the destination register, and the remaining bits of the register are filled with zeroes.

A byte store (STRB) repeats the bottom 8 bits of the source register four times across the data bus. The external memory system should activate the appropriate byte subsystem to store the data.

A word load (LDR) or word store (STR) should generate a word aligned address. Using a non-word-aligned addresses has non-obvious and unspecified results.

Use of R15

These instructions will never cause the PSR to be modified, even when Rd or Rn is R15.

If R15 is specified as the base register (Rn), the PC is used without the PSR flags. When using the PC as the base register you must remember that it contains an address 8 bytes on from the address of the current instruction.

If R15 is specified as the register offset (Rm), the value presented will be the PC together with the PSR.

When R15 is the source register (Rd) of a register store (STR) instruction, the value stored will be the PC together with the PSR. The stored value of the PC will be 12 bytes on from the address of the instruction. A load register (LDR) with R15 as Rd will change only the PC, and the PSR will be unchanged.

Address exceptions

On an ARM2 or ARM3 processor, if the address used for the transfer (ie the unmodified contents of the base register for post-indexed addressing, or the base modified by the offset for pre-indexed addressing) has a logic one in any of the bits 26 to 31, the transfer will not take place and the address exception trap will be taken.

Later versions of the ARM do not generate address exceptions when in a 32 bit configuration (as used by RISC OS from very soon after reset), even when running in 26 bit modes.

Note that it is only the address actually used for the transfer which is checked. A base containing an address outside the legal range may be used in a pre-indexed transfer if the offset brings the address within the legal range, and likewise a base within the legal range may be modified by post-indexing to outside the legal range without causing an address exception.

Data Aborts

A transfer to or from a legal address may still cause problems for a memory management system. For instance, in a system which uses virtual memory the required data may be absent from main memory. The memory manager can signal a problem by taking the processor ABORT pin HIGH, whereupon the data transfer instruction will be prevented from changing the processor state and the Data Abort trap will be taken. It is up to the system software to resolve the cause of the problem, then the instruction can be restarted and the original program continued.

32 bit operation

R15 must not be used as the register offset (Rm).

If R15 is specified as the base register (Rn), you must not use write-back - including post indexing.

For a post-indexed LDR or STR, Rm and Rn must not be the same register.

When using write-back - including post indexing -Rd and Rn must not be the same register.

Examples

        STR     R1,[BASE,INDEX]!        ; store R1 at BASE+INDEX (both of
                                        ; which are registers) and write
                                        ; back address to BASE

        STR     R1,[BASE],INDEX         ; store R1 at BASE and writeback
                                        ; BASE+INDEX to BASE

        LDR     R1,[BASE,#16]           ; load R1 from contents of BASE+16.
                                        ; Don't write back

        LDR     R1,[BASE,INDEX,LSL #2]  ; load R1 from contents of
                                        ; BASE+INDEX*4

        LDREQB  R1,[BASE,#5]            ; conditionally load byte at BASE+5
                                        ; into R1 bits 0 to 7, filling bits
                                        ; 8 to 31 with zeroes

        STR     R1,PLACE                ; generate PC relative offset to
                                        ; address PLACE
        More instructions

PLACE

Block data transfer (LDM, STM)

Instructions for loading or storing any subset of the currently visible registers

Instruction format

Assembler syntax

LDM|STM«cond»FD|ED|FA|EA|IA|IB|DA|DB Rn«!»,Rlist«^»

`LDM`	loads from memory into register(s).
`STR`	stores from register(s) into memory.
`«cond»`	is a two-character condition mnemonic; see the chapter entitled The condition field.
`Rn`	is an expression evaluating to a valid ARM register number.
`Rlist`	is either a comma-separated list of registers and/or of register ranges indicated by hyphens, all enclosed in `{}` (e.g. `{R0,R2-R7,R10}`); or an expression evaluating to the 16 bit operand.
`«!»`	if present sets the W bit to write-back the base register.
`«^»`	if present sets the S bit to load the PSR with the PC, or forces storing of user bank registers when in a non-user mode.

Addressing mode names

There are different assembler mnemonics for each of the addressing modes. There are alternative forms for each mnemonic: one form is intended for use with stacks, and describes the type of stack the addressing mode supports; the other form merely describes the instructions functionality. The equivalencies between the names and the values of the bits in the instruction are:

Name	Stack	Other	L bit	P bit	U bit
pre-increment load	LDMED	LDMIB	1	1	1
post-increment load	LDMFD	LDMIA	1	0	1
pre-decrement load	LDMEA	LDMDB	1	1	0
post-decrement load	LDMFA	LDMDA	1	0	0
pre-increment store	STMFA	STMIB	0	1	1
post-increment store	STMEA	STMIA	0	0	1
pre-decrement store	STMFD	STMDB	0	1	0
post-decrement store	STMED	STMDA	0	0	0

In the stacking forms of the mnemonics (FD, ED, FA and EA), the F and E refer to a full or empty stack, and the A and D refer to an ascending or descending stack:

A full stack is one in which the stack pointer points to the last data item written, whereas an empty stack is one in which the stack pointer points to the first free slot.
A descending stack is one which grows from high memory addresses to low ones, whereas an ascending stack is one which grows from low memory addresses to high ones.

The other forms of the mnemonics (IA, IB, DA and DB) simply mean Increment After, Increment Before, Decrement After, and Decrement Before.

Synopsis

Block data transfer instructions are used to load (LDM) or store (STM) any subset of the currently visible registers from or to memory. They support all possible stacking modes, maintaining full or empty stacks which can grow up or down memory, and are very efficient instructions for saving or restoring context, or for moving large blocks of data around main memory.

The instruction is only executed if the condition is true. The various conditions are defined in the The condition field.

The register list

The instruction can cause the transfer of any registers in the current bank (and non-user mode programs can also transfer to and from the user bank, see below). The register list is a 16 bit field in the instruction, with each bit corresponding to a register. A 1 in bit 0 of the register field will cause R0 to be transferred, a 0 will cause it not to be transferred; similarly bit 1 controls the transfer of R1, and so on.

Any subset of the registers, or all the registers, may be specified. The only restriction is that the register list must not be empty.

Addressing modes

The transfer addresses are determined by the contents of the base register (Rn), the pre/post bit (P) and the up/down bit (U). The registers are stored such that the lowest register is always at the lowermost address in memory, the highest register is always at the uppermost address, and the others are stored in numerical order between them.

(As an aside, this means that instruction sequences such as:

STMIA   R0,{R1,R2}
LDMIA   R0,{R2,R1}

do not swap the contents of R1 and R2.)

By way of illustration, consider the transfer of R1, R5 and R7 in the case where Rn=1000H and write back of the modified base is required (W=1). The figures below show the sequence of register transfers, the addresses used, and the value of Rn after the instruction has completed.

(In all cases, had write back of the modified base not been required (W=0), Rn would have retained its initial value of 1000H unless it was also in the transfer list of a load multiple register instruction, when it would have been overwritten with the loaded value.)

Post-increment addressing

Pre-increment addressing

Post-decrement addressing

Pre-decrement addressing

Transfer of R15

Whenever R15 is stored to memory, the value transferred is the PC together with the PSR flags. The stored value of the PC will be 12 bytes on from the address of the STM instruction.

If R15 is in the transfer list of a load multiple (LDM) instruction the PC is overwritten, and the effect on the PSR is controlled by the S bit. If the S bit is 0 the PSR is preserved unchanged, but if the S bit is 1 the PSR will be overwritten by the corresponding bits of the loaded value. In user mode, however, the I, F, M0 and M1 bits are protected from change whatever the value of the S bit. The mode at the start of the instruction determines whether these bits are protected, and the supervisor may return to the user program, re-enabling interrupts and restoring user mode with one LDM instruction.

Forcing transfer of the user bank

For STM instructions the S bit is redundant as the PSR is always stored with the PC whenever R15 is in the transfer list. For LDM instructions the S bit is redundant if R15 is not in the transfer list.

In both the above cases, the S bit is instead used to force transfers in non-user modes to use the user register bank instead of the current register bank. This is useful for saving and restoring the user state on process switches. You must not use write back of the base when forcing user bank transfer.

For an LDM instruction, you must take care not to read from a banked register during the following cycle; if in doubt insert a no-op.

Use of R15 as the base

When the base is the PC, the PSR bits will be used to form the address as well, so unless all interrupts are enabled and all flags are zero an address exception will occur. Also, write back is never allowed when the base is the PC (setting the W bit will have no effect).

Inclusion of the base in the register list

When writeback is specified, the base is written back at the end of the second cycle of the instruction. During a STM, the first register is written out at the start of the second cycle. A STM which includes storing the base, with the base as the first register to be stored, will therefore store the unchanged value, whereas with the base second or later in the transfer order, will store the modified value. An LDM will always overwrite the updated base if the base is in the list.

When the base register is in the list of registers

The base register may be stored and if write back is not in operation, no problem will occur.
If write back is in operation, the STM is performed in the following order:
- write lowest-numbered register to memory
- perform the write back
- write other registers to memory in ascending order.
Thus, if the base register is the lowest-numbered register in the list, its original value is stored. Otherwise, its written back value is stored.
If the base register is loaded the pop operation will continue successfully. The entire block transfer runs on an internal copy of the base, and will not be aware that the base register has been loaded with a new value.

Address exceptions

On an ARM2 or ARM3 processor, if the address of the first transfer falls outside the legal address space (ie has a 1 somewhere in bits 26 to 31), an address exception trap will be taken. The instruction will first complete in the usual number of cycles, though an STM will be prevented from writing to memory. The processor state will be the same as if a data abort had occurred on the first transfer cycle (see next section).

Only the address of the first transfer is checked in this way; if subsequent addresses over- or under-flow into illegal address space they will be truncated to 26 bits but will not cause an address exception trap.

Later versions of the ARM do not generate address exceptions when in a 32 bit configuration (as used by RISC OS from very soon after reset), even when running in 26 bit modes.

Data Aborts

Some legal addresses may be unacceptable to a memory management system, and the memory manager can indicate a problem with an address by taking the ABORT signal HIGH. This can happen on any transfer during a multiple register load or store, and must be recoverable if the ARM is to be used in a virtual memory system.

Aborts during STM instructions

If the abort occurs during a store multiple instruction, ARM takes little action until the instruction completes, whereupon it enters the data abort trap. The memory manager is responsible for preventing erroneous writes to the memory. The only change to the internal state of the processor will be the modification of the base register if write-back was specified, and this must be reversed by software (and the cause of the abort resolved) before the instruction may be retried.

Aborts during LDM instructions

When ARM detects a data abort during a load multiple instruction, it modifies the operation of the instruction to ensure that recovery is possible

Overwriting of registers stops when the abort happens. The aborting load will not take place, but earlier ones may have overwritten registers. The PC is always the last register to be written, and so will always be preserved.
The base register is restored, to its modified value if write-back was requested. This ensures recoverability in the case where the base register is also in the transfer list, and may have been overwritten before the abort occurred.

The data abort trap is taken when the load multiple has completed, and the system software must undo any base modification (and resolve the cause of the abort) before restarting the instruction.

32 bit operation

For an STM instruction where R15 is in the transfer list, the PC is stored, but the CPSR is not stored to the current mode's SPSR. (The intuitive extension of the 26 bit behaviour would be for the CPSR to be stored.)

For an LDM instruction where R15 is in the transfer list, if the S bit is set then as well as overwriting the PC, the SPSR for the current mode is moved to the CPSR. This is what you would expect as an extension of the 26 bit behaviour.

The S bit must not be set for instructions that are to be executed in user mode.

You must not use R15 as the base register.

Examples

        LDMFD   SP!,{R0,R1,R2}  ; unstack 3 registers

        STMIA   BASE,{R0-R15}   ; save all registers

These instructions may be used to save state on subroutine entry, and restore it efficiently on return to the calling routine:

        STMED   SP!,{R0-R3,R14} ; save R0 to R3 to use as workspace
                                ; and R14 for returning

        BL somewhere             ; this nested call will overwrite R14
                
        LDMED   SP!,{R0-R3,R15}^; restore workspace and return
                                ; (also restoring PSR flags)

Single data swap (SWP)

Instruction for swapping atomically between a register and external memory

This instruction is not available on the ARM2 processor

Instruction format

Assembler syntax

SWP «cond»«B» Rd,Rm,[Rn]

`«cond»`	is a two-character condition mnemonic; see the chapter entitled The condition field.
`«B»`	means to transfer a byte, otherwise a word is transferred.
`Rd`, `Rm` & `Rn`	are expressions evaluating to a valid ARM register number.

Synopsis

The data swap instruction is used to swap atomically a byte or word quantity between a register and external memory. It is implemented as a memory read followed by a memory write to the same address, which are 'locked' together. The processor cannot be interrupted until both operations have completed, and the memory manager is warned to treat them as inseparable. This instruction is particularly useful for implementing software semaphores.

The swap address is determined by the contents of the base register (Rn). The processor first reads the contents of the swap address. It then writes the contents of the source register (Rm) to the swap address, and stores the old memory contents in the destination register (Rd). The same register may be specified as both the source and destination; its contents are correctly swapped with memory.

The LOCK output goes HIGH for the duration of the read and write operations to signal to the external memory manager that they are locked together, and should be allowed to complete without interruption. This is important in multi-processor systems, where the swap instruction is the only indivisible instruction which may be used to implement semaphores. Control of the memory must not be removed from a processor while it is performing a locked operation.

The SWP instruction is not supported by the ARM2 processor, but is available in the ARM3, in the ARM2aS macrocell (as used for the ARM250 chip in the Acorn A3010, 3020 and A4000), and on the ARM6 series and later.

Bytes and words

This instruction may be used to swap a byte (B=1) or a word (B=0) between a register and memory. The SWP instruction is implemented as a LDR followed by a STR, and the action of these is as described in Single data transfer (LDR, STR).

Use of R15

You must not use R15 as an operand (Rd, Rn or Rm in a SWP instruction.

Data aborts

If the address used for the swap is unacceptable to a memory management system, the internal MMU or external memory manager can flag the problem by driving ABORT HIGH. This can happen on either the read or the write cycle (or both), and in either case, the Data Abort trap will be taken. It is up to the system software to resolve the cause of the problem. Once this has been done, the instruction can be restarted and the original program continued.

Examples

        SWP     R0,R1,[R2]      ; load R0 with the word addressed by R2,
                                ; and then store R1 at the same address

        SWPB    R2,R3,[R4]      ; load R2 with the byte addressed by R4,
                                ; and then store bits 0 to 7 of R3 at the
                                ; same address

        SWPEQ   R0,R0,[R1]      ; conditionally swap the word addressed
                                ; by R1 with the contents of R0

Software interrupt (SWI)

Instruction for entering supervisor mode in a controlled manner

Instruction format

Assembler syntax

SWI«cond» expression

`«cond»`	is a two-character condition mnemonic; see the chapter entitled The condition field.
`expression`	is evaluated and placed in the comment field as a SWI number (which is ignored by ARM).

Synopsis

The software interrupt instruction is used to enter supervisor mode in a controlled manner. The instruction causes the software interrupt trap to be taken, which effects the mode change. The PC is then forced to the SWI vector. If this address is suitably protected (by external memory management hardware) from modification by the user, a fully protected operating system may be constructed.

The instruction is only executed if the condition is true. The various conditions are defined in the The condition field.

Return from the supervisor

The PC and PSR are saved in R14_svc upon entering the software interrupt trap, with the PC adjusted to point to the word after the SWI instruction. MOVS R15,R14_svc will return to the calling program, and restore the PSR.

Note that the link mechanism is not re-entrant, so if the supervisor code wishes to use software interrupts within itself it must first save a copy of the return address.

Comment field

The bottom 24 bits of the instruction are ignored by ARM, and may be used to communicate information to the supervisor code. For instance, the supervisor may look at this field and use it to index into an array of entry points for routines which perform the various supervisor functions (as in RISC OS).

32 bit operation

The CPSR is saved in SPSR_svc. The MOVS R15,R14_svc instruction used to return to the supervisor restores the CPSR from SPSR_svc. This is what you would expect as an extension of the 26 bit behaviour.

Examples

        SWI     Read                    ; get next character from read stream

        SWI     WriteI+"k"              ; output a "k" to the write stream

        SWINE   0                       ; conditionally call supervisor
                                        ; with 0 in comment field

The above examples assume that suitable supervisor code exists at the SWI vector address, for instance:

        B       Supervisor              ; SWI entry point

EntryTable                              ; addresses of supervisor routines
        DCD     ZeroRtn
        DCD     ReadCRtn
        DCD     WriteIRtn
        ...
Zero    EQU     0
ReadC   EQU     256
WriteI  EQU     512

Supervisor

; SWI has routine required in bits 8-23, data (if any) in bits 0-7.
; Assumes R13_svc points to a suitable stack.

        STM     R13,{R0-R2,R14}         ; Save work registers and return
                                        ; address
        BIC     R0,R14,#&FC000003       ; Clear PSR bits
        LDR     R0,[R0,#-4]             ; Get SWI instruction
        BIC     R0,R0,#&FF000000        ; Clear top 8 bits
        MOV     R1,R0,LSR #8            ; Get routine offset
        ADR     R2,EntryTable           ; Get start address of entry table
        LDR     R15,[R2,R1,LSL #2]      ; Branch to appropriate routine

WriteIRtn                               ; Enter with character in R0 bits 0-7
...
        LDM     R13,{R0-R2,R15}^        ; Restore workspace and return.

Coprocessor data operations (CDP)

Instruction for telling a coprocessor to perform some internal operation

Instruction format

Assembler syntax

CDP«cond» CP#,operation,CRd,CRn,CRm«,info»

`«cond»`	is a two-character condition mnemonic; see the chapter entitled The condition field.
`CP#`	is the unique number of the required coprocessor, which must be a symbol defined via the CP directive.
`operation`	is evaluated to a constant and placed in the `CP` `Opc` field.
`CRd`, `CRn` & `CRm`	are expressions evaluating to a valid coprocessor register number, which must be a symbol defined via the CN directive.
`info`	where present is evaluated to a constant and placed in the `CP` field.

Synopsis

This instruction is used to tell a coprocessor to perform some internal operation. No result is communicated back to ARM, and it will not wait for the operation to complete. The coprocessor could contain a queue of such instructions awaiting execution, and their execution can overlap other ARM activity, allowing the coprocessor and ARM to perform independent tasks in parallel.

The instruction is only executed if the condition is true. The various conditions are defined in the The condition field.

The coprocessor fields

Only bit 4 and bits 24 to 31 are significant to ARM; the remaining bits are used by coprocessors. The above field names are used by convention, and particular coprocessors may redefine the use of all fields except CP# as appropriate. The CP# field is used to contain an identifying number (in the range 0 to 15) for each coprocessor, and a coprocessor will ignore any instruction which does not contain its number in the CP# field.

The conventional interpretation of the instruction is that the coprocessor should perform an operation specified in the CP Opc field (and possibly in the CP field) on the contents of CRn and CRm, and place the result in CRd.

Restriction

Current ARM chips have a fault in the implementation of CDP which will cause a Software Interrupt to take the Undefined Instruction trap if the SWI is the next instruction after the CDP. This problem only arises when a hardware coprocessor is attached to the system, but if it is ever intended to add hardware to support a CDP (rather than trapping to an emulator) the sequence CDP SWI should be avoided.

Examples

        CDP     p1,10,CR1,CR2,CR3       ; Request coprocessor 1 to do
                                        ; operation 10 on CR2 and CR3,
                                        ; and put the result in CR1.

        CDPEQ   p2,5,CR1,CR2,CR3,2      ; If Z flag is set, request
                                        ; coprocessor 2 to do operation 5
                                        ; (type 2) on CR2 and CR3,
                                        ; and put the result in CR1.

Coprocessor data transfers (LDC, STC)

Instructions for transferring data between the coprocessor and main memory

Instruction format

Assembler syntax

LDC|STC«cond»«L» CP#,CRd,address

LDC loads from memory to coprocessor (L=1).

STC stores from coprocessor to memory (L=0).

«L» when present perform long transfer (N=1), otherwise perform short transfer (N=0).

«cond» is a two-character condition mnemonic; see the chapter entitled The condition field.

CP# is the unique number of the required coprocessor, which must be a symbol defined via the CP directive.

CRd is an expression evaluating to a valid coprocessor register number, which must be a symbol defined via the CN directive.

address

can be:

An expression which generates an address:

expression

ObjAsm will attempt to generate an instruction using the PC as a base and a corrected immediate offset to address the location given by evaluating the expression. This will be a PC relative, pre-indexed address. If the address is out of range, an error will be generated.
A pre-indexed addressing specification:

[Rn] offset of zero
[Rn,#expression]«!» offset of expression bytes
A post-indexed addressing specification:

[Rn],#expression offset of expression bytes

Rn is an expression evaluating to a valid ARM register number. Note if Rn is R15 then ObjAsm will subtract 8 from the offset value to allow for ARM pipelining.

«!» if present sets the W bit to write-back the base register.

Synopsis

These instructions are used to load (LDC) or store (STC) a subset of the coprocessor's registers directly to memory. ARM is responsible for supplying the memory address, and the coprocessor supplies or accepts the data and controls the number of words transferred.

The instruction is only executed if the condition is true. The various conditions are defined in the The condition field.

The coprocessor fields

The CP# field is used to identify the coprocessor which is required to supply or accept the data, and a coprocessor will only respond if its number matches the contents of this field.

The CRd field and the N bit contain information for the coprocessor which may be interpreted in different ways by different coprocessors, but by convention CRd is the register to be transferred (or the first register where more than one is to be transferred), and the N bit is used to choose one of two transfer length options. For instance N=0 could select the transfer of a single register, and N=1 could select the transfer of all the registers for context switching.

Addressing modes

ARM is responsible for providing the address used by the memory system for the transfer, and the addressing modes available are a subset of those used in single data transfer instructions. Note, however, that the immediate offsets are 8 bits wide and specify word offsets for coprocessor data transfers, whereas they are 12 bits wide and specify byte offsets for single data transfers.

The 8 bit unsigned immediate offset is shifted left 2 bits and added to (U=1) or subtracted from (U=0) a base register (Rn); this calculation may be performed either before (P=1) or after (P=0) the base is used as the transfer address. The modified base value may be overwritten back into the base register (if W=1), or the old value of the base may be preserved (W=0). Note that post-indexed addressing modes require explicit setting of the W bit, unlike LDR and STR which always write-back when post-indexed.

The value of the base register, modified by the offset in a pre-indexed instruction, is used as the address for the transfer of the first word. The second word (if more than one is transferred) will go to or come from an address one word (4 bytes) higher than the first transfer, and the address will be incremented by one word for each subsequent transfer.

Address alignment

The base address should normally be a word aligned quantity. The bottom 2 bits of the address will appear on A[1:0] and might be interpreted by the memory system.

Use of R15

If Rn is R15, the value used will be the PC without the PSR flags, with the PC being the address of this instruction plus 8 bytes. Write-back to the PC is inhibited, and the W bit will be ignored.

Address exceptions

If the address used for the first transfer is illegal the address exception mechanism will be invoked. Instructions which transfer multiple words will only trap if the first address is illegal; subsequent addresses will wrap around inside the 26 bit address space.

Data aborts

If the address is legal but the memory manager generates an abort, the data abort trap will be taken. The writeback of the modified base will take place, but all other processor state will be preserved. The coprocessor is partly responsible for ensuring that the data transfer can be restarted after the cause of the abort has been resolved, and must ensure that any subsequent actions it undertakes can be repeated when the instruction is retried.

32 bit operation

If R15 is specified as the base register (Rn), you must not use write-back.

Examples

        LDC     p1,CR2,table            ; Load CR2 of coprocessor 1 from
                                        ; address table, using a PC relative
                                        ; address.
        
        STCEQL  p2,CR3,[R5,#24]!        ; Conditionally store CR3 of
                                        ; coprocessor 2 into an address
                                        ; 24 bytes up from R5, write this
                                        ; address back into R5, and use long
                                        ; transfer option (probably to store
                                        ; multiple words)

Note that though the address offset is expressed in bytes, the instruction offset field is in words. ObjAsm will adjust the offset appropriately.

Coprocessor register transfers (MCR, MRC)

Instructions for communicating information between ARM and a coprocessor

Instruction format

Assembler syntax

MCR|MRC«cond» CP#,operation,Rd,CRn,CRm«,info»

`MCR`	moves from coprocessor to ARM register (L=1).
`MRC`	moves from ARM register to coprocessor (L=0).
`«cond»`	is a two-character condition mnemonic; see the chapter entitled The condition field.
`CP#`	is the unique number of the required coprocessor, which must be a symbol defined via the CP directive.
`operation`	is evaluated to a constant and placed in the `CP` `Opc` field.
`Rd`	is an expression evaluating to a valid ARM register number.
`CRn` & `CRm`	are expressions evaluating to a valid coprocessor register number, which must be a symbol defined via the CN directive.
`info`	where present is evaluated to a constant and placed in the `CP` field.

Synopsis

These instructions are used to communicate information directly between ARM and a coprocessor. An example of a coprocessor to ARM register transfer (MCR) instruction would be a FIX of a floating point value held in a coprocessor, where the floating point number is converted into a 32 bit integer within the coprocessor, and the result is then transferred to an ARM register. A FLOAT of a 32 bit value in an ARM register into a floating point value within the coprocessor illustrates the use of an ARM register to coprocessor transfer (MRC).

An important use of this instruction is to communicate control information directly from the coprocessor into the ARM PSR flags. As an example, the result of a comparison of two floating point values within a coprocessor can be moved to the PSR to control the subsequent flow of execution.

Note that the ARM6 series and later have an internal coprocessor (#15) for control of on-chip functions. Accesses to this coprocessor are performed during coprocessor register transfers.

The instruction is only executed if the condition is true. The various conditions are defined in The condition field.

The coprocessor fields

The CP# field is used, as for all coprocessor instructions, to specify which coprocessor is being called upon to respond.

The CP Opc, CRn, CP and CRm fields are used only by the coprocessor, and the interpretation presented here is derived from convention only. Other interpretations are allowed where the coprocessor functionality is incompatible with this one. The conventional interpretation is that the CP Opc and CP fields specify the operation the coprocessor is required to perform, CRn is the coprocessor register which is the source or destination of the transferred information, and CRm is a second coprocessor register which may be involved in some way which depends on the particular operation specified.

Transfers to R15

When a coprocessor register transfer to ARM has R15 as the destination, bits 31, 30, 29 and 28 of the transferred word are copied into the N, Z, C and V flags (respectively) of the PSR. The other bits of the transferred word are ignored, and the PC and other PSR flags are unaffected by the transfer.

Transfers from R15

A coprocessor register transfer from ARM with R15 as the source register will store the PC together with the PSR flags.

32 bit operation

Transfers to R15

When a coprocessor register transfer to ARM has R15 as the destination, bits 31, 30, 29 and 28 of the transferred word are copied into the N, Z, C and V flags (respectively) of the CPSR. The other bits of the transferred word are ignored, and the PC and other PSR flags are unaffected by the transfer. This is what you would expect as an extension of the 26 bit behaviour.

Transfers from R15

A coprocessor register transfer from ARM with R15 as the source register will store the PC+12. Unlike the 26 bit behaviour, it does not store the CPSR to the coprocessor.

Examples

        MRC     2,5,R3,CR5,CR6          ; Request Co-Proc 2 to perform
                                        ; operation 5 on CR5 and CR6, and
                                        ; transfer the (single 32 bit word)
                                        ; result back to R3.

        MRCEQ   3,9,R3,CR5,CR6,2        ; Conditionally request Co-Proc 2 to
                                        ; perform operation 9 (type 2) on
                                        ; CR5 and CR6, and transfer the
                                        ; result back to R3.

Undefined instructions

Undefined instructions

Instruction format

Assembler syntax

At present ObjAsm has no mnemonics for generating these instructions. If they are adopted in the future for some specified use, suitable mnemonics will be added to ObjAsm. Until such time, these instructions should not be used.

Synopsis

If the condition is true, the undefined instruction trap will be taken.

Note that the undefined instruction mechanism involves offering these instructions to any coprocessors which may be present, and all coprocessors must refuse to accept them by letting CPA float HIGH.

(Note that some instruction codes are not defined but do not cause the Undefined instruction trap to be taken, for instance a Multiply instruction with bit 5 or bit 6 changed to a 1. These instructions should be avoided, as their action may change in future ARM implementations.)

The instruction is only executed if the condition is true. The various conditions are defined in The condition field.

Instruction set summary

Instructions available on ARM, briefly summarised

Instruction formats

Assembler syntax

B|BL«cond» expression

MOV|MVN«cond»«S»Rd,op2 CMN|CMP|TEQ|TST«cond»«P»Rn,op2 ADC|ADD|AND|BIC|OR|ORR|RSB|RSC|SBC|SUB«cond»«S»Rd,Rn,op2

MRS«cond» Rd,psr MSR«cond» psr,Rm MSR«cond» psrf,Rm MSR«cond» psrf,#expression

MUL«cond»«S» Rd,Rm,Rs MLA«cond»«S» Rd,Rm,Rs,Rn

UMULL|SMULL|UMLAL|SMLAL«cond»«S» RdLo,RdHi,Rm,Rs

LDR|STR«cond»«B»«T» Rd,address

LDM|STM«cond»FD|ED|FA|EA|IA|IB|DA|DB Rn«!»,Rlist«^»

SWP«cond»«B» Rd,Rm,[Rn]

SWI«cond» expression

CDP«cond» CP#,operation,CRd,CRn,CRm«,info»

LDC|STC«cond»«L» CP#,CRd, address

MCR|MRC«cond» CP#,operation,Rd,CRn,CRm«,info»

Parameters for the above, alphabetically sorted

address

can be:

An expression which generates an address:
expression

ObjAsm will attempt to generate an instruction using the PC as a base and a corrected immediate offset to address the location given by evaluating the expression. This will be a PC relative, pre-indexed address. If the address is out of range, an error will be generated.
A pre-indexed addressing specification:

[Rn] offset of zero
[Rn,#expression]«!» offset of expression bytes
[Rn,«+|-»Rm«,shift»]«!» offset of ±contents of index register, shifted by shift (not available for LDC/STC).
A post-indexed addressing specification:

[Rn],#expression offset of expression bytes
[Rn],«+|-»Rm«,shift» offset of ±contents of index register, shifted by shift (not available for LDC/STC).

Rn and Rm are expressions evaluating to a valid ARM register number. Note if Rn is R15 then ObjAsm will subtract 8 from the offset value to allow for ARM pipelining.

shift is a general shift operation (see the chapter entitled Shift types), but note that the shift amount may not be specified by a register.

«!» if present sets the W bit to write-back the base register.

«B» means to transfer a byte, otherwise a word is transferred.

«cond» is a two-character condition mnemonic; see the chapter entitled The condition field.

CP# is the unique number of the required coprocessor, which must be a symbol defined via the CP directive.

CRd, CRn & CRm are expressions evaluating to a valid coprocessor register number, which must be a symbol defined via the CN directive.

expression for B and BL is a program-relative expression describing the branch destination, from which ObjAsm calculates the offset; for SWI, it is evaluated and placed in the comment field as a SWI number (which is ignored by ARM).

#expression

is an expression symbolising a 32 bit value.

If #expression is used, ObjAsm will attempt to match the expression by generating a shifted immediate 8-bit field. If this is impossible, it will give an error.

info where present is evaluated to a constant and placed in the CP field.

«L» when present perform long transfer (N=1), otherwise perform short transfer (N=0).

op2

may be any of the operands that the barrel shifter can produce.

The syntax is Rm«,shift» or #expression

If #expression is used, ObjAsm will attempt to match the expression by generating a shifted immediate 8-bit field. If this is impossible, it will give an error.

shift is shiftname Rs or shiftname #expression, or RRX (rotate right one bit with extend). shiftnames are: ASL, LSL, LSR, ASR, and ROR. (ASL is a synonym for LSL, and the two assemble to the same code.) See Shift types.

operation is evaluated to a constant and placed in the CP Opc field.

«P» means to take the result of a CMN, CMP, TEQ or TST operation, and move it to the bits of R15 that hold the PSR - even though the instruction has no destination register. Bits corresponding to the PC are masked out, as are (in User mode) the I, F, and mode bits.

psr

is CPSR, CPSR_all, SPSR or SPSR_all.

(CPSR and CPSR_all are synonyms, as are SPSR and SPSR_all.)

psrf is CPSR_flg or SPSR_flg. The most significant four bits of Rm or #expression are written to the N, Z, C and V flags respectively.

Rd, RdLo, RdHi, Rm, Rn & Rs are expressions evaluating to a valid ARM register number.

Rlist is either a comma-separated list of registers and/or of register ranges indicated by hyphens, all enclosed in {} (e.g. {R0,R2-R7,R10}); or an expression evaluating to the 16 bit operand.

«S» means to set the PSR's condition codes from the operation. ObjAsm forces this for CMN, CMP, TEQ and TST, provided the P flag is not specified.

«T» means to set the W bit in a post-indexed instruction, forcing non-privileged mode for the transfer cycle. T is not allowed when a pre-indexed addressing mode is specified or implied.

«!» if present sets the W bit to write-back the base register.

«^» if present sets the S bit to load the PSR with the PC, or forces storing of user bank registers when in a non-user mode.

Synopsis

For a detailed synopsis of the various instructions, see the following sections:

Branch, Branch with Link (B, BL)

Data processing

PSR transfer (MRS, MSR)

Multiply and Multiply-Accumulate (MUL, MLA)

Multiply Long and Multiply-Accumulate Long (UMULL, SMULL, UMLAL, SMLAL)

Single data transfer (LDR, STR)

Block data transfer (LDM, STM)

Single data swap (SWP)

Software interrupt (SWI)

Coprocessor data operations (CDP)

Coprocessor data transfers (LDC, STC)

Coprocessor register transfers (MCR, MRC)

Undefined instructions

Further instructions

Further instructions

The above completes the description of all the basic ARM instructions. However, ObjAsm understands a number of other instructions, which it translates into appropriate basic ARM instructions.

Extended range immediate constants

Synopsis

In the case of an instruction such as

MOV R0,#VALUE

ObjAsm will evaluate the expression and produce a CPU instruction to load the value into the destination register. This may not in fact be the machine level instruction known as MOV, but the programmer need not be aware that an alternative instruction has been substituted. A common example is

MOV Rn,#-1

which the CPU cannot handle directly (as -1 is not a valid immediate constant). ObjAsm will accept this syntax, but will convert it and generate object code for

MVN Rn,#0

which results in Rn containing -1. Such conversions also takes place between the following pairs of instructions:

BIC/AND
ADD/SUB
ADC/SBC
CMP/CMN

The ADR instruction

Assembler syntax

ADR«cond» register,expression

Synopsis

This produces an address in a register. ARM does not have an explicit 'calculate effective address' instruction, as this can generally be done using ADD, SUB, MOV or MVN. To ease the construction of such instructions, ObjAsm provides an ADR instruction.

The expression may be register-relative, program-relative or numeric:

Register-relative
ADD|SUB register,register2,#constant
will be produced, where register2 is the register to which the expression is relative.
Program-relative
ADD|SUB register,PC,#constant
will be produced.
Numeric
MOV|MVN register,#constant
will be produced.

In all three cases, an error will be generated if the immediate constant required is out of range.

If the program has a fixed origin (that is, if the ORG directive has been used), the distinction between program-relative and numeric values disappears. In this case, ObjAsm will first try to treat such a value as program-relative. If this fails, it will try to treat it as numeric. An error will only be generated if both attempts fail.

The ADRL instruction

Assembler syntax: ADR«cond»L register,expression
Synopsis: This form of ADR is provided by ADRL and allows a wider collection of effective addresses to be produced. ADRL can be used in the same way as ADR, except that the allowed range of constants is any constant specified as an even rotation of a value less than &10000. Again program-relative, register relative and numeric forms exist. The result produced will always be two instructions, even if it could have been done in one. An error will be generated if the necessary immediate constants cannot be produced.

Literals

Assembler syntax

LDR register,=expression

Synopsis

Literals are intended to enable the programmer to load immediate values into a register which might be out of range as MOV/MVN arguments.

ObjAsm will take certain actions with literals. It will:

if possible, replace the instruction with a MOV or MVN,
otherwise, generate a program-relative LDR and if no such literal already exists within the addressable range, place the literal in the next literal pool.

Program-relative expressions and imported symbols are also valid literals. See the chapter entitled Organisational directives - END, ORG, LTORG and KEEP for further information.

www.riscos.com Technical Support:Acorn Assembler

CPU instruction set

The condition field

Conditional instruction sequence

Instruction timings

The barrel shifter

Unshifted register

Register shifted by a constant amount

Value resulting from rotating register and carry bit one bit right

Register shifted by n bits

8-bit constant rotated right by 2n bits

8-bit constant rotated right by 2n bits and specified explicitly

Shift types

Mnemonics

Specification of the shift amount

Instruction specified shift amount

Register specified shift amount

Logical shift left, or arithmetic shift left

Special cases

Logical shift right

Special cases

Arithmetic shift right

Special cases

Rotate right

Special cases

Rotate right with extend

Coprocessor instructions

Branch, Branch with Link (B, BL)

Instruction format

Assembler syntax

Synopsis

The link bit

32 bit operation

Calculating the offset

The link bit

Examples

Data processing

Instruction format

Assembler syntax

MOV and MVN - single operand

CMN, CMP, TEQ and TST - no result written

ADC, ADD, AND, BIC, OR, ORR, RSB, RSC, SBC, SUB - two operands

Parameters

Opcodes

Synopsis

The S bit

Logical operations (AND, BIC, EOR, MOV, MVN, ORR, TEQ, TST)

Arithmetic operations (ADC, ADD, CMP, CMN, RSB, RSC, SBC, SUB)

The P flag

Shifts

Immediate operand rotates

Using R15 as the destination or operand

Using R15 as the destination

Using R15 as an operand

32 bit operation

TEQP, TSTP, CMPP and CMNP

Using R15 as the shift register

Using R15 as the destination

Examples

PSR transfer (MRS, MSR)

Instruction format

Assembler syntax

Synopsis

Operand restrictions

Reserved bits

Examples

Multiply and Multiply-Accumulate (MUL, MLA)

Instruction format

Assembler syntax

Synopsis

PSR flags

Operand restrictions

32 bit operation

Examples

Multiply Long and Multiply-Accumulate Long (UMULL, SMULL, UMLAL, SMLAL)

Instruction format

Assembler syntax

Synopsis

PSR flags

Operand restrictions

www.riscos.com Technical Support:
Acorn Assembler