-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Work environment
| Questions | Answers |
|---|---|
| OS/arch/bits | MacOS AArch64 |
| Architecture | armv8 |
| Source of Capstone | git clone |
| Version/git commit | df72286 |
Sorry for not using the template fully, and sorry in advance for the long issue, but I have identified a fair few instructions (mainly SVE) with incorrect access types, and others with incorrect implicit register reads / writes or no immidiate encoding.
Incorrect implicit destinations
- Opcode
AArch64_MRS(0x42d03bd5) currently has the NZCV register as an implicit write register. This isn't correct - Opcode
AArch64_BLR(0x20003fd6) has an implicit read of SP. Again this isn't correct.
Incorrect access permissions
For this, for SVE instructions which have the format of fsub zdn, pg, zdn, zm - that is, where the 1st and 2nd z-regs must be the same - both operands have their access set to READ | WRITE.
Although this is technically correct, as register zdn is being written to and read from, I think it can be confusing. My reasoning for this is that operand[0] represents the register being written to, so its access should be just WRITE. Then operand[2] (in the example above) is the first source vector and should be READ.
If someone does not know that the destination vector register and the first source vector are mandated by the ISA spec to be the same register, then it could be confusing to see 2 registers being written to.
Here is an examples of this occuring:
0 20 00 19 04 eor z0.b, p0/m, z0.b, z1.b
ID: 311 (eor)
op_count: 4
operands[0].type: REG = z0
operands[0].access: READ | WRITE
Vector Arrangement Specifier: 0x8
operands[1].type: PREDICATE
operands[1].pred.reg: p0
operands[1].access: READ
operands[2].type: REG = z0
operands[2].access: READ | WRITE
Vector Arrangement Specifier: 0x8
operands[3].type: REG = z1
operands[3].access: READ
Vector Arrangement Specifier: 0x8
Write-back: True
Registers read: z0 p0 z1
Registers modified: z0
Groups: HasSVEorSME
Which could be instead this:
0 20 00 19 04 eor z0.b, p0/m, z0.b, z1.b
ID: 311 (eor)
op_count: 4
operands[0].type: REG = z0
operands[0].access: WRITE
Vector Arrangement Specifier: 0x8
operands[1].type: PREDICATE
operands[1].pred.reg: p0
operands[1].access: READ
operands[2].type: REG = z0
operands[2].access: READ
Vector Arrangement Specifier: 0x8
operands[3].type: REG = z1
operands[3].access: READ
Vector Arrangement Specifier: 0x8
Write-back: True
Registers read: z0 p0 z1
Registers modified: z0
Groups: HasSVEorSME
I don't have a comprehensive list of all instructions that are effected by this, but it generally seems to be SVE only and with the format Zdn, pg, zdn, <zm|#imm>.
Below is a list of opcode enums (and some bytecodes where I've made a note of them) of the ones I have run into so far:
- AArch64_FSUB_ZPmI_D
- AArch64_FSUB_ZPmI_H
- AArch64_FSUB_ZPmI_S // Example bytecode - 00849965
- AArch64_FMUL_ZPmI_D
- AArch64_FMUL_ZPmI_H
- AArch64_FMUL_ZPmI_S // Example bytecode - 00809a65
- AArch64_FADD_ZPmI_D // Example bytecode - 0584d865
- AArch64_FADD_ZPmI_H
- AArch64_FADD_ZPmI_S
- AArch64_AND_ZPmZ_D // Example bytecode - 4901da04
- AArch64_AND_ZPmZ_H
- AArch64_AND_ZPmZ_S
- AArch64_AND_ZPmZ_B
- AArch64_SMULH_ZPmZ_B // Example bytecode - 20001204
- AArch64_SMULH_ZPmZ_D
- AArch64_SMULH_ZPmZ_H
- AArch64_SMULH_ZPmZ_S
- AArch64_SMIN_ZPmZ_B
- AArch64_SMIN_ZPmZ_D
- AArch64_SMIN_ZPmZ_H
- AArch64_SMIN_ZPmZ_S // Example bytecode - 01008a04
- AArch64_SMAX_ZPmZ_B
- AArch64_SMAX_ZPmZ_D
- AArch64_SMAX_ZPmZ_H
- AArch64_SMAX_ZPmZ_S // Example bytecode - 01008804
- AArch64_MUL_ZPmZ_B // Example bytecode - 40001004
- AArch64_MUL_ZPmZ_D
- AArch64_MUL_ZPmZ_H
- AArch64_MUL_ZPmZ_S
- AArch64_FSUBR_ZPmZ_D
- AArch64_FSUBR_ZPmZ_H
- AArch64_FSUBR_ZPmZ_S // Example bytecode - 24808365
- AArch64_FSUB_ZPmZ_D
- AArch64_FSUB_ZPmZ_H
- AArch64_FSUB_ZPmZ_S // Example bytecode - 24808165
- AArch64_FMUL_ZPmZ_D
- AArch64_FMUL_ZPmZ_H
- AArch64_FMUL_ZPmZ_S // Example bytecode - 83808265
- AArch64_FDIV_ZPmZ_D // Example bytecode - 0184cd65
- AArch64_FDIV_ZPmZ_H
- AArch64_FDIV_ZPmZ_S
- AArch64_FDIVR_ZPmZ_D // Example bytecode - 0184cc65
- AArch64_FDIVR_ZPmZ_H
- AArch64_FDIVR_ZPmZ_S
- AArch64_FADDA_VPZ_D
- AArch64_FADDA_VPZ_H
- AArch64_FADDA_VPZ_S // Example bytecode - 01249865
- AArch64_FADD_ZPmZ_D // Example bytecode - 6480c065
- AArch64_FADD_ZPmZ_H
- AArch64_FADD_ZPmZ_S
- AArch64_FCADD_ZPmZ_D // Example bytecode - 2080c064
- AArch64_FCADD_ZPmZ_H
- AArch64_FCADD_ZPmZ_S
- AArch64_ADD_ZPmZ_B // Example bytecode - 00000004
- AArch64_ADD_ZPmZ_D
- AArch64_ADD_ZPmZ_H
- AArch64_ADD_ZPmZ_S
- AArch64_EOR_ZPmZ_B // Example bytecode - 20001904
- AArch64_EOR_ZPmZ_D
- AArch64_EOR_ZPmZ_H
- AArch64_EOR_ZPmZ_S
Similar has also been seen with unpredicated SVE instructions where operand[0] and operand[1] must be the same SVE vector register:
- AArch64_SMAX_ZI_B
- AArch64_SMAX_ZI_D
- AArch64_SMAX_ZI_H
- AArch64_SMAX_ZI_S // Example bytecode - 03c0a825
- AArch64_AND_ZI // Example bytecode - 00068005
- AArch64_ADD_ZI_B // Example bytecode - 00c12025
- AArch64_ADD_ZI_D
- AArch64_ADD_ZI_H
- AArch64_ADD_ZI_S
Incorrect access permissions pt. 2
There are some other instructions I have found with wrong access information.
AArch64_CASALX and AArch64_CASALW // Example bytecode - 02fce188
0 02 fc e1 88 casal w1, w2, [x0]
ID: 127 (casal)
op_count: 3
operands[0].type: REG = w1
operands[0].access: READ | WRITE
operands[1].type: REG = w2
operands[1].access: READ
operands[2].type: MEM
operands[2].mem.base: REG = x0
operands[2].access: READ | WRITE
Write-back: True
Registers read: w1 w2 x0
Registers modified: w1 x0
Groups: HasLSE
All permissions should be READ as no register is updated with CASAL. Also writeback should be False:
0 02 fc e1 88 casal w1, w2, [x0]
ID: 127 (casal)
op_count: 3
operands[0].type: REG = w1
operands[0].access: READ
operands[1].type: REG = w2
operands[1].access: READ
operands[2].type: MEM
operands[2].mem.base: REG = x0
operands[2].access: READ
Registers read: w1 w2 x0
Registers modified: w1 x0
Groups: HasLSE
AArch64_FCVTNv4i32 // Example bytecode - 0168614e
0 01 68 61 4e fcvtn2 v1.4s, v0.2d
ID: 367 (fcvtn2)
op_count: 2
operands[0].type: REG = q1 (vreg)
operands[0].access: READ | WRITE
Vector Arrangement Specifier: 0x420
operands[1].type: REG = q0 (vreg)
operands[1].access: READ
Vector Arrangement Specifier: 0x240
Write-back: True
Registers read: fpcr q1 q0
Registers modified: q1
Groups: HasNEON
operands[0] should be WRITE only. More variants of this instruction may be effected, I just haven't verified this:
0 01 68 61 4e fcvtn2 v1.4s, v0.2d
ID: 367 (fcvtn2)
op_count: 2
operands[0].type: REG = q1 (vreg)
operands[0].access: WRITE
Vector Arrangement Specifier: 0x420
operands[1].type: REG = q0 (vreg)
operands[1].access: READ
Vector Arrangement Specifier: 0x240
Write-back: True
Registers read: fpcr q1 q0
Registers modified: q1
Groups: HasNEON
Imm not set when a shift is present
For many instructions that take an immidiate, a shift can also optionally be provided. When the shift is not provided, the instructions work fine.
However, when the shift is provided the shift amount is often fixed or in a range. As such, Capstone / LLVM disassembler automatically works out the shifted value. The shifted immidiate is given correctly in the operand string, but is not in the disassembly info.
Example: AArch64_CPY_ZPzI_H: // Example bytecode - 01215005
0 01 21 50 05 mov z1.h, p0/z, #0x800
ID: 273 (cpy)
Is alias: 1429 (mov) with REAL operand set
op_count: 4
operands[0].type: REG = z1
operands[0].access: WRITE
Vector Arrangement Specifier: 0x10
operands[1].type: PREDICATE
operands[1].pred.reg: p0
operands[1].access: READ
operands[2].type: IMM = 0x0
operands[2].access: READ
operands[3].type: IMM = 0x0
operands[3].access: READ
Registers read: p0
Registers modified: z1
Groups: HasSVEorSME
Here, there is an extra operand in operand[3], and the imm is not set:
0 01 21 50 05 mov z1.h, p0/z, #0x800
ID: 273 (cpy)
Is alias: 1429 (mov) with REAL operand set
op_count: 4
operands[0].type: REG = z1
operands[0].access: WRITE
Vector Arrangement Specifier: 0x10
operands[1].type: PREDICATE
operands[1].pred.reg: p0
operands[1].access: READ
operands[2].type: IMM = 0x800
operands[2].access: READ
Registers read: p0
Registers modified: z1
Groups: HasSVEorSME
An alternative assembly for this instruction (and the one I used to generate the bytecode) is cpy z1.h, p0/z, #8, lsl #8, where the only LSL available is by #8.
The instructions we have found to be effected are:
- AArch64_ADD_ZI_B
- AArch64_ADD_ZI_D
- AArch64_ADD_ZI_H
- AArch64_ADD_ZI_S
- AArch64_CPY_ZPzI_B
- AArch64_CPY_ZPzI_D
- AArch64_CPY_ZPzI_H
- AArch64_CPY_ZPzI_S
This issue is likely to effect all instructions which use immidiates and optional shifts in this way.
FP immidate not shown in disassembly information
For instructions which take a fixed floating point immidiate value, it is correctly identified that one exists, and the EXACTFPIMM field is populated. But, we also have the .fp field in the cs_aarch64_op union. It could be useful to also populate this field as well as the enum for better clarity and improved in-project usage.
Example: AArch64_FADD_ZPmI_D // Example bytecode - 0584d865
0 05 84 d8 65 fadd z5.d, p1/m, z5.d, #0.5
ID: 332 (fadd)
op_count: 4
operands[0].type: REG = z5
operands[0].access: READ | WRITE
Vector Arrangement Specifier: 0x40
operands[1].type: PREDICATE
operands[1].pred.reg: p1
operands[1].access: READ
operands[2].type: REG = z5
operands[2].access: READ | WRITE
Vector Arrangement Specifier: 0x40
operands[3].type: SYS IMM:
operands[3].subtype EXACTFPIMM = 1
Write-back: True
Registers read: z5 p1
Registers modified: z5
Groups: HasSVEorSME
Could be
0 05 84 d8 65 fadd z5.d, p1/m, z5.d, #0.5
ID: 332 (fadd)
op_count: 4
operands[0].type: REG = z5
operands[0].access: READ | WRITE
Vector Arrangement Specifier: 0x40
operands[1].type: PREDICATE
operands[1].pred.reg: p1
operands[1].access: READ
operands[2].type: REG = z5
operands[2].access: READ | WRITE
Vector Arrangement Specifier: 0x40
operands[3].type: SYS IMM:
operands[3].subtype EXACTFPIMM = 1
operands[3].fp = 0.5
Write-back: True
Registers read: z5 p1
Registers modified: z5
Groups: HasSVEorSME
Thanks in advance!