Skip to main content

A tiny RISC-V instruction set simulator

Project description

tinyRV

A RISC-V instruction decoder and instruction set simulator in less than 200 lines of python.

Mission: Make the most useful RISC-V disassembler/simulator for understanding the ISA and reverse-engineering binaries with the least amount of easily extendable code. Simulation performance is secondary.

  • Uses official RISC-V specs to decode every specified RISC-V instruction.
  • Simulates the base ISAs and is easily extendable.
  • RV32IMAZicsr_Zifencei and RV64IMAZicsr_Zifencei compliance validated using RISCOF (see Testing below).
  • Boots Linux! Big thanks to CNLohr for mini-rv32ima.

Getting Started

pip install tinyrv

Print all RISC-V instructions in a binary:

tinyrv-dump firmware.bin

Outputs for firmware.bin from picorv32:

00000000: custom0                                  # INVALID data=0x800400b
00000004: custom0                                  # INVALID data=0x600600b
00000008: jal        zero, 0x3e0                   # rv_i
0000000c: addi       zero, zero, 0                 # rv_i
00000010: custom0                                  # INVALID data=0x200a10b
00000014: custom0                                  # INVALID data=0x201218b
00000018: lui        ra, 0                         # rv_i
0000001c: addi       ra, ra, 0x160                 # rv_i
00000020: custom0                                  # INVALID data=0x410b
00000024: sw         sp, 0(ra)                     # rv_i
00000028: custom0                                  # INVALID data=0x1410b
0000002c: sw         sp, 4(ra)                     # rv_i
00000030: custom0                                  # INVALID data=0x1c10b
00000034: sw         sp, 8(ra)                     # rv_i
00000038: sw         gp, 12(ra)                    # rv_i
0000003c: sw         tp, 16(ra)                    # rv_i
...

picorv32 uses some custom instructions for IRQ handling.

Decode instructions from data:

tinyrv-dump 0xf2410113 0xde0ec086 0x2013b7

or in python:

import tinyrv
for op in tinyrv.decoder(0xf2410113, 0xde0ec086, 0x2013b7):
    print(op)

Outputs four instructions (the second word contains actually two 16-bit compressed instructions):

addi       sp, sp, -220
c.swsp     ra, uimm8sp_s=64
c.swsp     gp, uimm8sp_s=60
lui        t2, 0x201000

Each decoded instruction comes with a lot of metadata and parsed arguments:

op = tinyrv.decode(0xf2410113)
print(hex(op.data), op.name, op.extension, op.variable_fields, bin(op.mask), bin(op.match), op.valid())
print(op.args, op.rd, op.rs1, op.imm12)
print(op.arg_str())
0xf2410113 addi ['rv_i'] ['rd', 'rs1', 'imm12'] 0b111000001111111 0b10011 True
{'rd': 2, 'rs1': 2, 'imm12': -220} 2 2 -220
sp, sp, -220

Simulate a binary:

rv = tinyrv.sim(xlen=32)  # xlen affects overflows, sign extensions
rv.read_bin('firmware.bin', base=0)
print(rv.x)  # print registers
print()
rv.step()  # simulate a single instruction at rv.pc

Outputs:

x00(ro)=00000000  x08(fp)=00000000  x16(a6)=00000000  x24(s8)=00000000
x01(ra)=00000000  x09(s1)=00000000  x17(a7)=00000000  x25(s9)=00000000
x02(sp)=00000000  x10(a0)=00000000  x18(s2)=00000000  x26(10)=00000000
x03(gp)=00000000  x11(a1)=00000000  x19(s3)=00000000  x27(11)=00000000
x04(tp)=00000000  x12(a2)=00000000  x20(s4)=00000000  x28(t3)=00000000
x05(t0)=00000000  x13(a3)=00000000  x21(s5)=00000000  x29(t4)=00000000
x06(t1)=00000000  x14(a4)=00000000  x22(s6)=00000000  x30(t5)=00000000
x07(t2)=00000000  x15(a5)=00000000  x23(s7)=00000000  x31(t6)=00000000


00000000: unimplemented: 0800400b custom0    
00000000: custom0                                  #

Simulation halts at the first instruction that is not implemented. Just set the pc and carry on:

rv.pc = 8
rv.run(50)
00000008: jal        zero, 0x3e0                   # 

000003e0: addi       ra, zero, 0                   # 
000003e4: addi       sp, zero, 0                   # 
(... boring initialization stuff skipped ...)
00000454: addi       t5, zero, 0                   # 
00000458: addi       t6, zero, 0                   # 
0000045c: lui        sp, 0x20000                   # sp=00020000
00000460: jal        ra, 0xbdc                     # ra=00000464

00000bdc: lui        a0, 0xc000                    # a0=0000c000
00000be0: addi       a0, a0, 0x79c                 # a0=0000c79c
00000be4: jal        zero, 0xb08                   # 

00000b08: lui        a4, 0x10000000                # a4=10000000
00000b0c: lbu        a5, 0(a0)                     # mem[0000c79c]->68 a5=00000068
00000b10: bne        a5, zero, 0xb18               # 

00000b18: addi       a0, a0, 1                     # a0=0000c79d
00000b1c: sw         a5, 0(a4)                     # 00000068->mem[10000000]
00000b20: jal        zero, 0xb0c                   # 

00000b0c: lbu        a5, 0(a0)                     # mem[0000c79d]->65 a5=00000065
00000b10: bne        a5, zero, 0xb18               # 

00000b18: addi       a0, a0, 1                     # a0=0000c79e
00000b1c: sw         a5, 0(a4)                     # 00000065->mem[10000000]
00000b20: jal        zero, 0xb0c                   # 

00000b0c: lbu        a5, 0(a0)                     # mem[0000c79e]->6c a5=0000006c
00000b10: bne        a5, zero, 0xb18               # 

Each jump, taken branch produces a newline, right-hand side has register changes and memory transactions. Memory is paged, allocated on demand and persists. Now let's get past this loop by setting a breakpoint:

rv.run(1000, bpts={0xb14})
rv.run(10)
...
00000b1c: sw         a5, 0(a4)                     # 00000064->mem[10000000]
00000b20: jal        zero, 0xb0c                   # 

00000b0c: lbu        a5, 0(a0)                     # mem[0000c7a7]->0a a5=0000000a
00000b10: bne        a5, zero, 0xb18               # 

00000b18: addi       a0, a0, 1                     # a0=0000c7a8
00000b1c: sw         a5, 0(a4)                     # 0000000a->mem[10000000]
00000b20: jal        zero, 0xb0c                   # 

00000b0c: lbu        a5, 0(a0)                     # mem[0000c7a8]->00 a5=00000000
00000b10: bne        a5, zero, 0xb18               # 
00000b14: jalr       zero, 0(ra)                   # 

00000464: addi       ra, zero, 0x3e8               # ra=000003e8

00000468: unimplemented: 0a00e00b custom0    
00000468: custom0                                  # 

Simulate from command line:

# usage: tinyrv-sim file.bin [(32|64) [run limit in decimal [start PC in hex]]]

tinyrv-sim firmware.bin 32 100 8

Dev Setup

All core simulator code is in tinyrv/tinyrv.py and has no external dependencies. tinyRV loads opcode specs from tinyrv/opcodes.py, which is auto-generated from riscv-opcodes by tinyrv_opcodes_gen.py.

Do this to re-generate:

git clone https://github.com/riscv/riscv-opcodes.git
make -C riscv-opcodes
python3 tinyrv_opcodes_gen.py

Some VMs use external libraries:

Testing

Install riscv-gnu-toolchain or homebrew-riscv (for MacOS).

RISCOF

Install the RISC-V compatibility framework RISCOF:

pip3 install setuptools wheel
git clone https://github.com/riscv/riscof.git
cd riscof
pip3 install -e .

Install the Sail ISA specification language:

brew install opam zlib z3 pkg-config
opam init
opam switch create ocaml-base-compiler
opam install sail
eval $(opam config env)

Install the RISCV Sail Model:

git clone https://github.com/riscv/sail-riscv.git
cd sail-riscv
ARCH=RV32 make c_emulator/riscv_sim_RV32
ARCH=RV64 make c_emulator/riscv_sim_RV64
# copy / link c_emulator/riscv_sim_RV{32,64} into $PATH location

Optionally, install Spike RISC-V ISA Simulator:

git clone https://github.com/riscv-software-src/riscv-isa-sim.git
cd riscv-isa-sim
mkdir build
cd build
../configure --prefix=/path/to/install  # /path/to/install/bin must be in $PATH
make
make install
spike  # test

Then, run the tests:

make -C tests run_riscof

riscv-software-src/riscv-tests

make -C tests run_riscv_tests

This will automatically clone and build the test suite if necessary.

Run Coremark

make -C tests run_coremark

This will automatically clone and build coremark if necessary.

Boot Linux

make -C tests run_linux

Will download an Image and boot it. Boottime is about a minute on a reasonable fast machine and recent Python.

The Image was made using buildroot. The configuration is based on qemu_riscv64_nommu_virt_defconfig with FPU, compressed instructions and other extensions turned off.

To (re-)create this Image, run:

make -C tests Image

This will download buildroot and build the kernel (takes some time). The build configuration is in tests/br2_external_tinyrv.

Related

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tinyrv-0.0.8.tar.gz (71.8 kB view hashes)

Uploaded Source

Built Distribution

tinyrv-0.0.8-py3-none-any.whl (63.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page