A tiny RISC-V instruction set simulator
Project description
tinyRV
A RISC-V instruction decoder, instruction set simulator and basic system emulator in less than 1000 lines of python.
Mission: Make the most useful RISC-V disassembler/simulator for understanding the ISA and reverse-engineering binaries with the least amount of easily extendable code. Simulation performance is secondary.
- Uses official RISC-V specs to decode every specified RISC-V instruction.
- Simulates the base ISAs and is easily extendable.
- RV32GC and RV64GC compliance validated using riscof and riscv-tests (see Testing below).
- IEEE754 compliant single-precision and double-precision floating point.
- Emulates basic user environment for running ELFs. Supports some linux system calls, argc/argv, semihosting, HTIF.
- Emulates a virt system similar to qemu with: UART, CLINT, PLIC, basic boot-loader, DTB generation.
- Boots nommu Linux images! Big thanks to CNLohr for mini-rv32ima.
Getting Started
pip install tinyrv
Print all RISC-V instructions in a binary:
tinyrv-dump firmware.bin
Outputs for firmware.bin from picorv32:
00000000: custom0 # INVALID data=0x800400b
00000004: custom0 # INVALID data=0x600600b
00000008: jal zero, 0x3e0 # rv_i
0000000c: addi zero, zero, 0 # rv_i
00000010: custom0 # INVALID data=0x200a10b
00000014: custom0 # INVALID data=0x201218b
00000018: lui ra, 0 # rv_i
0000001c: addi ra, ra, 0x160 # rv_i
00000020: custom0 # INVALID data=0x410b
00000024: sw sp, 0(ra) # rv_i
00000028: custom0 # INVALID data=0x1410b
0000002c: sw sp, 4(ra) # rv_i
00000030: custom0 # INVALID data=0x1c10b
00000034: sw sp, 8(ra) # rv_i
00000038: sw gp, 12(ra) # rv_i
0000003c: sw tp, 16(ra) # rv_i
...
picorv32 uses some custom instructions for IRQ handling.
Decode instructions from data:
tinyrv-dump 0xf2410113 0xde0ec086 0x2013b7
or in python:
import tinyrv
for op in tinyrv.decoder(0xf2410113, 0xde0ec086, 0x2013b7):
print(op)
Outputs four instructions (the second word contains actually two 16-bit compressed instructions):
addi sp, sp, -220
c.swsp ra, 0x40(sp)
c.swsp gp, 0x3c(sp)
lui t2, 0x201000
Each decoded instruction comes with a lot of metadata and parsed arguments:
op = tinyrv.decode(0xf2410113)
print(hex(op.data), op.name, op.extension, op.variable_fields, bin(op.mask), bin(op.match), op.valid())
print(op.args, op.rd, op.rs1, op.imm12)
print(op.arg_str())
0xf2410113 addi ['rv_i'] ['rd', 'rs1', 'imm12'] 0b111000001111111 0b10011 True
{'rd': 2, 'rs1': 2, 'imm12': -220} 2 2 -220
sp, sp, -220
Simulate a binary:
rv = tinyrv.sim(xlen=32) # xlen affects overflows, sign extensions
rv.copy_in(0, open('firmware.bin', 'rb').read())
print(rv.x) # print registers
print()
rv.step() # simulate a single instruction at rv.pc
Outputs:
x00(ro)=00000000 x08(fp)=00000000 x16(a6)=00000000 x24(s8)=00000000
x01(ra)=00000000 x09(s1)=00000000 x17(a7)=00000000 x25(s9)=00000000
x02(sp)=00000000 x10(a0)=00000000 x18(s2)=00000000 x26(10)=00000000
x03(gp)=00000000 x11(a1)=00000000 x19(s3)=00000000 x27(11)=00000000
x04(tp)=00000000 x12(a2)=00000000 x20(s4)=00000000 x28(t3)=00000000
x05(t0)=00000000 x13(a3)=00000000 x21(s5)=00000000 x29(t4)=00000000
x06(t1)=00000000 x14(a4)=00000000 x22(s6)=00000000 x30(t5)=00000000
x07(t2)=00000000 x15(a5)=00000000 x23(s7)=00000000 x31(t6)=00000000
00000000: unimplemented: 0800400b custom0
00000000: custom0 # [0]
Simulation halts at the first instruction that is not implemented. Just set the pc and carry on:
rv.pc = 8
rv.run(50)
00000008: jal zero, 0x3e0 # [1]
000003e0: addi ra, zero, 0 # [2] ra=00000000
000003e4: addi sp, zero, 0 # [3] sp=00000000
(... boring initialization stuff skipped ...)
00000454: addi t5, zero, 0 # [31] t5=00000000
00000458: addi t6, zero, 0 # [32] t6=00000000
0000045c: lui sp, 0x20000 # [33] sp=00020000
00000460: jal ra, 0xbdc # [34] ra=00000464
00000bdc: lui a0, 0xc000 # [35] a0=0000c000
00000be0: addi a0, a0, 0x79c # [36] a0=0000c79c
00000be4: jal zero, 0xb08 # [37]
00000b08: lui a4, 0x10000000 # [38] a4=10000000
00000b0c: lbu a5, 0(a0) # [39] mem[0000c79c]->68 a5=00000068
00000b10: bne a5, zero, 0xb18 # [40]
00000b18: addi a0, a0, 1 # [41] a0=0000c79d
00000b1c: sw a5, 0(a4) # [42] 00000068->mem[10000000]
00000b20: jal zero, 0xb0c # [43]
00000b0c: lbu a5, 0(a0) # [44] mem[0000c79d]->65 a5=00000065
00000b10: bne a5, zero, 0xb18 # [45]
00000b18: addi a0, a0, 1 # [46] a0=0000c79e
00000b1c: sw a5, 0(a4) # [47] 00000065->mem[10000000]
00000b20: jal zero, 0xb0c # [48]
00000b0c: lbu a5, 0(a0) # [49] mem[0000c79e]->6c a5=0000006c
00000b10: bne a5, zero, 0xb18 # [50]
Each jump, taken branch produces a newline, right-hand side has register changes and memory transactions. Memory is paged, allocated on demand and persists. This loop writes ascii chars to address 0x10000000 - the firmware apparently expects an UART there. Now let's get past this loop by setting a breakpoint:
rv.run(1000, bpts={0xb14})
rv.run(10)
...
00000b0c: lbu a5, 0(a0) # [89] mem[0000c7a6]->64 a5=00000064
00000b10: bne a5, zero, 0xb18 # [90]
00000b18: addi a0, a0, 1 # [91] a0=0000c7a7
00000b1c: sw a5, 0(a4) # [92] 00000064->mem[10000000]
00000b20: jal zero, 0xb0c # [93]
00000b0c: lbu a5, 0(a0) # [94] mem[0000c7a7]->0a a5=0000000a
00000b10: bne a5, zero, 0xb18 # [95]
00000b18: addi a0, a0, 1 # [96] a0=0000c7a8
00000b1c: sw a5, 0(a4) # [97] 0000000a->mem[10000000]
00000b20: jal zero, 0xb0c # [98]
00000b0c: lbu a5, 0(a0) # [99] mem[0000c7a8]->00 a5=00000000
00000b10: bne a5, zero, 0xb18 # [100]
00000b14: jalr zero, 0(ra) # [101]
00000464: addi ra, zero, 0x3e8 # [102] ra=000003e8
00000468: unimplemented: 0a00e00b custom0
00000468: custom0 # [103]
Another custom IRQ instruction.
Let us ignore those and add a UART by subclassing tinyrv.sim
.
This code example is from tests/fwsim.py
:
import tinyrv, struct
from tinyrv.system import uart8250
class fwsim(tinyrv.sim):
def __init__(self, xlen=64, trap_misaligned=True):
super().__init__(xlen, trap_misaligned)
self.uart = uart8250(self)
def _custom0 (self, **_): self.pc+=4
def notify_stored(self, addr):
if addr == 0x10000000: self.uart[0] = struct.unpack_from('B', *self.page_and_offset(addr))[0]
def main():
rv = fwsim(xlen=32, trap_misaligned=False)
rv.copy_in(0, open('tinyrv-test-blobs/picorv32_fw/firmware.bin', 'rb').read())
rv.pc = 0
rv.run(10, trace=False)
rv.run(0, bpts={0x3e0}, trace=False)
if __name__ == '__main__': main()
It shows several features of tinyRV:
- sub-classes can implement additional instructions by simply defining methods named
'_'+instruction_name
. All parameters are passed as kwargs - ignore all unused arguments with**_
. - The
custom0
instruction is defined as nop. It only advances the PC. - Several callbacks and hooks are available for sub-casses. Here we use
notify_stored
to catch writes to 0x10000000 and forward them to a UART. Other callbacks are:notify_loading
,hook_csr
,hook_exec
. copy_in
loads binary data into memory at a given address. Here: 0.- First run will get past the reset-vector (at 0x3e0), the second run continues simulation until the next reset.
Running this code will output data sent to the UART:
hello world
lui..OK
auipc..OK
j..OK
jal..OK
jalr..OK
beq..OK
bne..OK
blt..OK
bge..OK
...
TinyRV comes with two virtual machines that can be lauched from the command line. Call with -h for more info:
tinyrv-user-elf
simulates ELFs in a minimal user environment. Use this for running cross-compiled user programs.tinyrv-system-virt
emulates a system similar to qemu's virt. Use this to boot kernels.
Dev Setup
All core simulator code is in tinyrv/tinyrv.py
and has no external dependencies.
tinyRV loads opcode specs from tinyrv/opcodes.py
, which is auto-generated from riscv-opcodes by tinyrv_opcodes_gen.py
.
Do this to re-generate:
git clone https://github.com/riscv/riscv-opcodes.git
make -C riscv-opcodes
python3 tinyrv_opcodes_gen.py
Some VMs use external libraries:
- lief for loading ELFs
- readchar for reading keystrokes
- dataclasses-struct for easier binary data manipulation
Testing
Install riscv-gnu-toolchain or homebrew-riscv (for MacOS).
RISCOF
Install the RISC-V compatibility framework RISCOF:
pip3 install setuptools wheel
git clone https://github.com/riscv/riscof.git
cd riscof
pip3 install -e .
Install the Sail ISA specification language:
brew install opam zlib z3 pkg-config
opam init
opam switch create ocaml-base-compiler
opam install sail
eval $(opam config env)
Install the RISCV Sail Model:
git clone https://github.com/riscv/sail-riscv.git
cd sail-riscv
ARCH=RV32 make c_emulator/riscv_sim_RV32
ARCH=RV64 make c_emulator/riscv_sim_RV64
# copy / link c_emulator/riscv_sim_RV{32,64} into $PATH location
Optionally, install Spike RISC-V ISA Simulator:
git clone https://github.com/riscv-software-src/riscv-isa-sim.git
cd riscv-isa-sim
mkdir build
cd build
../configure --prefix=/path/to/install # /path/to/install/bin must be in $PATH
make
make install
spike # test
Then, run the tests:
make -C tests run_riscof
riscv-software-src/riscv-tests
make -C tests run_riscv_tests
This will automatically clone and build the test suite if necessary.
Run Coremark
make -C tests run_coremark
This will automatically clone and build coremark if necessary.
Boot Linux
make -C tests run_linux
Will download an Image and boot it. Boottime is about a minute on a reasonable fast machine and recent Python.
The Image was made using buildroot. The configuration is based on qemu_riscv64_nommu_virt_defconfig with FPU, compressed instructions and other extensions turned off.
To (re-)create this Image, run:
make -C tests Image
This will download buildroot and build the kernel (takes some time). The build configuration is in tests/br2_external_tinyrv.
Related
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file tinyrv-0.0.9.tar.gz
.
File metadata
- Download URL: tinyrv-0.0.9.tar.gz
- Upload date:
- Size: 90.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
f751c7f123058a0454f16e15589148d5c789b23c8c000be12cf928eab1665a74
|
|
MD5 |
4ae696f99690db32f42e3b9a27b64f6e
|
|
BLAKE2b-256 |
a1463d57dd7d6cd6e91e12e621ee956e4bd81bf0404ec9663997b4012398d8c5
|
File details
Details for the file tinyrv-0.0.9-py3-none-any.whl
.
File metadata
- Download URL: tinyrv-0.0.9-py3-none-any.whl
- Upload date:
- Size: 82.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 |
b4c19d628cb8852a201113805bacefe6a169c7407d8bc1f11e95b312d5821e5a
|
|
MD5 |
9adbf8b3a999b86a621929eff0707575
|
|
BLAKE2b-256 |
ea5587a2684b45c57d7778f054fb51ba7f037b76298ae6a8b7f201b2a31ee806
|