/Xv6 Rust 0x02/ - printf!("Hello xv6-rust!")
With the help of the previous article, right now we have a good foundation for running rust on risc-v platform.
In the second episode, we are going to jump into some real code of
xv6, and take care of the initialize from machine level to supervisor
level, and finally, make the printf!()
macro available in
our code!
1. Short but important assembly code
In our latest code, we set the entry of our code as
main()
, and after that, we only did one thing before
running the test code, which is set the stack pointer.
However, xv6 will do more in the stage of initialization, it chooses to have an individual ASM file to put the very initial code in it, and the ASM file named "entry.S"(some part of code or comment will be truncated, please click the link attached to the file to see the full code):
1 | ### entry.S |
The above code basically set up a 4096-bytes stack, for every hart,
and the start address of stack, which we have set to
0x80001000
in our previous code, comes from a constant
stack0
that located in the "start.rs".
1 | // start.rs |
Define the stack0
as a u8
array with length
of 4096*NCPU
can safely reserve enough space for kernel
stack in each hart, after compiled, the stack0
will be
settled in the .rodata
section, with its address available
in the memory range.
Let's take a look about it (the binary kernel
is the
kernel output of xv6-rust):
1 | readelf -s kernel | grep stack0 |
The above content shows the stack0
has address
0x80015800
, with Ndx = 2
means the
stack0
is located at .rodata
section.
Basically the entry.S
only responsible for initialized
the stack pointer, and then jump to rust code directly.
Finally, please don't forget to update the program entry in the
entry.ld
, as well as the text section:
1 | ... ... |
And declare the entry.S
in main.rs
:
1 | // main.rs |
2. Machine -> Supervisor
No doubt that the last line of ASM call start
will bring
us to the start()
, and here is the core part of the start()
:
1 | // start.rs |
Actually, we cannot even call it a piece of "rust" code, because if
you clone the repo and go through the related code, you may find nearly
all functions here (like r_mstatus()
or
w_mepc()
) are wrappers to ASM code.
Almost all of the above functions are operate risc-v CSRs (control
and status registers), of course we could follow the risc-v
specification to learn the details about those CSRs (the entire Privileged
Specification with 166 pages only talks about the CSRs), but I'm
gonna post the following table to briefly introduce what they do in the
start()
.
CSRs are a group of registers that can only be accessed in privileged mode, such as machine mode or supervisor mode, those registers are capable of store status, or changing the system configurations, and can be read or written by CSR instructions.
Register | Name | Description |
---|---|---|
mstatus | Machine Status Register | The mstatus register keeps track of and controls the hart’s current
operating state. Here we only care about the MPP filed, which stores the previous privileged mode: M = 11; S = 01; U = 00; Back to the code, it sets the previous privileged mode from machine to supervisor. (Note that here the MPP filed only store the mode value, the privileged mode won't be switched immediately) |
mepc | Machine Exception Program Counter | When a trap is taken into M-mode, mepc is written with the virtual
address of the instruction that was interrupted or that encountered the
exception, and it may be explicitly written by software. So why here in the code, the function address of kmain been
written? It's highly connected with the mret instruction,
we will get back to this afterward. |
stap | Supervisor Address Translation and Protection | It controls supervisor-mode address translation and protection. And here we just set it to 0 for disable the virtual address translation. |
medeleg / mideleg | Machine Trap Delegation Registers | By default, all traps at any privilege level are handled in machine mode. In our code, both these registers are set as 0xffff to indicate that all traps will be delegated to handle on S-mode |
sie | Supervisor Interrupt Registers | In the code, the External / Timer / Software interrupts are all enabled on S-mode. |
pmpaddr0 / pmpcfg0 | Physical Memory Protection | These two register combined controlling the access permission across
a specific address range. Here, xv6 allows RWX permission on S-mode, across the range of 0~0x3ffffffffffff , that range
covers almost 1PiB address space. |
tp | Thread Pointer | tp is one of the general purpose registers, not part of
CSR. Obviously the name thread pointer indicates this register is a
thread local store register.Then it's easy for us to understand the code: store hart id into tp for quicker access. |
The instruction mret
is highly related to the
mepc
register, like we described in the above table.
mret
is called "Trap-Return Instructions", which is to
return from the trap.
Generally speaking, when any trap like interrupt or exception
happens, the instruction address where the trigger the trap, will be
stored in the xPC
register(like mepc
or
sepc
), then the program will be redirect to a trap handler
that related to the specific trap. Once the handler done its work, and
the program needs to return to the original location, it will need to
fetch the address from xPC
, and set program counter with
that, then jump back to the address.
mret
(and not surprisingly, there is a sret
too) does the whole process by only one instruction, besides, it will
also trigger the privileged mode switch, to the mode saved in the MPP
filed of mstatus
.
So I suppose you have understood the code logic here: at first set
the kmain
to mepc
, then do some work, at last
call mret
so that the program will jump to the
kmain
, while the privileged mode is switched to S-mode as
well.
How does risc-v deal with the privileged mode switch?
.... RISC-V Privileged Specification Chapter 1.2 ...
A hart normally runs application code in U-mode until some trap (e.g., a supervisor call or a timer interrupt) forces a switch to a trap handler, which usually runs in a more privileged mode. The hart will then execute the trap handler, which will eventually resume execution at or after the original trapped instruction in U-mode. Traps that increase privilege level are termed vertical traps, while traps that remain at the same privilege level are termed horizontal traps. The RISC-V privileged architecture provides flexible routing of traps to different privilege layers.
.... RISC-V Privileged Specification Chapter 1.2 ...
Generally, when a trap happens, the address of where the cause the trap will be saved in
mepc
orsepc
, regarding the current privileged mode. After trap handled by specific handler, it should call eithermret
orsret
to return to the previous mode, which is stored in theMPP
orSPP
filed of themstatus
.
3. We need UART
With the mret
is executed, the program is running into a
new file: main.rs
,
which is hard to tell if it's new, because we already have one, one not
exactly since we will introduce a new function kmain
to
replace our previous main
.
Don't be frightened by a lot of new functions that are called within
kmain
, we are not gonna need them currently, the only
functions we should pay our attention to are the
Uart::init()
and Console::init()
:
1 | // main.rs |
QEMU generic virtual platform for risc-v supports a "NS16550 compatible UART". According to the memory address mapping we talked about in the last chapter:
1 | qemu-system-riscv64 -monitor stdio |
UART address starts from 0x1000000
. And there are about
10 registers to config and control the UART (for more details refer to
the 16550
specification).
Let's go back to code. We can find all UART related code in the file
uart.rs
.
And basically Uart::init()
initializes the UART in the mode
of 8 bits + 38.4k baud rate + FIFO with interrupt.
In fact, after initialize, we could directly put or get chars by the following code:
1 | // uart.rs |
Let's have a quick test to print a "A" to the console:
1 | // main.rs |
1 | ... ... |
Awesome, we have printed the first letter! Since we can print a
letter, the printf!()
is around the corner.
3. printf!()
At last, we got here. So far we already output a letter "A" through UART, the next we simply need to create a printer and call UART inside to print.
Generally speaking, the only difference between UART with a printer is that the printer takes a format string rather than a character, which means the printer is on a higher abstraction level, and needs to conduct the preprocess of format string, to parse the format string to a standard string, and then crack down the string to characters.
Refer to the print.rs
,
the macro printf!()
receives the input arguments as the
"format_args":
1 | // print.rs |
"format_args" allow us to print a string with params, such as
printf!("This is a {}", "param")
.
The best part here is we don't need to do anything by ourselves to
parse the relatively complex arguments: "This is a {}"
and
"param"
. There is a rust trait
core::fmt::Write
takes care of all that stuff!
Let's go to the console.rs
:
1 | // console.rs |
The Write
trait implemented the function
write_fmt
by default, we only need to implement the
write_str
here and output the string that has already been
parsed correctly. The string can be outputted by calling the UART
function putc_sync
.
Finally, we could print something with printf!()
!
1 | // main.rs |
And the output:
1 | ... ... |
It works!