TL; DR. This will be a series regarding labs I gave during the spring 2022 semester.
The reason why I am writing this down is that it has been a week and no students have asked for the solution for the last Lab. I realize that the learning gap between students is huge, especially when a non-profit university is admitting more and more students. To help all students understand the concepts of modern OS, I decided to write this post.
It starts with the past lab content I have (as the skeleton), and will be amended with extra materials I think it helps.
Remember, it’s for help in learning. DON’T COPY & PASTE CODE!
Lab1-Introduction
Lab3-Process
Lab4-IPC
Lab6-7-Scheduling
First user process in xv6
Kernel works
In xv6, as the same as conventional Linux OS, the very first user-level process is init
. Before init
’s running, all the OS bootstraps happen in a highly privileged mode(kernel level).
Xv6’s kernel has the entry point as the main function located in the file main.c
. The main function invokes 17 functions to set up kernel page tables, interrupt handlers, I/O devices and etc. When all kernel preparations are done, by calling the function userinit()
, the kernel will boot up process init.
int main(void) { kinit1(end, P2V(4*1024*1024)); // phys page allocator kvmalloc(); // kernel page table mpinit(); // collect info about this machine lapicinit(); seginit(); // set up segments cprintf("\ncpu%d: starting xv6\n\n", cpu->id); picinit(); // interrupt controller ioapicinit(); // another interrupt controller consoleinit(); // I/O devices & their interrupts uartinit(); // serial port pinit(); // process table tvinit(); // trap vectors binit(); // buffer cache fileinit(); // file table ideinit(); // disk if(!ismp) timerinit(); // uniprocessor timer startothers(); // start other processors kinit2(P2V(4*1024*1024), P2V(PHYSTOP)); // must come after startothers() userinit(); // first user process // Finish setting up this processor in mpmain. mpmain(); }
It’s tricky since that init is a user process, but kernel can’t call any user-level system calls to create it.
Why? 1. Kernel has all privileges to create a user process. So it doesn’t need to call system calls such as fork().
And 2. All other user processes can be created by forking from its parent.
Forking including clone the whole user virtual memory layout. However, the first process has no parent to fork from.
That’s why it makes the creation of the first user process becomes so unique.
In proc.c, userinit() define there gives us the whole procedure of creating init.
Similar to the fork(), but more simple.
Process control block(structures for storing the process status) was created at the very first by calling allocproc().
After then, by invoking setupkvm()(defined in vm.c), kernel memory map was setup for the process.
During setting up kernel memory map, a page size virtual memory will be assigned to the process as ready.
And later, this page size memory will be used to store instructions of init.
Followed by setup kernel stack for the init process, calling inituvm() will load init‘s text into the page that is just being allocated.
inituvm() takes 3 arguments: a pointer to the process’s page directory (p->pgdir),
a char-type pointer declared from external which point to init‘s text segment(_binary_initcode_start), and
a char-type pointer which points to an external integer as the size of the init‘s text segment(_binary_initcode_size).
Simply put, it will load instructions of init into the memory.
So now, the problem becomes when and where did instructions for init have compiled into the kernel?
void userinit(void) { struct proc *p; extern char _binary_initcode_start[], _binary_initcode_size[]; p = allocproc(); initproc = p; if((p->pgdir = setupkvm()) == 0) panic("userinit: out of memory?"); inituvm(p->pgdir, _binary_initcode_start, (int)_binary_initcode_size); p->sz = PGSIZE; memset(p->tf, 0, sizeof(*p->tf)); p->tf->cs = (SEG_UCODE << 3) | DPL_USER; p->tf->ds = (SEG_UDATA << 3) | DPL_USER; p->tf->es = p->tf->ds; p->tf->ss = p->tf->ds; p->tf->eflags = FL_IF; p->tf->esp = PGSIZE; p->tf->eip = 0; // beginning of initcode.S safestrcpy(p->name, "initcode", sizeof(p->name)); p->cwd = namei("/"); p->state = RUNNABLE; }
Where the user-level code was integrated?
If you search the keyword “_binary_initcode_start” in the source code, you can’t find any references.
The clue comes from the Makefile.
In the makefile, initcode is a prerequisites to compile the kernel image.
Step 1: Before kernel was compiled, initcode.S was first compiled to a runnable binary initcode.
This binary was very odd because it was not supposed to let any other OS to run it.
Initcode.s was first compiled without any standard including, and generating the intermediate file initcode.o.
Step 2: Initcode.o then linked to Initcode.out with two uncommon settings.
First it specify the entry of this binary file as when “start” symbol points to.
This “start” symbol was declared in the assembly code.
Second it specify a absolute address(0) for the text segments.
By doing this, text segments will be placed at the start of the binary file (except the header of the ELF)[^ldman].
Step 3: Initcode.out is already a minimized binary but it’s not enough.
That’s why when using objcopy to copy it to the file initcode, it further strip all headers and debug information[^objcopyman].
At this point, we have a minimal binary file initcode.
From the first byte of this file, it’s only includes runnable instructions.
And the size of the file is only 44 bytes.
initcode: initcode.S $(CC) $(CFLAGS) -nostdinc -I. -c initcode.S # Step 1 $(LD) $(LDFLAGS) -N -e start -Ttext 0 -o initcode.out initcode.o # Step 2 $(OBJCOPY) -S -O binary initcode.out initcode # Step 3 $(OBJDUMP) -S initcode.o > initcode.asm
This binary later were appended to the kernel using following commands.
And during this appending, 3 symbols were generated and added to the symbol table of the kernel[^ldman].
“_binary_initcode_start” contains the address of where the initcode segment was appended to.
“_binary_initcode_end” contains the address of where the initcode segment was ended at.
“_binary_initcode_size” is a *ABS* type symbol with value 0x2C(45) that specify the size of the initcode segment is 45 bytes.
kernel: $(OBJS) entry.o entryother initcode kernel.ld $(LD) $(LDFLAGS) -T kernel.ld -o kernel entry.o $(OBJS) -b binary initcode entryother # <- This Line $(OBJDUMP) -S kernel > kernel.asm $(OBJDUMP) -t kernel | sed '1,/SYMBOL TABLE/d; s/ .* / /; /^$/d' > kernel.sym
In short summary,
using objdump, we can verify that source code initcode.S has been compiled and loaded into the kernel.
Also the segment of initcode’s instructions was located by the pointer “_binary_initcode_start”.
That’s explain when calling inituvm(p->pgdir, _binary_initcode_start, (int)_binary_initcode_size);,
functionalities implemented in initcode.S will be loaded into the runtime of the first process within xv6.
# Header of the file kernel kernel: file format elf32-i386 kernel architecture: i386, flags 0x00000112: EXEC_P, HAS_SYMS, D_PAGED start address 0x0010000c Program Header: LOAD off 0x00001000 vaddr 0x80100000 paddr 0x00100000 align 2**12 filesz 0x00008c6a memsz 0x00008c6a flags r-x ... Sections: Idx Name Size VMA LMA File off Algn 0 .text 00008586 80100000 00100000 00001000 2**2 CONTENTS, ALLOC, LOAD, READONLY, CODE ... SYMBOL TABLE: ... 8010b50c g .data 00000000 _binary_initcode_end ... 8010b4e0 g .data 00000000 _binary_initcode_start ... 0000002c g *ABS* 00000000 _binary_initcode_size ...
User-level code
Take a look of content in the initcode.S, you will find the code can explain itself well.
There are no other jobs but just calling system call exec to run a user-level binary “init”.
Initcode.S:
# Initial process execs /init. #include "syscall.h" #include "traps.h" # exec(init, argv) .globl start start: pushl $argv pushl $init pushl $0 // where caller pc would be movl $SYS_exec, %eax int $T_SYSCALL # for(;;) exit(); exit: movl $SYS_exit, %eax int $T_SYSCALL jmp exit # char init[] = "/init\0"; init: .string "/init\0" # char *argv[] = { init, 0 }; .p2align 2 argv: .long init .long 0
The “init” mentioned above is not a pure user-level binary executable that compiled from the source code init.c.
Within init.c, a file named console will be created at the runtime for saving standard outputs and errors.
Then it will forked a child process(the second user process), and let it run program “sh”.
“sh” is the xv6’s default shell, a user-level program that generated from source sh.c.
After the shell boots up, you can interactive with the xv6.
This’s how first process (and second process) was started in the xv6.
init.c:
// init: The initial user-level program #include "types.h" #include "stat.h" #include "user.h" #include "fcntl.h" char *argv[] = { "sh", 0 }; int main(void) { int pid, wpid; if(open("console", O_RDWR) < 0){ mknod("console", 1, 1); open("console", O_RDWR); } dup(0); // stdout dup(0); // stderr for(;;){ printf(1, "init: starting sh\n"); pid = fork(); if(pid < 0){ printf(1, "init: fork failed\n"); exit(); } if(pid == 0){ exec("sh", argv); printf(1, "init: exec sh failed\n"); exit(); } while((wpid=wait()) >= 0 && wpid != pid) printf(1, "zombie!\n"); } }
[^ldman]: ld(1) – Linux man page
[^objcopyman]: 3 objcopy – binutils mannual
Xv6’s round robin schduler
The Scheduler is the core of an operating system.
With the scheduling of processes, the kernel can achieve near-real-time execution of multiple workloads.
The scheduling problem is also an active aspect of computer science research.
You can’t have one algorithm to fit all scenarios.
Xv6 by default has a round-robin scheduler.
It’s controlled using two-level for-loops, where the top-level for-loop is an endless loop that will keep the scheduler busy running.
The second-level nested for-loop will iterate a data structure named Ptable where all control information for processes is stored.
Information including pid, process name, etc. is stored in a structure called proc. Ptable is an array of processes.
Every runnable process in the Ptable will run strictly 1 time tick until the for-loop reached the last process in the Ptable.
Then it will loop back to the top-level for-loop for the next iteration of processes.
// In file proc.c struct { struct spinlock lock; struct proc proc[NPROC]; } ptable; // In file proc.h struct proc { uint sz; // Size of process memory (bytes) pde_t* pgdir; // Page table char *kstack; // Bottom of kernel stack for this process enum procstate state; // Process state int pid; // Process ID struct proc *parent; // Parent process struct trapframe *tf; // Trap frame for current syscall struct context *context; // swtch() here to run process void *chan; // If non-zero, sleeping on chan int killed; // If non-zero, have been killed struct file *ofile[NOFILE]; // Open files struct inode *cwd; // Current directory char name[16]; // Process name (debugging) };
// In file proc.c void scheduler(void) { struct proc *p; for(;;){ // Enable interrupts on this processor. sti(); // Loop over process table looking for process to run. acquire(&ptable.lock); for(p = ptable.proc; p < &ptable.proc[NPROC]; p++){ if(p->state != RUNNABLE) continue; // Switch to chosen process. It is the process's job // to release ptable.lock and then reacquire it // before jumping back to us. proc = p; switchuvm(p); p->state = RUNNING; swtch(&cpu->scheduler, proc->context); switchkvm(); // Process is done running for now. // It should have changed its p->state before coming back. proc = 0; } release(&ptable.lock); } }
It’s not hard to understand why this logic makes a round-robin manner.
This is very important to understand how to pick a process to run because scheduling is about always picking the appropriate process to achieve higher performance.
You can always come up with some new ideas for designing a good scheduler policy.
Understanding how to switch from one process to another is equivalently important.
Once the process for the next time tick is selected.
It’s time to switch from the running scheduler to the selected process. Wait for a second, there are two questions we haven’t answered.
- What is the running scheduler?
- How did the last running process stop running and give the CPU back to the scheduler?
Leave a Reply