XV6 Series: MIT Labs 2024

Write-ups for Lab: Page tables

Published

January 5, 2025

Inspect a user-process page table

General knowledge:

Since xv6 is running on Sv39 RISC-V, only the bottom 39 bits of 64-bit virtual address are used; the top 25 bits are not used. Within 39 bits, there are 27 bits are used for Page Table Entry (PTE) index; the other 12 bits are used for the offset.

Each PTE contains a 44-bit physical page number (PPN) and 10-bit flags.

Given the PTE value, we can calculate the Physical Page Number (PPN) by the following:

#define PTE2PA(pte) (((pte) >> 10) << 12)

Right shift 10 bits to remove the flags
Left shift 12 bits to align with the physical address where the bottom 12 bits are used for offset.

Given the PTE value, we can get the PTE Flags by the following:

#define PTE_FLAGS(pte) ((pte) & 0x3FF)

By using bit-wise AND with 0x3FF (001111111111), we can zero all the bits in the PTE except for the first 10 bits, representing the FLAGS.

A typical PTE in RISC-V has the following format:

Bit Position	Name	Description
0	V	Indicates if the PTE is valid.
1	R	Page is readable.
2	W	Page is writable.
3	X	Page is executable.
4	U	Accessible in user mode.
5	G	Shared across all address spaces.
6	A	Set by hardware on the first access.
7	D	Set by hardware on the first write.
8-9		Reserved for future use.
10-53	PPN	Physical page number (mapping to memory).
54-63		Reserved for future use.

Solution:

Approach:

Using the PTE format to decode the permission.
Using the order of virtual address to determine the page content since they are allocated in order.

PTE entry	Content	Permission
va 0x0 pte 0x21FCD85B pa 0x87F36000 perm 0x5B	text	01011011 = AUXRV
va 0x1000 pte 0x21FD1417 pa 0x87F45000 perm 0x17	data segment	00010111 = UWRV
va 0x2000 pte 0x21FD1007 pa 0x87F44000 perm 0x7	guard page	0111 = WRV
va 0x3000 pte 0x21FD40D7 pa 0x87F50000 perm 0xD7	stack	11010111 = DAUWRV
va 0x4000 pte 0x0 pa 0x0 perm 0x0	unused
va 0x5000 pte 0x0 pa 0x0 perm 0x0	unused
va 0x6000 pte 0x0 pa 0x0 perm 0x0	unused
va 0x7000 pte 0x0 pa 0x0 perm 0x0	unused
va 0x8000 pte 0x0 pa 0x0 perm 0x0	unused
va 0x9000 pte 0x0 pa 0x0 perm 0x0	unused
va 0xFFFF6000 pte 0x0 pa 0x0 perm 0x0	unused
va 0xFFFF7000 pte 0x0 pa 0x0 perm 0x0	unused
va 0xFFFF8000 pte 0x0 pa 0x0 perm 0x0	unused
va 0xFFFF9000 pte 0x0 pa 0x0 perm 0x0	unused
va 0xFFFFA000 pte 0x0 pa 0x0 perm 0x0	unused
va 0xFFFFB000 pte 0x0 pa 0x0 perm 0x0	unused
va 0xFFFFC000 pte 0x0 pa 0x0 perm 0x0	unused
va 0xFFFFD000 pte 0x0 pa 0x0 perm 0x0	unused
va 0xFFFFE000 pte 0x21FC90C7 pa 0x87F24000 perm 0xC7	trapframe	11000111 = DAWRV
va 0xFFFFF000 pte 0x2000184B pa 0x80006000 perm 0x4B	trampoline	01001011 = AXRV

Speed up system calls

General knowledge:

System calls usually require a context switch from user mode to kernel mode, which is expensive. To speed up the certain system calls, we use a technique called virtual dynamic shared object (vDSO). vDSO is a kernel mechanism for exporting a carefully selected set of kernel space routines to user space applications so that applications can call these kernel space routines in-process, without incurring the performance penalty of a mode switch from user mode to kernel mode that is inherent when calling these same kernel space routines by means of the system call interface.

Besides, we should also understand how xv6 allocates pages to the process and how the virtual memory mapping to the physical memory.

TBU: explain those concepts.

Solution:

ugetpid() is declared in the file user/user.h, thus any user program can call it.
Also, the definition of ugetpid() is defined in the user/ulib.c file as follows:

int
ugetpid(void)
{
  struct usyscall *u = (struct usyscall *)USYSCALL;
  return u->pid;
}

From the definition, we can see that the pid is stored in the usyscall structure, which is defined in kernel/memlayout.h as follows:

struct usyscall {
  int pid;  // Process ID
};

The problem description suggests that: > When each process is created, map one read-only page at USYSCALL. At the start of this page, store a struct usyscall

The USYSCALL is defined in the kernel/memlayout.h as follows:

#define USYSCALL (TRAPFRAME - PGSIZE)

That means the struct usyscall is stored in the page right after the TRAPFRAME page.

To do that, we have to deep into the function where the memory is allocated for the process. In this case, it is the alloproc function defined in kernel/proc.c.

static struct proc*
allocproc(void) {
  ...
}

The allocproc uses the function kalloc to allocate 4096-byte page in physical memory to process’s component. For example, this is the line where a single page in physical memory is allocated for the process’s trapframe.

p->trapframe = (struct trapframe *)kalloc()

After this line, the p->trapframe contains the memory of the page a.k.a. It plays as a pointer to the page in physical memory.

Thus, to allocate a page for the struct usyscall, we can use the following code:

if((p->usyscall = (struct usyscall *)kalloc()) == 0){
  freeproc(p) ;
  release(&p->lock);
  return 0;
}

We also assign the process ID to the usyscall structure as follows:

p->usyscall->pid = p->pid;

Then we map the physical address to virtual address of the process via the function proc_pagetable defined in the file kernel/proc.c.

 if(mappages(pagetable, USYSCALL, PGSIZE,
              (uint64)(p->usyscall), PTE_R | PTE_U) < 0) {
      uvmunmap(pagetable, TRAMPOLINE, 1, 0);
      uvmunmap(pagetable, TRAPFRAME, 1, 0);
      uvmfree(pagetable, 0);
      return 0;
  }

The function mappages is used to create a pagetable entry for the given physical address.

After the mapping is successful, the system can use the pagetable to convert the virtual address into physical address. That it’s. The rest of the code is to clean up the memory when freeing the process, which is defined in the freeproc function.

static void
freeproc(struct proc *p)
{
  ...
  if(p->usyscall)
    kfree((void*)p->usyscall);
  ...
}

Print a page table

General knowledge:

Xv6 uses a three-level pagetable. For each level, there are 2^9 (512) entries. The first and second level entries contain the physical addresses for page-table pages in the lower level, while the lowest level entries contain the physical page number (PPN).

The pagetable entries are stored sequentially in the memory as follows:

| Level2_Entry0 | Level1_Entry0 | Level0_Entry0 | ... | Level2_Entry1 | Level1_Entry0 | Level0_Entry0 | ...

Thus, the gap between 2 PTE varies by level: - Level 2 entries: 512 * 512 * PGSIZE - Level 1 entries: 512 * PGSIZE - Level 0 entries: PGSIZE

The virtual address of the first pagetable entry is 0, which is set in the uvmfirst function.

void
uvmfirst(pagetable_t pagetable, uchar *src, uint sz)
{
  char *mem;

  if(sz >= PGSIZE)
    panic("uvmfirst: more than a page");
  mem = kalloc();
  memset(mem, 0, PGSIZE);
  mappages(pagetable, 0, PGSIZE, (uint64)mem, PTE_W|PTE_R|PTE_X|PTE_U);
  memmove(mem, src, sz);
}

Solution:

void
vmprint_helper(pagetable_t pagetable, uint64 level, uint64 va) {
  uint64 sz = 0;
  if (level == 2) sz = 512 * 512 * PGSIZE; 
  else if (level == 1) sz = 512 * PGSIZE;
  else sz = PGSIZE;
  for (int i = 0; i < 512; i++, va += sz) {
    pte_t pte = *(pagetable + i); // Dereference the pagetable pointer to get the pagetable entry content
    if ((pte & PTE_V) == 0) continue;
    for (int j = 0; j < 3 - level; ++j) printf(" ..");
    printf("%p: pte %p pa %p\n", (void *) va, (void *) pte, (void *) PTE2PA(pte));
    if (PTE_LEAF(pte) == 0) // if it is not the leave
        vmprint_helper((void *) PTE2PA(pte), level - 1, va);
  }
}