Accessing User Memory

As part of a system call, the kernel must often access memory through pointers provided by a user program. The kernel must be very careful about doing so, because the user can pass a null pointer, a pointer to unmapped virtual memory, or a pointer to kernel virtual address space (above PHYS_BASE). All of these types of invalid pointers must be rejected without harm to the kernel or other running processes, by terminating the offending process and freeing its resources.

There are at least two reasonable ways to do this correctly:

  • verify the validity of a user-provided pointer, then dereference it. If you choose this route, you’ll want to look at the functions in src/userprog/pagedir.c and in src/threads/vaddr.h. This is the simplest way to handle user memory access. Use the function is_user_vaddr (declared in src/threads/vaddr.h) to check whether a pointer refers to user memory. You can use the function pagedir_get_page (declared in src/userprog/pagedir.h) to check whether a given address is mapped in user space.
  • check only that a user pointer points below PHYS_BASE, then dereference it. An invalid user pointer will cause a “page fault” that you can handle by modifying the code for page_fault() in src/userprog/exception.c. This technique is normally faster because it takes advantage of the processor’s memory management unit (MMU), so it tends to be used in real kernels (including Linux).

In either case, you need to make sure not to “leak” resources. For example, suppose that your system call has acquired a lock or allocated memory with malloc(). If you encounter an invalid user pointer afterward, you must still be sure to release the lock or free the page of memory. If you choose to verify user pointers before dereferencing them, this should be straightforward. It’s more difficult to handle if an invalid pointer causes a page fault, because there’s no way to return an error code from a memory access. Therefore, for those who want to try the latter technique, we’ll provide a little bit of helpful code:

/* Reads a byte at user virtual address 'uaddr'. 'uaddr' must be below PHYS_BASE.
   Returns the byte value if successful, -1 if a segfault occurred. */
static int get_user (const uint8_t *uaddr) 
{
    int result;
    asm ("movl $1f, %0; movzbl %1, %0; 1:"
    : "=&a" (result) : "m" (*uaddr));
    return result;
}

/* Writes 'byte' to user address 'udst'. 'udst' must be below PHYS_BASE.
   Returns true if successful false if a segfault occurred. */
static bool put_user (uint8_t *udst, uint8_t byte) 
{
    int error_code;
    asm ("movl $1f, %0; movb %b2, %1; 1:"
    : "=&a" (error_code), "=m" (*udst) : "q" (byte));
    return error_code != -1;
}

Each of these functions assumes that the user address has already been verified to be below PHYS_BASE. They also assume that you’ve modified page_fault() so that a page fault in the kernel merely sets %eax to 0xffffffff and copies its former value into %eip.

If you do choose to use the second option (rely on the processor’s MMU to detect bad user pointers), do not feel pressured to use the get_user and put_user functions from above. There are other ways to modify the page fault handler to identify and terminate processes that pass bad pointers as arguments to system calls, some of which are simpler and faster than using get_user and put_user to handle each byte.