Wednesday, April 30, 2014

[OSDI] Read Frame Usage in Kernel Code

Intro

When a user-space process ask for memory spaces, the kernel may give out some "pages" which can be located by virtual addresses. However, kernel only gives out real physical memory space when the process start to write data on that address. That is, the pages are able to convert to a real page "frame" after their first write.
The conversion is done by looking up the page table. The page table is different from process to process, and it is stored in every process memory descriptor.

Structures

In Linux kernel, the structure and their hierarchies can be listed below:
  • task_struct: this describes a process
    • (char *) comm: name of the process
    • (struct mm_struct *) mm: memory descriptor of the process
      • (struct vm_area_struct *)mmap: start pointer for virtual memory areas
And, the virtual memory area of a process is a linked list structure in kernel code, so we start from the first one which is stored in task->mm.
  • vm_area_struct: a virtual memory area
    • vm_start: start virtual address
    • vm_end: end virtual address
    • vm_next: next vm_area_struct in the linked list, NULL if it's the last one

Process

To scan through all the process, and find the one we want:
struct task_struct *task;
for_each_process(task) {

    if( task == NULL )  continue;
    if( task->mm == NULL )  continue;

    if( strcmp(task->comm, "reclim-me")==0 ){
        ...
    }
}

To scan through the virtual memory areas of one mmap:

for( vma=mm->mmap ; vma!=NULL ; vma=vma->vm_next) { ... }

And, in real world, there are several layers for the page table, which means we have to look up one after one, and check if the address exists:

pgd = pgd_offset(mm, address);
if (!pgd_present(*pgd)) continue;

pud = pud_offset(pgd, address);
if (!pud_present(*pud)) continue;

pmd = pmd_offset(pud, address);
if (!pmd_present(*pmd)) continue;

pte = pte_offset_map(pmd, address);

ptl = pte_lockptr(mm, pmd);
spin_lock(ptl);
if (pte_present(*pte)) {
sum ++;
}
pte_unmap_unlock(pte, ptl);

Implementation

Finally, for this OSDI lab, we have to print out the number of frames of a process. To know the current answer, we can:
  • cat /proc/<pid>/statm | awk '{print $2}'
  • get_mm_rss(task->mm)
To get states of virtual memory areas:
  • cat /proc/<pid>/maps
However, we have to implement our own calculation in this lab, and here's the code:

By detecting the number of frames of the program "reclim-me", we learned that kernel gives out the memory only when the process starts to write.

Other

It seems that the TA mistyped "reclaim" into "reclim" which is a meaningless word; however, I follow the original code in my github at this point.

Reference