Wednesday, May 7, 2014

[OSDI] Adding Snapshot Feature for Block Devices in Kernel Code

It's the 8th lab for the OSDI course. This one is fun and interesting. We are asked to write a snapshot taking code in kernel code, so we can go back to the past status of a block device.

How to Implement

We are adding two new ioctl numbers to the block devices. One is SNAPSHOT, and another is ROLLBACK. By calling SNAPSHOT, the kernel code switch to the snapshot mode, and things are written into shadow pages instead of normal pages. By calling ROLLBACK, the shadow pages are removed, and we are back to the previous stage (snapshot).

Edit drivers/block/brd.c

I am editing the kernel code on version 2.6.32.60 (original here)

Global and Macros

  • pick not-used ioctl (I/O control) number for our custom ioctl number
  • MASK for distinguishing normal pages and shadow pages in page->index. I am using the most significant bit. If 1, it's a shadow page; otherwise, a normal page.
    • #define SHADOW_MASK (1 << ((sizeof(pgoff_t))*8-1))

Modify brd_lookup_page

Add the following to handle page lookup actions when snapshot is enabled:
    if( snapshot_enable ) {
        // is read request? yes
        // read from shadow page if exist
        rcu_read_lock();
        idx = (sector >> PAGE_SECTORS_SHIFT) | SHADOW_MASK; // shadow
        page = radix_tree_lookup(&brd->brd_pages, idx);
        rcu_read_unlock();

        BUG_ON(page && page->index != idx);
        if( page ){
            printk("[osdi] one shadow page is created\n");
            return page;
        }
    }
This means that when snapshot is enabled, we return its shadow page if exists.

Modify brd_insert_page

Add the following code at the beginning:
if (!snapshot_enable && page)   return page;
if (snapshot_enable && page && ((page->index & SHADOW_MASK)!=0))    return page;

And, add the following line before "radix_tree_insert":
if (snapshot_enable)    idx |= SHADOW_MASK;
So, this means that if it's under snapshot mode, we should always insert pages into shadow pages (if there's no a shadow page for it so far, then create one). If it's not under snapshot mode, then follow the old rules.

Modify brd_ioctl

Make sure that we detect the new ioctl numbers and call their handlers.

static int brd_ioctl(struct block_device *bdev, fmode_t mode,
                        unsigned int cmd, unsigned long arg)
{
    int error;
    struct brd_device *brd = bdev->bd_disk->private_data;

    if (cmd == BLKFLSBUF) {
            ...
        return error;
    }
    else if (cmd == SNAPSHOT) {
        error = snapshot_handler();
        return error;
    }
    else if (cmd == ROLLBACK) {
        error = rollback_handler(bdev);
        return error;
    }
    return -ENOTTY;
}

SNAPSHOT & ROLLBACK Handlers

static int snapshot_handler() {
    snapshot_enable = TRUE;
    return 0;
}

static int rollback_handler(struct block_device *bdev) {
    int error = 0;
    struct brd_device *brd = bdev->bd_disk->private_data;

    snapshot_enable = FALSE;
    brd_free_shadow_pages(brd);

    if(error)   printk("[osdi] error = %d\n", error);
    return error;
}

* brd_free_shadow_pages is almost as same as brd_free_pages but adding "if( (pages[i]->index & SHADOW_MASK)==0 )" in the beginning of the for loop.

Full Code ( drivers/block/brd.c )

In this codepad, please let me know if the link is not working.

User Programs

To control our devices, we need to write our user programs. Here, we will use /dev/ram0 as our block device.

SNAPSHOT.c

#include <unistd.h>
#include <sys/ioctl.h>
#include <fcntl.h>

#include <stdio.h>

#define SNAPSHOT _IO(0x10, 30)

int main() {

    int fb;

    fb = open("/dev/ram0", O_RDWR);
    if( fb<0 ) {
        printf("can't access the device\n");
        return -1;
    }
    ioctl(fb, SNAPSHOT, 0);
    close(fb);

    return 0;
}

ROLLBACK.c

#include<unistd.h>
#include<sys/ioctl.h>
#include<fcntl.h>

#include <stdio.h>

#define ROLLBACK _IO(0x10, 31)

int main() {

    int fb;

    fb = open("/dev/ram0", O_RDWR);
    if( fb<0 ) {
        printf("can't access the device\n");
        return -1;
    }
    ioctl(fb, ROLLBACK, 0);
    close(fb);

    return 0;
}

Makefile

CC=gcc
all: ROLLBACK.c SNAPSHOT.c
        -$(CC) ROLLBACK.c -o ROLLBACK
        -$(CC) SNAPSHOT.c -o SNAPSHOT

Run & Test

  • sudo mke2fs -m 0 /dev/ram0 # format the ram partition to ext2
  • sudo mkdir /ramdisk/
  • sudo mount /dev/ram0 /ramdisk/
  • sudo mount # check if it's mounted successfully
  • sudo mkdir /ramdisk/TEST1
  • sudo umount /ramdisk/
We've created a directory called "TEST1", and umount the ram disk. It's time to take a snapshot!
  • sudo ./SNAPSHOT
  • sudo mount /dev/ram0 /ramdisk/
  • sudo mkdir /ramdisk/TEST2
  • sudo umount /ramdisk/
Now, we've create the second directory called "TEST2", and it's time to test rollback.
  • sudo ./ROLLBACK
  • sudo mount /dev/ram0 /ramdisk/
  • ls /ramdisk 
In the last command, we should only see the folder "TEST1" without "TEST2". So, the snapshot works!

Question Raised

Q: Why we always umount the device when running our SNAPSHOT or ROLLBACK user program in this lab?
A: Since we are checking if the device is being used or not in ROLLBACK, we should umount before asking the device to ROLLBACK.