Create Your Own Kernel

In this article, we will create a simple kernel in C from scratch.

Introduction
 
You already know what kernel is.
 
The first part of writing an operating system is to write a bootloader in 16-bit assembly (real mode). A bootloader is a piece of program that runs on any operating system is running. It is used to boot other operating systems. Usually, each operating system has a set of bootloaders specific for it.

Go to the following link to create your own bootloader in 16-bit assembly.
Bootloaders generally select a specififc operating system and starts it's process and then operating system loads itself into memory.
If you are writing your own bootloader for loading a kernel you need to know the overall addressing/interrupts of memory as well as BIOS.
Mostly each operating system has specific bootloader for it.

There are lots of bootloaders available out there in online market.

But there are some proprietary bootloaders such as Windows Boot Manager for Windows operating systems or BootX for Apple's operating systems.

But there are lots of free and open source bootloaders.see the comparison,
  • https://en.wikipedia.org/wiki/Comparison_of_boot_loaders 
Among most famous is GNU GRUB - GNU Grand Unified Bootloader package from the GNU project for Unix like systems.
  • https://en.wikipedia.org/wiki/GNU_GRUB 
We will use GNU GRUB to load our kernel because it supports a multiboot of many operating systems.

Requirements
  • GNU/Linux - I am using GNU/Kali Linux 2017 i386 distribution .
  • Assembler - I am using GNU Assembler(gas) to instruct the bootloader for loading the starting point of our kernel.
  • GCC - GNU Compiler Collection a cross compiler. A newer version of GCC. I am using 7.2.0 version of GCC. The most important thing.
  • If you use old version you may face multiboot header not found error.
  • Xorriso - A package that creates, loads, manipulates ISO 9660 filesystem images.(man xorriso)
  • grub-mkrescue - Make a GRUB rescue image, this package internally calls the xorriso functionality to build an iso image.
  • QEMU - Quick EMUlator to boot our kernel in virtual machine without rebooting the main system.
Start 
 
Alright, writing a kernel from scratch is to print something on screen. So we have a VGA(Visual Graphics Array), a hardware system that controls the display.

For more details,
https://en.wikipedia.org/wiki/Video_Graphics_Array
  • VGA has a fixed amount of memory and addresssing is 0xA0000 to 0xBFFFF.
  • 0xA0000 for EGA/VGA graphics modes (64 KB)
  • 0xB0000 for monochrome text mode (32 KB)
  • 0xB8000 for color text mode and CGA-compatible graphics modes (32 KB)
First you need a multiboot bootloader file that instruct the GRUB to load it. Following fields must be define.
 
  • Magic :- A fixed hexadecimal number identified by the bootloader as the header(starting point) of the kernel to be loaded.
  • flags :- If bit 0 in the flags word is set, then all boot modules loaded along with the operating system must be aligned on page (4KB) boundaries.
  • checksum :- which is used by special purpose by bootloader and its value must be the sum of magic no and flags.
We don't need other information,

but for more details
  • https://www.gnu.org/software/grub/manual/multiboot/multiboot.pdf 
Ok lets write a GAS assembly code for above information. we dont need some fields as shown in above image.
 
boot.S
  1. # set magic number to 0x1BADB002 to identified by bootloader   
  2. .set MAGIC,    0x1BADB002  
  3.   
  4. # set flags to 0  
  5. .set FLAGS,    0  
  6.   
  7. # set the checksum  
  8. .set CHECKSUM, -(MAGIC + FLAGS)  
  9.   
  10. # set multiboot enabled  
  11. .section .multiboot  
  12.   
  13. # define type to long for each data defined as above  
  14. .long MAGIC  
  15. .long FLAGS  
  16. .long CHECKSUM  
  17.   
  18.   
  19. # set the stack bottom   
  20. stackBottom:  
  21.   
  22. # define the maximum size of stack to 512 bytes  
  23. .skip 512  
  24.   
  25.   
  26. # set the stack top which grows from higher to lower  
  27. stackTop:  
  28.   
  29. .section .text  
  30. .global _start  
  31. .type _start, @function  
  32.   
  33.   
  34. _start:  
  35.   
  36.   # assign current stack pointer location to stackTop  
  37.     mov $stackTop, %esp  
  38.   
  39.   # call the kernel main source  
  40.     call KERNEL_MAIN  
  41.   
  42.     cli  
  43.   
  44.   
  45. # put system in infinite loop  
  46. hltLoop:  
  47.   
  48.     hlt  
  49.     jmp hltLoop  
  50.   
  51. .size _start, . - _start  
We have defined a stack of size 512 bytes and managed by stackBottom and stackTop identifiers. Then in _start, we are storing a current stack pointer, and calling the main function of a kernel. As you know, every process consists of different sections such as data, bss, rodata and text. You can see the each sections by compiling the source code without assembling it.

e.g.: Run the following command,

gcc -S kernel.c

and see the kernel.S file.

And this sections requires a memory to store them, this memory size is provided by the linker image file. Each memory is aligned with the size of each block. It mostly require to link all the object files together to form a final kernel image.

Linker image file provides how much size should be allocated to each of the sections. The information is stored in the final kernel image. If you open the final kernel image(.bin file) in hexeditor, you can see lots of 00 bytes.

The linker image file consists of an entry point,(in our case it is _start defined in file boot.S) and sections with size defined in the BLOCK keyword aligned from how much spaced.

linker.ld 
  1. /* entry point of our kernel */  
  2. ENTRY(_start)  
  3.   
  4. SECTIONS  
  5. {  
  6.     /* we need 1MB of space atleast */  
  7.     . = 1M;  
  8.   
  9.     /* text section */  
  10.     .text BLOCK(4K) : ALIGN(4K)  
  11.     {  
  12.         *(.multiboot)  
  13.         *(.text)  
  14.     }  
  15.   
  16.     /* read only data section */  
  17.     .rodata BLOCK(4K) : ALIGN(4K)  
  18.     {  
  19.         *(.rodata)  
  20.     }  
  21.   
  22.     /* data section */  
  23.     .data BLOCK(4K) : ALIGN(4K)  
  24.     {  
  25.         *(.data)  
  26.     }  
  27.   
  28.     /* bss section */  
  29.     .bss BLOCK(4K) : ALIGN(4K)  
  30.     {  
  31.         *(COMMON)  
  32.         *(.bss)  
  33.     }  
  34.   
  35. }  
Now you need a configuration file that instruct the grub to load menu with associated image file

grub.cfg 
  1. menuentry "MyOS" {  
  2.     multiboot /boot/MyOS.bin  
  3. }  
Now let's write a simple HelloWorld kernel code.

kernel_1
  1. #ifndef _KERNEL_H_  
  2. #define _KERNEL_H_  
  3.   
  4. #define VGA_ADDRESS 0xB8000  
  5.   
  6. #define WHITE_COLOR 15  
  7.   
  8. typedef unsigned short UINT16;  
  9.   
  10. UINT16* TERMINAL_BUFFER;  
  11.   
  12. #endif  
Here we are using 16 bit, on my machine the VGA address is starts at 0xB8000 and 32 bit starts at 0xA0000. An unsigned 16 bit type terminal buffer pointer that points to VGA address.

It has 8*16 pixel font size.

see above image.

kernel.c
  1. #include"kernel.h"  
  2.   
  3. static UINT16 VGA_DefaultEntry(unsigned char to_print) {  
  4.     return (UINT16) to_print | (UINT16)WHITE_COLOR << 8;  
  5. }  
  6.   
  7. void KERNEL_MAIN()  
  8. {  
  9.   TERMINAL_BUFFER = (UINT16*) VGA_ADDRESS;  
  10.   
  11.   TERMINAL_BUFFER[0] = VGA_DefaultEntry('H');  
  12.   TERMINAL_BUFFER[1] = VGA_DefaultEntry('e');  
  13.   TERMINAL_BUFFER[2] = VGA_DefaultEntry('l');  
  14.   TERMINAL_BUFFER[3] = VGA_DefaultEntry('l');  
  15.   TERMINAL_BUFFER[4] = VGA_DefaultEntry('o');  
  16.   TERMINAL_BUFFER[5] = VGA_DefaultEntry(' ');  
  17.   TERMINAL_BUFFER[6] = VGA_DefaultEntry('W');  
  18.   TERMINAL_BUFFER[7] = VGA_DefaultEntry('o');  
  19.   TERMINAL_BUFFER[8] = VGA_DefaultEntry('r');  
  20.   TERMINAL_BUFFER[9] = VGA_DefaultEntry('l');  
  21.   TERMINAL_BUFFER[10] = VGA_DefaultEntry('d');  
  22. }  
The value returned by VGA_DefaultEntry() function is the UINT16 type with highlighting the character to print with white color. The value is stored in the buffer to display the characters on a screen.

First lets point our pointer TERMINAL_BUFFER to VGA address 0xB8000.

Now you have an array of VGA, you just need to assign specific value to each index of array according to what to print on a screen as we usually do in assigning the value to array.

See the above code that prints each character of HelloWorld on a screen.

Ok lets compile the source. type sh run.sh command on terminal.

run.sh
  1. #assemble boot.s file  
  2. as boot.s -o boot.o  
  3.   
  4. #compile kernel.c file  
  5. gcc -c kernel.c -o kernel.o -std=gnu99 -ffreestanding -O2 -Wall -Wextra  
  6.   
  7. #linking the kernel with kernel.o and boot.o files  
  8. gcc -T linker.ld -o MyOS.bin -ffreestanding -O2 -nostdlib kernel.o boot.o -lgcc  
  9.   
  10. #check MyOS.bin file is x86 multiboot file or not  
  11. grub-file --is-x86-multiboot MyOS.bin  
  12.   
  13. #building the iso file  
  14. mkdir -p isodir/boot/grub  
  15. cp MyOS.bin isodir/boot/MyOS.bin  
  16. cp grub.cfg isodir/boot/grub/grub.cfg  
  17. grub-mkrescue -o MyOS.iso isodir  
  18.   
  19. #run it in qemu  
  20. qemu-system-x86_64 -cdrom MyOS.iso  
the output is,
 
   
  
 
As you can see, it is a overhead to assign each and every value to VGA buffer, so we can write a function for that, which can print our string on a screen (means assigning each character value to VGA buffer from a string).
 
kernel_2

kernel.h
  1. #ifndef _KERNEL_H_  
  2. #define _KERNEL_H_  
  3.   
  4. #define VGA_ADDRESS 0xB8000  
  5.   
  6. #define WHITE_COLOR 15  
  7.   
  8. typedef unsigned short UINT16;  
  9.   
  10. int DIGIT_ASCII_CODES[10] = {0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38, 0x39};  
  11.   
  12. unsigned int VGA_INDEX;  
  13.   
  14. #define BUFSIZE 2200  
  15.   
  16. UINT16* TERMINAL_BUFFER;  
  17.   
  18. #endif  
DIGIT_ASCII_CODES are hexadecimal values of characters 0 to 9. we need them when we want to print them on a screen.
VGA_INDEX is the our VGA array index. VGA_INDEX is increased when value is assigned to that index.
BUFSIZE is the limit of our VGA.

Following function prints a string on a string by assigning each character to VGA.
  1. void printString(char *str)  
  2. {  
  3.   int index = 0;  
  4.   while(str[index]){  
  5.     TERMINAL_BUFFER[VGA_INDEX] = VGA_DefaultEntry(str[index]);  
  6.     index++;  
  7.     VGA_INDEX++;  
  8.   }  
  9. }  
To print an 32 bit integer, first you need to convert it into a string.
  1. int digitCount(int num)  
  2. {  
  3.   int count = 0;  
  4.   if(num == 0)  
  5.     return 1;  
  6.   while(num > 0){  
  7.     count++;  
  8.     num = num/10;  
  9.   }  
  10.   return count;  
  11. }  
  12.   
  13. void itoa(int num, char *number)  
  14. {  
  15.   int digit_count = digitCount(num);  
  16.   int index = digit_count - 1;  
  17.   char x;  
  18.   if(num == 0 && digit_count == 1){  
  19.     number[0] = '0';  
  20.     number[1] = '\0';  
  21.   }else{  
  22.     while(num != 0){  
  23.       x = num % 10;  
  24.       number[index] = x + '0';  
  25.       index--;  
  26.       num = num / 10;  
  27.     }  
  28.     number[digit_count] = '\0';  
  29.   }  
  30. }  
  31.   
  32. void printInt(int num)  
  33. {  
  34.   char str_num[digitCount(num)+1];  
  35.   itoa(num, str_num);  
  36.   printString(str_num);  
  37. }  
To print a new line, you have to skip some bytes in VGA pointer(TERMINAL_BUFFER) according to the pixel font size. For this we need another variable that stores the current Y th index.
  1. static int Y_INDEX = 1;  
  2.   
  3. void printNewLine()  
  4. {  
  5.   if(Y_INDEX >= 55){  
  6.     Y_INDEX = 0;  
  7.     Clear_VGA_Buffer(&TERMINAL_BUFFER);  
  8.   }  
  9.   VGA_INDEX = 80*Y_INDEX;  
  10.   Y_INDEX++;  
  11. }  
And in KERNEL_MAIN(), just call the functions,
  1. void KERNEL_MAIN()  
  2. {  
  3.   TERMINAL_BUFFER = (UINT16*) VGA_ADDRESS;  
  4.   printString("Hello World!");  
  5.   printNewLine();  
  6.   printInt(1234567890);  
  7.   printNewLine();  
  8.   printString("GoodBye World!");  
  9. }  
  
 
As you can see it is the overhead to call each and every function for displaying values, that's why C programming provides a printf() function with format specifiers which print/set specific value to standard output device with each specifier with literals such as \n \t \r etc. 
 
kernel_3
 
VGA provides 15 colors,
 
      BLACK = 0,
      BLUE = 1,
      GREEN = 2,
      CYAN = 3,
      RED = 4,
      MAGENTA = 5,
      BROWN = 6,
      LIGHT_GREY = 7,
      DARK_GREY = 8,
      LIGHT_BLUE = 9,
      LIGHT_GREEN = 10,
      LIGHT_CYAN = 11,
      LIGHT_RED = 12,
      LIGHT_MAGENTA = 13,
      YELLOW = 14,
      WHITE = 15,

Just change the function name VGA_DefaultEntry() to some another with UINT8 type of color parameter with replacing the WHITE_COLOR to it.

For keyboard interrupt, you have inX function provided by gas, where X could be byte,word,dword or long etc. The BIOS keyboard interrupt value is 0x60, which is in bytes, passed to the parameter as to inb instruction.
  1. UINT8 IN_B(UINT16 port)  
  2. {  
  3.   UINT8 ret;  
  4.   asm volatile("inb %1, %0" :"=a"(ret) :"Nd"(port) );  
  5.   return ret;  
  6. }  
We can also create a simple linked list data structure, as a starting point of an file system. let's say we have following record,
  1. typedef struct list_node{  
  2.   int data;  
  3.   struct list_node *next;  
  4. }LIST_NODE;  
but we need memory to allocate this block because there is no malloc() function exists. Instead we use a memory address assigning to pointer to structure for storing this data block. well you can use any memory address but not those addresses who are used for special purposes.
 
0x00000000 - 0x000003FF - Real Mode Interrupt Vector Table
0x00000400 - 0x000004FF - BIOS Data Area
0x00000500 - 0x00007BFF - Unused
0x00007C00 - 0x00007DFF - Our Bootloader
0x00007E00 - 0x0009FFFF - Unused
0x000A0000 - 0x000BFFFF - Video RAM (VRAM) Memory
0x000B0000 - 0x000B7777 - Monochrome Video Memory
0x000B8000 - 0x000BFFFF - Color Video Memory
0x000C0000 - 0x000C7FFF - Video ROM BIOS
0x000C8000 - 0x000EFFFF - BIOS Shadow Area
0x000F0000 - 0x000FFFFF - System BIOS
 
In above addresses range, 0x00000500 - 0x00007BFF or 0x00007E00 - 0x0009FFFF can be used to store our linked list data. You can access the whole memory(RAM) if you know the limit of it or can be stored in a stack.

Download the source code.

So here's a function that return a allocated LIST_NODE memory block with starting at address 0x00000500, 
  1. LIST_NODE *getNewListNode(int data)  
  2. {  
  3.   LIST_NODE *newnode = (LIST_NODE*)0x00000500 + MEM_SIZE;  
  4.   newnode->data = data;  
  5.   newnode->next = NULL;  
  6.   MEM_SIZE += sizeof(LIST_NODE);  
  7.   return newnode;  
  8. }  
 
  
 
References 
  • http://wiki.osdev.org/Expanded_Main_Page 
  • http://www.brokenthorn.com/ 
  • http://mikeos.sourceforge.net/