Building an OS from Scratch
This article is an intro to OS development. It is a brief overview of the development process I went through: where to start, what needs to be done, and what are the challenges. In the next article, I will get into more details.
I assume you are a developer or a researcher with a basic understanding of OS internals and computer hardware.
How do we start?
Before we start with the software we need to understand the hardware that executes our software, specifically, we need to understand how the hardware bootstraps and how it expects to find the software. We are going to use Intel’s x86 architecture. Although there are many CPU architectures, the concepts described in the article are similar.
When we power on our PC, the firmware of the motherboard (aka BIOS or UEFI) is executed. It checks the hardware (POST) and looks for the software to run next - the bootloader. The BIOS looks for a bootloader in hardware devices (e.g., hard disk, CD, network). Then it loads the first bootloader it finds into the memory and executes it. The main goal of the bootloader is to load the OS. It changes the CPU’s mode from real-mode which is a very limited mode of the CPU (for example, we have access to only ~1MB of memory) to protected-mode, which is intended for running the OS. Then, it loads the OS. We could write our own bootloader, but I preferred to focus on the OS so I used an existing bootloader - GRUB.
As the OS, we control everything and we can implement anything we want in any way we want. But that comes with a big cost: we have to work very hard for everything since we cannot rely on any dependency: We don’t have drivers: no network, filesystem, monitor, or keyboard. We can’t use 3rd-party libraries, not even the C standard library. We can’t even use a standard C compiler. There is no user-space, everything is now kernel-space. All we have is RAM, CPU and its assembly instruction set. That’s it.
The first thing we want to do is to be able to write code in C instead of in assembly. For that, we need a compiler. The compiler assumes the software it compiles is going to run on the same machine (with the same OS and CPU architecture). We can configure the compiler not to use any of the current machine’s dependencies, but it’s prone to problems. The solution for that is to compile a compiler (I had to wait ~30 minutes for it to compile).
To do something interesting, we need to interact with more hardware devices such as the monitor, the NIC, and the keyboard. To do that, we need to write some drivers. A driver is a software that communicates directly with hardware devices.
There are 2 ways for communicating with devices:
- Mapped memory: a pre-defined address space with a pre-defined purpose that we read and write to, together with the hardware device.
- I/O ports: in and out assembly instructions, very similar to send/recv functions in sockets.
There are 2 difficulties concerning drivers:
- Hardware devices may vary a lot in their behavior and protocol and they are poorly documented if documented at all.
- They usually run in kernel-mode, which gives them more resources and permissions. If not written carefully they can crash the system.
The monitor driver
This is a pretty simple driver because it doesn’t involve interrupts (described in the next paragraph). There are multiple ways of interacting with the monitor. The most basic one is in VGA-text mode. In this mode, we mainly communicate through mapped memory. The memory represents a 2-dimensional array of characters shown on the screen. In order to set a blinking cursor, we will need to communicate directly with the device through I/O ports. For real graphics (i.e., drawing pixels and not just text), we need to write a driver that interacts with the graphics card in VGA mode. It is more complex, but possible and will work on most graphics cards. There is one downside - the driver will have low resolution. For high resolution, we will need to write a driver for a specific graphics card and it’s much more complex and much less documented online.
Other hardware devices generate input, which can come at any point in time. The CPU executes instructions serially from the memory. During program execution, hardware devices have events (such as a key press on the keyboard or data received from the NIC) that need to be handled. In order to handle those events, we have the interrupts mechanism. When an event from a device occurs, the CPU gets a hardware interrupt, which notifies the CPU about an event from a device hardware. Before every instruction the CPU executes, it checks for interrupts. If there is an interrupt, it stops his current program execution and runs the code that handles that specific interrupt. The code handling the interrupt is called ISR (Interrupt Service Routine, not to be confused with In-Service Register). Each device has a different handler (ISR) for its interrupts. The IDT (Interrupt Descriptor Table) maps between the interrupts and their handlers. After handling the interrupt, the CPU gets back to its previous context and continues to execute the program where it stopped. As part of our OS, we need to implement the IDT and an ISR for each interrupt.
GDT (Global Descriptor Table)
Now we start a bit with memory management. Implementing the GDT is required before implementing the IDT and ISRs. The GDT is a table that defines memory segments, it defines different address spaces and their permissions (for example, top 1GB of memory is the kernel data segment with read/write permissions and no execute permissions). With GDT we implement segmentation, which is obsolete but still required, so we’ll implement a flat-memory model (all segments overlap and cover the whole address space).
Back to drivers
Now that we understand hardware interrupts and we implemented GDT and IDT, we can work on more complex drivers. There are 2 big differences with these drivers:
- We need to implement an interrupt handler (ISR) and add it to the IDT
- They have a more complex protocol and behavior
For each one of them, we need to find documentation (which sometimes is not an easy task).
Those are the most important drivers we might implement next:
- Keyboard: it can be connected via USB or PS/2. USB is more advanced (allows more features such as custom keys) but it’s more complex.
- NIC: it is connected via a PCI connection. After finding it among other PCI devices, we need to implement the network stack (ARP, IP, UDP, TCP, etc), TCP being the hardest part.
- Hard drive: it is connected through ATA/PATA/SATA interface, allowing to read and write directly to sectors in the hard disk through I/O ports. We then need to implement a file system (such as FAT, NTFS, ext).
Advanced OS features
After coding the drivers, we gained a lot of capabilities we can use. We can decide what we want to implement and how: memory management, processes, threads, scheduling, process communication, userspace, API for userspace, POSIX-compatibility, UI (CLI, GUI), porting an existing software to our OS, and more.
Each part can be implemented in many ways, it can be extended and be taken to the extreme:
We could create an OS that loads all software from the internet.
We could implement an os-level, web-based GUI, similar to Electron.
We could make a snapshots mechanism like in VMs.
We could allow user-mode drivers.
We could implement most of the OS in a high-level language such as Python.
There are endless possibilities and ideas.
We started by understanding the boot sequence and our starting point. Then we set up our development environment, a basic memory structure, an interrupt infrastructure and drivers. I hope you got the general idea of how an OS is developed: where and how to start, what are the challenges, and what are the possibilities. In the next article, we will get into more details, stay tuned.