Track 01 · complete

The Boot Path

Follow LainDOS from the reset vector to a running game — eight steps, each with the real source that does the work and a live view of the memory it touches. Scroll, or jump to any step.

00
Reset vector

Power-on & BIOS

When an x86 powers on it can't run a program yet — there's nothing in RAM. The CPU starts executing at the reset vector (physical FFFF0h, the top of the 1 MiB real-mode space), which is a far jump into the motherboard ROM. That ROM is the BIOSBIOSFirmware services available before DOS exists; it loads the boot sector and provides interrupts such as INT 13h disk I/O..

The BIOSBIOSFirmware services available before DOS exists; it loads the boot sector and provides interrupts such as INT 13h disk I/O. runs POST (counts RAM, initialises the PICPICThe Programmable Interrupt Controller that masks and dispatches hardware IRQs such as keyboard IRQ1., PITPITThe Programmable Interval Timer that drives the classic PC timer tick and many game timing loops. and video), then bootstraps: it reads the first 512-byte sector of the boot disk to physical 0x7C00, checks that its last two bytes are 55 AA, and far-jumps to it. The only thing it hands you in a register is DL — the drive you booted from.

AA55

The boot signature the BIOSBIOSFirmware services available before DOS exists; it loads the boot sector and provides interrupts such as INT 13h disk I/O. checks at offset 510. boot.asm ends with times 510-($-$$) db 0 then dw 0xAA55 to guarantee it.

DL

00h = floppy A:, 80h = first hard disk. boot.asm reads it immediately because the BIOSBIOSFirmware services available before DOS exists; it loads the boot sector and provides interrupts such as INT 13h disk I/O. doesn't preserve it for long.

state → DL = boot drive · CS:IP = 0000:7C00
01
Establishing a stack

The boot sector takes over

We're now executing 512 bytes at 0x7C00 with no operating system underneath us — we have to build a sane machine state from scratch before doing anything else. That means: turn off interrupts, point all the segment registers somewhere known, and set up a stack.

Real-mode addresses are segment:offset, where the linear address is segment×16 + offset. Zeroing DS/ES/SS makes every address we write a flat offset from 0, which keeps the arithmetic in this tiny sector simple.

src/boot.asmNASM · 16-bit
31
32
33
34
35
36
37
38
39
40
boot:
cli ; no interrupts during setup
xor ax,ax
mov ds,ax
mov es,ax
mov ss,ax
mov sp,0x7C00 ; stack grows down from us
sti
cld ; string ops count upward
mov [drv],dl ; remember the boot drive
32

cli clears the interrupt flag (IF=0). Between loading SS and SP the stack is momentarily inconsistent; an interrupt firing there would push onto a garbage stack, so we mask them.

33

xor ax,ax is the idiomatic two-byte way to zero a register. You can't mov ds, 0 directly — segment registers only load from another register, so AX is the courier.

37

SP set to 0x7C00 means the stack grows downward from just below our code (pushes decrement SP), so it never overwrites the 512 bytes living at 0x7C00–0x7DFF.

39

cld clears the direction flag (DF=0) so lodsb / movsb / cmpsb auto-increment SI and DI — the boot sector relies on this for its string compares and copies.

40

DL is copied into a local variable (and to fixed address 0x500) so the value survives the BIOSBIOSFirmware services available before DOS exists; it loads the boot sector and provides interrupts such as INT 13h disk I/O. calls to come, which clobber it.

state → SS:SP = 0000:7C00 · [drv] = DL
01b
FAT12 by hand

…and loads the kernel off the floppy

There's no file system driver yet, so the boot sector is one. Using the BIOS Parameter BlockBPBThe FAT boot-sector table that describes sector size, FAT location, root directory size, and cluster layout. baked into its own first bytes, it reads the FAT to segment 0x60 and the root directory to 0x0A20, then scans 32-byte directory entries for the 8.3 name "KERNEL SYS". That 0x0A20 copy is boot-sector staging; the relocated kernel later reloads active root directories into ROOT_SEG at 0x0B40.

Once found, it follows the file's cluster chain. FAT12FAT12The 12-bit FAT format used by 1.44 MB floppy images and the early LainDOS boot path. packs each 12-bit entry across byte boundaries, so fat_next reads a word and shifts by 4 bits for odd clusters. Each cluster is read with INT 13hINT 13hThe BIOS disk interrupt used by boot code before the DOS filesystem is available. to LOAD_SEG (0x1000); when the chain hits an end-of-chain marker (≥ 0xFF8) it jumps in.

src/boot.asmNASM · 16-bit
84
85
86
96
122
123
124
125
126
mov si,kn ; "KERNEL SYS"
mov cx,11
repe cmpsb ; compare 11 bytes
 
fk: mov si,[es:di+26] ; first cluster from dir entry
 
ldk:
mov dl,[drv]
db 0xEA ; far jmp opcode
dw 0 ; offset
dw LOAD_SEG ; segment -> 1000:0000
86

repe cmpsb compares DS:SI (our target name) with ES:DI (the directory entry) byte by byte, CX times, stopping early on a mismatch. CX=11 covers the 8-char name + 3-char extension with no separator.

96

Offset 26 of a FAT directory entry is the starting cluster number. Everything about reading the file flows from this one 16-bit value.

124

A hand-assembled far jump: EA + 16-bit offset + 16-bit segment. It loads CS=0x1000, IP=0 and starts executing the kernel. DL is reloaded just before so the kernel still knows the boot drive.

state → KERNEL.SYS resident at 1000:0000
02
Moving day

The kernel relocates itself

The kernel is assembled with org 0 and was loaded high at 0x1000, but it wants to live low so the entire span above it (0x1000–0xA000) is free for programs. So its first act is to copy itself down to RELOC_SEG (0x0340) and far-jump into the copy.

The entry point is re-entrant by design: it compares CS against RELOC_SEG and only relocates if it isn't already there. After the jump it points every segment register at 0x0340, installs a real stack at 0xC800, and resets the FPU.

src/kernel.asmNASM · 16-bit
95
96
97
98
111
112
115
117
118
119
123
124
127
kernel_entry:
mov ax, cs
cmp ax, RELOC_SEG ; already low?
je .relocated
 
mov ax, RELOC_SEG
mov es, ax
mov cx, kernel_end
rep movsb ; copy kernel -> 0x0340
jmp RELOC_SEG:.relocated
.relocated:
mov ss, ax
mov sp, KERNEL_STACK_TOP ; 0xC800
fninit
97

CS is read into AX and compared with 0x0340. On the first pass CS is still 0x1000, so the branch falls through into the copy; the second time (post-jump) it's 0x0340 and we skip straight to .relocated.

117

rep movsb copies CX bytes from DS:SI to ES:DI. With CX = kernel_end (the assembled size), DS=0x1000 and ES=0x0340, the whole kernel image is moved down 0xCC0 paragraphs in one instruction.

118

The far jump is what actually changes CS to 0x0340 — until now we were still executing the original copy at 0x1000. Code can't just mov cs; a far jmp/ret is the only way to reload it.

124

SP is set to 0xC800. Combined with SS=0x0340 that's linear 0x0340×16 + 0xC800 = 0FC00h — a stack sitting safely below the MCBMCBA 16-byte DOS memory header that describes the allocated or free block immediately after it. arena and above the FAT16FAT16The 16-bit FAT format LainDOS uses for larger hard-disk style images. root buffer (there are compile-time assertions guarding both boundaries).

127

fninit resets the x87x87The x86 floating-point instruction set. DOS extenders and games may exercise it heavily after startup. FPU to a known state so later floating-point use (and games that probe for a coprocessor) behave.

state → CS = 0340 · SS:SP = 0340:C800
03
Hello, hardware

Bring-up: serial, VGA, memory

With a stack in place the kernel brings up its I/O and takes inventory of memory. Serial comes first because LainDOS logs almost everything over COM1COM1The first PC serial port. LainDOS uses it as the primary machine-readable debug log. — it's the project's primary debugging channel — then the VGAVGAThe PC video standard LainDOS and the demos use for text and graphics output. text screen is cleared and "LainDOS booted" is printed.

Two BIOSBIOSFirmware services available before DOS exists; it loads the boot sector and provides interrupts such as INT 13h disk I/O. calls measure RAM. INT 12h returns conventional memory (the classic ≤640 KiB) in AX; INT 15h/AH=88h returns extended memoryXMSExtended Memory Specification services for memory above 1 MiB, used by many later DOS games and extenders. above 1 MiB for the XMSXMSExtended Memory Specification services for memory above 1 MiB, used by many later DOS games and extenders. shim, capped so the BIOSBIOSFirmware services available before DOS exists; it loads the boot sector and provides interrupts such as INT 13h disk I/O.-backed block move stays under 16 MiB.

src/kernel.asmNASM · 16-bit
129
130
132
133
135
136
145
call serial_init
call vga_clear
mov si, msg_booted ; "LainDOS booted"
call serial_print
int 0x12 ; conventional KiB -> AX
mov [mem_kib], ax
 
call init_xms_size ; INT 15h AH=88h
129

serial_init programs the 8250/16550 UART at port 0x3F8: it sets the divisor latch for the baud rate and configures 8 data bits, no parity, 1 stop bit (8N1). After this, serial_print can poll the line-status register and shift bytes out.

135

int 0x12 is a BIOSBIOSFirmware services available before DOS exists; it loads the boot sector and provides interrupts such as INT 13h disk I/O. service that takes no inputs and returns the conventional-memory size in KiB in AX (typically 640). It's the simplest possible way to learn how much low RAM exists.

145

init_xmsXMSExtended Memory Specification services for memory above 1 MiB, used by many later DOS games and extenders._size issues INT 15h with AH=88h; the BIOSBIOSFirmware services available before DOS exists; it loads the boot sector and provides interrupts such as INT 13h disk I/O. returns KiB of extended memoryXMSExtended Memory Specification services for memory above 1 MiB, used by many later DOS games and extenders. in AX. LainDOS clamps it to XMSXMSExtended Memory Specification services for memory above 1 MiB, used by many later DOS games and extenders._MAX_KB (15360) so its INT 15h-based XMSXMSExtended Memory Specification services for memory above 1 MiB, used by many later DOS games and extenders. block moves never address past 16 MiB.

state → memory sized · serial live
04
The vector table

Installing the interrupts

This is the moment LainDOS becomes a DOS. The Interrupt Vector TableIVTThe table at physical address zero containing 256 far pointers for CPU, BIOS, and DOS interrupts. lives at physical 0 — 256 entries, each a 4-byte far pointer (offset word, then segment word). By writing its own routines into specific slots, the kernel makes the standard DOS software interrupts call its code.

INT 21hINT 21hThe main DOS API interrupt. Programs select services such as open, read, EXEC, and exit with AH. is the DOS APIDOS APIThe program-facing DOS service contract, mostly reached through INT 21h in LainDOS. itself; INT 20h terminates a program; INT 2Fh is the multiplex used here for XMSXMSExtended Memory Specification services for memory above 1 MiB, used by many later DOS games and extenders.; INT 33hINT 33hThe conventional DOS mouse-driver interrupt implemented by LainDOS for game input. is the built-in mouse driver. It also claims the CPU fault vectors (01h/06h/0Dh) so a crashing program dumps state over serial instead of hanging silently.

src/kernel.asmNASM · 16-bit
1414
1418
1419
1420
1421
1422
1423
1425
1435
init_interrupts:
xor ax, ax
mov es, ax ; ES=0 -> the IVT
mov [es:0x20*4], word int20_handler ; terminate
mov [es:0x20*4+2], cs
mov [es:0x21*4], word int21_handler ; DOS API
mov [es:0x21*4+2], cs
mov [es:0x2F*4], word int2f_handler ; XMS mux
mov [es:0x33*4], word int33_handler ; mouse
1419

ES is zeroed so ES:offset addresses the IVTIVTThe table at physical address zero containing 256 far pointers for CPU, BIOS, and DOS interrupts. at the very bottom of memory. Each vector N is at byte N×4.

1420

Vector 0x20 (terminate) is at byte 0x80. The low word gets the handler's offset within the kernel segment…

1421

…and the high word gets CS (0x0340). Together they form the far pointer the CPU jumps to when int 20h executes. Every handler is installed as this offset+segment pair.

1422

INT 21hINT 21hThe main DOS API interrupt. Programs select services such as open, read, EXEC, and exit with AH. is the heart of the DOS APIDOS APIThe program-facing DOS service contract, mostly reached through INT 21h in LainDOS. — AH selects the function (09h print string, 3Dh open, 3Fh read, 4Ch exit, …). Pointing 0x21 at int21_handler is what lets ordinary DOS programs talk to LainDOS.

state → INT 20h/21h/2Fh/33h live
05
Carving up RAM

Memory arena & filesystem

DOS hands out memory as a chain of Memory Control Blocks — 16-byte headers, each describing the block that follows it. LainDOS seeds the chain with a single block covering everything from 0x1000 up to VGAVGAThe PC video standard LainDOS and the demos use for text and graphics output. memory at 0xA000, marked 'Z' (the last/only block) and owner 0 (free).

It then re-validates the BPBBPBThe FAT boot-sector table that describes sector size, FAT location, root directory size, and cluster layout. the boot sector copied up and derives the on-disk layout: where the FAT starts, where the root directory sits, where the data region begins, and the maximum cluster number — supporting both FAT12FAT12The 12-bit FAT format used by 1.44 MB floppy images and the early LainDOS boot path. and FAT16FAT16The 16-bit FAT format LainDOS uses for larger hard-disk style images. images.

src/kernel.asmNASM · 16-bit
159
160
161
162
163
164
171
mov ax, MCB_START ; 0x1000
mov es, ax
mov byte [es:0], MCB_SIG_Z ; 'Z' = last block
mov word [es:1], 0 ; owner 0 = free
mov ax, MEM_TOP - MCB_START - 1
mov word [es:3], ax ; size in paragraphs
 
call init_bpb_geometry
161

Byte 0 of an MCBMCBA 16-byte DOS memory header that describes the allocated or free block immediately after it. is its signature: 'M' means another block follows, 'Z' means it's the last one. A single 'Z' block is the simplest valid arena.

162

Bytes 1–2 are the owner PSPPSPThe DOS data block placed before each program, holding terminate vectors, the job file table, command tail, and environment pointer. segment; 0 means free. When a program allocates, this gets stamped with its PSPPSPThe DOS data block placed before each program, holding terminate vectors, the job file table, command tail, and environment pointer. so the memory can be reclaimed on exit.

164

Bytes 3–4 hold the block size in 16-byte paragraphs. MEM_TOP − MCBMCBA 16-byte DOS memory header that describes the allocated or free block immediately after it._START − 1 = 0x8FFF paragraphs ≈ 576 KiB, the whole conventional arena minus the one-paragraph header.

171

init_bpbBPBThe FAT boot-sector table that describes sector size, FAT location, root directory size, and cluster layout._geometry checks bytes/sector = 512, that sectors-per-cluster is a power of two, and computes krsta/kdsta (FAT and data start) and kmax_cluster, picking FAT12FAT12The 12-bit FAT format used by 1.44 MB floppy images and the early LainDOS boot path. vs FAT16FAT16The 16-bit FAT format LainDOS uses for larger hard-disk style images. from the file-system type string.

state → MCB chain armed · BPB validated
06
MZ or not MZ

Finding & sizing the program

Now the kernel loads the program it exists to run. resolve_path locates the boot file by name and returns its directory entry; from there it reads the starting cluster and file size, and pulls the first sector into the sector buffer.

The crucial test is two bytes. DOS executables come in two shapes: a flat .COM image, or a relocatable .EXE that begins with the signature 'MZMZThe DOS EXE header signature. LainDOS uses it to distinguish EXE files from flat COM programs.'. Read little-endian, those bytes form the word 5A4Dh — one compare decides how much memory to reserve and how the image is laid out.

src/kernel.asmNASM · 16-bit
183
184
217
218
219
220
mov si, fname_exe
call resolve_path ; find it on disk
 
mov dx, SEC_BUF
mov ds, dx ; DS -> first sector
cmp word [0x0000], 0x5A4D ; 'MZ' ?
je .peek_mz ; -> EXE path
184

resolve_path walks the path one 8.3 component at a time, descending subdirectories, and returns the matching directory entry in ES:DI (or sets carry if not found). The default name comes from -DBOOT_FILE at assembly time (SHELL.COM on this image).

218

DS is pointed at SEC_BUF (0x0B00) so [0x0000] reads the very first word of the file we just loaded.

219

The bytes 'M','Z' are 4Dh,5Ah on disk; little-endian that's the word 0x5A4D. (The format is named after Mark Zbikowski, who designed it.) A match means a header-bearing .EXE…

220

…and the EXE path computes minimum/maximum paragraph counts from the header. No match falls through to the COM path, which just reserves the file size plus a fixed slack.

state → image type resolved
07
Showtime

Handoff — the game runs

The kernel allocates a block from the MCBMCBA 16-byte DOS memory header that describes the allocated or free block immediately after it. arena, stamps it with the new owner, loads the image cluster-by-cluster just above its Program Segment PrefixPSPThe DOS data block placed before each program, holding terminate vectors, the job file table, command tail, and environment pointer., and builds that 256-byte PSPPSPThe DOS data block placed before each program, holding terminate vectors, the job file table, command tail, and environment pointer. (command tail, environment pointer, default or inherited file handles).

For an .EXE, setup_exe_dyn walks the relocation table — adding the load segment to each flagged word so far pointers resolve correctly — then sets the entry CS:IP and SS:SP from the header and transfers control. For a .COM it simply jumps to offset 0x100. Control won't return until the program calls INT 20h or INT 21hINT 21hThe main DOS API interrupt. Programs select services such as open, read, EXEC, and exit with AH./AH=4Ch.

src/kernel.asmNASM · 16-bit
346
347
362
363
 
 
mov ax, [cs:prog_seg]
call build_psp ; 256-byte PSP
 
mov ax, [cs:prog_seg]
call setup_exe_dyn ; relocate, set CS:IP/SS:SP, go
 
; control returns only via INT 20h
; or INT 21h AH=4Ch (terminate).
347

build_pspPSPThe DOS data block placed before each program, holding terminate vectors, the job file table, command tail, and environment pointer. constructs the Program Segment PrefixPSPThe DOS data block placed before each program, holding terminate vectors, the job file table, command tail, and environment pointer. at offset 0 of the program's segment: the INT 20h terminate instruction, the top-of-memory word, the environment segment, the command-line tail and the inherited/default handle table. DOS programs expect DS/ES to point here on entry.

363

setup_exe_dyn applies the .EXE relocations (each is a far-pointer word that needs the actual load segment added), then loads CS:IP and SS:SP from the header and jumps. The very next thing on screen is the program's own output — for the demo image, the SHELL.COM prompt. Run midemo from there to start Monkey Island.

state → program owns the CPU
Memory · step 001/9
Low memory seg:0
A000
VGA graphics memory
1000
Program + environment arena
0B40
Root directory buffer
0B20
Read cache buffer
0B00
Sector buffer
0340
Relocated kernel
0060
FAT scratch buffer
0040
BIOS Data Area
0000
Interrupt Vector Table
Registers & state after
CS:IP0000:7C00where the boot sector starts
DL00h / 80hboot drive (only reliable input)
othersundefinedDS, ES, SS, SP must be set by us
highlighted: the segments this step reads or writes.
DL = boot drive · CS:IP = 0000:7C00