diff --git a/content/blog/MOP2/intro.adoc b/content/blog/MOP2/intro.adoc new file mode 100644 index 0000000..6cf1f51 --- /dev/null +++ b/content/blog/MOP2/intro.adoc @@ -0,0 +1,189 @@ += Intro to MOP2 programming +Kamil Kowalczyk +2025-10-14 +:jbake-type: post +:jbake-tags: MOP2 osdev +:jbake-status: published + +This is an introductory post into MOP2 (my-os-project2) user application programming. + +All source code (kernel, userspace and other files) are available at https://git.kamkow1lair.pl/kamkow1/my-os-project2. + +Let's start by doing the most basic thing ever: quitting an application. + +== AMD64 assembly + +.Hello program in AMD64 assembly +[source,asm] +---- +.section .text + +.global _start +_start: // our application's entry point + movq $17, %rax // select proc_kill() syscall + movq $-1, %rdi // -1 means "self", so we don't need to call proc_getpid() + int $0x80 // perform the syscall + // We are dead!! +---- + +As you can see, even though we're on AMD64, we use `int $0x80` to perform a syscall. + +The technically correct and better way would be to implement support for `syscall/sysret`, but `int $0x80` is +just easier to get going and requires way less setup. Maybe in the future the ABI will move towards +`syscall/sysret`. + +`int $0x80` is not ideal, because it's a software interrupt and these come with a lot of interrupt overhead. +Intel had tried to solve this before with `sysenter/sysexit`, but they've fallen out of fasion due to complexity. + +For purposes of a silly hobby OS project, `int $0x80` is completely fine. We don't need to have world's best +performance (yet ;) ). + +=== "Hello world" and the `debugprint()` syscall + +Now that we have our first application, which can quit at a blazingly fast speed, let's try to print something. +For now, we're not going to discuss IPC and pipes, because that's a little complex. + +The `debugprint()` syscall came about as the first syscall ever (it even has an ID of 1) and it was used for +printing way before pipes were added into the kernel. It's still useful for debugging purposes, when we want to +literally just print a string and not go through the entire pipeline of printf-style formatting and only then +writing something to a pipe. + +.Usage of `debugprint()` in AMD64 assembly +[source,asm] +---- +.section .data + +STRING: + .string "Hello world!!!" +STRING_LEN: + .quad . - STRING + +.section .text + +.global _start +_start: + movq $1, %rax // select debugprint() + lea STRING(%rip), %rdi // load STRING + lea STRING_LEN(%rip), %rsi // load STRING_LEN + int $0x80 + + // quit + movq $17, %rax + movq $-1, %rdi + int $0x80 +---- + +Why are we using `lea` to load stuff? Why not `movq`? Because we can't... + +We can't just `movq`, because the kernel doesn't support relocatable code - everything is loaded at a fixed +address in a process' address space. Because of this, we have to address everything relatively to `%rip` +(the instruction pointer). We're essentially writing position independent code (PIC) by hand. This is what +the `-fPIC` GCC flag does, BTW. + +== Getting into C and some bits of `ulib` + +Now that we've gone overm how to write some (very) basic programs in assembly, let's try to untangle, how we get +into C code and understand some portions of `ulib` - the userspace programming library. + +This code snippet should be understandable by now: +._start.S +[source,asm] +---- +.extern _premain + +.global _start +_start: + call _premain +---- + +Here `_premain()` is a C startup function that gets executed before running `main()`. `_premain()` is also +responsible for quitting the application. + +._premain.c +[source,c] +---- +// Headers skipped. + +extern void main(void); +extern uint8_t _bss_start[]; +extern uint8_t _bss_end[]; + +void clearbss(void) { + uint8_t *p = _bss_start; + while (p < _bss_end) { + *p++ = 0; + } +} + +#define MAX_ARGS 25 +static char *_args[MAX_ARGS]; + +size_t _argslen; + +char **args(void) { + return (char **)_args; +} + +size_t argslen(void) { + return _argslen; +} + +// ulib initialization goes here +void _premain(void) { + clearbss(); + + for (size_t i = 0; i < ARRLEN(_args); i++) { + _args[i] = umalloc(PROC_ARG_MAX); + } + + proc_argv(-1, &_argslen, _args, MAX_ARGS); + + main(); + proc_kill(proc_getpid()); +} +---- + +First, in order to load our C application without UB from the get go, we need to clear the `BSS` section of an +ELF file (which MOP2 uses as it's executable format). We use `_bss_start` and `_bss_end` symbols for that, which +come from a linker script defined for user apps: + +.link.ld - linker script for user apps +[source] +---- +ENTRY(_start) + +SECTIONS { + . = 0x400000; + + .text ALIGN(4K): + { + *(.text .text*) + } + + .rodata (READONLY): ALIGN(4K) + { + *(.rodata .rodata*) + } + + .data ALIGN(4K): + { + *(.data .data*) + } + + .bss ALIGN(4K): + { + _bss_start = .; + *(.bss .bss*) + . = ALIGN(4K); + _bss_end = .; + } +} +---- + +After that, we need to collect our application's commandline arguments (like argc and argv in UNIX-derived +systems). To do that we use a `proc_argv()` syscall, which fills out a preallocated memory buffer with. The main +limitation of this approach is that the caller must ensure that enough space withing the buffer was allocated. +25 arguments is enough for pretty much all appliations on this system, but this is something that may be a little +problematic in the future. + +After we've exited from `main()`, we just gracefully exit the application. diff --git a/content/blog/2025/09/hellow-world.adoc b/content/blog/hellow-world.adoc similarity index 100% rename from content/blog/2025/09/hellow-world.adoc rename to content/blog/hellow-world.adoc