MOP2 programming intro
This commit is contained in:
189
content/blog/MOP2/intro.adoc
Normal file
189
content/blog/MOP2/intro.adoc
Normal file
@ -0,0 +1,189 @@
|
||||
= Intro to MOP2 programming
|
||||
Kamil Kowalczyk
|
||||
2025-10-14
|
||||
:jbake-type: post
|
||||
:jbake-tags: MOP2 osdev
|
||||
:jbake-status: published
|
||||
|
||||
This is an introductory post into MOP2 (my-os-project2) user application programming.
|
||||
|
||||
All source code (kernel, userspace and other files) are available at https://git.kamkow1lair.pl/kamkow1/my-os-project2.
|
||||
|
||||
Let's start by doing the most basic thing ever: quitting an application.
|
||||
|
||||
== AMD64 assembly
|
||||
|
||||
.Hello program in AMD64 assembly
|
||||
[source,asm]
|
||||
----
|
||||
.section .text
|
||||
|
||||
.global _start
|
||||
_start: // our application's entry point
|
||||
movq $17, %rax // select proc_kill() syscall
|
||||
movq $-1, %rdi // -1 means "self", so we don't need to call proc_getpid()
|
||||
int $0x80 // perform the syscall
|
||||
// We are dead!!
|
||||
----
|
||||
|
||||
As you can see, even though we're on AMD64, we use `int $0x80` to perform a syscall.
|
||||
|
||||
The technically correct and better way would be to implement support for `syscall/sysret`, but `int $0x80` is
|
||||
just easier to get going and requires way less setup. Maybe in the future the ABI will move towards
|
||||
`syscall/sysret`.
|
||||
|
||||
`int $0x80` is not ideal, because it's a software interrupt and these come with a lot of interrupt overhead.
|
||||
Intel had tried to solve this before with `sysenter/sysexit`, but they've fallen out of fasion due to complexity.
|
||||
|
||||
For purposes of a silly hobby OS project, `int $0x80` is completely fine. We don't need to have world's best
|
||||
performance (yet ;) ).
|
||||
|
||||
=== "Hello world" and the `debugprint()` syscall
|
||||
|
||||
Now that we have our first application, which can quit at a blazingly fast speed, let's try to print something.
|
||||
For now, we're not going to discuss IPC and pipes, because that's a little complex.
|
||||
|
||||
The `debugprint()` syscall came about as the first syscall ever (it even has an ID of 1) and it was used for
|
||||
printing way before pipes were added into the kernel. It's still useful for debugging purposes, when we want to
|
||||
literally just print a string and not go through the entire pipeline of printf-style formatting and only then
|
||||
writing something to a pipe.
|
||||
|
||||
.Usage of `debugprint()` in AMD64 assembly
|
||||
[source,asm]
|
||||
----
|
||||
.section .data
|
||||
|
||||
STRING:
|
||||
.string "Hello world!!!"
|
||||
STRING_LEN:
|
||||
.quad . - STRING
|
||||
|
||||
.section .text
|
||||
|
||||
.global _start
|
||||
_start:
|
||||
movq $1, %rax // select debugprint()
|
||||
lea STRING(%rip), %rdi // load STRING
|
||||
lea STRING_LEN(%rip), %rsi // load STRING_LEN
|
||||
int $0x80
|
||||
|
||||
// quit
|
||||
movq $17, %rax
|
||||
movq $-1, %rdi
|
||||
int $0x80
|
||||
----
|
||||
|
||||
Why are we using `lea` to load stuff? Why not `movq`? Because we can't...
|
||||
|
||||
We can't just `movq`, because the kernel doesn't support relocatable code - everything is loaded at a fixed
|
||||
address in a process' address space. Because of this, we have to address everything relatively to `%rip`
|
||||
(the instruction pointer). We're essentially writing position independent code (PIC) by hand. This is what
|
||||
the `-fPIC` GCC flag does, BTW.
|
||||
|
||||
== Getting into C and some bits of `ulib`
|
||||
|
||||
Now that we've gone overm how to write some (very) basic programs in assembly, let's try to untangle, how we get
|
||||
into C code and understand some portions of `ulib` - the userspace programming library.
|
||||
|
||||
This code snippet should be understandable by now:
|
||||
._start.S
|
||||
[source,asm]
|
||||
----
|
||||
.extern _premain
|
||||
|
||||
.global _start
|
||||
_start:
|
||||
call _premain
|
||||
----
|
||||
|
||||
Here `_premain()` is a C startup function that gets executed before running `main()`. `_premain()` is also
|
||||
responsible for quitting the application.
|
||||
|
||||
._premain.c
|
||||
[source,c]
|
||||
----
|
||||
// Headers skipped.
|
||||
|
||||
extern void main(void);
|
||||
extern uint8_t _bss_start[];
|
||||
extern uint8_t _bss_end[];
|
||||
|
||||
void clearbss(void) {
|
||||
uint8_t *p = _bss_start;
|
||||
while (p < _bss_end) {
|
||||
*p++ = 0;
|
||||
}
|
||||
}
|
||||
|
||||
#define MAX_ARGS 25
|
||||
static char *_args[MAX_ARGS];
|
||||
|
||||
size_t _argslen;
|
||||
|
||||
char **args(void) {
|
||||
return (char **)_args;
|
||||
}
|
||||
|
||||
size_t argslen(void) {
|
||||
return _argslen;
|
||||
}
|
||||
|
||||
// ulib initialization goes here
|
||||
void _premain(void) {
|
||||
clearbss();
|
||||
|
||||
for (size_t i = 0; i < ARRLEN(_args); i++) {
|
||||
_args[i] = umalloc(PROC_ARG_MAX);
|
||||
}
|
||||
|
||||
proc_argv(-1, &_argslen, _args, MAX_ARGS);
|
||||
|
||||
main();
|
||||
proc_kill(proc_getpid());
|
||||
}
|
||||
----
|
||||
|
||||
First, in order to load our C application without UB from the get go, we need to clear the `BSS` section of an
|
||||
ELF file (which MOP2 uses as it's executable format). We use `_bss_start` and `_bss_end` symbols for that, which
|
||||
come from a linker script defined for user apps:
|
||||
|
||||
.link.ld - linker script for user apps
|
||||
[source]
|
||||
----
|
||||
ENTRY(_start)
|
||||
|
||||
SECTIONS {
|
||||
. = 0x400000;
|
||||
|
||||
.text ALIGN(4K):
|
||||
{
|
||||
*(.text .text*)
|
||||
}
|
||||
|
||||
.rodata (READONLY): ALIGN(4K)
|
||||
{
|
||||
*(.rodata .rodata*)
|
||||
}
|
||||
|
||||
.data ALIGN(4K):
|
||||
{
|
||||
*(.data .data*)
|
||||
}
|
||||
|
||||
.bss ALIGN(4K):
|
||||
{
|
||||
_bss_start = .;
|
||||
*(.bss .bss*)
|
||||
. = ALIGN(4K);
|
||||
_bss_end = .;
|
||||
}
|
||||
}
|
||||
----
|
||||
|
||||
After that, we need to collect our application's commandline arguments (like argc and argv in UNIX-derived
|
||||
systems). To do that we use a `proc_argv()` syscall, which fills out a preallocated memory buffer with. The main
|
||||
limitation of this approach is that the caller must ensure that enough space withing the buffer was allocated.
|
||||
25 arguments is enough for pretty much all appliations on this system, but this is something that may be a little
|
||||
problematic in the future.
|
||||
|
||||
After we've exited from `main()`, we just gracefully exit the application.
|
||||
Reference in New Issue
Block a user