190 lines
5.1 KiB
Plaintext
190 lines
5.1 KiB
Plaintext
= Intro to MOP2 programming
|
|
Kamil Kowalczyk
|
|
2025-10-14
|
|
:jbake-type: post
|
|
:jbake-tags: MOP2 osdev
|
|
:jbake-status: published
|
|
|
|
This is an introductory post into MOP2 (my-os-project2) user application programming.
|
|
|
|
All source code (kernel, userspace and other files) are available at https://git.kamkow1lair.pl/kamkow1/my-os-project2.
|
|
|
|
Let's start by doing the most basic thing ever: quitting an application.
|
|
|
|
== AMD64 assembly
|
|
|
|
.Hello program in AMD64 assembly
|
|
[source,asm]
|
|
----
|
|
.section .text
|
|
|
|
.global _start
|
|
_start: // our application's entry point
|
|
movq $17, %rax // select proc_kill() syscall
|
|
movq $-1, %rdi // -1 means "self", so we don't need to call proc_getpid()
|
|
int $0x80 // perform the syscall
|
|
// We are dead!!
|
|
----
|
|
|
|
As you can see, even though we're on AMD64, we use `int $0x80` to perform a syscall.
|
|
|
|
The technically correct and better way would be to implement support for `syscall/sysret`, but `int $0x80` is
|
|
just easier to get going and requires way less setup. Maybe in the future the ABI will move towards
|
|
`syscall/sysret`.
|
|
|
|
`int $0x80` is not ideal, because it's a software interrupt and these come with a lot of interrupt overhead.
|
|
Intel had tried to solve this before with `sysenter/sysexit`, but they've fallen out of fasion due to complexity.
|
|
|
|
For purposes of a silly hobby OS project, `int $0x80` is completely fine. We don't need to have world's best
|
|
performance (yet ;) ).
|
|
|
|
=== "Hello world" and the `debugprint()` syscall
|
|
|
|
Now that we have our first application, which can quit at a blazingly fast speed, let's try to print something.
|
|
For now, we're not going to discuss IPC and pipes, because that's a little complex.
|
|
|
|
The `debugprint()` syscall came about as the first syscall ever (it even has an ID of 1) and it was used for
|
|
printing way before pipes were added into the kernel. It's still useful for debugging purposes, when we want to
|
|
literally just print a string and not go through the entire pipeline of printf-style formatting and only then
|
|
writing something to a pipe.
|
|
|
|
.Usage of `debugprint()` in AMD64 assembly
|
|
[source,asm]
|
|
----
|
|
.section .data
|
|
|
|
STRING:
|
|
.string "Hello world!!!"
|
|
STRING_LEN:
|
|
.quad . - STRING
|
|
|
|
.section .text
|
|
|
|
.global _start
|
|
_start:
|
|
movq $1, %rax // select debugprint()
|
|
lea STRING(%rip), %rdi // load STRING
|
|
lea STRING_LEN(%rip), %rsi // load STRING_LEN
|
|
int $0x80
|
|
|
|
// quit
|
|
movq $17, %rax
|
|
movq $-1, %rdi
|
|
int $0x80
|
|
----
|
|
|
|
Why are we using `lea` to load stuff? Why not `movq`? Because we can't...
|
|
|
|
We can't just `movq`, because the kernel doesn't support relocatable code - everything is loaded at a fixed
|
|
address in a process' address space. Because of this, we have to address everything relatively to `%rip`
|
|
(the instruction pointer). We're essentially writing position independent code (PIC) by hand. This is what
|
|
the `-fPIC` GCC flag does, BTW.
|
|
|
|
== Getting into C and some bits of `ulib`
|
|
|
|
Now that we've gone overm how to write some (very) basic programs in assembly, let's try to untangle, how we get
|
|
into C code and understand some portions of `ulib` - the userspace programming library.
|
|
|
|
This code snippet should be understandable by now:
|
|
._start.S
|
|
[source,asm]
|
|
----
|
|
.extern _premain
|
|
|
|
.global _start
|
|
_start:
|
|
call _premain
|
|
----
|
|
|
|
Here `_premain()` is a C startup function that gets executed before running `main()`. `_premain()` is also
|
|
responsible for quitting the application.
|
|
|
|
._premain.c
|
|
[source,c]
|
|
----
|
|
// Headers skipped.
|
|
|
|
extern void main(void);
|
|
extern uint8_t _bss_start[];
|
|
extern uint8_t _bss_end[];
|
|
|
|
void clearbss(void) {
|
|
uint8_t *p = _bss_start;
|
|
while (p < _bss_end) {
|
|
*p++ = 0;
|
|
}
|
|
}
|
|
|
|
#define MAX_ARGS 25
|
|
static char *_args[MAX_ARGS];
|
|
|
|
size_t _argslen;
|
|
|
|
char **args(void) {
|
|
return (char **)_args;
|
|
}
|
|
|
|
size_t argslen(void) {
|
|
return _argslen;
|
|
}
|
|
|
|
// ulib initialization goes here
|
|
void _premain(void) {
|
|
clearbss();
|
|
|
|
for (size_t i = 0; i < ARRLEN(_args); i++) {
|
|
_args[i] = umalloc(PROC_ARG_MAX);
|
|
}
|
|
|
|
proc_argv(-1, &_argslen, _args, MAX_ARGS);
|
|
|
|
main();
|
|
proc_kill(proc_getpid());
|
|
}
|
|
----
|
|
|
|
First, in order to load our C application without UB from the get go, we need to clear the `BSS` section of an
|
|
ELF file (which MOP2 uses as it's executable format). We use `_bss_start` and `_bss_end` symbols for that, which
|
|
come from a linker script defined for user apps:
|
|
|
|
.link.ld - linker script for user apps
|
|
[source]
|
|
----
|
|
ENTRY(_start)
|
|
|
|
SECTIONS {
|
|
. = 0x400000;
|
|
|
|
.text ALIGN(4K):
|
|
{
|
|
*(.text .text*)
|
|
}
|
|
|
|
.rodata (READONLY): ALIGN(4K)
|
|
{
|
|
*(.rodata .rodata*)
|
|
}
|
|
|
|
.data ALIGN(4K):
|
|
{
|
|
*(.data .data*)
|
|
}
|
|
|
|
.bss ALIGN(4K):
|
|
{
|
|
_bss_start = .;
|
|
*(.bss .bss*)
|
|
. = ALIGN(4K);
|
|
_bss_end = .;
|
|
}
|
|
}
|
|
----
|
|
|
|
After that, we need to collect our application's commandline arguments (like argc and argv in UNIX-derived
|
|
systems). To do that we use a `proc_argv()` syscall, which fills out a preallocated memory buffer with. The main
|
|
limitation of this approach is that the caller must ensure that enough space withing the buffer was allocated.
|
|
25 arguments is enough for pretty much all appliations on this system, but this is something that may be a little
|
|
problematic in the future.
|
|
|
|
After we've exited from `main()`, we just gracefully exit the application.
|