11 KiB
Asset packing with zip.c
One of the technical decisions that I had to make when writing this website is the way that I'm going to distribute the assets (images, GPP, page templates, etc.). I've decided to go with packing or baking-in the assets into the main binary, so that I don't have to worry about updating an asset folder alongside the application.
One flaw of this approach is that as the website gets bigger, the binary will too. Fortunately I don't have to worry about this too much right now, since I've reimplemented asset packing with the excellent zip library by `kuba--` (Polska GUROM!!!111!!). In this article I'd like to demonstrate how I've minimised the size of our assets using `zip` and `incbin`.
References
- zip: https://raw.githubusercontent.com/kuba--/zip/refs/heads/master/src/zip.c
- incbin: https://github.com/graphitemaster/incbin
- zip format: https://en.wikipedia.org/wiki/ZIP_(file_format)
- the source code: https://git.kamkow1lair.pl/kamkow1/aboba
Generating the bundle - `bundle.zip`
To compress our assets, first we need to generate a bundle file. I've decided to call it bundle.zip - simple and descriptive.
Generating the bundle requires some changes to our build program. See https://git.kamkow1lair.pl/kamkow1/aboba/src/branch/master/build.c for reference.
``` #define BUNDLED_FILES \ "./gpp1", \ "./tmpls/home.html", \ "./tmpls/page-missing.html", \ "./tmpls/template-blog.html", \ "./tmpls/blog.html", \ "./etc/hotreload.js", \ "./etc/theme.js", \ "./etc/simple.css", \ "./etc/highlight.js", \ "./etc/hljs-rainbow.css", \ "./etc/marked.js", \ "./etc/favicon.ico", \ "./etc/me.jpg", \ "./etc/tmoa-engine.jpg", \ "./etc/tmoa-garbage.jpg", \ "./blog/blog-welcome.md", \ "./blog/blog-weird-page.md", \ "./blog/blog-curious-case-of-gebs.md", \ "./blog/blog-the-making-of-aboba.md", \ "./blog/blog-asset-packing-with-zip.c.md"
const char *bundle_zip_deps[] = { BUNDLED_FILES };
RULE_ARRAY("./bundle.zip", bundle_zip_deps) {
RULE("./gpp1", "./gpp/gpp.c") {
CMD("cc", "-DHAVE_STRDUP", "-DHAVE_FNMATCH_H", "-o", "gpp1", "gpp/gpp.c");
}
struct zip_t *zip = zip_open("./bundle.zip", BUNDLE_ZIP_COMPRESSION, 'w');
defer { zip_close(zip); }
for (size_t i = 0; i < sizeof(bundle_zip_deps)/sizeof(bundle_zip_deps[0]); i++) {
char *copy = strdup(bundle_zip_deps[i]);
defer { free(copy); }
char *name = basename(copy);
String_Builder sb = {0};
defer { sb_free(&sb); }
sb_read_file(&sb, bundle_zip_deps[i]);
zip_entry_open(zip, name);
zip_entry_write(zip, sb.items, sb.count);
zip_entry_close(zip);
}
LOGI("Generated bundle.zip\\n");
}
RULE("./aboba",
"./main.c",
"./routes.c",
"./routes.h",
"./baked.c",
"./baked.h",
"./commit.h",
"./timer.c",
"./timer.h",
"./CONFIG.h",
"./locked.h",
"./mongoose.o",
"./bundle.zip",
BUNDLED_FILES
) {
// build mongoose.o - skipped
// Generate commit.h - skipped
#define CC "cc"
#define TARGET "-o", "aboba"
#if MY_DEBUG
#define CFLAGS "-fsanitize=address", "-fPIC", "-ggdb"
#define DEFINES "-DMY_DEBUG=1", "-D_GNU_SOURCE", "-DGEBS_NO_PREFIX", "-DINCBIN_PREFIX=", "-DINCBIN_STYLE=INCBIN_STYLE_SNAKE", \\
"-DGEBS_ENABLE_PTHREAD_FEATURES"
#define EXTRA_SOURCES "./cJSON/cJSON.c", "./zip/src/zip.c", "./md5-c/md5.c"
#else
#define CFLAGS "-fPIC"
#define DEFINES "-DMY_DEBUG=0", "-D_GNU_SOURCE", "-DGEBS_NO_PREFIX", "-DINCBIN_PREFIX=", "-DINCBIN_STYLE=INCBIN_STYLE_SNAKE", \\
"-DGEBS_ENABLE_PTHREAD_FEATURES"
#define EXTRA_SOURCES "./cJSON/cJSON.c", "./zip/src/zip.c"
#endif
#define SOURCES "./main.c", "./routes.c", "./baked.c", "./timer.c"
#define OBJECTS "./mongoose.o"
#define LINK_FLAGS "-Wl,-z,execstack", "-lpthread"
#define INC_FLAGS "-I.", "-I./zip/src"
CMD(CC, TARGET, CFLAGS, DEFINES, INC_FLAGS, SOURCES, OBJECTS, EXTRA_SOURCES, LINK_FLAGS);
// generate compile_flags.txt - skipped
// #undef macros - skipped
}
```
If you go through the commit history, you'll see that apart from just generating the bundle file, I've also cleaned up the build commands a bit with `#define`s. Let's take a closer look at the bundle generation code.
``` const char *bundle_zip_deps[] = { BUNDLED_FILES };
RULE_ARRAY("./bundle.zip", bundle_zip_deps) {
RULE("./gpp1", "./gpp/gpp.c") {
CMD("cc", "-DHAVE_STRDUP", "-DHAVE_FNMATCH_H", "-o", "gpp1", "gpp/gpp.c");
}
struct zip_t *zip = zip_open("./bundle.zip", BUNDLE_ZIP_COMPRESSION, 'w');
defer { zip_close(zip); }
for (size_t i = 0; i < sizeof(bundle_zip_deps)/sizeof(bundle_zip_deps[0]); i++) {
char *copy = strdup(bundle_zip_deps[i]);
defer { free(copy); }
char *name = basename(copy);
String_Builder sb = {0};
defer { sb_free(&sb); }
sb_read_file(&sb, bundle_zip_deps[i]);
zip_entry_open(zip, name);
zip_entry_write(zip, sb.items, sb.count);
zip_entry_close(zip);
}
LOGI("Generated bundle.zip\\n");
}
```
We declare a dependency (using `RULE*()` macro), which says that `./bundle.zip` depends on files defined by `BUNDLED_FILES`. To generate the bundle we use `zip_open()` `zip_close()`. To call `zip_open()`, we have to provide a so called "compression level". The zip library provides us only with `ZIP_DEFAULT_COMPRESSION_LEVEL`, which is a macro that evaluates to 6. I wasn't satisfied with it, so after looking at `miniz.h` (a backing library that zip uses), I've found that zip uses `MZ_DEFAULT_COMPRESSION_LEVEL`, which is 6, but we can use the value of `MZ_UBER_COMPRESSION`, which is 10. This way we can achieve the most size-efficient compression.
I've decided to `#define` the compression level to avoid using arbitrary magic numbers and we can also use that definition both in the application and in the build program. Here's how `CONFIG.h` looks like now:
``` #ifndef CONFIG_H_ #define CONFIG_H_
#if MY_DEBUG
define CONFIG_LISTEN_URL "http://localhost:8080"
#else
define CONFIG_LISTEN_URL "http://localhost:5000"
#endif
#define BUNDLE_ZIP_COMPRESSION 10
#endif // CONFIG_H_ ```
The only "downside" here is that, since we're compressing so hard, it's going to take more time to generate the bundle. I've put "downside" in quotes for purpose, because this does not apply in our case. The files that we're packing are quite small already and there aren't many of them. We're just doing this to sqeeze out extra spacial performance.
Previously we were baking-in the assets like so:
from 447362c74d/baked.c
``` INCBIN(gpp1, "./gpp1");
INCBIN(home_html, "./tmpls/home.html"); INCBIN(page_missing_html, "./tmpls/page-missing.html"); INCBIN(template_blog_html, "./tmpls/template-blog.html"); INCBIN(blog_html, "./tmpls/blog.html");
INCBIN(simple_css, "./etc/simple.css"); INCBIN(favicon_ico, "./etc/favicon.ico"); #if MY_DEBUG INCBIN(hotreload_js, "./etc/hotreload.js"); #endif INCBIN(theme_js, "./etc/theme.js"); INCBIN(highlight_js, "./etc/highlight.js"); INCBIN(hljs_rainbow_css, "./etc/hljs-rainbow.css"); INCBIN(marked_js, "./etc/marked.js"); INCBIN(me_jpg, "./etc/me.jpg"); INCBIN(tmoa_engine_jpg, "./etc/tmoa-engine.jpg"); INCBIN(tmoa_garbage_jpg, "./etc/tmoa-garbage.jpg");
INCBIN(blog_welcome_md, "./blog/welcome.md"); INCBIN(blog_weird_page_md, "./blog/weird-page.md"); INCBIN(blog_curious_case_of_gebs_md, "./blog/curious-case-of-gebs.md"); INCBIN(blog_the_making_of_aboba_md, "./blog/the-making-of-aboba.md"); ```
Now that we have our `bundle.zip`, we do it like this:
``` INCBIN(bundle_zip, "./bundle.zip"); ```
And there we go, we have our bundle!
I've also had to slightly change the way we add the assets to the resource hash table:
``` void add_baked_resource(char *key, const uchar *data, size_t size) { int fd = memfd_create(key, 0); if (fd < 0) { LOGE("Could not create resource %s. Aborting...\n", key); abort(); } write(fd, data, size); shput(baked_resources.value, key, ((Baked_Resource_Value){ .memfd = fd, .bufptr = (void *)data })); }
void init_baked_resources(void) { struct zip_t *zip = zip_stream_open(bundle_zip_data, bundle_zip_size, BUNDLE_ZIP_COMPRESSION, 'r'); size_t n = zip_entries_total(zip); for (size_t i = 0; i < n; i++) { zip_entry_openbyindex(zip, i);
const char *name = strdup(zip_entry_name(zip));
size_t size = zip_entry_size(zip);
char *buf = malloc(size);
zip_entry_noallocread(zip, buf, size);
add_baked_resource((char *)name, buf, size);
zip_entry_close(zip);
}
zip_stream_close(zip);
} ```
`add_baked_resource()` hasn't changed here, but `init_baked_resources()` has. Here we use one of zip's abilities, which is unpacking an in-memory .zip file. We iterate each entry in the bundle, get the name and size, preallocate a buffer and read said entry into the buffer. We can then add the resource to the hash table as we did previously.
Conclusion
One question you may ask is, how much space are we saving? I don't remeber the exact sizes of the binary, but
I remember that before compression it was `~1.2M` and now after compression is implemented, it's `1.6M`. How is
that an improvement if the binary gained weight? That `.4M` is likely due to zip format overhead - metadata, file and
directory headers and so on. What we gain here is that the binary hasn't changed in size much despite adding more files,
like for eg. this article that I'm writing right now. It has stayed at `~1.6M` so far and the size doesn't go up.
(So far) We aren't even making byte-sized gains, which can be checked with `stat --printf "%s" ./aboba`. What
we're gaining here is slowed down size increase. Previously if I wanted to add a 30K jpeg, the binary would literally
go up by 30K.