Freestanding/Bare-Metal Stdlib: Supporting OS Development and Accelerator Targets

Problem

Mojo’s stdlib currently assumes a full hosted POSIX environment with libc, making it unusable for:

  • OS/kernel development
  • Embedded systems
  • Direct GPU/accelerator programming
  • Any bare-metal environment
  • Non-POSIX targets such as NT (Windows) platforms

Even basic operations like string handling pull in dependencies on dup, fdopen, fclose and other POSIX functions that don’t exist in freestanding environments, as well as Mojo dynamically linked runtime libs.
Additionally, manual control over memory is made challenging, even with UnsafePointer model being rather restrictive.

Discussion Points

  • What other use cases need this beyond OS development?
  • Should freestanding be a compile flag or separate modules?
  • How should memory allocation work without a heap?
  • Which stdlib components are most critical for bare-metal?

Anyone else hitting similar issues? Would love to hear about other use cases and ideas for how to tackle this.

3 Likes

I asked @lesoup-mxd to move some discord discussion over here.

I think their stabs at kernel development, as well as potential user confusion about “Why can’t I open a file in a MAX kernel?” or similar issues, are sufficiently motivating for a discussion about how Mojo is going to classify different targets. The clearest is example of this differentiation is MAX accelerators.

If we’re on the host CPU, you can do whatever you want, even if it’s a bit ill advised to go open a TCP socket in the middle of a matmul.

If you’re on a Nvidia DC GPU with a Nvidia NIC, you have RDMA, file access via GDS, NCCL, NVSHMEM, printf and even Ethernet networking. That’s a pretty substantial amount of IO capabilities.

Take away the Nvidia NIC and the DC GPU loses a lot of capabilities.

Consumer Nvidia GPUs get a restricted form of GDS and printf, and most other things are locked down.

NPUs like the Qualcomm Hexagon NPU are essentially general purpose processors with big vector units. They don’t really have access to the OS but can run more or less any freestanding C code you want.

Then we have fixed function hardware, the “Hand me 2 matrices and I’ll multiply them” NPUs. These are unlikely to ever run MAX, but might provide a useful target for offloading particular parts of the stdlib, as would various cryptographic accelerators.

Rust has the idea of separate libraries:

  • core: Everything that should be able to function on a Turing machine. This is where core language concepts like “What is a u8?”` live.
  • alloc: Stuff that needs a global, general purpose allocator to be present.
  • std: Things that could be reasonably assumed to require an operating system of some sort.

However, this approach has caused issues for Rust and there was discussion around making things more capability based. This is partially do deal with OS differences, and partially to make Rust extend into embedded better. As many of you know, I am strongly in favor of capability-based abstractions, but I want to have some discussion around how others think this should work.

cc @joe

4 Likes

If we had custom allocator injection that issue would be solved. But I’m not sure how we would approach it given that the parameter would have to be propagated upwards through all types that might choose the allocator they use. This would also mean fixing their use in function signatures where any sort of [] is used if they have a default allocator (which we most likely would). Maybe a global allocator flag would be best (what would we be sacrificing?).

As for the capabilities I think this should somehow be solved when compiling for a target. I’m a bit out of my depth so I might be rambling, but what if we have a way to disable only functions that use certain capabilities while still being able to use other methods in that type/module. Maybe we could define some compiler flags that can be expressed in mojo code (magic decorator?) that constraint the function to be compiled only for targets with that specific capability.

I would much rather have a limited set of APIs for stdlib types than having to do what Arduino did and basically reimplement a “standard library for microcontrollers”. Or going the micro-python route which had to do even more work having to reimplement a python interpreter :melting_face:

Custom allocators helps a bit, and I agree that’s necessary, but I think that’s a separate discussion.

I would prefer things be enabled when we know they should work, rather than disabling things we don’t think work. Something like Rust’s cfg might be good, but I think I’d prefer a way to disable structs/functions based on the result of a function which returns a boolean (ex: @enable_if(sys.info.is_linux())).

I think we still want a well featured stdlib, and that it can be the same one as we use on more capable platforms, we just need to only enable things which the platform can actually do. A good portion of the stdlib is things like string handling, math, and basic collections which can be done on a microcontroller without any issues.

2 Likes

Won’t this dovetail nicely with the requires clause at some point? e.g. doing

struct UnsafePointer:
    fn alloc(...) requires sys.info.has_heap_allocator(): ...

fn stack_allocation[...](...) requires sys.info.has_stack_allocator(): ...

That is one way to do it, but you might also want to disable whole structs.

AFAIK structs will also get requires clauses

In that case then we’ll wait for requires.

3 Likes

That depends on whether we get compiler support for these sorts of checks, otherwise it’ll be a lot of if statements before you can call alloc:

if sys.info.has_heap_allocator():
    unsafe_ptr.alloc(...)

If we’re using requires, you can just do that for all of the functions that actually do allocation.

Wouldn’t requires be overly verbose for non-detached?
It feels to me like such a solution induces pain on one side or the other.
E.g either all for-posix CPU libs will have to overcomplicate their requires signatures,
or all detached will have to claim a verbose definition that derives from standard.
I might not be fully in the know here

1 Like