New `UntypedPointer` for heterogeneous data (idea, proposal)

JulianJS · July 10, 2025, 8:51am

Hi all,
this is a little idea/proposal for some kind of “UntypedPointer” to use with heterogeneous data.

(Disclaimer: I am new to low level memory management but sometimes this can bring some fresh new ideas, I hope :D)

Intro

What makes mojo great?

great abstractions over low level concepts (gpu)
great performance
great ergonomics
great type system
familiar pythonic syntax

When working with heterogeneous data, the current UnsafePointer is a “good” but not “great” abstraction.

I think there is room for improvement to make low level memory management much more approachable and ergonomic, especially for newcomers.
My impression is, that lower level memory management is often avoided despite interesting use cases, because the api is “hard to use” and not “fun” (yet)!

Mojo has soooo much potential and making high level and low level coding fun and ergonomic is a great way to attract more people to the language and an absolute game changer!

Pointers

Pointer (homogeneous data)

When working with homogeneous data (array, list), we need:

memory location: Where are we?
type: What is the type? The type is the SAME for all elements

Pointer (heterogeneous data)

When working with heterogeneous data (png, apache arrow, etc.), we need:

memory location: Where are we?

IMO having a type is not ideal, because the type is not the same for all elements.
Instead, we could specify the type when we read from the pointer, which would allow us to use the same pointer for different types.

Example 1 (from `UnsafePointer` documentation)

From the docs:

def read_chunks(var ptr: UnsafePointer[UInt8]) -> List[List[UInt32]]:
    chunks = List[List[UInt32]]()
    # A chunk size of 0 indicates the end of the data
    chunk_size = Int(ptr[])
    while (chunk_size > 0):
        # Skip the 1 byte chunk_size and get a pointer to the first
        # UInt32 in the chunk
        ui32_ptr = (ptr + 1).bitcast[UInt32]()
        chunk = List[UInt32](capacity=chunk_size)
        for i in range(chunk_size):
            chunk.append(ui32_ptr[i])
        chunks.append(chunk)
        # Move our pointer to the next byte after the current chunk
        ptr += (1 + 4 * chunk_size)
        # Read the size of the next chunk
        chunk_size = Int(ptr[])
    return chunks

Problem with this approach:

multiple pointers (each type requires its own pointer)
multiple memory locations
bitcasting required
pointer arithmetic is not very ergonomic

The proposed solution would look like this:

def read_chunks2(var ptr: UntypedPointer) -> List[List[UInt32]]:
    chunks = List[List[UInt32]]()
    
    chunk_size = ptr.read[Int, move=True]()  # "move" will move the pointer to the next memory location
    while (chunk_size > 0):
        chunk = List[UInt32](capacity=chunk_size)

        for _ in range(chunk_size):
            chunk.append(ptr.read[UInt32, move=True]())
        
        chunks.append(chunk)

        chunk_size = ptr.read[Int, move=True]()
    return chunks

Benefits of this approach:

single pointer for all types
single memory location
easier to use, read, and understand

Example 2 (many types)

This is just a contrived example to show the difference between the current UnsafePointer and the proposed UntypedPointer.

p = UnsafePointer(...)

p_int8 = p.bitcast[Int8]()
value_int8 = p_int8[]
p_int8 += 1

p_int16 = p_int8.bitcast[Int16]()
value_int16 = p_int16[]
p_int16 += 1

p_int32 = p_int16.bitcast[Int32]()
value_int32 = p_int32[]
p_int32 += 1

Problem:

multiple pointers (each type requires its own pointer)
multiple memory locations

Proposed solution:

p = UntypedPointer(...)
value_int8 = p.read[Int8, move=True]()
value_int16 = p.read[Int16, move=True]()
value_int32 = p.read[Int32, move=True]()

Benefits:

single pointer for all types
single memory location
easier to use, read, and understand

API Design and Comparison

Reading from a pointer

UnsafePointer:

p_int8 = p.bitcast[Int8]()
_ = p[0]

Problem:

no autocompletion: Indexing data structures usuallly does not provide autocompletion
no “documentation” (hover text): Indexing data structures usually does not provide “documentation” (hover text)
potentially multiple pointers: Each type requires its own pointer

Proposed:

p.read[Int8]()

Benefits:

easier to use
autocompletion: p.<TAB> shows all available functions including read
documentation: IDE will provide documentation (hover text) for read

“Moving” a pointer (pointer arithmetic)

UnsafePointer:

p_int8 = p.bitcast[Int8]()
p_int8 += 1
p_int16 = p_int8.bitcast[Int16]()
p_int16 += 4

Problem:

no nice abstraction
required bitcasting
no autocompletion or documentation

Proposed:

p.move[Int8](amount=1)
p.move[Int16](amount=4)

Benefits:

easier to use
autocompletion: p.move.<TAB> shows all available functions including move
documentation: IDE will provide documentation (hover text) for move

Conclusion

Making low level memory management more “fun” would be so great!
The proposed ideas would have the main benefit of:

Marking lower level memory management “fun” and accessible (just like gpu programming in mojo)

Would love to hear your thoughts on this idea!

Thanks for reading!

Topic		Replies	Views
Ask Joe anything about the Mojo standard library! :fire: Mojo ask-me-anything	11	382	June 8, 2025
How to build a generic collection with an Iterator in Mojo Mojo discussion , 24_5 , docs	1	63	June 12, 2025
Return compile time constant as runtime value Mojo	11	101	April 24, 2025
Freestanding/Bare-Metal Stdlib: Supporting OS Development and Accelerator Targets Mojo std-lib , mojo-compiler	10	170	June 21, 2025
Is there an ETA on C/C++ Interop? Mojo	4	480	January 22, 2025

New `UntypedPointer` for heterogeneous data (idea, proposal)

Intro

Pointers

Pointer (homogeneous data)

Pointer (heterogeneous data)

Example 1 (from UnsafePointer documentation)

Example 2 (many types)

API Design and Comparison

Reading from a pointer

“Moving” a pointer (pointer arithmetic)

Conclusion

Related topics

Example 1 (from `UnsafePointer` documentation)