Mojo-toml v0.3.0 - Native TOML Parser

mojo-toml v0.3.0 - Native TOML Parser

I’ve released mojo-toml, a native TOML 1.0 parser for Mojo with zero Python dependencies. Currently read-only.

What it does

Parses TOML configuration files into native Mojo structures:

  • All TOML 1.0 types (strings, integers, floats, booleans, arrays, tables)
  • Nested structures with dotted keys
  • Duplicate key detection
  • Clear error messages with line/column context
  • 96 tests ensuring reliability

Installation

git clone https://github.com/DataBooth/mojo-toml.git
cd mojo-toml
pixi run test-all

Coming soon to the modular-community channel.

Usage

from toml import parse

fn main() raises:
    var config = parse("""
        [database]
        host = "localhost"
        port = 5432
    """)
    
    var db = config["database"].as_table()
    print(db["host"].as_string())  # "localhost"
    print(db["port"].as_int())     # 5432

What’s in v0.3.0

  • Proper dotted key support and duplicate detection
  • Parser improvements with reset() method
  • Reorganised test suite (96 tests across 10 files)
  • Performance benchmarks and documentation

See the CHANGELOG for full details.

Links

Roadmap

Planned features for future releases:

  • TOML writer (serialiser)
  • Array of tables [[array]]
  • Hex/octal/binary integers
  • Native datetime parsing

Acknowledgements

This project is sponsored by DataBooth, building high-performance data and AI services with Mojo.

Feedback and contributions most welcome!

Great foundational work for Mojo, thanks for this ecosystem contribution!

As always, I’ll encourage anyone writing Mojo libraries to add some benchmarks. We’ve had more than a few “accidentally faster than python’s SOTA” libraries.

Hi Owen - happy to add a benchmark- although this is a simple parsing package which should be sufficiently fast for real world use cases - will try and add a comparison with the Python package / Python standard lib in 3.11+

A more complete response :slight_smile:

Thanks @owenhilyard! Great timing on the benchmark suggestion - v0.5.0 now includes a comprehensive benchmarking system! :bullseye:

We’re running both mojo-toml and Python’s tomllib (stdlib) on the same test cases with full machine specs:

pixi run benchmark-mojo    # mojo-toml performance
pixi run benchmark-python  # Python baseline

Both generate markdown reports in benchmarks/reports/ with system info (CPU, GPU, RAM, Mojo/Python versions).

Current results (Apple M1 Pro):

  • Real-world config files (pixi.toml): ~2ms parse time
  • Simple configs: 40K+ parses/sec
  • Python’s tomllib is currently 2-10x faster (it’s implemented in optimised C)

Not “accidentally faster” yet :grinning_face_with_smiling_eyes:, but very competitive for a pure Mojo implementation. More importantly, it’s fast enough for the use case - config files parse in milliseconds, which is imperceptible for typical config loading. The focus has been on correctness, completeness, and usability rather than raw speed.

As Mojo’s compiler optimisations mature, I would expect to close the gap or exceed Python. Details in PERFORMANCE.md and the benchmark reports.

Would love to hear if there are specific performance-sensitive use cases we should optimise for or just specific performance tweaks as I am still very new to Mojo!

Fantastic! I’m sure that with a little sprinkle of SIMD this could easily catch up (note that Mojo doesn’t do autovectorization, so your scalar code stays scalar).

The main use-case I would have for making this faster is parsing a lot of config files (think kubernetes amounts of config but in TOML). That’s almost pure drag racing, so it’s up to you if you want to go for it.