Mojo-yaml v0.1.0 Lite - Native YAML Parser

mjboothaus · January 14, 2026, 11:12am

mojo-yaml v0.1.0 Lite - Native YAML Parser

From the author (@mjboothaus) of mojo-toml, mojo-ini, and mojo-dotenv comes mojo-yaml, a native YAML Lite parser for Mojo with zero dependencies, covering ~80% of common YAML use cases.

What it does

Parses block-style YAML configuration files into native Mojo structures:

Nested mappings (dicts) and sequences (lists) of any depth
Inline list-mappings: - name: Alice\n age: 30
All scalar types: int, float, bool, null, string (quoted/unquoted)
Comments anywhere with #
Type-safe value access with .get() and .get_at()
Clear error messages with line/column context
91 tests ensuring reliability (100% passing)

Installation

git clone https://github.com/DataBooth/mojo-yaml.git
cd mojo-yaml
pixi run test-all
pixi run example-all  # See working examples

Coming soon to the modular-community channel.

Usage

Basic Parsing:

from yaml import parse

fn main() raises:
    var config = parse("""
server:
  host: localhost
  port: 8080
  debug: true
users:
  - name: Alice
    role: admin
  - name: Bob
    role: user
""")
    
    # Type-safe access
    var server = config.get("server")
    print(server.get("host").as_string())   # localhost
    print(server.get("port").as_int())      # 8080
    print(server.get("debug").as_bool())    # True
    
    # Navigate sequences
    var users = config.get("users")
    var first_user = users.get_at(0)
    print(first_user.get("name").as_string())  # Alice

File I/O:

from yaml import parse
from pathlib import Path

fn main() raises:
    var content = Path("config.yaml").read_text()
    var config = parse(content)
    # Work with parsed data

What’s in v0.1.0 Lite

Core Parser: Lexer (518 LOC) + Parser (300 LOC) + YamlValue (318 LOC)
Coverage: ~80% of common YAML patterns (block-style only)
Tests: 91/91 passing across 15 test suites (100%)
Examples: 4 working code examples with real-world fixtures
Documentation: Comprehensive README, CHANGELOG, and COMPATIBILITY.md

Real-World Testing

Works: .pre-commit-config.yaml, custom configs
Requires quoting: Multi-word strings, version numbers
Not supported: Flow-style [...], empty values, anchors/aliases

See COMPATIBILITY.md for detailed feature matrix.

Compatibility Tips

Do:

version: "1.0.0"           # Quote version numbers
description: "My app"      # Quote multi-word strings  
host: localhost            # Single words OK unquoted
list:
  - item1                  # Use block style
  - item2

Don’t:

version: 1.0.0             # ❌ Multiple dots fail
description: My app        # ❌ Spaces in unquoted strings
list: [item1, item2]       # ❌ Flow style not supported

Feature Comparison

Feature	Support	Notes
Nested Mappings	Full	Any depth
Nested Sequences	Full	Any depth
Inline List-Mapping	Full	`- name: value` with continuation
Scalars	Full	int, float, bool, null, string
Comments	Full	`# anywhere`
Quoted Strings	Full	`"text"`, `'text'`
Unquoted Strings	Single word	Must quote multi-word
Version Numbers	Must quote	`"1.0.0"` not `1.0.0`
Flow Style	Not supported	`[1, 2]`, `{a: b}`
Empty Values	Not supported	`key:` → use `key: null`
Anchors/Aliases	Not supported	`&anchor`, `*ref`
Multi-Document	Not supported	`---`
Writing YAML	Not implemented	Reader-only v0.1.0

Why “Lite”?

Full YAML 1.2 is complex (~84-page spec). YAML Lite focuses on the ~80% use case:

Configuration files (pre-commit, custom configs)
Data serialization (block-style only)
Nested structures (any depth)
Advanced features (anchors, flow-style, multi-doc)

This provides immediate value while keeping implementation maintainable.

Roadmap

Possible enhancements for v0.2.0+:

Support version number patterns (multiple decimal points)
Handle empty values gracefully
Flow-style array support [1, 2, 3]
YAML writer functionality

Not planned:

Anchors/aliases (complex, rarely needed)
Multi-document streams (niche use case)
Literal/folded blocks (marginal utility)

Links

Related Projects

mojo-toml - TOML 1.0 parser/writer for modern configs
mojo-ini - INI/ConfigParser for Mojo
mojo-dotenv - Environment variable management

Together these provide comprehensive configuration file support for Mojo!

Acknowledgements

Open source project with initial development sponsored by DataBooth, building high-performance data and AI services with Mojo.

Feedback and contributions welcome!

melodyogonna · January 14, 2026, 12:09pm

You’re on fire

HammadHAB · January 14, 2026, 1:56pm

Awesome! Maybe XML Parser next?

mjboothaus · January 15, 2026, 2:13am

Hmmm… apparently significantly more effort than toml… maybe my parsing days are done

clattner · January 16, 2026, 12:40am

Very cool! Out of curiosity, have you looked at performance at all?

mjboothaus · January 16, 2026, 9:24am

There is now some initial (not comprehensive) benchmarking that I have included the outputs in the README.md

Indicatively at a high level the benchmarks demonstrate mojo-yaml’s 9-21x performance advantage over Python’s pyyaml.

Caveat - this is just a “lite” implementation in Mojo for now - as noted (also in README) more intricate yaml features are not implemented.

Maybe a Mojo BenchmarkSuite to mirror the testing TestSuite?

owenhilyard · January 16, 2026, 7:16pm

Maybe a Mojo BenchmarkSuite to mirror the testing TestSuite?

Today’s you’re lucky day! benchmark | Modular

mjboothaus · January 17, 2026, 3:25am

Hi @owenhilyard

Actually I basically simultaneously discovered benchmark in the stdlib

I wanted to share a complementary approach, inspired by TestSuite, that I have prototyped in the following repo:

Mojo BenchSuite: TestSuite-style patterns for benchmarking

Repo: GitHub - DataBooth/mojo-benchsuite: A lightweight, TestSuite-inspired benchmarking framework for Mojo

Benchmarking is clearly important to the Modular team (the stdlib module is excellent!). This project explores making it as frictionless as TestSuite.

Key additions over stdlib `benchmark`:

Suite-level organisation — Group and run multiple benchmarks
Environment capture — OS/CPU/version for reproducibility
Adaptive iterations — Auto-adjust for reliable statistics
Multiple outputs — Console, markdown, CSV with timestamps

Example

from benchsuite import BenchReport

fn my_algorithm():
    pass

def main():
    var report = BenchReport()
    report.benchmark[my_algorithm]("my_algorithm")

Future: Exploring TestSuite.discover_tests[__functions_in_module__()]() pattern for auto-discovery of bench_* functions.

Any feedback and ideas welcome!

Topic		Replies	Views
Mojo-toml v0.3.0 - Native TOML Parser Community Showcase	4	119	January 14, 2026
Mojo-ini v0.2.0 - Native INI Parser (Initial release) Community Showcase	0	49	January 13, 2026
MojoINI: minimal ini parser in mojo Community Showcase	0	110	July 8, 2025
EmberJson: JSON parsing in pure mojo Community Showcase	43	1382	May 19, 2026
Mojo-dotenv v0.2.0 - Load .env files in Mojo (98%+ python-dotenv compatible) Community Showcase	3	109	January 9, 2026