This conversation is a long one and has been present in many PRs or issues separately. Whether to rename a function to snake_case, or rename it entirely to be more explicit, or decide on a different behavior in function edge-cases.
I wanted to start this conversation about API design here in the forum as a central place and future reference.
This is an excerpt of a conversation I had in issue #4217 with @gryznar
Let’s finally drop this narration about motivating decisions just because “Python has that”
IMO we should still take them into account, we can’t just say we want to eventually become a superset and just go and destroy the compatibility of both languages.
Campatibility may be achieved differently than just replicating API (AI based migration tool for example will handle such cases quite well in the nearest future I suppose)
I still see reimplementing API (via copying it) like a big foot-gun which blocks better design decisions and innovations
I agree in a sense (I wanted to e.g. remove the raising of split() on an empty separator, and many other instances like that), but I’m also wary of going full on “lets optimize every single bit of decision forgoing ergonomics/familiarity” .
This can be easily be taken to parameter space since I don’t know of any use case where one would need maxsplit to be a runtime argument. Also, we could rename maxsplit to max_split. While we are at it, we could add an overload where the separator itself is also in parameter space, that way the _mem* implementations can optimize away several checks… this can go on and on about every single detail in every API. Where do we stop?
Finally all decisions are up to Modular. However as I posted in this issue I’d like to see Mojo as Python++ (better Python, not its superset). There are a lot of possibilities to design API in much better and consistend way and just sticking to Python seems to me like a balast to achieve in best scenario semi-compatibility
I still see reimplementing API (via copying it) like a big foot-gun which blocks better design decisions and innovations.
I fully agree with this sentiment. The expressed desire seems to currently be for Mojo to be a “Python family” language that focuses on high performance. It’s python-like which allows for skipping a lot of language design rabbit holes and makes getting a first draft of things done faster.
But when it comes to APIs/stdlib conventions I don’t think copying Python has much value. Python’s api’s aren’t known for consistency, and the stdlib is so large we’d have to draw arbitrary lines somewhere to start with.
If we want to defer to another implementations way of doing things in the name of skipping design work, let it be Rust, or some other language with a cohesive set of API design guidelines and a well thought out stdlib.
The only caveat I’d have is if Modular has designs around some fantastically seemless interop story between the two languages where same-api would be needed.
In short, I don’t think we should be tied to keeping apis one-to-one with Python.
Rust APIs are not the best ever could made, but they are at least consistent which is big plus for me.
In a case of such interoperability, Mojo can just seamlessly call CPython under the hood as it does nowadays, but with support from LSP and better wrapper APIs
I think that was the long term goal originally. “Rename all your .py files to .mojo and magically everything is faster”, which surely would be nice, but I think freeing ourselves from the shackles of being exactly the same will be a good thing long term. We are free to make improvements where Python made mistakes, and if we can achieve a nice workflow of offloading expensive operations in Python codebases into Mojo, like we currently do with C, I think that’s good enough. As long as we keep things close enough that we still achieve the goal of closing the gap between AI research code, and AI deployment code.
I’d suggest splitting this into two different problems:
How we provide drop-in compatibility with Python someday?
How we benefit from muscle memory today?
In the first case, my expectation is that dynamically typed values will be something like PythonObject and their api will follow Python exactly, and do so in a similar way that Python does.
This means that non-PythonObject values (e.g. Int vs int) CAN have completely different APIs if they want to. The problem then is the second question: if int and Int are completely different, does that make the ecosystem stronger or weaker?
I think the answer here is that we keep things similar to Python wherever it makes sense, but deviate when there is a strong reason to. I don’t think that “consistency” or “beauty” by itself would be a strong enough reason to deviate, but “type safety” and “generics features” and “ability to use traits to extend things” or other features that are key to static languages can be.
Cherry-picking where we deviate from Python and where not will lead us to the state when some things work as in Python and some not.
Following Python will also rise a lot of unavoidable questions like: “Why thing A does not work like in Python if thing B does?”
Every deviation in this area may be more confusing for people than just keeping language with Python-like vibe, but with its own nature. Most copies are just worser than the original.
I still vote for taking own way and do not introduce missleading impression that Mojo is Python superset (which is not achievable if we would like to have performant language with good design).
I think we can easily provide superficial API compatibility through extensions, similar to how OCaml provides custom operators via open XxxNotation. This approach allows the compatibility layer to be maintained separately from the core standard library.
# This example is only meant to illustrate the idea
# os.path.isabs is actually a free function.
# in stdlib/path/path.mojo:
struct Path:
fn is_absolute(...) -> Bool:
...
# in stdlib/path/compat.mojo:
extension Path:
fn isabs(...) -> Bool:
...
# ordinary Mojo usage:
fn main():
p = Path(...)
if p.is_absolute():
...
# Mojo in "Python compat mode":
from path.compat import *
fn main():
p = Path(...)
if p.isabs():
...
I love this idea of extensions!! it would also allow us to separate type implementations into multiple files for logical isolation and let library authors organize better .
I think this would also open the door for normal/breaking API changes to be made with versioning using these extensions. When a change is made that can have a simple workaround then keeping backwards compatibility will be trivial (this can already be done but it bloats the main file). @deprecated for example doesn’t allow one to set arguments or parameters as deprecated so one could build an extension with the deprecated API that has a deprecation warning in a separate file. Theoretically many versions of backward compatibility could be maintained this way.
This could totally wait further down the roadmap, but it would be something very nice to aim for IMO.
Yeah I am also +1 on extensions and I think it is consensus from the Mojo team overall that this is the right thing to do, they just haven’t had time to implement it yet