TheStateOfGarnet2021
I am working on a new programming language called Garnet, so I wanted to write up a bit of a progress report about what it is, where it is, and where it’s going. Basically partially to get my thoughts in order, partly to drum up interest (if it is deserving of any), and partly to solicit feedback from anyone who is interested. I’ve been working on it off and on for something like a year and a half now, fairly seriously for about a year, and I intend to keep going with it until either it has solved the questions I want to solve or it stops being useful because someone’s made something better.
What is Garnet?
The core question of Garnet is “what if Rust were simple?”
The origins of Garnet as a general idea are very old, but didn’t solidify until I learned Rust. I love Rust a lot, and it has basically replaced anything else I might use for any class of problem I’m interested in. I essentially no longer write C#, F#, OCaml, Python or C for anything serious, if I can at all avoid it. For me, it actually is that good. However, Rust is not the last language that ever needs to exist, and like any real-world system that gets used for complicated stuff it has its share of flaws, downsides, and just outright mistakes. Garnet might or might not ever become anything large or popular, but I want to experiment with taking a lot of Rust’s ideas, making different decisions about how those ideas should work, and see what happens.
I’m not going to fully explain the language here. Check out the WIP rationale notes for the design philosophy, and to see it in action you’re best off looking at the docs and tests directories in the source repo.
The Past
Or, what I did in 2021.
Basically, I got the compiler working end-to-end and compiling real programs that include types other than integers and booleans. There are Rust-style sized integers, there are basic enums and tuples and structs and lambdas, but no closures or generics yet. There is a Never
type which works way better than I ever expected. There is some very basic expression-level type inference. All in all, it now looks like a programming language that did not come out of an undergrad course.
Lots of small tooling and quality of life things were made, which are not exactly innovative but also makes everything feel much more real. There is a command line program that you can actually use to compile programs. There is a test harness that compiles and runs a bunch of real programs written as standalone files and checks that their output is correct, which seems like a small thing but actually makes life so much better. There is a CI pipeline that runs tests and does code coverage, though it’s still a little jank. Error handling for type checking got way better. Symbol tables and string interning got way better. I even started writing an informal spec.
The backend was rewritten to simply output Rust code. This is dumb and also works very well. I went around in circles on it about three times, going from outputting Webassembly to creating an SSA intermediate representation to outputting Webassembly again to realizing that I didn’t know enough to write my own re-looper equivalent and I should just do something simple because a backend is not the important part right now. I’ll worry about the backend once I know what the type system should look like. A simple backend is not terribly difficult anyway, and a complicated backend is a life-long project that I don’t particularly want.
Then… I spent about six months grappling with type checking, type systems and generics. I am not finished doing that yet, either. The type checker has been rewritten about three times as well, and has grown to be about as sophisticated as is necessary to work without any kind of generics, but I banged my head into it a lot trying to go further without much luck. Plain HM type checking is not good enough. So I’ve spent the last couple months hitting the papers and experimenting with the results of them, and writing a proof of concept type checker based on the “Complete and Easy Bidirectional Typechecking for Higher-Rank Polymorphism” paper by Dunfield and Krishnaswami, 2020. Implementing the paper worked, now I am working on adding the rest of Garnet’s existing types into it which also seems to be working so far. Then I should have a good basis for handling polymorphism in general, basically about on par with System F, and I can replace Garnet’s existing type checker with it.
I’ve also concluded that I’m probably the only person on the face of the planet who wrote a parser and was basically happy with it on the first attempt. I guess some things have to go well sooner or later.
The Present
Or, what I want to do in 2022.
Garnet is very much still in the embryonic phase. A compiler exists, the language is technically Turing complete and has a lot of the foundational features like functions, loops, basic tuples and structs, and so on. However, it’s not able to do anything particularly interesting yet.
Major features it does not have yet are:
- Borrow checking
- Module system
- Generics and something like traits/objects/interfaces/typeclasses
These are all sooooorta interrelated, but I will try to tease them apart.
Borrow checking
Okay I have to admit I have no idea what I’m doing here, in a formal sense. I am on very good terms with Rust’s borrow checker and we very rarely get into fights, but I have only a general idea about how it actually works under the hood and what the algorithms are like. IMO there’s no particular need to do better than Rust’s borrow checker, but there’s still a lot to learn to be equally good.
So, this is both easy and difficult.
Module system
This is the simplest feature at its core. Having namespaces in code is useful, and having those namespaces defined by separate file is pretty convenient. On the other hand, structs produce namespaces already, so I really may end up doing something similar to Zig or Lua, and make modules simply be structs. We can evaluate them at compile time, remove any unnecessary indirection, and compile the function calls to code just as if those structs never existed. But perhaps if we also preserve those structs explicitly, we can use them as interfaces, vtables, DLL export tables, or other handy bits of runtime metadata.
Which leads us to…
Generics
This is the biggie. Some sort of mechanism for safe code generation based on types is essential for a strongly typed language in the 21st century. However, it makes everything exponentially more complicated.
It is easy to make types and functions accept arbitrary data types, as long as all they can do is move data around. The difficulty comes when you want to do stuff with your data based on partial information about it, and it becomes more difficult again when your data can be code anyway. Various language have different appropaches to this, Rust traits and Haskell typeclasses and C# objects and SML modules; the basic principle of “here is a set of functions you can apply to this data type” is pretty clear. But then when you nest these, or generate them based on other type constraints, then things can quickly become impossible to typecheck or impossible to generate efficient code for.
So this is a design space that needs some exploration. Rust’s traits are very nice, but I dislike the constraints around trait objects, coherence and how much they invite metaprogramming without quite living up to the full possibilities. Zig’s essentially macro-like code generation approach is very compelling, but has the problem where things can be invisibly Wrong until they are instantiated and used. Object-ish systems are quite well explored, but have a lot of holes and do a lot of their work at runtime via slower dynamic dispatch. If our traits/interfaces are structs, then can we define our generics simply by having functions that take a type and return a struct implementing the generic for that type? SML’s modules work a lot like this, and the research language 1ML demonstrates you can make it work with just functions and first-class types. This starts out seeming easy, then you realize it means you can have structs with types as their members and have to figure out how to actually implement that, and kinda goes down-hill from there. Can we do something like that robustly, without needing arbitrary compile-time code execution that may take infinite amounts of time? Moreover, can we do it without it being a total pain in the ass?
I also really want it to take it a step further though. In Rust there’s a lot of functions that have equivalents that operate on owned objects, on references, and on mutable references. This irks me, and occasionally (such as when making builder pattern objects) causes real problems. I think being able to be generic over ownership would make the language both simpler and nicer. We will see though.
The Future
Or, what isn’t going to get done in 2022.
Backend
The compiler is written in Rust, and rather cheekily also outputs Rust code and calls rustc
to compile it. Obviously, this does not really fulfill any of the “portable” or “quick to compile” goals, but should be good enough to bootstrap itself with. Keeping a backend around that emits Rust, C, or some other high level langauge might actually be quite useful for making archival snapshots or bootstrapping on new platforms – OCaml does something like this, for example. But the intention is to eventually have a Real Backend of some kind or another.
What to use for its backend, though? There’s not actually that many general-purpose compiler backends out there, your choices are basically GCC or LLVM, and both of those are large hulking monsters I would really prefer not to have to deal with. This basically leaves two options: Use a less-developed but more lightweight compiler backend like Cranelift or QBE, or write our own backend from scratch. Not sure which way is the best right now. Both have upsides and risks.
Properties
I want the compiler to explicitly prove and promise that certain functions or types have certain properties. Rust does this a lot with traits and derive macros, but that often comes down to Compiler Magic in the end, such as the Copy
trait. Mainly I want to be able to be able to say that certain functions do not panic, or do not allocate dynamic memory, or do not use unbounded stack space, things like that. Then if I violate one of these properties by accident the compiler can catch it for me. This seems fairly straightforward to begin with, but as it gets fancier you start edging into algabraic effects, which are a topic of open research. So we may go there, or we may cherry-pick the most useful bits of functionality from them without the complexity of the whole system, or the whole idea may be more trouble than it’s worth. We will see.
Tooling
Tooling is Important. Rust is a great example of what a difference modern tooling can make, and C++ is a great example of what it’s like when a langauge makes that impossible. So, if I want Garnet to become a Big Boy Language, it must have at least:
- Build/packaging tool (cargo)
- Package index (crates.io)
- Formatter (rustfmt)
- Doc generator (rustdoc)
- Basic editor integration (syntax highlighting and basic context info)
If things get more serious, more powerful tools also become useful:
- Linter (clippy)
- IDE integration (language server)
Frankly, I’m halfway convinced I could just make cargo
believe that Garnet code is Rust, and invoke garnetc
appropriately. That way we could just use an existing build tool. This isn’t necessarily a good idea, but it would be hilarious if it works.
None of this exists yet.
Conclusions
So yeah! The easy stuff is mostly done. Spending six months grappling with type theory and still not having a real solution is… pretty demoralizing, but not particularly surprising in retrospect. I know the shape of what I want so I think I just have to hang in there and get there in my own fashion, and I am now reasonably certain I am on the right track. Math has never been my friend but I know I shall bend it to my cruel whim sooner or later. Always have before.