TheStateOfGarnet2022

Up to date as of May 2023.

I am working on a new programming language called Garnet, so I wanted to write up a bit of a progress report about what it is, where it is, and where it’s going. This is partly to get my thoughts in order, partly to drum up interest (if it is deserving of any), and partly to solicit feedback from anyone who is interested. I did this at the end of 2021, but life has been been busy so it’s been closer to 18 months since then rather than 12. So, here’s the next installment.

What is Garnet?

The core question of Garnet is “what if Rust were small?”

Basically, I think there’s a niche out there for a borrow-checked systems language that tries to be more minimalist than Rust. If you divide programming languages into “small” and “large”, Rust is definitely on the large side. I don’t see anything fundamentally wrong with this, since it means there’s lots of Useful Stuff in it, but it also comes with downsides: Rust is difficult to learn, complicated to implement, difficult to modify, and its (blessedly few) sharp edges tend to be arcane and unpleasant. I also want to prioritize a lot of the useful things that Rust doesn’t prioritize: I want to make a tool with a stable ABI, excellent support for bare-metal programming, and fast compile times. That way we can have programming languages that complement each other.

So I think we need something like Rust but simpler. Now’s a good time for it, ’cause we’re also going through a little bit of a renaissance of low-level systems languages. Before 2015 or so if you wanted to write a program touching raw memory your realistic choices were “C, C++ or maybe Pascal”, but now aside from Rust there’s Zig, Hare, Odin, Austral, and probably others I’ve never heard of. However, with the exception of Austral, none of these have a borrow checker. I think borrow-checking is a killer feature that should be more common, so Garnet is a small, borrow-checked system language with a careful set of interlocking features that hopefully will produce a useful, flexible language. I kind of want it to be a low-level Lua. Ubiquitous, adaptable, handy and easy to pick up and hack on.

The Past, or, what I did in 2022.

Basically, I got type checking and inference working with generic functions and data, figured out how to make generics work semi-nicely without traits, and hammered out some weird edge cases of the type system. That was basically all of 2022. I also played with compiler backends a little, and fought a lot with my dysfunctional brain. Since then I’ve been doing some learning and cleanup, and worked a lot on monomorphization, which is one of those things which is not exactly hard but a total pain in the ass. So, progress has been slow and fraught. Apparently I am not alone in considering monomorph annoying, which makes me feel a little bit better. I’ll get it hammered down eventually, it’s just an annoying blocker because some of the features of Garnet’s generic type system can’t be expressed in Rust’s own generics and, since the compiler emits Rust code for those moments, those features typecheck properly but just won’t work until monomorphization is done.

The Present

However, I am now in the state where I can talk a little bit about Garnet as if it were a real thing you can write programs in, because it is! Not like, super interesting programs, but still nontrivial ones! At first blush, Garnet looks like Rust mushed up with Lua, because it is:

fn the_answer() {} =
    __println(42)
end

I can’t write Hello World because we don’t have strings yet.. I should probably add those someday but it won’t be too difficult, so for now they’re just skipped. There’s lots of things like that I’m afraid. Here we declare a function named the_answer() which returns an empty tuple, unit, {}. It calls one compiler-builtin function, __println(), and returns the result of it. The syntax is expression-based like Rust, so a block returns its last value. Most compound statements use keywords like do ... end or fn ... end or if ... then ... end instead of curly braces, but they work exactly the same way. Since we don’t use curly braces for blocks, we can use them for delimiting tuples like Erlang, so a tuple literal is written {1, 2, 3}. Struct literals are written {.label1 = value1, .label2 = value2}; it feels weird copying C but it works surprisingly well. One of the goals for Garnet is to make tuples and structs fairly interchangeable, so it makes me happy that they can use the same delimiters. Arrays are written [a, b, c], like Rust or Python.

Most of the easy stuff is present in the language: we have let for declaring constant values, let mut for variables, primitive loop and break expressions, recursion, a basic collection of math and logic operators, all that good stuff. The type system is strong with no implicit conversion, like with Rust, because this has saved me from so many tedious errors. There are structs, tuples, arrays, Rust-like enums aka sum types, and C-like enums, because Rust’s lousy support for C-like enums irks me. Sometimes you just want a sequence of integer constants. So, this is a program you can compile and run right now:

fn fib(x I32) I32 =
    if x < 2 then x
    else fib(x-1) + fib(x - 2)
    end
end

fn main() {} =
    let val = 40
    __println(fib(val))
end

The compiler is written in Rust and emits Rust code, which is a little kooky but means that you only need one language’s toolchain installed. There’s lots of potential for backends to choose from but so far none of them have stood out as the Obviously Best Solution, so this is fine to get going with. (LLVM is slow, and I want Garnet to compile quickly.)

We have integers and booleans as our built-in types: floats, characters and such are in the same boat as strings, which is to say not difficult to add but I just haven’t needed to get around to it yet. Functions are first-class types, but with no borrow checker we can’t really do closures yet. References, borrowing and move semantics are still in the future, and will probably be the next big hurdle. I could add unsafe pointers but there’s not a whole lot of point yet. We have sum types, but no pattern matching yet so they’re somewhat limited.

Arrays are written T[3] for an array of 3 T’s, all types compose postfix. So you can write:

fn get_center_element(T | matrix T[3][3]) T =
    matrix[1][1]
end

T is a generic parameter here, in Rust this function signature would be written fn center_element<T>(matrix: [[T;3];3]) -> T. We have slightly less punctuation than Rust; I tried to make generic type parameters entirely inferred like OCaml but it was one of those things that was more trouble than it was worth.

So that’s all pretty basic, on to the interesting stuff! Tuple types are written {T1, T2, T3}. A struct type is written struct a: T1, b: T2, c: T3 end. (Now that I look at it, maybe that should change.) Like tuples, you don’t have to declare structs before using them; by default structs are anonymous and duck-typed by their field names and types. This was honestly an accident but seems to have worked out well in practice. So you can write something like this:

let foo struct
  .a: I32, 
  .b: Bool,
end = {.a = 3, .b = false}

let bar struct
  .b: Bool,
  .a: I32, 
end = foo

Kinda wordy, but the variables foo and bar above are the same type. This ability just kinda emerged on its own out of how the type system works; I don’t expect anyone to ever really do this, though it’s occasionally handy. Instead there will be an alias declaration that lets you add shortcut names for types, like so:

alias Thing = struct
  .a: I32, 
  .b: Bool,
end

let foo Thing = {.a = 3, .b = false}
let bar Thing = foo

If you want to introduce nominal types, you do that with the type declaration, which produces a type constructor function and makes it so you can no longer interchange values of different types, even if they’re “the same”:

type Thing = struct
  .a: I32, 
  .b: Bool,
end

let foo = Thing {.a = 3, .b = false}

 -- This no longer compiles, foo and bar are not the same type
let bar struct
  .a: I32, 
  .b: Bool,
end = foo

In Rust, this sort of single-value nominal type would be called a “tuple struct”. In Garnet type just takes any ol’ type declaration as a parameter, so it can contain a primitive type, a tuple, a struct, a sum type, something with generic parameters, whatever:

type Result(T, E) = sum
    Ok T,
    Err E,
end

Type inference works with generics, which is really the hard part of type inference, so this snippet typechecks and returns “false”:

let x = Result.Ok 42
let y = Result.Err "computer blown up early"
-- x and y have the inferred type Result(I32, String) because == compares two values of the same type.
x == y 

And someday pattern matching will let us match on nominal types to get the inner structural type and vice versa. So all in all there’s a lot I’m pretty happy with.

The last big feature of Garnet is generic types. Or rather, how do you generalize generic types? Sure it’s easy to have an array of any type T, but what if you want to write a function that takes an array of any T’s and checks if they’re all equal to some value? You need to be able to define equality for a particular type T and then tell your new function about it, which rapidly leads you into metaprogramming questions like “how do I tell the language that anything which is Orderable is also Comparable?” Older languages like C and Pascal take the easy answer of “you don’t”, which works surprisingly well a lot of the time but results in writing lots of fragile boilerplate code and/or ugly macros. That’s no longer particularly satisfactory though; in the end these sorts of generics are really about robustness and reducing code duplication. Java, C#, and other “traditional OO” languages do this with inheritance, which works pretty poorly all in all. Dynamic languages like Python and Scheme handle this all at runtime, and it’s just a matter of whether or not you can make your compiler generate fast code for it.

Newer languages have better approaches for handling this sort of thing, which is why we’re seeing dynamic languages become less popular. Rust uses traits (aka typeclasses), which are a language feature that let you define collections of functions that are “attached” to particular types, and has rules about how they interact. Zig takes another interesting approach and lets you write code that manipulates types at compile time to generate new code, somewhat like D/C++ templates. This is nice because it doesn’t require as much fanciness from the type system; as long as this template code runs successfully and outputs more valid code then you’re good. Both of these approaches essentially compile to entirely static function calls, there’s no type lookup at runtime or anything.

However, in my opinion traits are part of what makes Rust complicated, so I didn’t want to use that approach; if Garnet had traits then it would mostly turn into Rust sooner or later. Zig’s approach is very tempting, since what you really want out of type metaprogramming is to write code that writes code, and treating the language itself like a mega macro system seems like the lowest-friction way of doing that. I quite want a proper macro system of some kind anyway, sooner or later! However, this has some downsides too: if you screw up one of these comptime functions then it can just generate broken code, and you may not get informed that the code is broken until you try to actually use it. If you never call this function, or you call it with arguments that don’t trigger the broken functionality, you never know it’s broken. They can also sneakily incorporate non-local effects, so that an innocent change somewhere in your program can make things break somewhere entirely else. Basically, Rust’s expansion of generic types occurs after type checking, so it always generates a program that will typecheck. Zig’s expansion of generic types occurs before type checking, so it can generate code that then does not typecheck. This isn’t necessarily a deal-breaker, and it’s not too common to run into this in practice when writing Zig, but it still leaves me kinda spooked by the idea.

So I spent a long time thinking about this. What makes a language simple? What do you need in your programming language to keep it simple? Well, generally you start with values, functions/expressions, and types. Ok, why can’t we build type metaprogramming out of these pieces? Lua does it, and it works startlingly well, but it does it by looking up types and functions at runtime. Traits look a lot like structs full of functions plus some syntactic sugar, so what happens if we just make actual structs full of functions and go from there? Turns out the ML family of languages has done this already, and given the result the unhelpful name of “modules”. So I took a stab at implementing ML modules, and what do you know, it actually more or less works. So what does it look like if we define our aforementioned equality trait with that approach and try to use it? Our “traits” are just structs, and our “trait implementations” become just normal constant values that are fixed at compile time, and the compiler can still monomorphize it and inline everything so it’s just as fast as Rust and Zig. So put that together and we should get something like this:

--- Define our equality signature
type Eq(T) = struct
    eq: fn(T, T) Bool,
end

--- impl Eq for I32
const IntEq Eq(I32) = Eq {
    .eq = fn(lhs I32, rhs I32) Bool = ... end,
}

--- impl Eq for Bool
const BoolEq Eq(Bool) = Eq {
    .eq = fn(lhs Bool, rhs Bool) Bool = ... end,
}

--- Write a function using our Eq module
fn all(T | eq_impl Eq(T), val T, array T[]) Bool =
    -- iterators in for loops don't actually exist yet, but let's pretend they do...
    for i in range(0, array.len()) do
        if not eq_impl.eq(array[i], val) then return false end
    end
    return true
end

all(IntEq, 3, [1, 2, 3])                -- returns false
all(BoolEq, true, [true, true, true])   -- returns true

Booyah, it works. Turns out that ML modules and Haskell/Rust traits/typeclasses are basically isomorphic, you can turn either into the other one automatically with enough work. The differences are primarily ergonomic: modules are more explicit, and making them actually nice to use will probably require some syntactic sugar and argument inference that I’m not going to get into here. But to me at least they’re a lot conceptually simpler than traits: it’s all just structs, functions and generic types. You do higher level metaprogramming by writing const functions that return new modules, your trait constraints are just other modules those functions take as arguments, etc. So far you don’t even need anything like Rust’s associated types, which IMO is also a win, though it may eventually end up being simpler to include something like them anyway. The downside of modules, besides verbosity, is that you can make modules do some things that traits can’t, and some of those things are pretty ugly. But some of those things are real helpful, too, and I don’t think you can get one without the other. I wrote a whole Thing on the details if you’re interested. So I’m going to run with it and see how it goes.

So that’s more or less what Garnet looks like right now, give or take a few details or placeholders – the syntax I’ve demonstrated here is a little idealized, the reality has a bit lot more WIP junk in it I need to get around to fixing. All the basics are there though; it’s technically Turing-complete and even reasonably pleasant to use. Simple modules work, but complicated ones don’t, and that’s the main thrust of what I want to improve next. And as you may have noticed, there’s tons of small things that need to be done that aren’t directly related to that. Strings, iterators, floating point numbers, better packages/namespaces, and so on. I just now realized that I can create and typecheck arrays but actually indexing them doesn’t generate any code. So the language can’t actually do much yet, but it’s tantalizingly close.

The Future

So apart from modules and the associated features, and all the “little stuff” I keep insisting is just a Small Matter Of Programming, what is left in Garnet to design and implement? Well…

I said tuples and structs are intended to be interchangeable, but didn’t say anything about how. Well, the answer is I’m not really sure yet, but I have a few ideas of what I want. Basically, it’s a question of how much control you want to have about the layout of things in memory. For me this is one of the things I want to be a killer feature for Garnet, since it’s such a giant pain in the ass in C, Rust, and most other languages I’ve seen. The only language I’ve seen that lets you describe the layout of a collection in memory really well is Erlang, so I want to do something similar. My vague thought is that, similarly to how you can write patterns and then pattern-match to stuff different parts of a value into them, I want to be able to write bit-patterns and then pattern-match to stuff different types into them. How this will really work, I don’t exactly know yet, but there’s no reason it couldn’t. None of this exists, but it is an example of what I want to be able to write someday, one way or another:

--- Define bit patterns for the fields of 32 and 64 bit floating point numbers
type Float32 = << Sign:1, Exponent:  8, Fraction: 23 >>
type Float64 = << Sign:1, Exponent: 11, Fraction: 53 >>

--- Turn a Float32 bit pattern into a Float64 one
fn f32_to_f64(f Float32) Float64 = 
    match f with
        -- zero/sign extension here is probably wrong but you get the idea
        << Sign, Exponent, Fraction >> => Float64 << Sign = Sign, Exponent = zero_extend(Exponent), Fraction = sign_extend(Fraction) >>
    end
end

--- Define a struct that has the fields and bit layout of a Float32
alias PackedFloat32 = struct with layout Float32
end

--- Reinterpret any array of >=4 bytes into a Float32 and extract the Exponent part of it as a byte
fn get_exponent_from_bytes(bytes U8[]) U8 = 
    let byte_slice = bytes[..4]
    let float = byte_slice @as(Float32)    -- the compiler knows that byte_slice is exactly 32 bits and has a compatible layout
    return float.Exponent @as(U8)
end

99% of the time you won’t care about this sort of layout detail, and that’s fine, but Garnet is intended to be useful for that 1% of times when you’re writing device drivers, super memory-optimized things, and so on. You can do this sort of thing in Rust or C, but not particularly conveniently or safely. Honestly, that’s silly. It’s just so bad. Let’s make our tools better!

What else needs to be done… there’s a bunch of syntax warts to iron out and some things that aren’t implemented just ’cause I haven’t had energy. For example, there’s a few syntax bugs associated with the lack of semicolons. The Austral spec says “for many people, semicolons represent the distinction between an old and crusty language and a modern one”, which I love because it’s so true. It then goes on to use Austral’s semicolons as a way to repel the people who will care overmuch about that distinction, which I love even more because it’s so genre-savvy. (My insistence on not using curly braces for blocks serves a similar purpose, I realize belatedly.) I really don’t care whether a language uses semicolons or not but I’ve usually felt that they shouldn’t be necessary, so I wrote Garnet without them. It turns out that they’re not necessary but they are certainly useful sometimes, and so their absence makes some things tougher. I kind of want to try making newlines be significant and see how well that works in practice, and it feels like it may clear up other warts, but I just haven’t had time. There’s lots of little exploratory stuff like that sitting around.

There’s a very basic proof-of-concept for namespaces/packages that let you break a program up over several files, but nothing particularly functional. It will work basically like Rust’s modules, give or take, and that’s just fine. There isn’t really any “protection” mechanisms like public/private functions or types yet. I’m not opposed to adding one but want to first try the language out just using naming conventions for public/private values a la Python. I expect I will semi-inevitably decide it’s a bad idea and add proper exports and such, but if I can I’d like to omit features until they’re necessary. On the other hand I really want to make Garnet compile fast, and that means being able to parallelize every stage of compilation possible, so the package system may shake out a little differently than Rust’s “glom all files together into one compilation unit”. Need to do some more research there; Zig has some crazy and awesome ideas. I also want to be able to have a sensible ABI, at least for most things, so that’s related too. Rust has way too many great libraries that are far too difficult to use from other languages.

And the borrow checker! I haven’t even touched this for real yet, ’cause it’s scary, but frankly what I really want to do is just steal Rust’s borrow checker, or possibly a simplified version of it. If we can do that then anything more is gravy. Though, inspired by Austral, I also want to have linear types in Garnet. “Linear types” are a superset of Rust’s drop-checking, which automatically calls destructors on values when they go out of scope. Linear types mean that you must call destructors on values when they go out of scope. To me the way Austral does it is needlessly verbose, but there’s cases where your destructor doesn’t fit the function signature of Rust’s Drop trait, so you really want to be able to write an arbitrary function that destroys a type and have the compiler make sure that function is always called somewhere. So I want Garnet to have something like Rust’s Drop trait, and also the ability to say “this type can’t be Drop’ed and needs to be destroyed manually instead”. I think the absence of this in Rust is a wart that it can never quite fix for real without breaking existing code, so it’d be nice to get it Right in Garnet.

Conclusions

A lot of the time it feels like I haven’t made much progress. But looking back on it since the start of 2022, most of the hard stuff seems to work and I’ve converged on a fundamental design of the language that I don’t hate. That’s pretty good, honestly! That’s one reason I like writing these little updates. Now it’s a matter of getting a few of the last big blockers done, and then features can mostly be added one by one without interfering with each other.

So if you’ve read this far and want to work on a programming language, I would love some actual help. The last 18 months has taught me a lot about what I am actually good at in compiler writing, and the answers are basically the more engineerly parts of it. Wonky data structures, assembly programming, big high-level designs, I love all of it, but give me a paper full of turnstiles and gammas and my eyes glaze over. I can do the more math and theory intensive parts, eventually, but I sure as heck don’t enjoy it much. I want to work on a bunch of the missing features and “future” bits of the language, and I can’t because there’s these annoyingly critical things like type inference and borrow checking that keep getting in the way.

But I’m doomed to do it anyway, ‘cause nobody will do it for me, and that leaves lots of low-hanging fruit for people who want to pluck some. Wanna add strings? Make error messages suck less? Do some cool backend optimization? Figure out how to handle errors more smoothly than Rust’s Result+panic? Write a borrow checker? Make a standard lib? Design bit patterns? Come up with a heckin’ name for ML modules besides “modules” so we don’t have to call a file full of code a “package” or “namespace”? Maybe even, heaven forbid, get it to the point where we can start writing something resembling a standard library? Go for it, sky’s the limit.

The project is all on sr.ht. There’s two issue trackers, one for language design and one for compiler implementation. There’s a wiki and some mailing lists ’cause that’s how they do things on sourcehut; if emailing patches or whatever gets too much of a pain in the ass then maybe we’ll change to something else. To hack on the compiler you will need Mercurial and rustc and that should be it.

I suck at writing endings. Onward!