TheStateOfGarnet2023
Garnet is my attempt at making a programming language that could be summed up as “What if Rust was simple?” The goal is to make a small, borrow-checked language aimed at lower-level programming, designed for portability and robustness. It’s the end of another year, so let’s take a look at what’s new.
Last updated December 2023.
Previous years:
What got done in 2023
- Finished writing a basic type checker with inference and generics, it now mostly works as long as you don’t get very fancy
- Made newlines syntactically significant, which dramatically unfucked old syntax issues
- Tried to improve generic type inference, which dramatically conjured new syntax issues, so it’s probably not worth it
- Sum types now work, as do arrays and tuples
- Took a stab at monomorphization, which doesn’t work yet but is a blocker ’cause we need to be able to generate some code that isn’t valid Rust. Since we just compile to Rust for now, this is a bit of a nuisance.
- Experimented with a namespace/import system; it’s on hold until there’s a real plan for separate compilation.
- Burned myself out and decided to spend
a couplefour months working on a side-project, which conveniently made me get around to learning Elixir too. Elixir is a pretty wild ride and I love it.
I also did a lot of cleanup and minor convenience-level language
improvements, like adding const
s and propagating them into
generated code where possible, sorting out type and operator syntax a
bit, adding nicer support for compiler intrinsics, making a slightly
better autoformatter and integrating it into the language test harness,
adding a verification pass to the type checker to make sure it actually
finds concrete types for everything in the program and doesn’t leave
anything un-inferred, etc etc… There’s still a lot of small pieces left
to do though, for example bit operations like shifts and xor’s are
currently just compiler builtin functions, there’s no real way to
convert values between types, there’s no characters or even constant
strings…. It would take a day or two four to add each of
those, they’re just less important than getting the foundations down
more solidly.
So right now, this is a small example of the hypothetical Garnet I am aiming for, that just implements the PCG32 pseudorandom number generator:
type Rand32 = struct
state: U64,
increment: U64,
end
--- Mutably borrows a `Rand32` and produces a random
--- `U32` in the range `[0, 2^32-1]`.
fn rand_u32(self ^uniq Rand32) U32 =
-- We have to explicitly dereference borrows with the ^ operator,
-- which is postfix. Not sure if we'll eventually have Rust-style
-- auto-deref of borrows, don't feel like we need it yet.
let oldstate U64 = self^.state
-- The "method" syntax foo:bar() is just equivalent to calling bar(foo)
-- I'm considering replacing it with a more functional-style pipeline
-- operator, foo |> bar(), might play with that someday.
self.state = oldstate
:wrapping_mul(RAND32_MULTIPLIER)
:wrapping_add(self^.increment)
let xorshifted = cast(|U32| ((oldstate >> 18) xor oldstate) >> 27)
let rot = cast(|U32| oldstate >> 59)
xorshifted rotr rot
end
const RAND32_DEFAULT_INC U64 = 1442695040888963407
const RAND32_MULTIPLIER U64 = 6364136223846793005
--- Creates a new RNG state with the given seed
--- and the default increment value
fn new_rng(seed U64) Rand32 =
let mut rng = Rand32 {
.state = 0,
.increment = RAND32_DEFAULT_INC,
}
rng:rand_u32()
rng.state = rng.state:wrapping_add(seed)
rng:rand_u32()
rng
end
fn main() =
let mut rng = new_rand32(4)
let num = rng:rand_u32()
println(rng.state)
end
Then when you rewrite this into something the Garnet compiler can currently actually build, you get this:
type Rand32 = struct
state: U64,
increment: U64,
end
-- There's no borrows or references yet, or even raw pointers, so for now we just take
-- the struct as a value and return tuple containing a new, modified struct, immutable-
-- language style.
fn rand_u32(self Rand32) {Rand32, U32} =
let mut self = self
-- Currently declared types are strictly nominal, you can't do much with them besides
-- pass them to a function that expects that type. This includes accessing the fields
-- of the struct. So the $ operator is an "unwrap" operator to access the inner value,
-- like doing `foo.0` with a newtype struct in Rust.
-- It should be able to be inferred away when I can bother.
let oldstate = self$.state
-- No wrapping/saturating/etc math operators yet, so this will actually
-- panic on overflow.
self$.state = oldstate * RAND32_MULTIPLIER + self$.increment
-- Bitwise operators are just built-in functions right now.
let xorshifted = __rshift_u64(__bxor_u64(__rshift_u64(oldstate, 18), oldstate), 27)
let rot = __rshift_u64(oldstate, 59)
-- Type conversions are also builtin placeholders.
let res = __ror_u64(xorshifted, __u64_to_i32(rot))
{self, __u64_to_u32(res)}
end
const RAND32_DEFAULT_INC U64 = 1442695040888963407
const RAND32_MULTIPLIER U64 = 6364136223846793005
fn new_rand32(seed U64) Rand32 =
let mut rng = Rand32 {
.state = 0,
.increment = RAND32_DEFAULT_INC,
}
-- No pattern matching yet, so we index the returned tuple to access the first element.
rng = rng:rand_u32().0
rng$.state = rng$.state + seed
rng = rng:rand_u32().0
rng.0
end
fn main() =
let mut rng = new_rand32(4)
let res = rng:rand_u32()
__println_u32(res.1)
end
I’d totally forgotten I’d implemented unsigned integers; neat! So there’s a lot of placeholders and fiddly edges, but honestly it’s better than I thought it was, which is of course part of the point of doing a retrospective like this.
Type system
The main blocker on handling a lot of those fiddly edges is improving Garnet’s type system, which is really where 90% of the effort has gone. It’s also the part where I expose how little I know, so let’s get it out of the way first. Garnet’s type system is basically the younger sibling of Rust, which is a younger sibling of ML, but there’s still a lot of design space to play around in there. Beyond the modules vs typeclasses thing I went through last year, here’s the main points that I’ve mostly settled on for Garnet:
- Garnet has Rust-like generics that get typechecked and monomorphized – as opposed to for example Zig that treats generics as template-y macro-y things that generates new code before typechecking, or OCaml which last I checked doesn’t monomorphize most generics.
- Limited-at-most dependent types. Ie, you can’t treat arbitrary
values as part of types, such as being able to write
fn square(Int) returns an Int >= 0
and have the type system prove it correct. Arrays with integer lengths can be treated like a special case of this, and if there are other special cases they’ll be limited to fairly-simple integers and maybe C-like enums. This is mainly because whole-hog dependent types tend to creep towards having a Turing machine in your type system, and also from what I’ve seen when people want dependent types they really want to be able to do math to type sizes or reflect on stuff at compile time. So I’m not sure it’s worth the cost as a general-purpose tool. - Higher-kinded types are a bit of a morass that I’m still undecided
on. The question that HKT’s answer is, can you write a generic type
constructor that takes generic parameters? Can a function take a type
like
Map(K,V)
whereMap
is a generic type and let you pass it aBTreeMap(K,V)
or aHashMap(K,V)
, without writing the equivalent of separate trait impls for both? HKT’s are nice, but are they nice enough to be worth the added complexity? They add complexity, but are they that much more complex than not having them? You can often work around their absence with some effort, but how nasty is that? I dunno. Every time I look at them I come up with a different conclusion. - So for the math-y among us, I guess this is fairly close to System Fω?
Details about Garnet’s type system that don’t fall into these categories:
- Garnet will have move semantics with optional linear types. Values
that are non-
Copy
will mostly be drop-checked automatically a la Rust, but you will have the option to label types “must be destroyed explicitly” in which case the compiler will force you to call their destructors when they go out of scope. This lets you handle the cases where you need to drop objects in a way that doesn’t fit into theDrop
API. Basically all the actual details of this are up in the air though. - There’s about 25% of an algebraic effects system, which I call
“properties”. “Effects systems” a la Koka seem to really be made of two
parts which, to me, seem unrelated: declaring what side-effects
functions can have, and attaching non-local control flow to those
side-effects. I just want to be able to assert things about functions in
Garnet and have the compiler check my work for me. For me the
interesting effects are mostly fairly low-level things like
const
/comptime
, “no infinite loops,”no dynamic allocation”, “no panics”, “no stack overflow”, etc, and only like half of them actually need compiler intervention to check. For all the rest, all you really need is the ability to tell the compiler “this specific function has an effect calledX
” and then anything that doesn’t call it has your “doesn’t doX
” property. For example, any function has the effect “doesn’t write tostdout
” as long as it doesn’t call another function that writes tostdout
anywhere in its call tree, and you just have a low-levelwrite()
function somewhere that has the property “this writes to stdout”. Then you can parameterize higher-order functions on it, somap(List, F)
has the “doesn’t write to stdout” property as long asF
does as well. Naturally this gets tricky sometimes when you want to dodge side effects, such as with logging or debugging functions, but figuring out how to make that work well is part of the fun. - As described, those properties are ways of describing constraints on
functions, but the original motivation for properties was looking for
something better than Rust’s marker traits on data.
Beyond
Send
/Sync
I want to be able to express things likeZero
for types where zero initialization produces a valid type, orConstDefault
where a non-zero but fixed bit-pattern is a valid initializer, orTotal
where any bit-pattern is valid. Hopefully this lets you specify some things that tend to be sharp-edged UB in Rust or C, such as “this chunk of uninitialized memory will contain garbage but it’s still a valid integer”, or “casting this integer into a file handle is a bug because you can’t just conjure valid file handles out of nowhere”. Those sorts of things are honestly probably more useful than function properties, but the semantics around it are also harder to think about so my ideas about how to use them are less mature.
Compilation model
This is another tough part that I haven’t gotten much real work done on yet, but it’s also one place where Garnet’s goals depart from Rust’s the most. Rust makes some very deliberate design decisions: Compilation units are big, everything is statically linked, and there is no stable ABI by default. This makes the compiler design easier, linking and versioning of code easier, and optimizations more aggressive. But it also has some downsides: It’s harder to make compilation parallel, it’s harder to make compilation incremental, and it’s more work to distribute code artifacts (which is not always necessary, but is pretty useful sometimes).
So instead Garnet will have:
- API versioning built into the toolchain by default, with semver checking. In a strongly typed language there’s basically no reason I know of why the compiler shouldn’t be able to detect breaking API changes 100% of the time, or at least 99% of the time with a 1% false-positive rate and 0% false-negative rate.
- Compilation needs to be embarrassingly parallel, or at least have as few single-threaded bottlenecks as we can manage. I’m not sure I’m quite as dedicated to it as Zig, which guarantees you can lex each line in a file in parallel, but there’s a lot you can do to design away those bottlenecks if you try.
- There will be a stable ABI, though that’s honestly pretty damn hard
and a project for the far future. The current lowest-common-denominator
ABI is the C ABI for whatever OS you are on, but I would also like to
have a stable Garnet ABI that can express things like generics and
borrows, including metadata and maybe partially-compiled code for
another compiler to pick up on and work with. Having this can also make
fast parallel compilation easier, since the compiler can read some
metadata about a compiled artifact and know what to do with it, rather
than having to start entirely from the source. Interop with other
languages like C and Rust is a necessity, but honestly my dream is
eventually it will be useful and desirable to add a
repr(Garnet)
directive to Rust and C compilers. How to achieve all this? Eeeeh, ask me in a couple years. - Because of all of this, using the existing C-descended linking model is prooooobably a bad move? The biggest issue with it in my mind is inlining. Inlining is really important, and not being able to trust compilers to do it results in a surprisingly large pile of terrible shit in the C/C++ code that people actually write. But currently your options are basically “don’t have inlining across compilation units” (C/C++), “have only one compilation unit” (Rust/Go), or “trust your linker’s LTO magic to do something sensible”. I don’t like the last option, partially because a linker doing LTO magic tends to be one gigantic bottleneck in a program’s otherwise-parallel compilation. (Linking is always the step at which building a giant program like Chrome or LLVM suddenly consumes 60 GB of memory and cacks out.) On the other hand, the issue of linking is a long-term performance improvement, and I’m certainly not going to sit down and start writing a new linker tomorrow. Zig and Swift are both doing some fascinating work exploring this space, so there’s nothing wrong with letting them lead the way on that a bit and learning from their work.
Neat features that exist right now
Okay, I’ve done lots of talking about what I want to do with Garnet, but there’s some small things that Garnet does right now that just make me happy, so I’m going to talk about those. These mostly are places where I feel like Rust has some nasty rough edges and I wanted to make them nicer. There’s also a big pile of smaller mostly-cosmetic things that I like in my own opinionated way, like having only one type of doc comment or making struct and tuple literal syntax mirrors of each other, but while they spark some joy in me they don’t actually Matter outside my own head.
More composable unary operator syntax
Basically, most unary operators in Garnet are postfix, particularly the reference and dereference operators. This is similar to Zig, and hopefully will become the standard idiom for most languages in the future for reasons I shall now explain, somewhat similar to how types-compose-prefix is now an unspoken standard among Go, Zig, most of Rust, and a few other more minor languages. (Garnet does that, too.)
This solves one significant wart in Rust, that of operator precedence
of chained unary operators. Basically, in Rust when you write
*foo.bar.baz
the *
binds less tightly than the
.
, so you’re writing *(foo.bar.baz)
. If you
want to write (*foo).bar.baz
then you have do that
explicitly. However, in Rust you usually don’t notice because the
compiler will insert reference or dereference operators for you, with
some consistent rules that actually work quite well. So if you need to
write *(foo).bar.baz
or even
*(&(*(*foo).bar).baz)
then you can actually write
*foo.bar.baz
and rustc will usually be able to
just Figure It Out For You. Except that when you’re using raw pointers,
you really want dereferences to be explicit so rustc
doesn’t do ref/deref inference for raw pointers. So in that one damn
case you’re stuck writing *(&(*(*foo).bar).baz)
anyway. It’s cosmetic, but it also gets old really quick. C gets around
this by having both .
and ->
operators
where foo->bar
is sugar for (*foo).bar
, and
it works but I kinda dislike that too.
Meanwhile in Garnet ^
(aka *
) and
&
are postfix, so to write Rust’s
(*foo).bar.baz
in Garnet you write
foo^.bar.baz
and to write *(foo.bar.baz)
you
write foo.bar.baz^
. If you want to write
*(&(*(*foo).bar).baz)
, then well you just do
foo^.bar^.baz&^
. No parens needed, and instead of
having to read it from the inside out you just read it from beginning to
end. Because of this Garnet has no auto-deref rules; it may grow some
eventually, but IMO it doesn’t really need them so far.
This syntax is hardly new and exotic, it’s exactly the logic behind
Rust’s postfix .await
, Zig’s pointer dereferences are
postfix .*
‘s, and Pascal did the same thing with its own
pointer syntax in the heckin’ 1970’s. But it still makes me happy to
replace a subtle complication with a nice simplification.
Nominal vs. structural types
Ok, this one is actually more than cosmetic. In Rust, tuples are
structurally typed, so every (i32, String)
is
interchangeable with every other one, while structs are nominally typed,
so struct Foo { x: i32, y: String }
and
struct Bar { x: i32, y: String }
are entirely unique types
and you can’t assign some Foo
to some Bar
even
though they have the same fields. Both nominal and structural types are
useful, so in Rust you can turn anything into a new nominal type with
the “newtype struct” pattern, struct Foo(Whatever);
. But
you can’t make a struct into a structural type, and “newtype structs”
always seemed like a kinda half-assed solution both syntactically and
semantically, and there’s no real reason for it that I can tell. It just
ended up that way.
In Garnet, tuples are structural, every {I32, String}
is
interchangeable with every other {I32, String}
, and structs
are also structural, so you can write a variable of type
struct x: I32, y: String end
and it’s interchangeable with
every other variable declared with the type
struct x: I32, y: String end
. (Yeah the syntax is too
verbose for inline types, idk what would be better.) It’s just an
anonymous struct type, and its only identifying features are its field
names and types, so it’s also the same type as a struct written
struct y: String, x: I32 end
. In this it’s a little like
OCaml’s and C’s structs, which I frankly never particularly liked.
But then we make the type Foo = ...
declaration create a new nominal type, just like Rust’s “newtype
structs”, and suddenly everything just slots together.
type Foo = struct ... end
creates a nominal struct like
Rust, and type Thing = {...}
creates a nominal tuple like
Rust’s newtype structs. Then you can have an $
operator to
“unwrap” or “take apart” the nominal part of the type to get at the
underlying structural type. A “type constructor” like
Foo { .x = 3, .y = "something" }
or
Thing {a,b,c}
is just a function which takes a single
argument, which is an anonymous struct literal, and returns an instance
of that struct. So you can do things like
some_array.map(Foo)
to turn an array of tuples or anonymous
structs into an array of Foo
’s. In contrast, in Rust if you
make a newtype struct Foo
then its constructor is a
function, but if you declare a struct type Foo
then its
constructor is not a function, and you just suck it up and
deal. (And now that I think of it, Garnet’s $
“unwrap”
operator is just a function too, which dispatches on the type of its
argument and does the opposite operation.)
Then under the hood, both tuples and structs compile to tuples, which
is to say, ordered, heterogeneous sequences of values. So the plan is to
have a “layout” concept of some kind that can turn a struct into a tuple
or vice versa, so you can tell the compiler that you can turn a
{I32, String}
into a
struct x: I32, y: String end
by mapping the 1st field to
x
and the 2nd field to y
, or vice versa. This
gives you a way you can apply a layout to impose an ordering on a struct
or field-names to tuples, and suddenly tuples and structs become
interchangeable with each other and a lot of the annoying little
decisions you have to make with choosing which to use in Rust can go
away. And somewhere down the line (with additions to describe padding
and alignment of fields) this will let you express things like Rust’s
repr(C)
which define an ordering to struct fields in memory
as well, except instead of magical compiler builtins with fixed rules
they’ll just be something you can define yourself.
Oh, I nearly forgot; this also meshes nicely with sum types.
Basically each variant in a sum type, aka an enum
in Rust,
just takes one argument. That argument can be an anonymous struct or
tuple, or it can be a nominal type, or whatever. It’s a bit rough still,
but have you ever wanted to write a function that only takes or returns
one particular variant of an sum type while still being the same type?
Or one that takes a type constructor function for a particular variant
and calls it with some particular data? Or one that takes a sum type’s
discriminant and a tuple of values and constructs one of several
variants with identical fields? In Garnet that should all be possible,
or at least more smooth and less manual than Rust, because the contents
of a type and the name attached to it are always separate things.
Coroutines
So in June 2023 Graydon Hoare, the creator of Rust, wrote a blog post titled The Rust I Wanted Had No Future. It’s full of fascinating little nuances here and there, many of them about design goals and how that interacts with people management/diplomacy/politics. It paints a pretty interesting story where Graydon wanted a higher-level language with some low-level features, something more in the vein of Swift, C# or Elixir. But other people saw that Rust could be a safe low-level language instead, and really wanted that, and Graydon basically ended up bowing to their wishes. I’m pretty glad he did, too. I wouldn’t say that the Rust he wanted had no future (Swift exists after all, and Graydon spent several years at Apple working on it) but I think the Rust we got fills a niche that has been unfilled for far too long.
One of the interesting features he mentioned losing the argument over was “exterior iteration”, which caught my attention ’cause I’d never heard the term before. All he really says about it is “Iteration used to be by stack / non-escaping coroutines, which we also called ‘interior’ iteration, as opposed to ‘exterior’ iteration by pointer-like things that live in variables you advance.” So now I had to figure out what “non-escaping coroutines” were. I’ve heard of coroutines of course, but never really grokked them particularly deeply. They were always one of those concepts, like type theory notation or thermodynamics, that I kinda get in the abstract but never understood well enough to figure out how to do anything useful with it. Plus every language that says it has coroutines seems to mean something slightly different, and it’s not a small list – there’s Lua, Modula-2, C# with tasks, Python with generators, Rust with async, etc etc. So I figured it was time to try to figure coroutines out for real.
This lead me down a fascinating and awesome rabbit-hole, for months. The short version is that Garnet isn’t going to do interior iteration like Graydon wanted Rust to do. The long version is really cool though. Basically what Graydon was referring to by “interior” iteration is making iteration happen via functions that take closures, like Ruby does it. So instead of writing in Rust:
in Ruby you write:
Ok, so instead of a language construct for a loop you have a function
that takes a closure. Does this really change anything? Why
yes, because what do you do if you want to break the loop
early? If you write a return
in the loop in the first
example then it returns from the function containing
the loop. If you write return
in the second one, it…
returns from the closure and continues the loop from
the top. Right? So with interior iteration you don’t have any built-in
way to write break
or return
from inside a
loop, just continue
. Riiiight? Well that seems like a
strict downside, ’cause now if you want to write a loop that breaks
early you have to make the inner closure return some marker value like a
bool or an Option
that tells the outer function “keep
iterating” or “stop here”. That is exactly what Rust iterator adapters
like map_while
do, and while it works it makes nesting loops and keeping track of what
is what rather harder. It seems like a strange thing for Graydon to
yearn for.
But no, that’s not how the second example works at
all! Fire up irb
and try it out! Add a break
or return
to your loop and see what happens. Nest a few
loops! The the block being passed to each()
can
break
or return
through multiple levels of
function calls; the each()
method can say “I’m calling this
thing, and if it calls return
, it returns all the way to
here even if there’s multiple function calls between this call
and that return
”. But how the hell does that work? To
figure that out, let’s back up a little.
So a function call is, at its heart, saving the current code’s state onto a stack, jumping to a new piece of code and doing some stuff, and then going back to where it came from and restoring its state. A coroutine is an extension of this: if calling a function is a jump, a coroutine is a jump and a stack switch. This explains why coro’s are both very simple and fundamental (it’s not actually doing anything particularly fancy, just loading a new stack pointer, so implementing them in the 1970’s languages where the term originates wasn’t a huge stretch), and also arcane as heck (they look a lot like hardware interrupts, and hardware interrupts are where the theoretical underpinnings of the Turing machine model of computation goes to die).
Ok, so a coro is a jump and a stack switch. What the heck does that
mean? Well, first it means that calling a coro is
usually a distinct operation from a function call called
resume
, so it doesn’t get mixed up with function calls,
because resuming a coro does things that a function call can’t. And it
means that to exit from inside inside a coro along with
return
you have a new operation called yield
.
Inside a coro you can call a function and return from it and it all
works like normal, but also instead of return
you can also
call yield
which hops back up the call chain to where
the coro was resume
’d from – similar to throwing an
exception. So it’s like, yield
is throw
and
resume
is catch
? Not quite; unlike an
exception, the call stack of the coro doesn’t go away when you call
yield
, so if you call resume
on the same coro
again you just hop right back up the call chain to where you left
off and keep executing just after the yield
, with all
the function calls between those two points still intact. Instead of
unwinding the stack and destroying everything on it until you hit a
catch
, it keeps the coro’s piece of the stack around so you
can switch back to it later. That’s why the stack switch happens, so you
can pick up a coro where it left off. So a coro also has state,
the state of a new stack that you can jump into and out of.
Or, you can never yield
and just return
normally, or you can resume
a different coro and never
return to the old one. Or, you can recursively resume
yourself and build up a chain of yield
calls that would
have to be made to go back to where the coro was started. Or you can
resume
a coro and pass it another coro and tell it
“resume
this thing when event Foo happens”.
This is about where most people’s heads start hurting, so let me try to restate it
In its most general form:
- A coro is a first-class object with state, like a closure is. It uses memory that has to be allocated and freed.
- Like a closure, a coro can be passed arguments or capture its environment, so it can refer back to objects that existed where it was created.
- Unlike a closure (usually), when you call a coro it might never
return to the caller; it’s entirely possible for it to just wander
off down a diverging control flow path without ever calling
yield
orreturn
, such as having two coro’s that just repeatedlyresume
each other. - And that coro may create new coros, pass them as arguments to each other, dynamically decide which ones to call, etc.
- This means that a coro is a piece of data that incorporates arbitrary control flow.
For the Schemers out there, if this starts sounding like
continuations, it’s not a coincidence; there’s a hierarchy of power out
there that basically goes
exceptions < coroutines < continuations
. (Apparently
coroutines are equal in power to continuations that can only be called
once, and I don’t even know how to start thinking about that.)
For the Lispers out there, if it sounds like Common Lisp “conditions”,
also not a coincidence, afaict they’re exactly the same thing. For Go or
Erlang people, your lightweight threads are basically implemented by the
compiler as coroutines that get stuffed into a global data structure and
selected by a scheduler. The compiler inserts yield
calls
for you at various places in every function, so each process/goroutine
runs for a little while and then does its jump+stack-switch back to the
scheduler, which chooses a new one and does a jump+stack-switch into it
to pick up where it left off. In fact you can implement coroutines with
lightweight threads very
easily.
There’s also a few more details I will mention just for completeness:
- A coroutine usually has an “exit” control flow distinct from
yield
that says “this coro is over and cannot be resumed”. That’s what happens if a coroutine callsreturn
instead ofyield
at the toplevel. - Coro’s have nothing at all to do with threads… apart from them both involving a jump+stack-switch! With coro’s that operation is done explicitly when the code says to do it, with threads it is done entirely by surprise, when the hardware says to do it. So in some ways they are very similar but in others they are very different beasts.
Ok so this is all awesome and brain-bending and how are you actually supposed to use these? That’s a very good question, and seems to be why so many variations of coroutines exist. Most of them have various limitations to specialize them for various purposes, and so you get you a whole fucking zoo of distinguishing features:
- Asymmetric vs. symmetric coroutines – asymmetric are the more common
variety currently, they have
yield
andresume
as I’ve described while symmetric ones only haveresume
. If you want something likeyield
in symmetric coro’s, you build it out ofresume
and pass a parameter for where to yield to. They’re equally powerful, you can implement either one in the other style, but people seem to like asymmetric ones more ’cause they act more like functions than goto’s. - Stackful vs. stackless coroutines – Stackful coro’s are “normal” as
described, stackless ones are basically Python’s generators and can only
call
yield
at the top-level function. This means they can’t recurse, and also means that callingyield
only ever yields through one function call of the stack; it can’t do the jump-up-multiple-levels-of-calls that stackful coro’s can. However, this also means that stackful coro’s may need to allocate their own dynamically-sized stack, while stackless ones are always able to be turned into fixed-sized objects at compile time. (I have a hunch that this is incomplete, and you can often make a coro fixed-sized without restricting it to only yield from the top level, but I haven’t dug into that too much.) - Escaping vs. non-escaping coroutines – In GC’ed languages like Lua or Common Lisp, coroutines can capture their environment like closures can, and so can refer to variables on the stack of the function where they were created. Just like closures! And like closures, those coro’s can be returned or passed to other functions or stuffed into arbitrary data structures, escaping the place they were created, so they have to be able to bring the fragments of that environment with them. So when Graydon was talking about “stack / non-escaping coroutines”, he was talking about coroutines that are limited to not be able to do this; these coroutines are essentially limited like Rust’s closures are, you make sure a coro either can’t outlive the function that called it or there’s some kind of explicit copy/move/GC/whatever involved to make sure the shared bits it uses stay alive as long as it needs them to.
There’s probably more variations, and I’m not even going to try to go into promises/async/generators etc and what they actually mean. Right now, if you want to play with a language with the most archetypical, powerful and interesting coroutines around, check out Lua. If you want to learn more about the engineering and theory, check out the paper Revisiting Coroutines which is very easy to read, optionally followed by Coroutines in Lua which has a bit more math in it. (It’s not a coincidence that both were written by Lua implementers.) I feel like I’ve seen a third paper somewhere that’s somewhat newer, but I can’t find it?
Ok, so coroutines are all very cool, but still… so what?
Good question!
Oh right back to Garnet
Graydon’s original suggestion for non-escaping coroutines was to use them mainly for iteration, like Ruby does. Eeeeh, IMO iteration isn’t exactly a difficult problem that needs lots of interesting innovation done to it. Garnet is very goal-driven; if I’m going to consider adding coroutines to it, they need to have a purpose, so what else might we use coroutines for?
- State machines
- State machines
- State machines
- State machines
Ok, that’s a bit tongue-in-cheek, but really is the essence. There’s just a few kinds of state machines that are so common and special-purpose they have their own names, so it’s more useful to think of it as:
- Iterators
- Error handling (especially involving transient errors or retries)
- Non-blocking I/O
- All other state machines
Great, if iterators aren’t too interesting and Garnet isn’t doing
much exception-style stack-unwinding error handling due to all the
various well-known issues with it, are the other use cases valuable
enough to add a new feature? Specializing coro’s for non-blocking I/O
pretty much gets you Rust’s async
, and “do not have Rust’s
async
” was one of the OG goals for Garnet. I used to feel
like Rust’s async
was just a big heckin’ mess of
unnecessary complexity, but withoutboats has written some compelling
stuff on why it looks the way it does, and so I now think it’s a lot
more of a useful local minimum. It contains lots of subtle details
around moving values, ownership, and control flow which interlock to
make it work, and you can’t really change much about one part without
needing to change all the others. Still, I’d rather have a
general-purpose tool than a special-purpose one, and Rust’s
async
is pretty lame compared to real coro’s, so maybe we
can do better/nicer?
First, let’s get one thing clear: Using coroutines for state machines
is a feature of convenience, not necessity. If I understand correctly,
any coroutine can be rewritten to be a normal closure that
takes a struct representing its current state, and returning a new
instance of that struct, optionally plus a global stack for the context
of “where does yield
yield to”. The thing is that keeping
track of it all without the stack-switch part of the equation built in
is just a gigantic pain in the ass, and doing it that way doesn’t
compose.
For example, writing non-blocking I/O in Rust without
async
isn’t that hard; whip out mio
and just get to
work. It’s a little tedious but far from difficult. You might have
multiple states in a handshake protocol or error states or such, but
writing those singly is also pretty easy. Boiler-plate-y, but I usually
prefer simple boiler-plate to complicated hidden logic anyway. What gets
difficult is composing these state machines, so that if you’re
writing say a network protocol, you can have one portion that says
“initial handshake”, one portion that says “talk to new peer of type A”,
and a different portion that says “talk to new peer of type B”, then
more portions that say “talk to previously-known peer of type A/B”, and
so on and so on. It all becomes a big morass of state enums and fragile
function calls back and forth dispatched by various giant
match
statements and you’d better not forget any of them or
mix them up or else you get subtle hard-to-spot bugs. Then if you need
to add a new peer of type C with its own protocol variation? Be prepared
to rip apart and replace big chunks of your existing code, and basically
re-analyze the entire thing to make sure it all still works. Using
coro’s instead, where each coro is a state and each
resume
/yield
call is a state transition, is
way nicer. I have some very
ramble-y notes on how it looks to implement a small but non-trivial
state machine this way, and it works out surprisingly well.
Sounds good, right? Let’s do it! Wellllll… unfortunately there’s some
wrinkles. Garnet has no GC, like Rust the memory management the language
understands is purely RAII+borrow-checking. So if it has coroutines, it
is up to the user to tell the compiler whether a coro can live on the
stack, in a Box
, in an Rc
, or something else.
This is just like Rust’s closures, so it’s not an unsolvable problem,
but with coro’s it’s way nastier. Borrow and drop checking
works by analyzing the static control flow of the program and making
sure there’s no code you can reach where Bad Stuff happens. Coroutines…
are data that represents arbitrary control flow. So without getting too
deep into the (fascinating) weeds about it, making a borrow checker that
can try to figure out what the hell a coroutine is borrowing and how
long those things live is a pretty hairy problem. Plus, remember that
bit about a coro being a jump+stack-switch? The “stack switch” part
means that creating a coro often means allocating dynamic memory, which
Garnet needs to be able to function without. Stuff that lives on the
stack needs to be fixed-size, so can you statically analyze a coro to
make sure it can be fixed-size? Again, yes, you can do this like with
you can closures, but that gets you the stackless coro’s described
above, which really just… aren’t very useful in comparison to stackful
ones. The best description I’ve heard for them is “weaksauce”. They work
for promises or iterators but way less useful for whole-hog state
machines, so it’s really not very appealing.
And that’s without the whole morass you get with Pin
in
Rust’s async functions! Which is not a small deal, because if
you don’t want every single coro to be boxed, then that means when you
move them they move, you’re not just copying a pointer from one
place to another. Which means you have the whole self-referential struct
problem, but also you can have one object on one stack pointing to a
different object on another stack, and that stack may suddenly get
picked up and put somewhere else in memory. And unlike the call tree of
function calls, because you can pass coro’s around as objects and stuff
them into data structures and whatever, instead of a tree or DAG of
function calls you can get a call graph with cycles in it that are
constructed at runtime. And so as far as I can tell, when your compiler
tries to figure out where to insert the drop()
calls for
you there will always be cases where it just cries. This isn’t a problem
in GC’ed languages because the GC has the power to move an object in
memory and then fix up all the pointers pointing to it; manual memory
management literally can’t do that. If you tried then you’d end up with
90% of a GC already. (Which sounds pretty cool now that I mention it,
but not something for Garnet.) So if you want a coro that can be passed
around as freely as it is in Lua or such, it would almost always need to
be boxed.
So… yeah. Is this a solvable problem? Can we make:
- some sort of coroutine
- in a borrow-checked language without automatic memory allocation or a GC
- that can live on the stack and be copied around
- which aren’t weaksauce
- can create and call new coro’s without having to box every single one
- and is still pretty nice to actually use?
I mean… probably? Eventually? If you took withoutboats, burntsushi and a bunch of other Rust lang design gurus and locked them in a room for a year or two to come up with a clean slate design, I’m like 80% sure they’d come up with something that works really well. But Garnet’s goals are “small and obvious” and, more importantly, “don’t be a research project”. To me the dynamic control/data flow in coro’s makes them mesh really well with a garbage collector, and looks really hard without one. I still need to do a writeup on all the details to explain how I think about this, but there’s a lot to there and it’s not sorted very well in my head and I would have to spend like 15 hours making diagrams to think through all the various cases. So I haven’t gotten around to it yet.
So, unless an epiphany hits that makes everything fall together nicely without becoming massively more complicated, Garnet will not have coroutines.
wat
Yep! Sometimes you spend 3 months thinking real hard about something and then decide it’s not the path to take. This is kinda why I didn’t want Garnet to be a research project, but sometimes the allure of shiny objects is just too much to resist. I can’t call it wasted time though; I learned a shit-ton, and tied my brain up into many interesting pretzels that forever changed how I think about code. Don’t get me wrong, I would love it if Garnet had coro’s, but I’m not going to try to redesign the language around them any time soon.
Also, honestly the most useful part of coroutines that is also the hardest is “writing general-purpose, composable state machines”, which don’t necessarily need to overlap with “coroutine” at all. There’s lots of ways to write state machines! So I’ve kinda pondered what it might look like to have “state machine” be a primitive of a programming language, like for loops or structs. Maybe coro’s are just too low-level, or the wrong abstraction, with all their detailed interactions with what they borrow on which stack and how the control flow interacts? If you just want to represent states and transitions in a way that’s easy to express, what would it look like? I think this idea has a lot of promise, but again, Garnet isn’t a research project, so it’s a problem to think about in the future once most of the current goals are sorted out. Maybe the next language!
So what’s left?
All the shit I don’t want to deal with but have to, alas.
Monomorphization
This is one of those bits of language design and compiler
implementation that not enough people talk about, as far as I can tell,
though that’s also not
the same as nobody. It’s basically the process of taking a
general-purpose function
fn append(&mut Vec<T>, T)
and turning it into
specialized variants
fn append(&mut Vec<i32>, i32)
and
fn append(&mut Vec<SomeStruct>, SomeStruct)
and
so on so that you can actually compile them to efficient machine code
without tons of pointer indirection. It’s very simple in concept and a
total pain in the ass in reality, but at least
I’m not the only one who thinks so. It’ll get done, it’s just
tedious and error-prone. Basically most of the worst parts of
type-checking without the payoff.
The rest of modules
Sufficiently-trivial ML modules pretty much work in Garnet right now, and once monomorph works we’ll be able to write slightly-less-trivial functors that let you do actual type metaprogramming along the lines of Rust’s traits. Simple ones, at least, until I explore the boundaries more and figure out what else is missing. One thing is that modules can’t currently contain new type declarations, which is a limitation similar to having Rust traits without associated types, so there’s useful stuff they either can’t express or can express only very awkwardly. And then there’s the whole higher-kinded type question. And then there’s the whole modular implicits thing I want to do to try to make modules less of a pain in the ass to actually use, which may or may not need some special cases or nuanced design work to make it actually nice instead of merely less terrible. And then it would be really nice to have some notion of dynamic dispatch built into the language, ’cause it nicely avoids a lot of writing of giant match statements of function dispatches, but I haven’t even really begun to think about how to do that.
There’s lots of research papers about variations and nuances of ML modules and I really should read more of them than I have, but most of them seem to stop at the math or the simplest-case algorithms and say “so this proves all the rest of the stuff works”, but nobody heckin’ tells you how all the rest of the stuff works and what all the implications are. At least not outside of the OCaml compiler codebase, which is honestly another thing I should read more of. Anyone know of any good, thorough review papers on this topic that don’t descend into 40 pages of math? Or heck, even a mediocre one would be nice.
Borrow checker
Oh man, this is gonna suck and there’s no two ways about it. There’s very few people in the world who can tell you how to actually write a full borrow checker, and I am not one of them, but I’ll have to become one.
But, you can get a lot of the most immediately useful bits of borrow
checking just by doing basically what Austral does
and making non-escaping borrows, which seem actually pretty easy to
implement. Basically with them you can write a function
fn foo(x: &Thing)
that borrows a thing and does stuff
with it, but you can’t write a function
fn foo(x: &'a Thing) -> &'a OtherThing
that
returns part of what it borrowed, or write a type
struct Thing<'a> { x: &'a OtherThing }
that
borrows another value. That’s not really sufficient to be a full-blown
solution IMO, but it’s a pretty okay place to start.
I occasionally dream of just including Datalog, Prolog or polonius
whole-hog into garnetc
and waiting for a random type-theory
angel to wander along and use it write a borrow checker for me, but
that’s not a terribly actionable plan. Wandering type-theory angels
usually have their own weird side-projects they’re working on; to coax
them to contribute to your project you have to put in the work to make
your project interesting to them first.
Pattern matching
This is a big, complex feature of its own, which I haven’t even
started to touch for real, but it’s also pretty well-bounded. I don’t
really have anything I want Garnet to do that Rust or Erlang can’t do
already, so it’s just a matter of figuring out how to implement it. From
what I’ve seen that’s its own side-quest that can consume infinite
amounts of interesting time and energy, but a basic implementation can
just turn a match
into a series of if
statements that are inefficient but correct and get on with life.
There’s still questions that will need to be answered around
exhaustiveness checking and overlapping match arms and such, but I think
the simple-and-obvious solution will cover like 95% of what you usually
need.
I’m probably wrong about that, now that I think about it. Look, let me dream!
The rest of the owl
The broad strokes of Garnet’s design are mostly settled in my mind but there’s tons more details that can be added more or less independently. This is actually where most of the actually interesting stuff goes, but I have to get the foundations down real good before I pay too much attention to them. So this includes things like:
- Properties. Probably 80% of the framework needed for function properties exists already, but actually doing anything with them is still TBD. Data/type properties are still mostly nebulous.
- Bit patterns and data layouts, and how they mesh with pattern matching and type conversions for real.
- How do linear types actually work? Do we have a
Drop
module or property, or something else? - How do moves and copies actually work? Do we have a
Move
or aPin
property, or something else? - Unsafe and pointer semantics. This is a Big Deal, and also a very important one, since one goal of Garnet is to make writing unsafe code less hazardous than it is in Rust. But it’s also low on the priority list for having a basic compiler work.
Conclusions
If I have monomorph and a solid module system done by the end of 2024, I’ll be pretty happy. If I have a non-escaping borrow checker too, or a prototype of something more sophisticated, I’ll be pretty ecstatic. Maybe then I can start designing more of the cool stuff.