I had another bad idea that I’ll never finish, so let’s write it down! The idea is basically to go through Pascal, Modula-2 and Oberon spec by spec, and see how each language evolves from the previous one. A bit of history of the inside of Niklaus Wirth’s brain, or something. We’re NOT going to cover any modern extensions or such, I can’t be arsed right now and at least for Pascal that is a big field I don’t particularly want to attack.

Disclaimer: I don’t know jack about any of these languages.



Most of the background info I have on Pascal comes from https://wiki.freepascal.org/Standard_Pascal, which probably knows what it’s talking about. Original book about Pascal written in 1974, then an ISO standard was written in 1983, ISO 7185. It was revised in 1990 to correct and clarify things without actually changing or adding anything. I will go off the 1990 ISO spec. Find it here: https://archive.org/details/iso-iec-7185-1990-Pascal

Diving in

Right of the bat, standard states it will not specify “the size or complexity of a program and its data that will exceed the capacity of any specific data processing system or the capacity of a particular processor, nor the actions to be taken when the corresponding limits are exceeded” – in other words, what happens when malloc() – ahem, new() fails is undefined?

There is a strict definition of an “error”, which is allowed to be undetected. Ie, accessing past the end of an array without a runtime bounds check is an error, but the implementation doesn’t have to insert the bounds check for you. It can if it wants to though.

Wow, the standard is dry as heck, even compared to other programming language specs like the Python or Rust spec or R5RS. Guess those aren’t ISO though. On the other hand, the standard requires implementations to either a) handle an error somehow at runtime or compile time, or b) document the fact that it isn’t handled, which is frankly a piece of rigor I approve of. On the gripping hand, there seems to be a formal split between “level 0” implementations and “level 1” implementations, and the only difference between them is something about how arrays are passed as function args. …Now that I look at the whole thing again, this distinction is never mentioned again.

Another example of functional dryness: I bet entire programming language standards get written these days that never use the word “denotation”. On the other hand, there’s an appendix full of nothing but cross-references from language concepts to various sections (no PDF search in 1990, it was probably originally a Postscript file intended to be printed out), another appendix with a list of every operation that may cause an error, another appendix for every feature that may be left implementation-defined…

Okay. Case is insensitive, even in variable names and stuff. Identifiers can be any length.

There are directives, which is just forward for forward refs, though it also says that a common extension is external for extern symbols.

Labels are numbers 0 to 9999 but no mention what they get used for yet.

Strings are delimited by '. ' can be escaped by doubling it, which is defined in a sorta odd way. Valid characters and character encodings are implementation-defined.

Block comment opening chars { or (*, closing chars are } or *). You can’t nest comments, though you can do { comment *) or vice versa. No line comments.

There’s a symbol that generally gets ascii-fied as ^. There’s also official alternative tokens; -> @, [ -> (., ] -> .). As far as I can tell, nobody actually uses the alternative tokens; I’ve never seen a pascal program which uses @T for pointers, always ^T.

Okay, we get to the real meat in chapter 6.2. A block is a list of declarations of labels, vars, constants, functions, procedures, etc that ends with a list of statements to be executed “upon an activation of the block”. They have to be in order, label -> const -> type -> var -> functions -> statements.

This doc very much pretends that it’s defining a notation for some sort of math instead of a program that will be executed on a machine that has some kind of sensible environment. It’s a little weird to read. It’s written in such standards-ese that it’s really hard to figure out what’s going on past the basics.

Scope exists. Declaration of a name has to come before use.

Each “activation of a block” is explicitly independent of any others. This means if you run the program twice, there’s no shared state between different invocations of a program. Programming in 1973 must have been some wild shit.

Three types of types: simple (numbers, etc), structured, and pointer.

simple-type-identifier = type-identifier .
structured-type-identifier = type-identifier .
pointer-type-identifier = type-identifier .
type-identifier = identifier .

squinting ooookay… Thanks for clearing that up, syntax definition.


Ah, types. Okay we’re in some more meat now. A simple type is explicitly an ordered set of values. There’s enum and subrange types, and each enum value maps 1-to-1 with an integer. Required built-in simple types are integers, reals, bools and chars. Bools are explicitly 0 for false and 1 for true – they’re simple types so they have to have an ordering relationship. Integers are signed by default? It defines chars as orderable and some constraints on the ordering you can rely on, but doesn’t define a character set or encoding.

Can’t assign values to enum, “Enumerated type constants are ordered by the sequence in which they are defined, and they have consecutive ordinal numbers starting at zero.” Ok that’s the first real missing feature I’ve seen so far that I think I would want.

Subranges are 1..100 or red..green. Can be any ordinal type but both ends must be the same type, natch, and if I’m reading correctly they define inclusive ranges [1 ... 100] instead of the now-typical half-open ranges [1 ... 101). …on the other hand this does avoid the “how to have a range ending at INT_MAX, you need to write INT_MAX+1 as the end value” problem that Rust requires ..= for.

Structured types can be arrays, records, sets or files. There can be a packed designator but it’s just a hint to optimize for size rather than speed.

Each array can be indexed by an ordinal type (integer, char, range, etc). You can have multi dimensional arrays or you can just nest them, so you could do array [Boolean] of array [integer] of real or array [Boolean, integer] of real.

A record consists of a fixed part and/or a variant part. I… think that combines struct and tagged(?) union functionality? Yeah looks like it.

Sets are sets of a particular type. Not much about them yet. Operations are defined later.

Files are listed as files of a particular type. Real, integer, structure, whatever. A modern spec would want to specify some kind of interoperability for the encoding of these types. Not here! Heterogeneous networks didn’t exist so much in 1973, and record-based file systems did. They’re smart enough to disallow files of type file though, hah. Not much is defined about how files work besides the absolute basics, which is defined math-style as a sequence of values plus a cursor. Files have two modes, “inspect” and “generate”, aka “read” or “write”. “The explicit denotation of the values Inspection and Generation is not part of the Pascal language.” Oh, there is also a special text type that only can be used in files and which is a sequence of lines separated by a nebulous “end of line” marker. They call this the magical text file. Anything you can do to a file of char can be done to a text file too, plus a few more things.

Pointer types are a nil value or a pointer to a non-nil value. They are written ^integer, amazingly not pointer to integer or such. “Since the nil-value is not an identifying-value, it does not identify a variable. The token nil shall denote the nil-value in all pointer-types.” So nil is a special magical token that can be any type of pointer, not an identifier. Fuck you, C, and your `(void*) 0.

Types can be “compatible” if they are ranges or sets that overlap or are of the same underlying type? Not sure what the point of that is yet.

Types can be assignment compatible if they’re not files, or one is a real and the other an integer (integer is widened to real?), or they are both ordinal types and one is same size or bigger, or they are both set types and one is the same size or bigger, or they are both compatible string types. …I don’t think we’ve talked about strings yet. Hmmm.

“The occurrence of an identifier in the identifier-list of a variable-declaration of the variable-declaration-part of a block shall constitute its defining-point as a variable-identifier for the region that is the block.” Oh good, thanks for clearing that up.

Arrays can be indexed. You can index a multi-dimensional array foo[i][j] or foo[i, j] and it does the same thing. Evaluation order of i and j is explicitly implementation-dependent.

Structures can have their fields be designated, and it chains as you would expect, so you can do foo^.bar^.baz or a[b].c.

You can have pointers to files, they are called “buffer variables”. “It shall be an error to alter the value of a file-variable f when a reference to the buffer-variable f^ exists. A reference or an access to a buffer-variable shall constitute a reference or an access, respectively, to the associated file-variable.” So no touchy the underlying file when a pointer to the file exists. What happens if you have two pointers to the same file? Does creating the second one constitute “altering the value of the file-variable f”? Hah, doesn’t say! Found another hole!

The main thing I may be learning is that reading the standard is a terrible way of learning anything about Pascal. Apparently the Free Pascal docs are the way to go. But I’m curious specifically about what standard Pascal is like, not what Free Pascal is like. So far standard Pascal is very explicitly not a language, but a framework within which to build a family of languages which will never need to inter-operate.

Procedures and functions

The standard doesn’t tell you what a procedure is. I guess you already know! This is, however, nearly the first portion that actually gives you significant example code.

Procedures can be nested. Take that, C! I… have no idea if they can access local variables from the enclosing procedure though. (It could be perfectly valid to do this without heap-allocated closures, as long as you can’t pass procedure pointers around.) “The occurrence of an identifier in the procedure-heading of a procedure-declaration shall constitute its defining-point as a procedure-identifier for the region that is the block closest-containing the procedure-declaration.” There’s like three more paragraphs of that stuff. I’m pretty sure what I want to know is somewhere in those paragraphs, but damned if I can figure out what it’s actually saying. No word here on whether procedures can recurse, either. (Note from the future: You can nest procedures, they can access local variables from the enclosing procedure, but they aren’t closures and inner procedures can’t be passed as function arguments.)

Functions are identical to procedures but return a value, because that’s an important distinction and you totally can’t define a unit type with structs. (Pro tip, you can.) Old fashioned. You return values by assigning to a magic var with the name of the procedure, instead of having a return statement. You must assign to this var at least once. No word on whether control flow has to actually pass through that assignment though, so I believe this is technically valid as written:

function Heck (explode : Boolean) : real;
    if explode then
        Heck := 12.3;

Lots of extraneous semicolons around here guys. I’m not a huge fan. Functions can explicitly be recursive, though you may need a forward decl for them.

Function params (I’ll just call em “function params” instead of “function or procedure params” or “formal parameter”) can have be marked as var which I think makes them essentially inout. You can’t use a struct field of a packed struct as a var parameter, so no calling Heck(foo.bar) if foo is a packed struct. Seems an odd detail, can’t the compiler just generate the extraction and re-packing code at the site of the function call, and the receiving function doesn’t have to care about it? Or is it assumed that these are implemented via pointers? Could still unpack the value into a local, pass a pointer to it, and then pack it back up after the function call, but that’s considered “advanced” even these days, so oh well.

I… THINK you can pass functions as function params? That’s interesting. Am I reading that right? I think I have to be.

The syntax and semantics of arrays and ranges as function args is actually pretty complicated. You simplify a fair amount by saying “arrays are always indexed by unsigned integers between 0 and N”, instead of “arrays can be indexed by arbitrary enums”. The latter also basically forces enum values to be contiguous, rather than sparse sets. This is actually a worthy design lesson to consider. How often do you want to index a dense collection by a char or an enum variant or something? Not that often, but more than never.

Files are once again called out as a bit of a special case in array function params. “The fixed-component-type of a value conformant array shall be one that is permitted as the component-type of a file-type.” So, basically arrays cannot be of type file. I feel like this is kinda coming towards an idea of “sized” vs “unsized” types, from an odd angle. On the other hand, “string terminated by a newline” is a valid component of a file type. Haven’t heard it say much about strings so far. Did I miss it? Array function params continue to be complicated.


Hoo boy file handling stuff. There’s rewrite() and reset() functions and I have no idea what they do, but their pre and post conditions are defined. I could probably puzzle it out if I put ten minutes into it, which I will do because it’s bugging me. Cripes, okay it’s in section 6.4, file are defined as having a L, R and M component which is just the least useful thing ever. Ok, M is the mode, read or write… ahem, Inspection or Generation. L and R are each single values of the type contained in the file. Sooooo they’re the “previous item” and “next item”, gotcha. What about empty items? They also have sequences of things where S() is the empty sequence, They define all that shit earlier. Wait no I was incorrect, L and R are both sequences. That makes way more sense, now we actually have a sane file cursor represented. Sequence L is the stuff you have already read, sequence R is the stuff still to read, and RTL scripts can just get bent. Augh. Okay, NOW it makes sense.

Much proof very set theory wow.

Okay, rewrite() basically rewinds a file to the beginning, truncates it, and sets in to write mode. reset() rewinds a file to the beginning without truncating it, and sets it to read mode. get() gets a char and presumably returns it but they don’t give function signatures for this so as written it could just shove the char up your ass instead. put() writes… fucking something to the end of the file. It’s invalid to call get() when there’s nothing to read or put() when the file cursor isn’t at the end of the file. I have to salute them for expressing this very clearly entirely in math, but couldn’t they have just fucking said this instead of giving me a puzzle to figure out? I think that get() and put() might magically operate through matching file pointers/buffers, but I’m done trying to decode this.

“NOTE: In order to facilitate interactive terminal input and output, the procedure get (and other input procedures) should be performed at the latest opportunity, and the procedure put (and other output procedures) should be performed at the first opportunity. This technique has been called lazy I/O.” So… buffered I/O? Or not? Sounds like buffered input and unbuffered output, which doesn’t seem to make sense. I think it’s trying to talk about the behavior of line-based terminals as we now understand them, but it’s hard to google lazy I/O without getting results talking about Haskell, so.

There’s a read() procedure which seems to magically take arbitrary numbers of args and read the contents of a file into them. So read(f, c1, c2, c3) reads the first 3 values of the file f into c1, c2 and c3. There’s a write() procedure which is likewise variadic. Seems odd; isn’t this what we have arrays for? This also means that the args to read() and write() are evaluated explicitly left-to-right, where I think they are not evaluated in any given order for normal functions.

No opening or closing files, I now notice. Presumably they’re platform defined. You could even have it so that files are just opened by the runtime based on the program invocation and passed to the program or whatever.

Memory allocation! new(p) allocates a new value of the type p is a pointer to(???) and points p to it. You can also do new(p, c1, c2, c3...) and if p is a struct it initializes its members to c1, c2, etc. I think. There’s also dispose(p), which works as you may expect, and dispose(p, c1, c2, c3...) which I think frees a whole nested structure at once but I’m not going to bother figuring out. No word on error handling. Yeah you can tell this was designed for education, not systems work.

There’s built-ins for turning packed types into unpacked ones and vice versa. Don’t care.

There’s a small selection of math functions. They magically work on both integers and reals, for no particular reason, and abs() and sqr() return integers if given an integer argument and reals if given a real argument, because we can’t express that in the language itself.

There’s ord(), succ() and pred() functions that operate on any ordinal type, that’s neat. It’s an error if you call succ() on the last member of an ordinal or pred() on the first. Don’t see any way of checking whether you are calling it on the last or first member of an ordinal. Guess you just compare against the element you know is the first or last element, and hope nobody ever blindly reorders things.


Using an undefined variable is an error. Operator precedence is simple: not > multiplying > adding > relational. Any value can be treated as a value with a larger range.

Set constructors are written [foo, bar, baz]. [] is an empty set. You can have ranges in sets, such as [foo .. bar, baz].

Order of evaluation of a range foo .. bar is implementation defined.

/ divides two integers or reals and returns a real. div divides two integers and returns a truncated integer. Modulo is mod. No += or such. No bitwise operators either, now that I think of it.

+, - and * are set union, difference, and intersection.

“The required constant-identifier maxint shall denote an implementation-defined value of integer-type. This value shall satisfy the following conditions. a) All integral values in the closed interval from -maxint to +maxint shall be values of the integer-type.” OOPS who sees the problem here? Hint: 2’s-complement. And what happens on integer overflow is unspecified.

Blah blah operators nothing to see here. Oh wait, you can use equality operators on strings, that’s neat. < and > too.

Statements may be prefixed by a label. Huzzah, we mention labels again, just fucking once! They’re goto labels, that’s all. No clear word about what happens if you jump into a loop or something, and I think it’s even allowed to jump between functions. Not sure.

If statements are not expressions, else statements dangle. Case statements don’t have a default value, it’s an error if the value fed to them doesn’t have a label to go to. Lame.

Loops! Huh, Pascal does the same thing C does where loops just have a statement after them, and if you want multiple statements then you wrap it in a block. I didn’t know that.

There’s with statements that put a struct’s members into scope, didn’t know that either. Like so:

with date do
  if month = 12 then
    begin month := 1; year := year + 1
  else month := month+1

{ is equivalent to }

if date.month = 12 then
  begin date.month := 1; date.year := date.year + 1
else date.month := date.month+1

I/O functions, I’m kinda done with those. There’s some basic number formatting stuff in the output functions.

Each program starts with a name, which has absolutely no use inside the program whatsoever. There’s also some optional program parameters, the meaning of which is implementation defined. And… that’s about it!


  1. Pascal does not define a language. It defines a family of languages, with no pretense of inter-operation. It’s a basis for a language, begging for extension, and this is on purpose.
  2. The section of Pascal that the standard defines is pretty fine for basic programming education, and fuckin’ tragic for systems programming. Feels very much not intended to be complete. No way to override type conversions, lots of small footguns like case not having a default value or not having a way to find the last member of a set, I/O does magical stuff that the rest of the language can’t do, etc.
  3. Reading the standard is a terrible way of learning anything about Pascal. Apparently the Free Pascal docs are the way to go. But that documents specifically the FP sublanguage of Pascal, which I expect to be very different. It’s assuredly the language you actually want to use for anything real, but I’m doing this to learn about Wirth, not about Pascal.
  4. Actually the standard is pretty terribly written in general; contrast it with R5RS or Webassembly for example. Not sure whether that’s because it’s an ISO doc, or that’s just how it’s written, or because the authors wanted to be very math-y and formal in how it is written, or because it was written in 1973 when nobody really knew what computers might look like in 20 years. Here in 2022, it’s pretty easy to make a good guess of what computers will look like in 20 years, even with a fair amount of wiggle room.
  5. There’s some bits that are more complicated than they should be. Mainly the desire to be able to define array indices as ranges, sets, characters, etc… That produces a fair bit of knock-on complexity involving how you write that shit down in non-trivial array types.
  6. The syntax is…. needlessly fiddly. Not even in terms of keywords and stuff, mostly just the punctuation. Semicolons as separators, and having separators even after a begin ... end block and such, really gets a little wearing to read, let alone write.
  7. I actually kinda like the chonky “program, types, constants, variables, functions, program block” thing, it makes perfect sense for a one-pass compiler without a lot of smarts or available memory. When you’re reading or writing a short program then it’s really not much imposition at all. What about big programs? I can’t really say.
  8. Needlessly old-fashioned in some spots. Functions vs procedures for example.
  9. I still dislike assigning to a magic var instead of having a return statement.
  10. I kinda like the sets and ordinal types, though they’re more restrictive than I’d want. Maybe have enums with arbitrary unique values for each enum member, and also (optionally?) an ordering that is specified unambiguously. So you can both implement a set of all values in an enum as a bitset, or index an array by an enum, or have an enum that represents the bit sequence 0xFD00_0002 without having to define 0xFD00_0001 intermediate values.
  11. There’s lots of little bits and pieces that my Rust-trained brain considers under-defined, but honestly that’s okay for an educational language. There’s a few actual errors though: Integers must be within [-maxint, maxint], which means that you can’t actually write 2’s-complement MIN_INT. It never mentions what happens if you create multiple buffer-variables referring to the same file. It never says that the return variable has to be assigned through all control flow.
  12. Other small things I’d label mistakes, not actual errors. They’re designed that way on purpose but I think it’s the objectively wrong design even though it has some advantages. Ranges like 1..100 being closed instead of half-open, for example. Enumerations needing to be consecutive.
  13. Cripes I hope the Modula-2 standard is easier to get through, fuuuuuuuck.



Ok, it was created in 1978? Wikipedia says “between 1977 and 1985”, thanks Obama. Ok the first report on Modula-2 was published in 1978, then there’s a series of books “Programming In Modula-2” with 4 editions, each describing apparently fairly minor corrections and updates. Then there’s the ISO version, ISO 10514, which has a 1996 and 1998 version. Then there’s extensions, ISO 10514-2 and ISO 10514-3, which we’re not gonna fucking touch.

Whatever happened to modula-1? Ok, it’s just called Modula, and was published in 1975 but discontinued soon after. The paper describing it is here if you ever want to look it up. AND it comes free with some unknown editor’s(?) marginalia!

Are we gonna talk about modula-3? Nah, not made by Wirth, though made by people who Wirth apparently knew and was probably budies with. So, consider it an amicable fork.

First challenge: Finding a copy of the Modula-2 standard that I don’t have to pay $200 for. CodingFiend, the crusty old (and sometimes wise and always opinionated) Modula-2 curmudgeon on the PL Discord, says: “I bought my paper copy of the ISO standard decades ago, and paid something like $200. I felt ripped off. The standard was written in the Vienna style which was awful, and since neither M2 vendor (Stony Brook or p1 GmbH) exactly followed the ISO standard it was fairly pointless. The Book on M2 is close enough.” Well okay then. No idea what “the Vienna style” is but I recall mention somewhere of Modula-2 being the first programming language fully described by formal math in the standard, so it’s probably terrible.

Diving in

Ok, if I’m going to be using Programming In Modula-2 then that’s easier to get. Fourth edition, check. Wikipedia has a brief changelog of the differences in various editions and it doesn’t look like they are terribly significant, as far as I care.

Oh hey it’s written like an actually sane document instead of a psychopathic math paper. This… is not gonna be a fair comparison to Pascal, and also it is going to leave out some details I am Interested in. Well… I’ll try. I found some library reference docs that will hopefully fill in some of the gaps.

Okay first off, identifiers are case sensitive. Huh! Keywords must be upper case. Strings are… delimited by either single or double quotes, but have no escaping mechanism for quotes so you just don’t get to have a string containing a single and a double quote at the same time. …Huh. <> and # are both “not equal”, the latter presumably being an ASCII-ification of ≠, which to me feels like an author who had Opinions about what they actually wanted and also had arguments with existing Pascal users. Nothing else has two symbol options! (Wait, yes they do, & and AND, ~ and NOT. No | and OR though! Very curious.) Comments are (* *), which is fine. No mention whether or not they can be nested, hah.

So, operator precedence is Interesting. That’s one of the common critiques of Pascal, and moderately justified IMO. In Modula-2 there’s still few levels of precedence, but they are:

  • NOT, parens, everything else
  • Binary *, /, DIV, MOD and logical AND
  • Binary +, - and logical OR
  • unary + and -

So math and logic ops share precedence, which is odd. No mention of comparison there either, I assume we’ll get to it. …Also no mention of XOR. Another symptom of Evil Abstract Math Thinking; admittedly outside of crypto or RNG code you really don’t need XOR very often. But… you still gotta have it there. It’s XOR, man, you can’t leave it out! Sometimes you need a good XOR’ing!

Ahem. Anyway…

Hmm, odd grammar artifact, you can never have two operators be adjacent; so x * -y is illegal. You have to say x * (-y). That’s actually one of those common Fiddly Bits of parsing expressions that tends to come up. If you use a Pratt parser or operator-precedence parser or something you can handle it pretty easily by having - be both a unary and binary operator with different precedences. Looking at the grammar in the book I think it should be totally possible to do for Modula-2, so I’m not sure what this is about. Upon inspection Oberon has the same artifact. Aaaaaanyway, DIV is integer division, / is float division.

Statements are separated by semicolons, not terminated by them. The book is very firm on that point; do I detect bitterness? But you can have an empty statement, so if you treat them like delimiters I think it still works? Still means you have to have END; after loops instead of just END. But it also means you do REPEAT ... UNTIL expr ; instead of needing a different terminator after the expr. So there’s that. …Huh, wait, it’s WHILE expr DO ... END, but REPEAT ... UNTIL expr. So WHILE occurs while expr is true and REPEAT occurs while expr is false? That’s weird. There’s also a LOOP statement that runs infinitely and is broken out of with the EXIT statement. Huh, Pascal didn’t have a break statement, did it?

If statements are IF expr THEN ... ELSIF expr THEN ... ELSE ... END. Augh, ELSIF! Is an extra E really that hard? Or just ELIF? ELSIF is just like… the most awkward compromise possible.

Integers are integers. Overflows are “usually” checked, at least according to the book! No wonder early compilers produced notoriously slow code, at least from the one time I’ve heard of Modula-2 actually being used. Wish I could find that article again, it was about someone making an apparently-multitasking OS GUI for like an 8-bit micro or something in the 80’s. Non-negative integers are called CARDINAL. Cardinals explicitly have a larger range than integers. You can’t intermix cardinals and integers in the same expression. There’s some conversion functions, called “transfer functions” for some reason. You can’t intermix integers and reals in math expressions either.

FALSE < TRUE is true, apparently. You can’t say, compare an integer and a real though.

A CHAR is written just like a string of length 1, which is kinda weird. The book tells you how ASCII works, and there are CHAR and ORD functions for turning chars into numbers and back, but it doesn’t promise anything beyond that about character encoding. Legit.

BITSET’s are sets of integers. That’s easier to understand than Pascal’s “set of arbitrary symbol ish things”, tbh, based on the conversations I had trying to fucking explain it to people. Max size of a BITSET is impl defined, and often small – the implication is it’s a machine word, I imagine. Operators are +, -, * for union, difference, intersection respectively, and / for “symmetric set difference” which I’ve never heard of, let’s look it up… aha, THAT is where XOR has been hiding. Ok then. You can also do i IN some_set to test for set membership. You can write bitsets by doing {1, 4, 9, 16} etc, which sets bit 1, bit 4, bit 9, etc. Doesn’t seem to be any way to assign names to a bitset’s members, which is interesting. It appears that instead of creating a new type, all BITSET’s are interchangable, and if you want them to have names you just do CONST my_name : INTEGER = 5 or whatever.

Constants and variables are constants and variables. Nothing special to see here.

Arrays have their spans given by a range, and ranges are inclusive like Pascal so the canonical example is ARRAY [0 .. N-1] OF INTEGER. You can assign arrays to each other just by doing x := y; no memcpy or shit like that. That means that those arrays have to be of the same index type or else it just doesn’t typecheck, that’s kinda neat. You can have multidimensional arrays by doing ARRAY [0 .. N-1], [0 .. M-1] OF INTEGER, which is an abbreviation for ARRAY [0 .. N-1] OF ARRAY [0 .. M-1] OF INTEGER. You can also give a type as an array’s range, so ARRAY CHAR OF REAL. No word on what happens if you do ARRAY REAL OF WHATEVER; downside of reading a book and not a spec. The syntax is valid! Presumably one valid answer is just “your compiler yells at you for trying to make a 4 GB array in your data segment” and nothing else really matters. I don’t think you can make a range indexed by real values anyway.

I just realized on the third scan-through: there literally are no array literals. No struct literals either, now that I think of it.

For loops are FOR i := 0 TO n - 1 [BY expr] DO ... END. Aha, the BY is nicer than Pascal’s DOWNTO or such. Not bad.

Function definitions and calls can have the () behind them omitted if you want. Feels kinda weird, but sure. Functions are called “procedures”, sigh, though they may return a value if you want so they’re actually just functions. A procedure that returns a value is called a “function procedure”, sigh. Oh, if they do return a value they have to be called as foo() and not foo for some reason, even if you’re giving them no params. Sigh. Functions have scope, and can be nested. Nested functions can refer to variables in their enclosing functions, though they presumably aren’t closures. Function definitions have to end with END FunctionName; for some reason, even though END alone is unambiguous. Yay, adding more tedium! My boi Wirth likes his semicolons I guess. Modules also end with END ModuleName. even though you can’t have more than one module per compilation unit. Was he worried files might get silently truncated by disk errors, or did he just really need three terminating tokens?

So far Modula-2 is definitely ahead of its time, and intended to be a real systems language as opposed to Pascal. Pretty damn good for the 80’s, let alone 1978. Lots of weird old-fashioned gubbins, and lots of things that are surprisingly modern. It feels like stylistic pinnacle of an old-fashioned world that is now destroyed, like the dreams of all the teenagers that died fighting WW1. I should write a lang that is literally just Modula-2 with Lisp’s syntax, it would be the most success-immune language of all time.

Like Pascal, params can be passed by “value parameters” or “variable parameters”. Variable parameters are INOUT parameters, aka pass by reference. It never calls them “pass by reference” or anything like it, but there are rules and admonitions that don’t make sense otherwise. You can’t pass an arbitrary expr as a VAR parameter, they have to be lvalues (though the language doesn’t seem to call lvalue’s a separate concept, the grammar does and calls them “designators”), arrays can be passed to functions directly (but there’s a suggestion to pass them as VAR parameters so they don’t use up extra storage). So yes, VAR means pass by reference. To me this feels like a leaky abstraction.

Functions are able to say that they take arrays of any length. “The lower bound of the index range of the formal is then always taken to be 0. The upper bound is obtained by calling the standard function HIGH(s). Its value is equal to the number of elements minus 1.” Presumably the caller passes the length of the array in a secret function param that HIGH() accesses. Get ye some slices, heathen! Nah they literally hadn’t been invented yet; this is actually relatively convenient, as annoying special cases go. It continues the trend of having inclusive rather than half-open array indexing, but at least they’re consistent about it.

Return statements, yay! They’re even written RETURN expr and the expr is optional, so that you can do early returns from procedures. Guess even Wirth decided that assigning magic variables was silly. You can’t return arrays or structs though. Those gotta be VAR parameters! Never mind

Functions can be recursive, nothing really special to see here. Tail recursion is not a thing. …actually, what happens if you have a nested function that touches the outer function’s vars, and that inner function recurses? Inner functions are allowed to recurse, as far as I can tell. I guess the recursive inner function has to actually take a pointer to the outer function’s stack frame, not just say “the outer function’s stack frame is X bytes above my own stack frame”. Never really considered that, it’s a sorta unusual design space between “no nested functions at all” and “real closures”.

Huh, a certain amount of the language semantics is actually encoded in the grammar. For example, there’s a SimpleType rule that is basically “what an array index type is allowed to be”, and it finds use elsewhere in sets and a couple other places. Const expressions are a different AST node from non-const ones. This means that the language is more rigid and the parser is more complicated, but the type checker is simpler. It Feels Weird but I can definitely sympathize with wanting a simpler type checker.


Two kinds of types, unstructured (atomic) and structured (has subcomponents). Integers are not bitfields, ’cause their internal representation is undefined. Vars of different types can’t be assigned to each other unless they are “compatible” such as an array with a range being assigned to a larger range type. Integers and cardinals are an exception, you can assign those to each other and if the cardinal is out of range of the integer I guess Something Just Happens.

Enumerations make a new unstructured type with identifiers as its members. Like Pascal, enums have a value that can be extracted with ORD(thing) and are assigned strictly starting from zero. No mention of a way to get the upper bound of an enum; the HIGH builtin could maybe be used for this but no. Booleans are the enum FALSE, TRUE.

Subrange types are written ["a" .. "z"] and are contiguous and inclusive. You can specify a particular underlying type by doing something like INTEGER [0 .. 99], otherwise it just makes a good guess. For integer subranges, if the lower bound is negative it’s an integer, otherwise it’s a cardinal. A value exceeding the bounds of a subrange is a runtime error.

Sets are set types of a base type. A BITSET is just SET OF [0 .. W-1] where W is the word size of the machine. A set literal for a set declared FOO = SET OF [0..10] is written FOO { 1, 3, 9 }. So if you want a set to have symbolic members you make it a set of an enum, aha. On the other hand, the base type of a set must be an enum or a subrange, and the implementation may limit the max size of a set, usually to the word size of a computer (“usually 16 or 32”). There’s some standard functions INCL(s,x) and EXCL(s,x) which modify the set variable s to include/exclude value x? So… INCL() is s |= x and EXCL() is s &= ~x? Looks like it, yes. What about the bitset operators like + and /? idk, they probably work but it’s not really described that way. More weird redundancy?

The author is very cutely excited by the idea of painting pixels onto a bitmap display using bitsets. On the other hand, it’s a good way of introducing the concept.

Record types! Pretty standard to start off. Like Pascal, there’s with statements to import a record’s fields into a scope. They mention that with x[1] do a := 3; b := 4 end may be more efficient than x[1].a := 3; x[1].b := 4. Must have been hard to live in a time where you couldn’t even trust compilers to factor out common sub-expressions. And, as we will see later, you can’t just aim a pointer at x[1] and do it that way!

You can have records with variant parts, which are basically tagged unions, aka real algebraic data types. It’s a little funny that you can make a record with no fields besides one of these “variant parts”, but they’re still attached to the record; just take the idea a tiiiiiiny bit further buddy! Come on, you can do it! Nope, oh well. Maybe in Oberon. The different variants are even separated by |’s though! It’s so close to sum types without quite getting there!

Variant parts are broken down with a CASE statement, which appears to import the variant’s fields into the local scope the same way a WITH statement does. Not quite real pattern matching but I didn’t really expect it to be. You write it like so:

WITH resident[i] DO
  CASE male OF
      TRUE: WriteString("Male, military rank is ", MilitaryRank)
    | FALSE: WriteString("Female, maiden name is ", MaidenName)

ahahaa the MaritalStatus enum consists of single, married or widowed, but no option for divorced. Ah, the good ol’ days when men were men, women couldn’t join the army, and your computer console had an ashtray built in.

You can put any constant expression in the CASE statement’s labels, and there IS an ELSE field for “everything else”, but the author sternly warns against the bad style of doing something like:

CASE some_integer OF
  1: thing1
| 2: thing2
| 173: thing3
ELSE other_thing

Instead you’re supposed to do:

IF some_integer = 1 THEN
ELSIF some_integer = 2 THEN
ELSIF some_integer = 173 THEN

I wonder why? This is a new phenomenon to me. Though I actually recently heard someone from India(?) say they were taught the same thing in classes. Weird.

You can’t put a | in front of the first case arm though, because we use separators in this household and that’s that, delimiters are not something that proper god-fearing Americans countenance! Yes I know Wirth is Swiss, let me have my fun.

Pointers! Pointers point to a particular type. They are written POINTER TO sometype, rather than Pascal’s ^sometype. Tedious, but more consistent with ARRAY OF sometype so I guess I’ll let it pass. There’s a magic allocation procedure, Storage.Allocate(). So if you do VAR p: POINTER TO Thing; Storage.Allocate(p, SIZE(Thing)) it sets p to a new Thing. Or you can abbreviate it to NEW(p) apparently. You dereference it with p^, which they still write with despite talking about ASCII earlier. They write ASCII char 94 in their ASCII table as too; not sure whether it’s style or stubbornness. :P Nope, Wikipedia says “Earlier versions of ASCII used the up arrow instead of the caret (5E hex) and the left arrow instead of the underscore (5F hex)” and links to a 1963 standard as proof. Though the book’s ASCII table has an underscore or maybe an em-dash(?) as 0x5F, so, idk. Whatever, doesn’t matter. The pointer dereference operator is ASCII 0x5E.

There is a value NIL that is a null pointer for any pointer type. “We may regard every pointer type as being a record with two variants, one pointing at objects of the specified, bound type, the other pointing at no object”. The variables a pointer points to “are anonymous and can be accessed via pointers only” – so no pointing at local variables or module variables, or anything besides what Storage.Allocate() gave you! No word on whether pointer dereferences are null-checked, surprisingly, just that they’re invalid.

There’s a Storage.Deallocate(p, SIZE(Thing)) procedure too. It can’t catch double-frees, natch. There’s a DISPOSE() built-in which mirrors NEW(), letting you omit the call to SIZE().

Procedure types! You can have actual first-class function and assign them to variables. So declaring a variable as VAR f: PROCEDURE(REAL, INTEGER): REAL is perfectly valid. You can also pass functions as arguments to functions. It literally demonstrates the visitor pattern as an example, ahahahaha. “…we emphasize the restriction that procedures that are assigned to variables or passed as parameters must not be declared local to any other procedure. Neither can they be standard procedures (such as ODD, INCL, etc).” It doesn’t say whether the compiler will disallow local procedures to be passed as params, actually, just that you can’t do it. So we have no closures (unsurprisingly), and generic functions must remain a special case. Compiler built-ins are very clearly allowed to do things you cannot write as functions in the language itself, and they are used all over the place for that purpose. This is the sort of thing that really bugs me in my own language designs. Apparently Wirth is more pragmatic in this regard though.


Modules are divided into a definition part and implementation part. Importing a module only requires the definition part. The definition part consists of function signatures, types, public vars and such. You can declare types with their whole implementation, or just the type names. All good. Definition and implementation parts can contain import lists. The definition only needs to import the things required for the definition itself.

Huh, the “main program” is apparently not really a module at all. They say explicitly you can imagine it as being enclosed in an unnamed root module, but some details about its syntax and semantics are slightly different than a real module. I’m too tired to dig into that too much more closely.

You can do IMPORT Foo to import the module Foo, or FROM Foo IMPORT Bar to import Foo.Bar as Bar. Gee, I wonder where Python got that!

Standard routines are imported into all programs. There’s no rigid definition of which standard modules are “standard routines”.

You can have local modules that are nested inside other modules. They are not separately compilable. They can import symbols from their enclosing modules just by doing IMPORT a. Local modules can export symbols to make them visible to the outer module. If there’s a name clash you can refer to symbols by their qualified names.

No word that I can find on what order modules’ program parts are executed in. Doesn’t matter as long as they don’t have any side effects visible to other modules, right? …right?! This seems like a pretty big omission to me.


It starts off by saying “we really can’t specify I/O because every system has its own I/O facilities and they need to be platform specific to be any good. …But we’ll present an interface for some basic things.” Text and binary streams, basically. Boring. Reasonably complete, generally satisfactory, but old-fashioned and boring.

Then there’s a chapter where he goes into more being gleefully excited about bitmapped graphics displays. Goes on to describe windows, context menus, pointers, etc which makes for interesting history but not about Modula-2, apart from some practical code examples. “We call this input device a Mouse, a name that is a reflection of a particular realization of the given specifications as a device that is hand-held and moved around on the table. The fact that it is equipped with three keys (eyes) and is connected to the keyboard by a thin, perhaps grey, cable has given this device its name.” Mr. Wirth I have concerns. Concerns about the mice you’ve seen with three eyes. Dude, Chernobyl hasn’t even happened yet!

figure 1

figure 1

omg it’s adorable

Low level stuff

Aha, this oughta be interesting! Low level code defined as stuff that breaks the rules of the language. He talks about use cases for low level code: Type punning, touching machine-dependent data with particular layouts, FFI, memory-mapped I/O. Notes that these facilities can not be checked by the compiler or at runtime.

There’s “type transfer functions” that let you coerce types without any conversion, a la Rust’s transmute(). The internal representations of things like integers are implementation defined. There’s also two basic data types for raw memory access, WORD and ADDRESS. You can’t do anything to a WORD besides assign it or turn it into another type via type transfer functions. How these actually work is implementation-defined, so if your WORD is 8 bits and your INTEGER is 32 bits (or any other value; 16 bits, 99 bits, whatever) then INTEGER(some_word) is 100% the compiler’s problem. WORD’s also have some funky implicit coercion effects when used as function parameters? As long as the type they are standing in for consists of a single storage word. Oops, not too useful these days of byte-addressed machines, sorry! Oh, or you can say your function takes ARRAY OF WORD and give it literally any type. Then the function gets the words that represent that value and can interpret them however it wants; it basically becomes something a little like C’s ... function param, except less cursed and only for one function parameter at a time instead of letting you walk the stack and annoying every CPU in the world with more than 6 registers. SIZE(variable) or SIZE(type) gives you the var or type’s size in words.

Type ADDRESS is defined as POINTER TO WORD. Values of type ADDRESS can be assigned to any pointer type. If a function takes an arg of type ADDRESS you can pass any pointer to it. You can do pointer arithmetic to an ADDRESS, they explicitly act as if they were of type CARDINAL, ie, an unsized int. There’s a built-in function ADR() which will give you the address of any value. No word of what happens if you write ADR(x+3). Presumably it’s an error ’cause literals don’t have an address, but idk. Also no word on what happens if, say, your word size is 1 byte and your integer size is 4 bytes and you dereference an address containing an integer. I suppose you’re supposed to treat it like a C void pointer; turn your ADDRESS into a POINTER TO INTEGER and then dereference it. That’s fine.

There’s some talk in chapter 31 about interacting with interrupts and raw devices and such too. Apparently you can specify specific addresses for particular variables by doing VAR s [WHATEVER_ADDRESS]: CHAR or such. I guess that’s nice. I don’t have a PDP-11 handy, so I feel safe skimming this bit though.

Processes and coroutines

Oho, this is unexpected. He talks about different types of multiprogramming: single processors that are multiplexed via time-slicing, symmetric multi-processors, asymmetric multi-processors. This is like, dawn-of-time shit for this topic, so we won’t be too harsh; I considered skipping this entirely, but I want to see what this world looks like in 1978.

You can start a process by giving it a function and specifying the amount of memory it’s given. So, spawn a thread. The process can wait for a signal, which has a type of some kind. You can send a signal, basically triggering a semaphore afaict, that will wake up one process that is waiting for that signal. You also can check whether there are any processes waiting for a particular signal.

It also talks about shared variables. “No process should meddle with common variables while another is performing some critical action upon them.” Ok dad yeah sure. :-P “A reasonable solution to this problem is to encapsulate common variables in a module which guarantees mutual exclusion of processes. Such a module is called a monitor…” Not a mutex? It literally has the words “mutual exclusion” in it. Is this where Java’s bogus synchronization comes from?

Apparently you specify a module to be a monitor by giving it a priority number, and then any exterior accesses to its variables are synchronized. So close, and yet so far. What if you put the synchronization on a record, buddy, and required accessors to get/set its contents? Also doesn’t seem to consider at all the cost of having to lock/unlock the mutex every time you access the variable in the monitor. And what if you want to hold a mutex in a critical section that’s longer than a single variable access?

Can’t really blame him, literally nobody knew how to do multithreading well before Rust. Even in 2013 your choices were limited to “Erlang” and “suffering”. And in 1978 I’m not sure that anyone even had done enough multiprogramming for practical concerns like that to come up.

It says you can’t busy-wait in code because control flow might only change when you send or wait on a signal. AKA, cooperative multithreading, coroutines; you might not have a timer interrupt to switch to another process. Though it also says this is really implementation dependent, and one possible solution among many. So a coroutine may invoke another via a procedure TRANSFER(VAR source, destination: ADDRESS). This suspends the calling function, sticks some identifier into source (presumably an instruction and stack pointer or something), and goes to the process at destination. To create a coroutine, aka invoking it the first time, you have to call NEWPROCESS(p: PROCEDURE, a: ADDRESS, n: INTEGER, VAR new: ADDRESS).

Cripes Wirth’s immunity to returning fucking results from functions really irks me sometimes. Is this something about having to then copy the value into the callee’s stack frame or such because the compiler is too stupid to optimize it out? That’s the only reason I can conceive of to have a single ADDRESS-typed VAR parameter instead of just returning an ADDRESS. It has nothing to do with the language and everything to do with the programmer, so just… what’s going on here?

Never really talks about how priorities in work monitors, though it’s implied later on that they’re intended to map to interrupt priority levels. Does that mean that priority 1 processes can interrupt priority 2 processes, or vice versa? What happens if they’re the same? Fine, we’ll just consider that bit implementation-defined.


And the book ends with a copy of the original Modula-2 Report! Cute.

Well it’s certainly more rigorous than Pascal. And while having a textbook on the language gives you fewer edge-case details than the standard does, it is also about 100x easier and less tedious to read.

All in all… A reasonable systems programming language. I don’t hate it. It’s far from perfect, there’s a number of fancy features that are probably in the Not Gonna Need It category (indexing arrays by random types, monitors, etc) and a bunch others that I am perfectly happy to let die with history (inclusive ranges). But if my choices were C or Modula-2… all else being equal I’d probably choose Modula-2. I might commit some atrocities to its syntax first, though.




Ok, Oberon was published in 1987. There’s no ISO standard, thank Eris. Just a report that is available for free from ETH Zurich. This is apparently the 1990 version, I assume it has some editing fixes but that’s all. There’s a wild handful of derivatives: Oberon-2 is a small enhancement made by Wirth and others in 1991, then there’s Active Oberon which was made in 1998ish at ETH Zurich and appears to be its own kinda fork(?), then there’s Oberon-07 which was published by Wirth in 2007 and appears to have been lightly revised since then, up to 2016. Oberon-07 is based on the OG 1987 Oberon, not Oberon-2.

I will be looking at the 1990 version of Oberon; I might add some notes about Oberon-07 where I find them, but won’t try to note every change and update. Wirth already did that for me.

There’s also an operating system called Oberon, implemented by Wirth and co for an experimental desktop computer called Ceres in the 1980’s. I actually read the book that was written as a result of this, back in high school, after finding it in a dusty engineering library. There’s a small zoo of derivatives with no single definite family tree that I can discern; making Oberon variants seems to be a semi-common hobby or research project. As of writing, Wirth is 88 years old and appears to still be playing with Oberon systems and implementations; frankly I don’t expect him to make anything cutting edge anymore, but if he wants to keep playing around with cool ideas then more power to him.

Diving in

It starts with “Make it as simple as possible, but no simpler”, eh? We’ll see about that. It also says “What remains unsaid is mostly left so intentionally, either because it is derivable from stated rules of the language, or because it would require to commit the definition when a general commitment appears as unwise.” Great, I love puzzles! (I don’t love puzzles.)

String literals are delimited by quote marks… and cannot contain quote marks. There is no escaping mechanism. ALREADY TOO SIMPLE! ABORT! ABORT! BAIL OUT! Okay dis gonna be a wild ride, let’s keep going.

Like Modula-2, integers can be suffixed with H for hex digits, and reals can be written 12E3 for 10^3. Chars are explicitly ASCII. They can be written as single-character strings or as nnX where nn is a hex digit, nice, so a space can be written " " or 20X. In Modula-2 you’d have to do CHAR(20H)

Keywords are all in caps, of course. Identifiers are case sensitive.

Comments are delimited by (* *). Can comments be nested? Undefined, but “derived from the stated rules of the language” I’m guessing not.

An identifier with * suffix is exported, so PROCEDURE foo*() is like saying pub fn foo(). It’s a little weird, but I kinda don’t hate it.


There’s multiple integer sizes, SHORTINT <= INTEGER <= LONGINT. There’s also REAL and LONGREAL. Their sizes are whatever is appropriate for the architecture. Except for SHORTINT which is explicitly -128 and 127, aka a 2’s-complement byte. Oh, apparently LONGINT <= REAL <= LONGREAL. Hmmmmmm, that’s not quite how floats work buddy; you can have an i32 value that you can’t exactly represent with an f32. Observe the following Python3 (which uses 64-bit values but same idea):

>>> float(0xFFFF_FFFF_FFFF_FFFF)
>>> int(float(0xFFFF_FFFF_FFFF_FFFF))

Whatever, i32::MAX was kinda impossibly huge in 1988.

Besides strings and numbers you can also have ARRAY OF Type, POINTER TO Type, records, and functions. Oh, and BOOLEAN and SET. A SET is basically Modula-2’s BITSET in concept, it’s a set of integers between 0 and MAX(SET).

Arrays are indexed only by 0..N; finally, zero-indexed, inclusive arrays! They’re written ARRAY LEN OF Type. Multi dimensional arrays still have a little sugar; `ARRAY LEN1, LEN2 OF Type.

Record types no longer encompass tagged unions, sad. No unions at in fact, as far as I can tell. No enum type either. Hmm now that I mention it, that means that BOOLEAN isn’t an orderable/number type, it’s just two magical compiler literals, TRUE and FALSE. I consider this a violation of “…but no simpler”, you remove a feature from the language but add a special case anyway.

Records are extensible. Big change there! The Big One, really. Public fields must be marked by identifiers ending with *. Maybe he considers this to subsume tagged unions and such? It kiiiiiiinda does. As someone who’s written plenty of C# and plenty of OCaml, you can make object inheritance and some reflection/specialization do anything you would do with sum types, it just sucks ass.

Pointers must point to a record or array type. Any pointer can have the value NIL. Pointers may point to extended records. Creating a new value is done by calling NEW(p). p is explicitly NIL if NEW fails. He’s learning!

Procedure types, basically the same as Modula-2. Procedure types can also have the value NIL. It’s almost… like… they’re actually just pointers. Again, first-class procedures cannot be local procedures or one of the magical predefined procedures like NEW.

Constant expressions are expressions that “can be evaluated without executing the program”. A little tautological, but I don’t hate it.

No unsigned integer types this time! Huh, interesting.

Procedures and functions

They’re called procedures, of course. There’s VAR procedure arguments that denote inout arguments, aka pass-by-reference, aka pointers. VAR args must be designators/lvalues.

Procedure arg evaluation order is undefined… cause expressions can’t have side effects. Hm! That’s actually a weakness of the modern “everything is an expression” model …wait no, function calls are expressions too, because you can do Something(x) + 3. So expressions having side effects is inevitable in any language where side effects are possible.

Statements vs expressions, ofc. NIL can be assigned to any pointer or procedure type (abstraction break, ’cause procedures are just pointers of course). Statements are separated by ;.

ELSIF, sigh. Is ALGOL to blame for this? Prolly.

Case statements exist. Case arms are separated by | like Modula-2. There’s an ELSE arm. If the ELSE arm is omitted and nothing matches, it is a (runtime) error.

While and repeat until loops exist. No for loops. Not the economization I would have picked. Exit (break) and return statements exist. REPEAT ... UNTIL test loops still have the weird inversion that they continue while test is false, and WHILE loops continue while test is true. There’s also LOOP ... END, which just loops forever.

WITH statements exist, work like they do in Modula-2. They allow you to assign a name though. But WITH is pure sugar! C’mon man! No FOR loops, but you have REPEAT ... UNTIL and WITH ? Not the decision I would have made. (…ahahahaha apparently in Oberon-07 WITH statements are removed and FOR loops added. I’m not sure whether to be entertained that he ends up revising the language to make the same decisions I would have, or critical that he needed 20 years to do so. Oh well, I’m standing on the shoulders of giants here.)

RETURN statements exist. Functions must contain a RETURN statement. “Derived from the stated rules of the language”, I assume all paths of control flow must terminate in a RETURN. (In Oberon-07 again, standalone RETURN statements are outright removed and a function that returns a value must end with a RETURN. Great, I LOVE keeping track of control flow stored in mutable state!)

Procedures can be nested, same as Modula-2. Forward references are still a thing. Recursion is a thing.

Procedures are written PROCEDURE Thing(...) ... END Thing Dude, more redundant names? Show me in the text editor where the end delimiter hurt you, Wirth.


It sorta exists. It’s a handful of pseudo-functions that are generic or take types as arguments. None that are variadic though, surprisingly; they all take a fixed number of arguments, vs Modula-2 where some of the built-in pseudo-functions could take different numbers of arguments. Oops no I’m wrong, “The second parameter of INC and DEC may be omitted, in which case its default value is 1.” Looks like those are the only variadic functions around. I prefer += and -= style, though.

It includes COPY(x, v) which takes two strings and copies v to x. In Modula-2 this was just an assignment. Weird. (In Oberon-07 they become just assignments again.)

There is no DELETE function. IIRC the Oberon OS was garbage-collected, going from the book I read on it in like 2002. I guess garbage collection can be “derived from the stated rules of the language”? Feels like a pretty big assumption there, buddy.

There is a SYSTEM module that has low-level stuff. It has a BYTE type, and if a function takes ARRAY OF BYTE as an arg then it can take any type. It has some functions for type and pointer casting operations and direct memory access. There’s LSL and ROT operations (rotate left??? It doesn’t say!), and I assume you can give them negative numbers to get LSR or rotate right. There’s no arithmetic shift though. (In Oberon-07 LSL and ASR functions get added, ROT has been named to ROR (rotate right), and a few other things are fixed.)


The description of module semantics is painfully brief. However it does include the sentence “The statement sequence following the symbol BEGIN is executed when the module is added to a system (loaded).” So that is some kind of attempt at module execution ordering… Again, going from the Oberon OS book modules were basically DLL’s and were loaded lazily when something first invoked a function from them. Speaking from experience, in reality module initializer order is a giant footgun, but you couldn’t necessarily know that at the time.

Modules seem fairly complete though; while simpler than Modula-2 there seem to be name overrides and other such practical things you’d want. Modules also have to end with the module name, MODULE Foo; ... END Foo. The period is also necessary, just to be sure. There are no interface modules for separate compilation like Modula-2 had. (Again, the Oberon operating system treated each module as a compilation unit and loaded them lazily; each module was a DLL and the function calls across modules resolved at runtime. Java has demonstrated to us since then that this is too fine a granularity.) There also do not seem to be submodules??? Interesting. Simpler than M-2, though.


What we’d now call an “lvalue” is explicitly defined and called a “designator”. This was the case in M2 too, the book was just less clear about it. There are module identifiers (aka paths) like Foo.Bar.Baz. Both Oberon and Modula-2 call them qualident’s, qualified identifiers, a term which makes perfect sense but which I’m sort of glad never caught on.

Structure references now implicitly dereference pointers, that’s interesting. So foo.Bar can actually be foo^.Bar. Pointers… can NOT point to other pointer types directly, which is also interesting, so this is unambiguous. You can’t have POINTER TO POINTER TO Foo. To represent that you’d have to make an intermediate struct type that contains only a pointer. Similarly, if foo is a pointer to an array, foo[bar] is actually foo^[bar]. This is actually kinda nice, compared to C’s -> vs ., or Rust’s auto-dereferencing rules that sometimes Aren’t What You Want and result in you writing (*foo.bar).baz for unsafe pointers a lot. Is this new? This is new; Modula-2 didn’t do this. Having pointers not able to point to certain types seems a steep price to pay, but still, it’s appealing.

There is a “typeguard” expression, “The typeguard v(T0) asserts that v is of type T0, i.e. it aborts program execution, if it is not of type T0.” It’s valid if v is a pointer to a type that inherits from T0, and v must be a pointer or a function parameter. So all records are sneakily passed to functions by pointers, so that this record-extension mechanism can work. Can you put records on the stack? Can you make a value of record T and pass it to a function that replaces it with a larger record T1?

Four levels of precedence again, with relations at the bottom again like Pascal. Wirth just has Issues with this, doesn’t he. ~ is negation, then multiplication, then addition, then relation. Its OK friend, operator precedence a wholly human artifact with no intrinsic meaning, let it go.

& is “and” but OR is for “or” and ~ is for “not”. At least there’s no synonyms, I suppose? DIV is for idiv and MOD is for modulus. Idk if MOD is modulo or remainder but there’s an equation for it so I assume its something sane. x = (x DIV y) * y + (x MOD y) oh though only for positive values of y. Sigh. = is equal, # is not-equal, so he finally gets to wholly have his way over the <> preferrers. Then you have the usual set operators, + for union, - for difference, * for intersection, / for XOR. Then X IN Y for set membership.

Then you have type tests, X IS T, aha! So, it’s basically instanceof. So structs probably carry some runtime type info with them. Getting very object-y.

Strings are arrays of chars but don’t have to fill the whole array.

Oh, I just realized: Arithmetic overflow is never mentioned. Out of bounds array access is never mentioned. Dereferencing a nil pointer is never mentioned. What the hell? I actually have no idea whether this is “derivable from stated rules of the language” or “would require to commit the definition when a general commitment appears as unwise”. I’m a tedious Rust fanboi, so to me this shit is important. I really hate how much goes unsaid in programming languages sometimes. (It scares the everloving shit out of me, too.)

NIL IS T IS mentioned as explicitly undefined for any type T, on the other hand. NIL is not a real value, it has no type, it’s just a placeholder for where a pointer should go.

You know what? You fucking know what? As far as I can recall this has NEVER been mentioned before in any of these languages, but logical operators are short-circuiting. There is an example WHILE (t # NIL) & (t.key # i) DO ..., and that can NOT work if logical operators do NOT short circuit. JFC THIS SHIT IS IMPORTANT. AUGH. The definition of & is “p & q stands for if p then q, else FALSE”, which one could interpret as doing short-circuit evaluation if you think real hard about it. I’m not sure whether I’m more angry about how vague it is, or that I didn’t catch this sooner.

Modula-2’s “ProcName; calls the procedure the same as ProcName();” goes away, intelligently, because you want to be able to know whether x := ProcName is assigning the value of ProcName or the result of calling it. Calling a procedure always needs the parentheses.


Oberon’s minimalism is both an example and a warning.


You might also be interested in Mesa, a Wirth-designed language that was used internally at Xerox PARC. I believe for a while most of their system software was written in it, including the Smalltalk-80 VM, an OS called Tajo, and the software of the Xerox Star (the first commercial GUI computer.)

I did a fair amount of coding in Mesa when I was a summer intern at Xerox in the mid-80s. I don’t remember the details, except that it elegantly unified structs and function parameters, and that all language keywords had to be written in ALL CAPS which made it annoying to type; one of the few times that the Caps Lock key came in handy!


So chronologically it fits between Pascal and Modula-2. Do… do I want to go through this? No. Oh, it actually wasn’t Wirth-designed, according to Wikipedia… though it’s Wikipedia, so details may be wrong. It’s apparently where Modula-2’s modules come from though. Eh, I might look at it someday, but not right now.


So what have we learned?

First, let’s outline some things that stay the same:

  • All of these languages follow the same block syntax style designed for quite dumb one-pass compilers. By the time it starts compiling the actual program it has already emitted all the stack handling code for local variables, for example. Pascal goes even further than C89 does by forcing a particular declaration ordering on you of header, imports/exports, var/const/type declarations, functions, entry point. (…wait, I just realized Oberon doesn’t have forward references for functions; it can’t be compiled with a one-pass compiler! Holy shit.)
  • All of these languages have largely the same syntax. Wirth never decides that ELSEIF is better than ELSIF or vice versa, or that curly braces are the way to go. The biggest single change is probably going from Pascal’s ^T to Modula-2’s POINTER TO T.
  • All the syntax is, by modern sentiments, insufferable, or at best deeply unfashionable. I don’t think I need to go into details. I pity Wirth’s RSI, that’s all.
  • All of these languages are basically C-level system programming languages, but with much more conservative ideas about pointers and dynamic memory.
  • Generics and metaprogramming are totally absent, and cases where you need generics or functions that operate on types are handled via compiler built-ins.
  • There’s plenty of compiler built-ins that look like functions but operate by different rules than the rest of the language. This never seems to bother him. Lucky guy.
  • Pointers are more pervasive than they look, but are hidden. VAR arguments are pointers, procedure types are pointers, there may be one or two other cases that I’m missing (potentially variable-length array args?). I actually dislike this ’cause it’s a leaky abstraction: VAR args have limits on what can be put into them, procedure types can be NIL, etc.
  • Variadic functions never happen.

Trends that progress through all the languages:

  • Each language is progressively more detailed and systems-oriented. I think this is both a trajectory of Wirth’s career (he wasn’t writing OS code in 1973 as far as I know, but he was in 1988) and also an artifact of computers becoming more uniform and standard (in 1973 you could not say that a byte-addressed machine with ASCII characters would be easily available, in 1988 you could).
  • Another symptom of this is that error handling gets progressively better, IMO. What happens in many error cases becomes progressively more explicit. Usually. (Oberon’s brevity annoys me.)
  • The syntax of each language becomes incrementally more tedious and long-winded. Not by a whole lot, but each one adds a new case where you have to do END SomeName. instead of just END.
  • That said he is not entirely immune to syntactic sugar, Oberon’s foo^.bar -> foo.bar is quite pleasant.

Trends that don’t progress through the languages:

  • Simplicity/minimalism. The middle language is the most complex one; both Pascal and Oberon are simpler than Modula-2.
  • Any care/awareness of functional programming languages. I see no influence of Lisp, Scheme, ML, or the various precursors of Haskell; whatever Wirth may have thought of them, that line of progress is entirely divorced from the design of these languages. Concepts that came from those languages like immutability, non-nullable pointers, type inference, powerful macros, closures and such are entirely absent. Considering the context and the compiler technology, this isn’t super unreasonable, but is still worth noticing. Garbage collection does sneak into Oberon, albeit from a kinda weird angle. On the flip side, these languages have always had first-class function types, even if they’re relatively rare.
  • Each language tries to keep operator precedence simple with four levels, but decides to do it a slightly different way. IMO this is a sign of the “waterbed effect”, pushing complexity around instead of being able to solve it – because the solution is either more or fewer precedence levels, which appears to have been distasteful to Wirth.

Wirth’s most persistent programming language innovation is undoubtedly Modula-2’s module system, which worked its way into Python and Java, and then from there into everything else. Then if you compare his languages to Rust, which is IMHO the most carefully designed systems programming language of the last 30 years, you can see lots of places where Rust makes Better Decisions based on what has happened during those 30 years. (Don’t talk to me about Zig; it’s on my to-do list.)



From lorddimwit:

On the gripping hand, there seems to be a formal split between “level 0” implementations and “level 1” implementations, and the only difference between them is something about how arrays are passed as function args.

A lot of people have trouble with this. I’ve seen some really bad answers as to what Level 1 Conformant Arrays are (not that TFA is bad, just on places like SO).

Pascal has a nominal type system, like Go, so


Are two different types. Moreover, the size of an array is part of its type, so

String256 = ARRAY 256 OF CHAR

Specifies, well, an array of 256 characters.

So when you define a procedure in Pascal, the formal parameters have to have a type. Array types must include their sizes.

In other words, it’s not possible in Standard Pascal (Level 0) to have a procedure that operates on arrays of arbitrary size.

Level 1 introduces Conformant Arrays. That lets you pass array sizes as additional parameters to a procedure, and relaxed the typing rules so that types that are conformant to the formal parameter array type can be used.

For example:


defines a procedure that takes an array of integers with the additional parameters of low and high, which indicate the bounds of the array.

(To date this remains my sole contribution to Stack Overflow…)

From doug-moen:

Peak dryness was achieved in the Algol 68 standard, which used a 2 level van Wijngaarden grammar to formalize as much of the syntax and semantics as possible. (BNF grammars were invented for Algol 60, and they were such a hit that people wanted to take BNF to the next level.) But, the Algol 68 report was so dry, rumour is that the only person who could read and understand it was Adriaan van Wijngaarden. Wirth resigned from the Algol 68 committee in protest.

All of Wirth’s languages are reactions to and attempts to improve on Algol. His first was Algol W, which was a refinement of Algol 60, and was originally presented to the Algol committee as the successor to Algol 60 (we got Algol 68 instead, which took ideas from Algol W).

The next I remember is Euler, which was a dynamically typed dialect of Algol: it was simpler, more regular, more powerful. I really liked it (in the context of its time, I mean, since it hasn’t aged well). It had features missing from Wirth’s later languages, like lambda expressions and standalone array literals.