HowManyRegistersShouldAnX86_64CPUHave

This is hypothetical complement to https://blog.yossarian.net/2020/11/30/How-many-registers-does-an-x86-64-cpu-have, which I enjoyed more than is probably good for me. That article looks at all the weird edge cases in a modern x86-64 CPU’s registers. x86-64 in general is a classic example of incremental bloat, so I’m curious to take the same data, the same article structure, and try to chop it down to what you might get if you could re-do basically the same design from scratch. Note I’m no CPU designer, and a mediocre assembly programmer at the very best, but it’s fun to toy with ideas. Most of my info is coming from the aforementioned article.

Last updated Nov 30, 2020

General-purpose registers

We have 16 GPR’s, but they grew out of fewer, smaller, different registers so they’re addressed as a giant pile of special cases. That’s silly, we only need 16 registers and instructions to extract the bits of them we want.

Current registers in this group: 68

Revised registers in this group: 16

Running revised total: 16

Special registers

The instruction pointers and flags registers are similarly broken down into weird sub-registers. We don’t need that, just IP and FLAGS.

Current registers in this group: 4

Revised registers in this group: 2

Running revised total: 18

Segment registers

Ah, time for the great sin of x86, its wacky segment handling. There’s six segment registers, and in long mode (which is all we care about anymore) AMD and Intel have done their best to make them useless and irrelevant, but you still need to have them. Despite the flat address space, I think segment registers are still occasionally used for certain things such as making address spaces for thread-local variables? For now let’s just pretend that we don’t need them. FS and GS are subsumed by model-specific registers FSBASE and GSBASE.

Current registers in this group: 6

Revised registers in this group: 0

Running revised total: 18

SIMD and FP registers

Like the original article, let’s do them in order: x87, MMX, SSE, AVX

x87

The x86-64 standard requires us to have SSE2 at least, which basically means x87 is obsolete. You never need to touch it.

Current registers in this group: 13

Revised registers in this group: 0

Running revised total: 18

MMX

The relationship between SSE and MMX is kinda weird, and I don’t understand it very well, so pardon any inaccuracies here. I think their feature sets overlap to some extent and MMX code is still occasionally used today(???), but MMX uses (part of) the x87 registers and SSE defines its own register set.

I think that it’s pretty safe to say we should just have one SIMD register set, and have all our SIMD instructions able to operate on it. All our SIMD instructions should also use the same status registers and such. So, for MMX specifically, we need no new registers.

Current registers in this group: 9

Revised registers in this group: 0

Running revised total: 18

SSE+AVX

x86-64 has 32 SIMD registers. Like the general purpose registers, they’re each made of multiple overlapping registers of different widths, and are apparently considered entirely different in the instruction encoding. So, all we really need is 32 SIMD registers and instructions with flexible addressing.

Maybe we need a SIMD status or flag register somewhere? I’ll assume there’s space for it in the CPU FLAGS register, I suppose.

Current registers in this group: 96

Revised registers in this group: 32

Running revised total: 50

Bounds registers

The original article says we don’t need these, and I see no reason to disagree.

Current registers in this group: 7

Revised registers in this group: 0

Running revised total: 50

Debug registers

These look reasonable the way they are. I think that’s a first.

Current registers in this group: 6

Revised registers in this group: 6

Running revised total: 56

Control registers

I don’t know what most of these are used for, though some of them are used for OS-ish things like interrupts and setting up protected/long mode. So while we might be able to ditch one or two of them if we really tried, for now I’ll just accept them as they are.

Current registers in this group: 6

Revised registers in this group: 6

Running revised total: 62

System table pointer registers

These are annoying and full of old-fashioned state for things like segments, real-mode emulation and such. But they also do things that are still important: Tell the CPU what its memory layout looks like when it goes to and from system calls, interrupts, and task switches. So like the control registers, we might be able to get rid of one or two of them but I’ll be conservative and keep them as they are. You can find some more info on how they work on the OSDev Wiki.

Current registers in this group: 4

Revised registers in this group: 4

Running revised total: 66

Memory-type-ranger registers

Never heard of these before. Apparently most other people haven’t, either. We’ll ignore ’em.

Current registers in this group: ?

Revised registers in this group: 0

Running revised total: 66

Model specific registers

MSR’s basically occupy a 32-bit address space, as far as I can tell. There’s apparently 400 of them currently defined, and I know the function of three of them: there’s a couple that tell the OS kernel where to go on syscall/sysret instructions, and one that gives information about the APIC interrupt controller. So it’s safe to say they’re Important, and there’s bound to be a big pile of obselete junk in there, but I’m not going to spend the time to sort out.

So, I’m going to take a vague guess. Judging from this list so far, x86-64 has about 3x as many registers as it needs. So it’s probably a vaguely reasonable assumption that holds true for MSR’s. There’s apparently over 400 (documented) MSR’s, so let’s round it down to a nice number like 128. 128 MSR’s Should Be Enough For Anyone, right?

A more accurate count of which MSR’s are actually useful is left as an exercise to the reader.

Current registers in this group: 400+

Revised registers in this group: 128

Running revised total: 194

Conclusion

From this brief glance, to make something Kinda Like x86-64, you probably need 200ish registers, and things that aren’t operating systems only actually need to care about maybe 50 of them. The running total of actual registers was counted at 619+, and applications probably need to care about 150-180 of them. About 100 of those, however, are overlapping general-purpose and SIMD registers, which IMO are not a particularly nice design but also aren’t hard to understand or use. So the ugly design and the nice design are within a factor of 2-3x of the size of each other, both in overall and user-facing surface.

Cruft tends to accumulate in corners and edges where it’s ugly, but is also mostly harmless. x86-64 is ugly as sin, but the ugliest places are hidden by the operating system and in old modes that are used rarely or only by old applications. So I sure won’t make any apologies for it, but it works. RISC-V is certainly nicer, but give it 40 years of continuous development and I bet it’ll collect a decent share of warts. ARM sure has.