Context

Published . Estimated reading time: 9 minutes.


The big deal with RGBDS and its backwards compatibility is likely obvious to any seasoned member of GBDev, but maybe you, dear reader, are not one.

Let’s talk about what backwards compatibility is (in a rather general sense), and why it matters.

What is RGBDS?

The short answer can be found on its history page. But, essentially, it’s a collection of four tools that enables people to program their own games for the Game Boy, and which was initially released roughly at the same time as the Game Boy Color itself.

RGBDS was originally developed as a mostly single-person endeavour, but passed through several maintainers and contributors, as well as multiple platforms. And, recently, RGBDS has seen releases more frequent than ever, bigger than ever… and also some compatibility breakage every six months on average.

Why is this a problem?

Or: “why is backwards compatibility desirable”.

XKCD #2224

An article about programming, written by a nerd, wouldn't be complete without some XKCDs~

Due to the popularity of RGBDS within the niche of “Game Boy assembly programming”, over the years, quite a lot of code relying on it has accrued. Most of that code is either maintained by someone who uses their local copy of RGBDS, or is abandoned. It has turned out that backwards compatibility breakages affect both of these categories.

See, the maintainers of active codebases have complained that when they want to upgrade (to benefit from a new feature or from a bugfix), they have to also adapt to the breaking changes from the new version.

And, meanwhile, people interested in checking out some of the older codebases have been faced with the problem that the latest version of RGBDS just spewed a bunch of errors… and they had no idea which version would be the right one—for those who realised that using an earlier version would work, in the first place!

So, given these clear incentives for RGBDS not to break backwards compat, why has it been repeatedly broken for several years in a row?

The case against back-compat

Or: “how much do you actually want back-compat?”

RGBDS was made by humans. Humans are imperfect.

RGBDS was started back in 1997 as a single programmer’s hobby project, and with seemingly only moderate knowledge of programming language theory. Add to that that RGBDS changed hands several times through the years, still between hobby programmers, and it becomes fair to assume that some features were implemented without a lot of forethought.

Setting aside what works well, we are left with two categories:

Bugs bugs bugs 🐛

At first blush, fixing bugs seems uncontroversial. Oh no, Bad Behaviour! Who would want that? Let’s fix it!

XKCD #1172

And yet.

I like to point to one RGBDS bug report as a very good example of the XKCD above: issue #362, “Labels can be defined with colons in their name”. Here is some code that triggered the bug1:

MACRO mklabel  ; Defines a macro, that, when called...
Label_\1:      ; ...defines a label, "copy-pasting" the macro's first argument into its name.
ENDM

  mklabel x    ; Defines Label_x, perfectly legitimate.
  mklabel a:b  ; Defines Label_a:b, which shouldn't be possible!

Colons are not valid characters for symbol names in RGBASM (labels being one kind of symbol), so this is definitely a bug. Worse still:

Any character that can be used in a macro arg can be used in a symbol name using this trick. For example, you can even put spaces in a symbol name.

Spaces in symbol names is a huge problem, because it breaks the sym file format2 that RGBLINK ends up emitting. So, naturally, the bug got fixed.

In and of itself, fixing the bug broke compatibility. But, since the behaviour wasn’t matching the documentation, correcting the behaviour appeared much more sensible to us maintainers than altering the documentation! Imagine reading a new tool’s documentation, and stumbling upon:

Symbol names can only contain the aforementioned characters, except if they are generated through a macro argument precisely at the end of a label, in which case no guarantees are made […]

Fixing this bug fixed the code shown above, and as usual after a bugfix, everyone was happier. …that is, until the following release, when someone else reported that their previously-accepted code was now rejected. They were using code highly similar to the above, but with $ as the “bad character” instead of : due to using another feature of RGBASM’s. Why didn’t they notice? Well, it turns out that on other platforms (such as x86, which that user was used to), $ is not only permitted in labels, but common!3

The user could update their code to no longer rely on the bug, but this would have required significantly increasing its complexity, so they refused. Instead, we agreed upon a compromise, which would give them a way to stop relying on the bug with only minimal changes.

This was not an isolated case, either:

All of these perfectly illustrate Hyrum’s Law:

With a sufficient number of users of an API, it does not matter what you promise in the contract: all observable behaviors of your system will be depended on by somebody.

Unfortunately, “all observable behaviours” includes bugs. Yet, in the interest of having reliable software that performs as documented, I prefer fixing at least some of these bugs.

1

Actually, MACRO mklabel was invalid at the time, and you had to use mklabel: MACRO. This blog’s code highlighter only knows the former syntax, though, so I cheated to keep it aesthetically pleasant.

2

Actually, this specification didn’t exist yet at the time of this bug report. But the format was almost exactly the same.

3

Amusingly, RGBDS v0.9.0 made $ valid in identifiers for that reason; so, five years later, we have come full circle 😆

It Sounded Better In My Head

After “things that don’t work”, let’s discuss “things that work but not well”. These are often called “papercuts”, because they’re not “lethal” to one’s experience… but still at least an annoyance.

A well-known example of a “papercut” is C++’s “Most Vexing Parse”:

struct Horse {
	std::string name;
	unsigned int age = 0;
};

int main() {
	Horse cadey("Cadey", 26);
	puts(cadey.name.c_str());

	Horse baby("???"); // Default value used for `age`, here 0.
	puts(baby.name.c_str());

	Horse anonymous(); // Default value used for `name` (empty string) and `age`... right?
	puts(anonymous.name.c_str());
}
main.cpp: In function 'int main()':
main.cpp:17:24: error: request for member 'name' in 'anonymous', which is of non-class type 'Horse()'
   17 |         puts(anonymous.name.c_str());
      |                        ^~~~

(Try this example yourself!)

The “papercut” here is that the “obvious” syntax does not do what you expect. (The correct thing to do here is Horse anonymous;.) This is also known as a violation of the Principle of least astonishment (or “POLA” for short).

Why are special cases bad? Since we need to remember the tool’s rules, what’s just one more?

The first problem is that special cases increase “friction”: they make <the thing> significantly harder to learn, because there’s just more to learn; they also make it harder to use, since you also have to keep in mind the general rule has an exception (and, sometimes, means that you must handle the exception specially). They also increase the chance of mistakes (including bugs), because humans are way better at sticking to rules than remembering exceptions.

(Also, this argument often comes from veterans who have gotten used to the papercuts, which makes it sound closer to “Skill issue” than I’d like.)

Let’s give an example in RGBASM, with what’s perhaps its most common papercut:

MACRO fancy_println
  println "", \#, " ✨"
ENDM

  println "one"
println "two"
  fancy_println "three"
fancy_println "four"
one
two
✨ three ✨
error: main.asm(8):
    syntax error, unexpected string, expecting : or ::
    To invoke `fancy_println` as a macro it must be indented
error: Assembly aborted (1 error)!

(Try this yourself!)

To be clear, the papercut here is that the built-in println directive can be indented and not indented, and that doesn’t make a difference; but the similar-looking fancy_println macro cannot. More broadly, you have two animals that quack like a duck, fly like a duck, swim like a duck… but one of them tastes like red pepper.

Let’s do a brief aside on that To invoke `fancy_println` as a macro it must be indented line. I have witnessed several users reacting to it with a frustrated “if it knows what I mean to do, why doesn’t it just do it!?”, and I understand. The problem is that in this case, RGBASM can infer what was intended; but in some cases, it can’t, and thus the rule has to exist. An analogy may help: if you were to mis-type simklar, your reader may be able to guess that you meant similar; but if you typed dkg, did you mean dig or dog? Or maybe dug?

Summary

The key takeaway regarding backwards compatibility is that there is a fundamental tension between keeping what’s currently working, well, working; and changing what’s making life difficult. Both aim to improve the user experience, but sometimes, the same change can improve some users’ experience and harm others’!

So, then, the natural next step is to ponder how handle changes. The next part in this series compares various strategies, and weighs their pros and cons.



Go back to the top of the page