Average English

On the merits of writing unsurprising English

I deeply believe that writing is a superpower. Especially now that we are interfacing with LLMs. Or, in Andrej Karpathy’s words: “The hottest new programming language is English.”1

But this goes far beyond writing prompts. It’s also about writing unsurprising, average English.

No surprises here

Let’s start with the principle of unsurprising language. I like to suggest to developers that they write code that is least surprising.

The principle of least astonishment or surprise originates from systems language design. Quoting the original source:

For those parts of the system which cannot be adjusted to the peculiarities of the user, the designers of a systems programming language should obey the “Law of Least Astonishment.” […] Widely accepted conventions should be followed whenever possible, and exceptions to previously established rules of the language should be minimal. 2

What does idiomatic mean?

It is that latter part about language or code that we’re interested in here. I think you can paraphrase it as: be idiomatic. What does that mean? Idiomatic language is the use of phrases, expressions, and structures that appear natural to the speakers or practitioners in the field.3

Applied to code this means using the right keywords, functions, and patterns that are commonly used in the programming language or framework you are working with.

Let me give you a real-world example. If we are working with a client, we establish a common vocabulary. They describe to us how they want a interactive component that does x, y and z and we respond with: Ah, you want a table of contents component and a search bar with autocomplete. We could have decided to call it a “anchor navigation” or “magic search”, but we chose the term with least ambiguity and highest specificity.

Why?

If you search for “table of contents component” and you find lots of helpful examples and code snippets for the thing you want. What are these UI elements called at Wikipedia, MDN, or the NestJS documentation? Table of components or TOC.

You search for “anchor navigation” and you find a lot of different things that are not what you are looking for. The next developer reviewing your code initially has no clue what you are talking about. They have to read the code and figure out what you meant. This is not only a waste of time, but it also leads to misunderstandings and bugs.

As developers, we have a natural tendency to create neologisms, fight tooth and nail over namings and thus converge on idiomatic patterns.4 To an extent, I’m happy about this creative oulet. On the other, constant deliberations over naming are a waste of time. Not reinventing the wheel also applies to language that was already well formalized.

Instead, you think hard about how to express things in a way that is least surprising to the next developer.5 That is time well spent because it compounds to time saved.

Idiomatic language in action.

Writing actionable tickets

Now, let’s talk about writing user stories and tickets. Incredibly important and consequential pieces of data. Be respectful of your colleagues and write user stories that are precise, easy to parse and unsurprising when they look deeper into them.

Whether you find yourself creating the initial user story or re-wording a ticket, the goal is not to have user stories that are as generic as possible. I see this happen all the time.

Take this example:

As a user, I want to be informed about important topics and be able to jump to them.

You would have been unpleasantly surprised to see what this user story actually meant.

Would you have thought that it asked for a popup? Would you say that being presented with a popup that asks the user to subscribe to a newsletter is something the user would want? Would you think that the definition of important topics are shared across all users? Probably not.

So instead, you might put it differently and reflect the true intent of what we’re doing (even if it sounds less nice):

As a business, we want to prompt first-time visitors with a popup so we can collect their email addresses and increase newsletter conversions.

If you see a generic user story, challenge it. Make it specific. Make it honest.

Making a point

In journalism school, you learn that everyone can write long meandering articles. The real skill is to edit and subtract until nothing can be taken away. The same applies to software engineering. Think about what to say ahead of time. Use the bottom-line up front principle. Write it down, cut out the fluff.

Do you like those Fireship short form videos? They are a great example of this principle in action. You can rest assured that Jeff Delaney spends a lot of time and effort on the scripts he writes for those videos. He is a master of brevity and clarity.6

So my advice would be to seek out every opportunity to practice conveying your point in as few words as possible. In a sequence and cadence that makes sense. So that people know what to expect (no surprises).

Software 3.0

Prompting feels a little like googling. You formulate a query, hoping for the best possible answer. The more specific and well-structured your query, the better the results.

With the advent of LLMs that has become doubly true. You want to write code that is as statistically close to the “average” code that LLMs have been trained on. Vice versa you query LLMs with a kind of language that is as close to the “average” English that LLMs have been trained on. The more precise your language, the better the results. Dropping specific concepts and verbs into the model will yield specific results.

Your ROI

Sure, LLMs are forgiving of typos and fuzzyness. Claude will create a well formed PLAN.md for you. But why not use the opportunity to improve your own language skills? The more you practice, the better you get. LLMs are also inherently bad at precise language. They are trained to answer in a language that is appealing and positively framed. Not a role model for clarity.

Don’t fall into the trap of using LLMs as a crutch for your own language skills. Even higher up the food chain, I see unedited AI slop getting passed around as briefings, requirements or PRD now.

A good indicator that language truly is a rare skill.

Rare means valuable.

Footnotes

  1. See Andrej Karpathy’s tweet and his Software 3.0 talk. ↩︎

  2. The fact that the source is 50 years old should be testament alone to the timelessness of this principle. Bergeron et al. (1972). Systems programming languages. In Advances in Computers (Vol. 12, pp. 175—284). ↩︎

  3. If you wanted, you could argue about whether constructions or vocabulary specific to a certain programming language are really more idiosyncratic than idiomatic. But I don’t. ↩︎

  4. As an old programming humor comic strip puts it: “The only valid measurement of code quality is WTFs/minute”, where less is better. ↩︎

  5. There are two hard things in computer science: cache invalidation, naming things and off-by-one errors. ↩︎

  6. This also maps nicely to the adage of “every line of code is a liability”. ↩︎