The oral tradition that built software may not survive AI

Until I became a software engineer at 32, my whole professional life was organized around the written word. I was a historian, one who was firmly anchored in books and archives and articles. I switched careers for reasons that aren’t important here and that I’ve written about elsewhere; suffice it to say, the job market for historians was sufficiently terrible that I wanted to do something else. I became a software engineer because I liked the problem-solving and design aspects of it. I work as a backend engineer for Hagerty Insurance. Somehow, I’ve been able to fit into it, perhaps even do well at it. But the thing that continues to confound me in so many ways in this job is that so little is ever written down.

The reality is this: Software development is an oral tradition. Especially when you’re just starting out as an engineer, you’re not working on brand-new code; you’re probably in a legacy code base. You’re going to face more questions than answers about what stuff does or why it was written the way that it was, and when you go looking for answers, there’s not going to be much written down. Perhaps there’s an early design doc, but then it turns out that everything was substantially revised before work began. Maybe there are a few wiki pages explaining known issues, some of which were solved a long time ago and others that have been left to molder in the codebase. Somebody might have left a comment in the code itself, but typically it’s a warning not to change something or else something else will break.

At this level, being a historian turned out to be an unexpected cultural advantage. Historians are used to reconstructing stories from fragments rather than finding neat explanations in the archive. So when it comes time to make a change in an unfamiliar codebase, you usually wind up finding another developer to ask for help. They’ve been there long enough to understand what’s happening under the hood, and they might be able to explain why changing one seemingly harmless thing could turn into a very bad idea.

Software engineering has an ambivalent relationship with documentation. Everyone agrees documentation matters in theory, but in practice it’s inconsistent, outdated, or missing entirely. Part of that is simple inertia. Writing documentation is usually less interesting than writing the code itself. But it’s also ideological. The Agile movement emerged in part as a reaction against the heavily documented Waterfall methodology, and one of Agile’s core values explicitly prioritizes “working software over comprehensive documentation.” In escaping bureaucratic overdocumentation, the industry also normalized underdocumentation.

This comparison might make some of my friends still in higher education wince. Software engineers don’t perform their code; we don’t tell it as a story, and if somebody asks me to tell them why we wrote a stored procedure as opposed to something else I’m not going to reply in metered verse. But oral traditions do have a teaching component, and at that level it is replicated among engineers. Certain software development patterns or implementations will be prioritized over others, or deprioritized (engineers tend to have sections of a codebase that they hold up as a cautionary warning of what not to do). This is a major part of how new engineers are trained.

It’s not that oral knowledge is a bad thing or that documentation is an inherently good thing. Societies relying on oral tradition to pass down culture and information did so stably for thousands of years in many cases. They developed highly accurate ways of passing down the information. But in one respect, software engineering is very different: the turnover among software engineers. Societies that relied on oral tradition didn’t lose their storytellers every five to seven years, but employers even at prestigious tech companies can reasonably expect to shed engineers every few years. I was stunned that within a few years on a job, there were projects where I might be the only person who had originally worked on it. And the crux, more than “what does this do?” is “why was it written this way, and what will happen if I change something?”

The result is a constant drain of domain knowledge. It’s not just that it makes onboarding more difficult; it makes solving ingrained tech debt that much more difficult. If people new to a codebase dive in and start making changes, you run the risk of injecting a great deal of instability. Every attempted fix to tech debt, which is itself a burgeoning problem and has been for decades, is that much more difficult and risky, which makes tackling tech debt even less attractive than it might otherwise be. It makes onboarding new people more difficult and time consuming as well as raising the likelihood of damaging mistakes.

It’s tempting therefore to imagine that generative AI will step into the breach and solve this for us. After all, even if you don’t want to turn a large language model (LLM) loose on a legacy code base—and there are plenty of reasons that you shouldn’t—having it generate documentation on the codebase itself might sound like a solution to the absence of other written information. LLMs can certainly summarize code back to you.

But hold up with that idea. Beyond hallucinations, there’s a deeper problem: Writing documentation is itself part of the thinking process. Whether I’m writing history or software, putting an approach into words helps refine it before I sink hours into implementation. Documentation also captures intent. An LLM may be able to summarize what a codebase does, but it cannot reliably explain why a developer chose one approach over another, or what trade-offs shaped that decision.

Moreover, it’s a chance for somebody else to understand why you did what you did. If they plan to change what I wrote (especially in a few years), they might understand why I needed to write it that way and what might be lost if you take it out. An LLM can read code that I’ve written. It might even scan a large codebase and accurately summarize what it’s doing. But it can’t assess authorial intent.

The solution isn’t offloading the mental labor, at least if you want to reduce tech debt and ease onboarding. Instead, it’s reembracing the written word, or at least what can work for us. Just as there’s a culture of orality in tech, there’s also a culture of the written word, and they can coexist with each other. One of the best examples came during the development of ARPANET, where engineers built a culture around RFCs, or Requests for Comment: informal memoranda used to discuss standards, problems, proposals, and best practices. Some were serious, some humorous, but all were written for other engineers. Imagine treating documentation the same way: not as bureaucratic homework for managers, but as communication with the people who will inherit your code later.

We reject documentation at our own peril. It’s been done badly in the past: Waterfall foundered in no small part because it had engineers writing records and papers for project managers and administrators. We can let them write their own documentation, and they should. For our part, we need to write our own.