Four Rules for Software Longevity

Image for post
Image for post
The very repairable and long-lasting Dualit toaster

For a long time we had a fun bit of technology in the house that would make a cheering sound whenever Arsenal scored, running off a Raspberry Pi, using the BBC live football “videprinter” and our streaming music sound system. I was always incredulous to find it still worked at the start of each new football season: an untouched, unloved little piece of code, never upgraded, continuing to quietly do its thing. Until eventually it stopped. Not because, as my amusing friends would have it, Arsenal stopped scoring goals, but because the BBC service it relied on finally got moved.

Is it really possible to build technology and create services that last for 10 years or more? I don’t mean by burying it in a mountain like the 10,000 year clock, or sending it into space like the Voyager space probes or the Solar Orbiter. I am talking about practical small web projects rather than industrial or embedded systems. In my case: as I refresh various 7-year-old projects like the Arsenal cheer creator and our fridge dashboard, how long will they last this time? Can they ever compete with MOCAS, that has been tracking contracts and payments for the US Department of Defense since 1958, and is thought to be the oldest computer program still in use?

Repairability

The longest serving appliance in our house is a Dualit toaster. With just a screwdriver all the parts can be replaced, and they’re readily available. It reminds me of the famous saying “I’ve had the same hammer for 50 years. I’ve replaced the head twice, and the handle four times, but it’s still the same hammer”. The Dualit is a simple device with a timer and heating elements; no sensors, modes or clever electronics. The recent case of a very old TV set that took down the broadband for an entire Welsh village every time it was switched on in the morning indicates that sometimes appliances do reach the end of their useful life, but the right to repair movement campaigns for easier consumer repairability for all devices. From 2021 in Europe, manufacturers will have to supply spare parts for 10 years after a product is sold, and products like the FairPhone are designed with repairability in mind.

Why doesn’t this work for software? Actually people will tell you it does. They’ll explain that clear APIs and standards mean that individual components can be swapped out, just like toaster parts. However this ignores the failures that happen over a longer time span: underlying operating systems, languages and frameworks fall out of support, hardware and networking assumptions change, providers disappear, new requirements are imposed. It would be as if the electricity supply to the toaster changed voltage every few years, bread sizes doubled, and new regulations declared it a fire hazard. A good example is security: when secure connections became the default in web browsers from 2018, it forced upgrades across the industry. Secure network connections rely on up to date libraries and root certificates, with a fundamental assumption that continual patching and upgrading takes place.

Can you have both security and longevity?

This is a great question — the hardest one to answer. Consider health systems — they certainly need both, and the consequences of a security breach in a hospital can be dire. September 2020 saw the first patient death attributed to ransomware. A misdirected cyberattack disrupted essential systems at a hospital in Dusseldorf (30 servers were encrypted by the ransomware), a patient had to be sent 20 miles to another hospital and tragically died en-route. So a regime of regular upgrading and patching is essential, and indeed a standard such as Cyber Essentials mandates applying all critical patches within 2 weeks. One consequence in practice is that older versions of Microsoft Windows, that no longer receive security patches, have to be phased out. In 2018 the NHS purchased Windows 10 licenses to upgrade over a million staff and machines. But what about specialised medical equipment that is expected to last 10 years but relies on a vulnerable Windows XP terminal, like many CT or MRI scanners? Manufacturers of regulated medical devices need thorough quality and validation procedures before making updates, to ensure ongoing clinical safety. Even if they and their hospital customers can keep up to date with patching, a wholesale upgrade to a new version of Windows may not happen (according to Palo Alto Networks, more than 80% of US imaging systems are running on out of date operating systems with known vulnerabilities and no ongoing security updates). One solution is to create separate private networks as safe cocoons within which to leave older technologies running where necessary, minimising exposure to outside attackers. As we’ll see below, another might be to rely on simpler, smaller operating systems that are easier to protect or cheaper to upgrade and test.

The flow strategy: continual migration

Should we even aim for a decade of minimal maintenance? Instead we could plan to be in a state of regular migration to newer platforms and systems: always be in flow, practice continuous deployment over the long term. Expect the platforms we use to practice planned obsolescence. We’re forced into this position with personal media: our tapes became CDs which became MP3 files, and now playlists we curate on Spotify. Our family photos went from negatives, slides and prints to scans and now to digital files in the cloud. Trying to think about the long term, I have photos stored locally in simple folders marking year and occasion, with a changing roster of cloud backup providers as a safety net, but every few years the work I’ve done to mark favourite items is lost in a hard disk failure, or a move from iPhoto to Aperture to Apple Photos.

For the flow strategy to work you need to migrate your code to new platforms. It’s easy to test if that’s worked as expected. But you also need to migrate your data, which is much harder to test. Keeping data in simple, standalone, easy to understand formats helps, as does avoiding properietary databases. You might think you’re safe as GDPR enshrines a right to data portability, but in reality progress has been slow. For example, the high profile Data Transfer Project between Google, Twitter, Facebook, Microsoft and Apple has so far only enabled transfer of photos (without metadata) between Google and Facebook.

Lessons from tiny code

Most commercial production systems have teams of people feeding and watering them, so the genuinely frozen and unattended small pieces of code I had at home for the last 7 years provide a nice time capsule; a simple case study of the factors that help and hinder longevity.

This was code with a simple structure running on a basic computer on an isolated home network, using well supported libraries and tools common among open source hobbyists and makers, without any automatic updates.

Image for post
Image for post
The refreshed fridge dashboard. Weather is still from forecast.io that became DarkSky and was then acquired by Apple in 2020. How much longer will that API work? Strava needed new authentication; Park Run parses the web site. The Covid-19 data is assembled from the great London Data Store, and I sincerely hope that API is no longer required in 10 years time!

Over time very simple maintenance was needed. Data and APIs changed or needed different authentication (although the most backward one, that you might have guessed would be the most fragile, is a web scraper to access Park Run results that still runs unchanged). Output devices needed refreshing: the trusty Squeezebox music system is no more, but a simple speaker plugged into the Pi is better; the “Kindle on the fridge” display has been remarkably resilient, although a second-hand replacement was needed a few years ago. But the core of the system has continued working untouched for many years.

So what finally broke it? It was security that got me in the end. New API endpoints insisting on secure connections, which unsurprisingly didn’t work on a 2013 version of Python 2.7. On my first generation Raspberry Pi things had got too tangled up, and it wasn’t clear this upgrade would be possible without touching the entire stack down to the operating system. Once you begin the upgrade chain it is hard to stop — trying to fix one library I ended up migrating to Python 3, Processing 3, a newer Debian, a new toolchain for managing Python runtimes and environments and, in for a penny in for a pound, a new Raspberry Pi.

Image for post
Image for post
Thanks to xkcd for this gem

Readers familiar with the technical side will now be pointing out that package management and containers are the solution, removing dependencies on the operating system and the mess of library versioning. However many of today’s relatively mature tools were not available 10 years ago. For example pyenv launched in 2012, pipenv in 2017 and Docker in 2013. Will they still be the right tools in 10 years time?

Rules of Longevity

So here are my humble suggestions for four “rules”. I’m sure you’ll have your own so let me know what you think of these:

1. Keep it Stupid

The original KISS acronym stands for Keep it Simple, Stupid. In the spirit of KISS I have shortened it to KIS. When writing software, the “stupid” version is simpler and more readable. Architecting for separation of concerns or “don’t repeat yourself” may create an appealing elegance, but is not really as simple as a single function that does everything. If it isn’t going to be reused or adapted, don’t make it reusable. If it doesn’t need to scale, don’t make it scalable. If it doesn’t need to perform, don’t tweak the performance. Go with the stupid version.

2. Self-contained Pieces

Each piece of your project should be as standalone as possible. Keep the metadata for a photo inside the jpeg file. Keep the documentation inside the code, or in a README in the same directory. Don’t use a library or framework unless you really need to (see “stupid” above). Be skeptical of pre-processing steps, domain-specific languages, stick to interpreted languages where possible. Don’t normalise. If you can get away with simple file storage, use that instead of a database. Prefer a network cable to wifi. Don’t use clever cloud services, they come and go too quickly. At the moment I am personally still using Medium for articles like this, which means I’m likely to need to migrate over the next few years when their business model changes or they get acquired or go out of business, and I know I should really be keeping them in Markdown on a personal server Indieweb-style.

3. Safety in Numbers

If your technology is in the mainstream of a hobbyist and maker community, your project has a great chance of persisting long into the future. What you really need are platforms used by a world full of tinkerers and enthusiasts with lively online communities, understanding and fixing things, working on open source projects. Think about the groups who are building emulators to let you run old Nintendo games on new hardware, or getting Doom running on a pregnancy testing kit. Python is a great platform in this regard. Personally I have been using it since the mid-1990s, and you can still hire Python developers today. A minimal Linux installation is much more likely to stand the test of time than a complex commercial operating system like Windows or macOS, although even the tested and hardened vxWorks real-time operating system on a 1997 Mars rover had to be patched remotely to fix a bug.

4. Don’t Touch

If you’ve done this right, your project should need a little spruce up every couple of years, and major maintenance every 5 or 10. In the meantime, leave it be. Don’t tinker. Don’t add features. Don’t refactor. If it ain’t broke… remember that every change is a risk. If you upgrade one small thing, it may break a whole chain of dependencies and before you know it you’ll be starting from scratch. This is the hardest rule for any developer. In fixing up my Arsenal cheer software I thought it would be even better if it read out the score and scorer. In order to do so I used Amazon’s text-to-speech Polly API, thus breaking rules 1, 2 and 3, and guaranteeing I will have to break rule 4 in a year or two!

Challenges for the future

Security is the one I keep coming back to, as it is so hard to see how you can get away without upgrades, and the external world continually imposes new requirements. We’re compelled to upgrade our cars to less polluting models, our appliances to more energy efficient models, our lightbulbs to CFLs and then LEDs. Similarly we need to keep an eye on security libraries, certificates and vulnerabilities.

Even harder will be dealing with accounts, passwords and authentication as we age. Following good advice I now use a password manager (with its own complex password), and two-factor authentication across many services, that generally depends on access to a mobile number or handset. This is a fragile system that is unlikely to last a decade let alone be usable as we age. Will I be able to log in to my Raspberry Pi when I’m 80?

Reflection

It has been interesting thinking through these rules as they generally contradict how we normally think about scalability, maintainability, quality, performance and architecture. I am suggesting deliberately choosing bad architecture and antipatterns! But perhaps this makes sense if the main goal is sustainability — minimising effort, maintenance and replacement cost while maximising the usable life of a technology.

Acknowledgments

Thanks to Chris for a discussion that added some good ideas and Stefan for reviewing.

Technology at Our Future Health; Non-exec director at NHS Digital

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store