More on this book
Community
Kindle Notes & Highlights
Read between
May 2 - August 20, 2022
Because we don’t talk about modernizing old tech, organizations fall into the same traps over and over again. Failure is predictable because so many software engineers think the conversations about modernizing legacy technology are not relevant to their careers.
We are horrified to discover that most people do not actually care how healthy a piece of technology is as long as it performs the function they need it to with a reasonable degree of accuracy in a timeframe that doesn’t exhaust their patience. In technology, “good enough” reigns
The first mistake software engineers make with legacy modernization is assuming technical advancement is linear.
Changing technology should be about real value and trade-offs, not faulty assumptions that newer is by default more advanced.
The lesson to learn here is the systems that feel familiar to people always provide more value than the systems that have structural elegances but run contrary to expectations.
Engineers tend to overestimate the value of order and neatness. The only thing that really matters with a computer system is its effectiveness at performing its practical application.
“the single worst strategic mistake that any software company can make.”
Unfortunately, when confronted with the troubles of existing systems, engineering teams tend to build the most momentum around starting from scratch.
So programmers prefer full rewrites over iterating legacy systems because rewrites maintain an attractive level of ambiguity while the existing systems are well known and, therefore, boring.
It’s no accident that proposals for full rewrites tend to include introducing some language, design pattern, or technology that is new to the engineering team. Very few rewrite plans take the form of redesigning the system using the same language or merely fixing a well-defined structural issue.
A big red flag is raised for me when people talk about the phases of their modernization plans in terms of which technologies they are going to use rather than what value they will add.
Modernizations should be based on adding value, not chasing new technology.
Familiar interfaces help speed up adoption.
Technical debt is most likely to happen when assumptions or requirements have changed and the organization resorts to a quick fix rather than budgeting the time and resources to adapt.
Large problems are always tackled by breaking them down into smaller problems. Solve enough small problems, and eventually the large problem collapses and can be resolved.
These are the kinds of situations where people become frustrated and start convincing themselves that the best thing to do is throw the whole thing out and build it from scratch.
When both observability and testing are lacking on your legacy system, observability comes first. Tests tell you only what won’t fail; monitoring tells you what is failing.
The most relevant guide for legacy modernizations is Michael Feathers’ Working Effectively with Legacy Code.
Software can have serious bugs and still be wildly successful. Lotus 1-2-3 famously mistook 1900 for a leap year, but it was so popular that versions of Excel to this day have to be programmed to honor that mistake to ensure backward compatibility. And because Excel’s popularity ultimately dwarfed that of Lotus 1-2-3, the bug is now part of the ECMA Office Open XML specification.
Building a modern infrastructure is not a goal.
In all likelihood, the business side of the organization does not understand what’s wrong with the existing system. Rolling out features they already have is not something they will celebrate.
If you’re thinking about rearchitecting a system and cannot tie the effort back to some kind of business goal, you probably shouldn’t be doing it at all.
the number-one killer of big efforts is not technical failure. It’s loss of momentum.
If you are running a team tasked with just cleaning up the debt and migrating onto more suitable technologies, it means the existing organization has failed to adapt.
Left to their own devices, software engineers will almost invariably over-engineer things to tackle bigger, more complex, long-view problems instead of the problems directly in front of them.
When organizations stop aiming for perfection and accept that all systems will occasionally fail, they stop letting their technology rot for fear of change and invest in responding faster to failure.
Problem-solving versus problem-setting is the difference between being reactive and being responsive.
A good rule of thumb is questions that begin with why produce more abstract statements, while questions that begin with how generate answers that are more specific and actionable.
Individual incentives have a role in design choices. People will make design decisions based on how a specific choice—using a shiny new tool or process—will shape their future.
To save face, reorgs and full rewrites become preferable solutions, even though they are more expensive and often less effective.
“The greatest single common factor behind many poorly designed systems now in existence has been the availability of a design organization in need of work.”
When an organization has no clear career pathway for software engineers, they grow their careers by building their reputations externally.
By defining what the expectations were for every experience level of engineering and hiring managers who would coach and advocate for their engineers, engineers could earn promotions and opportunities without the need to show off.
Well-integrated, high-functioning software that is easy to understand usually blends in. Simple solutions do not do much to enhance one’s personal brand. They are rarely worth talking about.
Introducing new languages or tools to optimize performance for the sake of optimizing performance
Most of the systems I work on rescuing are not badly built. They are badly maintained.
Conway argued against aspiring for a universally correct architecture. He wrote in 1968, “It is an article of faith among experienced system designers that given any system design, someone someday will find a better one to do the same job. In other words, it is misleading and incorrect to speak of the design for a specific job, unless this is understood in the context of space, time, knowledge, and technology.”
Joel Spolsky once described rewriting software as the single worst strategic mistake any organization could make, but he attributed its nearly universal appeal to a clever maxim that code is easier to write than read.9
A hundred errors on a legacy system is not failure-prone if it handles two million requests over that period. When looking at legacy systems, we tend to overrepresent failures.
The systems we like to rewrite from scratch also tend to be complex with many layers of abstraction and integrations.
Our perception of risk cues up another cognitive bias that makes rewrites more appealing than incremental improvements
When success seems certain, we gravitate toward more conservative, risk-averse solutions. When failure seems more likely, we switch mentalities completely. We go bold, take more risks.11
We are swapping a system that works and needs to be adjusted for an expensive and difficult migration to something unproven.
Legacy modernizations are ultimately transitions and require leaders with high tolerance for ambiguity.
It is impossible to improve a large, complex, debt-ridden system without breaking it.
As an industry, we reflect on success but study failure. Sometimes obsessively. I’m suggesting that if you’re modeling your project after another team or another organization’s success, you should devote some time to actually researching how that success came to be in the first place.
The most valuable skill a leader can have is knowing when to get out of the way.
Automation that fails either silently or with unclear error messages at best wastes a lot of valuable engineering time and at worst triggers unpredictable and dangerous side effects.
Future-proofing means constantly rethinking and iterating on the existing system. People don’t go into building a service thinking that they will neglect it until it’s a huge liability for their organization. People fail to maintain services because they are not given the time or resources to maintain them.
Legacy modernizations themselves are anti-patterns.