More on this book
Community
Kindle Notes & Highlights
Read between
February 27 - March 17, 2022
technology has a number of trade-offs where optimizing for one characteristic diminishes another important characteristic.
Looking for absolute truths in situations that are ambiguous and value-based is painful. Sometimes it helps just to highlight the fact that the disagreement is really over what to optimize for, rather than pure technical correctness.
It’s the basis of iteration—you build something, collect data on how it is performing, modify it to improve performance, and start the cycle over. This is how effective technology is built, so engineering teams should get comfortable using it to make hard decisions.
It’s surprisingly easy to change people’s minds about the inevitability of failure when you demonstrate that success is possible.
The hard problems around legacy modernization are not technical problems; they’re people problems. The technology is usually pretty straightforward.
The important word in the phrase proof of concept is proof. You need to prove to people that success is possible and worth doing.
There is a point where cynicism is so high, no single first step will ever provide enough value to prove the project will work.
The first question to ask is does this particular migration actually add any value at all? Or are we migrating because there’s a new shiny technology in front of us? After all, monoliths are not universally bad.
When things are working well and money is coming in, engineers can tolerate a multitude of sins. When things are bad, the perception of value added by nearly any change goes up.
Any system more than five years old will have at least a couple major things wrong with it.
you need to learn to talk about what you are doing in a way that minimizes the number of big decisions that need to be made—particularly big decisions that include changes in process or anything that would need multiple stakeholders to sign off on and many rounds of approvals to change.
You may think that by giving projects fancy names, projecting budgets, and settling staffing questions up front you are being diligent, and you are! But you’re also making the project look like a series of big decisions, which for audiences insulated from the day-to-day pain of legacy systems seems too risky.
opportunity cost is money lost by not doing something because you have chosen another opportunity instead.
Investing in the health of your technology makes sense to everyone only when the technology is visibly failing, and by that point, the problem is much larger and much harder to solve.
The pressure to delay maintenance work on legacy systems in favor of new features and products is constant at most organizations. There’s never a good time for it, although it always seems that if the organization could just get through the latest challenge, things will calm down and the cleanup can begin.
Organizations tend to underestimate the amount of work and level of investment modernization requires. An unfortunate consequence of that assumption is that they do not seek out expertise until they are in trouble.
Meetings with ever-expanding invite lists suggest something is wrong in that area of the project.
people react to a struggling project in basically two ways. There are the people who roll up their sleeves and focus on helping, even if helping means unglamorous work not usually part of their responsibilities, and then there are the people who spend the time they could be helping drafting excuses that explain why the failure is not their fault.
Coming in midstream means the project hasn’t officially failed yet, and what people are getting wrong, they are probably doubling and tripling down on.
Having unclear responsibilities means teams feel like they are asked to pick up the slack for someone else too often. They become self-righteous and start ignoring tasks that aren’t part of their jobs as they see it, making the situation worse.
When you can, it is always better to set up someone else for victory rather than solving the problem yourself.
reasons organizations try to fix things that aren’t broken. They assume new technology is more advanced than older technology. They aspire to artificial consistency. They confuse success with quality. They optimize past the point of diminishing returns.
Building a product from the beginning with a service-oriented architecture is usually a mistake.
the level of abstraction your design has should be inversely proportional to the number of untested assumptions you’re making.
The benefits of tight coupling are that one person can hold enough knowledge of the system in her head to anticipate behavior in a variety of conditions.
small organizations build monoliths because small organizations are monoliths.
nobody starts a large organization, just as nobody gives birth to a teenager.
Most monoliths will eventually have to be rethought and redesigned, but trying to pinpoint when is like trying to predict the exact moment you will outgrow a favorite sweater.
Don’t believe anyone who tells you that ditching your monolith is the solution to all your problems. Monoliths can and do scale. Sometimes they are more expensive to scale, but the notion...
This highlight has been truncated due to consecutive passage length restrictions.
Fixing things that are not broken means you’re taking on all the risks of a modernization but will not be able to find the compelling value add and build the momentum that keeps things going.
I had a friend who used to say her greatest honor was hearing a system she built had to be rewritten in order to scale it. This meant she had built something that people loved and found useful to the point where they needed to scale it.
Optimizing to minimize rewrites might seem like a sensible strategy, but if not properly reined in, it invites behavior that ultimately makes systems more brittle.
“Metawork is more interesting than work.” Left to their own devices, software engineers will almost invariably over-engineer things to tackle bigger, more complex, long-view problems instead of the problems directly in front of them.
Decisions motivated by wanting to avoid rewriting code later are usually bad decisions.
Set the expectation that all systems need to be rewritten eventually. Engineers at the highest level write programs that have to be revised. No one is smart enough to anticipate every new use case or feature, every advancement in hardware, or every adjustment or shift that might require code to be rewritten.
If an ugly piece of code meets its SLO, it might not be broken, it might be just an ugly piece of code. Technology doesn’t need to be beautiful or to impress other people to be effective, and all technologists are ultimately in the business of producing effective technology.
software engineers are socialized around the idea that their discipline is so difficult, nonengineers are incapable of understanding even the most basic concepts. Resistance from the nontechnical side of an organization tends to be dismissed as ignorance.
Contract testing is a form of automated testing that checks whether components of a system have broken their data contracts with one another.
Engineers make decisions that are worse for the health of systems overall but are less likely to trigger outages that they can be blamed for as individuals.
You have to be comfortable with the unknown. You can do that by emphasizing resilience over reliability.
“Anything over four nines is basically a lie.” The more nines you are trying to guarantee, the more risk-averse engineering teams will become, and the more they will avoid necessary improvements.
When organizations stop aiming for perfection and accept that all systems will occasionally fail, they stop letting their technology rot for fear of change and invest in responding faster to failure.
Resilience in engineering is all about recovering stronger from failure. That means better monitoring, better documentation, and better processes for restoring services, but you can’t improve any of that if you don’t occasionally fail.
Leaders have their fiefdoms. They fought hard for the resources they have. If they reroute even a small portion of those resources to institutional problems while their peers ignore the problem and the problem is not solved, those resources could be permanently forfeited.
Code Yellow, which is a cross-functional team created to tackle an issue critical to operational excellence.