More on this book
Community
Kindle Notes & Highlights
by
Jez Humble
Read between
July 3 - July 3, 2017
Once we have a prioritized list of target conditions and impact maps created collaboratively by technical and business people, it is up to the teams to determine the shortest possible path to the target condition.
Specifying target conditions rather than features allows us to rapidly respond to changes in our environment and to the information we gather from stakeholders as we work towards the target condition.
Our task is to find the shortest path to the target condition.
Lean UX, Josh Seiden and Jeff Gothelf suggest the template shown in Figure 9-2 to use as a starting point for capturing hypotheses.6 Figure 9-2. Jeff Gothelf’s template for hypothesis-driven development
Use the 80/20 rule and don’t worry about corner cases Build the 20% of functionality that will deliver 80% of the expected benefit. Don’t build for scale Experiments on a busy website are usually only seen by a tiny percentage of users. Don’t bother with cross-browser compatibility With some simple filtering code, you can ensure that only users with the correct browser get to see the experiment. Don’t bother with significant test coverage You can add test coverage later if the feature is validated. Good monitoring is much more important when developing an experimentation platform.
Twyman’s Law: “If a statistic looks interesting or unusual it is probably wrong.”
If we’re going to adopt a thorough experimental approach, we need to change what we consider to be the outcome of our work: not just validated ideas but the information we gain in the course of running the experiments. We also need to change the way we think about developing new ideas; in particular, it’s essential to work in small batches and test every assumption behind the idea we are validating. This, in turn, requires that we implement continuous delivery,
Working in small batches creates flow — a key element of Lean Thinking.
taking a scientific approach to customer and product development requires intensive collaboration between product, design, and technical people throughout the lifecycle of every product. This is a big cultural change for many enterprises where technical staff do not generally contribute to the overall design process.
One of the most common challenges encountered in software development is the focus of teams, product managers, and organizations on managing cost rather than value.
By focusing on the outcomes we wish to achieve, rather than solutions and features, we can separate what we are trying to do from the possible ways to do it.
The best managers figure out how to get great outcomes by setting the appropriate context, rather than by trying to control their people. Reed Hastings
Prescriptive, rule-based processes also act as a brake on continuous improvement unless people operating the process are allowed to modify them. Finally, an overreliance on process tends to drive out people who tinker, take risks, and run safe-to-fail experiments. These kind of people tend to feel suffocated in a process-heavy environment — but they are essential drivers of an innovation culture.
Similarly, as organizations grow, the systems they build and operate increase in complexity. To get new features to market quickly, we often trade off quality for higher velocity. This is a sensible and rational decision. But at some point, the complexity of our systems becomes a limiting factor on our ability to deliver new work, and we hit a brick wall. Many enterprises have thousands of services in production, including mission-critical systems running on legacy platforms. These systems are often interconnected in ways that make it very hard to change any part of the system without also
...more
CEO Jeff Bezos turned this problem into an opportunity. He wanted Amazon to become a platform that other businesses could leverage, with the ultimate goal of better meeting customer needs.
Each team is thus effectively engaged in product development — even the people working on the infrastructural components that comprise Amazon Web Services, such as EC2. It’s hard to overemphasize the importance of this transition from a project-based funding and delivery paradigm to one based on product development.
To control this problem, Amazon stipulated that all teams must conform to the “two pizza” rule: they should be small enough that two pizzas can feed the whole team — usually about 5 to 10 people. This limit on size has four important effects:
Each two-pizza team (2PT) is as autonomous as possible. The team’s lead, working with the executive team, would decide upon the key business metric that the team is responsible for, known as the fitness function, that becomes the overall evaluation criteria for the team’s experiments. The team is then able to act autonomously to maximize that metric, using the techniques we describe in Chapter 9
An essential element of Amazon’s strategy was the link between the organizational structure of a 2PT and the architectural approach of a service-oriented architecture.
There are two rules of thumb architects follow when decomposing systems. First, ensure that adding a new feature tends to change only one service or a component at a time. This reduces interface churn.6 Second, avoid “chatty” or fine-grained communication between services. Chatty services scale poorly and are harder to impersonate for testing purposes.
What differentiates a service-oriented architecture is that its components can be deployed to production independently of each other. No more “big bang” releases of all the components of the system together: each service has its own independent release schedule. This architectural approach is essential to continuous delivery of large-scale systems.
The most important rule that must be followed is this: the team managing a service has to ensure that its consumers don’t break when a new version is released.
“Organizations which design systems…are constrained to produce designs which are copies of the communication structures of these organizations.” One way to apply Conway’s Law is to align API boundaries with team boundaries. In this way we can distribute teams all across the world.
Organizations often try to fight Conway’s Law. A common example is splitting teams by function, e.g., by putting engineers and testers in different locations (or, even worse, by outsourcing testers).
The key to moving fast at scale is to create many small, decentralized, autonomous teams, based on the model of Mission Command described in Chapter 1
In truly decentralized organizations, we follow the principle of subsidiarity: by default, decisions should be made by the people who are directly affected by those decisions.
In companies such as Amazon, Netflix, and Etsy, teams, in many cases, do not need to raise tickets and have changes reviewed by an advisory board to get them deployed to production.
Engineers are expected to consult with each other before pushing changes, and certain types of high-risk changes (such as database changes or changes to a PCI-DSS cardholder data environment) are managed out of band.
ITIL supports this concept in the form of standard changes. All changes that launch dark (and which thus form the basis of A/B tests) should be considered standard changes.
If no such system exists, or it is unsuitable, the team should be allowed to choose their own stack — but must be prepared to meet any applicable regulatory constraints and bear the costs of supporting the system in production.
Autonomy — combined with an enterprise architecture that supports it — reduces dependencies between teams so they can get changes out faster.
If we can quickly learn what users actually value, we can stop wasting time building things that don’t add value.
The most important metric is: how fast can we learn? Change lead time is a useful proxy variable for this metric, and autonomous teams are essential to improving it.
When everybody on the team can build small experiments, push them into production, and analyze the metrics, the entire team comes into contact with users on a day-to-day basis.
a proof of your autonomy, mastery, and purpose.
It’s one thing to adopt the principles of Mission Command in a growing startup — but another thing entirely in an enterprise with a more traditional, centralized approach to management and decision making. Mission Command drastically changes the way we think about management — in particular, management of risk, cost, and other system-level outcomes.
The role of finance, the project management office, enterprise architects, GRC teams, and other centralized groups changes: they specify target outcomes, help to make the current state transparent, and provide support and tools where requested, but do not dictate how cost, processes, and risk are managed.
To enable both continuous delivery and decentralization, teams must be able to get changes out quickly and safely.
Architecting for continuous delivery and service orientation means evolving systems that are testable and deployable
A common response to getting stuck in a big ball of mud is to fund a large systems replacement project. Such projects typically take months or years before they deliver any value to users, and the switchover from the old to the new system is often performed in “big bang” fashion. These projects also run an unusually high risk of running late and over budget and being cancelled. Systems rearchitecture should not be done as a large program of work funded from the capital budget. It should be a continuous activity that happens as part of the product development process.
Amazon did not replace their monolithic Obidos architecture in a “big bang” replacement program. Instead, they moved to a service-oriented architecture incrementally, while continuing to deliver new functionality, using a pattern known as the “strangler application.” As described by Martin Fowler, the pattern involves gradual replacement of a system by implementing new features in a new application that is loosely coupled to the existing system, porting existing functionality from the original application only where necessary.9
Over time, the old application is “strangled” — just like a tree enveloped by a tropical s...
This highlight has been truncated due to consecutive passage length restrictions.
Always find ways to satisfy a need that is not served by the existing software, and prioritize features using the cost of delay divided by duration (CD3), as described in Chapter 7, to ensure you deliver the largest amount of value in the shortest possible time.
Do not attempt to port existing functionality unless it is to support a business process change
Deliver something fast Make the initial release of your new application small enough that you can get it deployed and providing value in a few weeks to a few months.
The measure of success for the first release is how quickly you can do it, not how much functionality is in it.
Design for testability and deployability Functionality in the new application must always be built using good software development practices: test-driven development, continuous integration, a well-encapsulated, loosely coupled modular design.
make sure the team working on it is enthusiastic about these methods and has enough experience to have a good chance at succeeding.
There is of course a trade-off to migrating in an incremental way. Overall, it takes longer to do a replacement incrementally compared to a hypothetical “big bang” rearchitecture delivering the same functionality. However, since a strangler application delivers customer value from early on, evolves in response to changing customer needs, and can advance at its own rate, it is almost always to be preferred.
Enterprise architecture is usually driven by expensive, top-down plans to “rationalize” the architecture, move from legacy systems to a modern platform, and remove duplication to create a single source of truth.

