More on this book
Community
Kindle Notes & Highlights
Splitting off the parts of the system that typically change at different speeds allows them to change more quickly.
Within distributed teams, communication is limited since they must explicitly request a physical or virtual space and time to communicate across locations.
Working across different time zones aggravates these communication delays and introduces bottlenecks when manual approvals or code reviews are needed
for a team to communicate efficiently, the options are between full colocation (all team members sharing the same physical space) or a true remote-first approach (explicitly restricting communication to agreed channels—such as messaging and collaboration apps—that everyone on the team has access to and consults regularly). When neither of these options is feasible (full colocation or remote first), then it’s better to split off the monolith into separate subsystems for teams in different locations.
Splitting off subsystems with clearly different risk profiles allows mapping the technology changes to business appetite or regulatory needs.
parts of applications subject to peaks of demand at a large scale (like yearly tax submissions on the last day), require a level of scaling and failover optimization not necessary for the rest of the system. Splitting off such a subsystem based on particular performance demands helps to ensure it can scale autonomously, increasing performance and reducing cost.
it makes sense to split off subsystems for user personas in these types of situations. The effort required to remove dependencies or coupling between features is compensated with a sharper focus on customers’ needs and experience using the system, which should result in higher customer satisfaction and improve the organization’s bottom line.
Could we, as a team, effectively consume or provide this subsystem as a service? If the answer is yes, then the subsystem is a good candidate for splitting off and assigning to a team to own and evolve.
Choose Software Boundaries to Match Team Cognitive Load
“If you have microservices but you wait and do end-to-end testing of a combination of them before a release, what you have is a distributed monolith.”
Technologies and organizations should be redesigned to intermittently isolate people from each other’s work for best collective performance in solving complex problems. —Ethan Bernstein, Jesse Shore, and David Lazer,
In some cases, the same two teams might need to collaborate closely at some point in time but be independent six or nine months later in order to achieve the most effective software delivery outcomes.
In many organizations, poorly defined team interactions and responsibilities are a source of friction and ineffectiveness.
Collaboration: working closely together with another team X-as-a-Service: consuming or providing something with minimal collaboration Facilitating: helping (or being helped by) another team to clear impediments
The collaboration team mode is suitable where a high degree of adaptability or discovery is needed, particularly when exploring new technologies or techniques. The collaboration interaction mode is good for rapid discovery of new things, because it avoids costly hand-offs between teams.
the cognitive load of ongoing collaboration can be much higher than working purely inside the team’s “natural” area. This means the communication overhead is going to be higher, possibly resulting in the apparent reduction of team effectiveness when viewed as a single team. Instead, the investment in higher effectiveness is being made in the ensemble of the two teams through rapid discovery of new practices.
Short-term or light-touch occasional collaboration to establish or refine the interfaces is fine, but a need for ongoing collaboration suggests incorrect domain boundaries and/or team responsibilities, or the incorrect mix of skills within a team.
The remit of the team undertaking the facilitation is to enable the other team(s) to be more effective, learn more quickly, understand a new technology better, and discover and remove common problems or impediments across the teams.
Leading technologist James Urquhart, writing about team intercommunication with Conway’s law in mind, describes the need for “a communication backchannel that avoids much of the politics, bandwidth constraints, and simple inefficiency of human-to-human communication.”4 This is exactly the kind of outcome this chapter’s well-defined team interactions should provide.
a dedicated architecture team is usually an anti-pattern to be avoided. However, a small group of software and systems architects can be hugely effective within an organization when the remit of architecture is to discover, adjust, and reshape the interactions between teams, and therefore, the architecture of the system.
use people skilled in API design to design the APIs between teams within the organization.
Collaborate on potentially ambiguous interfaces until the interfaces are proven stable and functional.
As Don Reinertsen says, “We need to be alert for the white space between the roles, gaps that nobody feels responsible for.”
X-as-a-Service is best for situations where predictable delivery is more important than rapid discovery.
Symptoms A startup company grows beyond fifteen people (Dunbar’s number). Other teams spend lots of time waiting on a single team to undertake changes. Changes to certain components or workflows in the system routinely get assigned to the same people, even when they’re already busy or away. Team members complain about lack of system documentation.
This can lead to an (often unspoken) specialization within the team regarding different components of the system. Requests that require changes to a specific component or workflow routinely get assigned to the same team member(s), because they will be able to deliver faster than other team members.
This level of specialization introduces bottlenecks in delivery (as storied in The Phoenix Project and explained in The DevOps Handbook). The routine aspect can also negatively affect individual motivation.
By treating operations as rich, sensory input to development, a cybernetic feedback system is set up that enables the organization to self steer.
To sense things (and make sense of things), organisms need defined, reliable communication pathways. Similarly, with well-defined and stable communication pathways between teams, organizations can detect signals from across the organization and outside it, becoming something like an organism.
Jeff Sussna, author of Designing Delivery, puts it like this: “Businesses normally treat operations as an output of design. . . . In order to empathize, though, one must be able to hear. In order to hear, one needs input from operations. Operations thus becomes an input to design.”
The team that designs and builds the software needs to be involved in its running and operational aspects in order to be able to build it effectively in the first place. The team providing this “design and run” continuity of care also needs to have some responsibility for the commercial viability of the software service; otherwise, decisions will be made in a vacuum separate from financial reality.
To generate and receive high-fidelity information from frontline, operational systems, highly skilled, highly aware people are needed. This means that—contrary to how many IT operations teams were staffed in the past—people in IT operations need to be able to recognize and triage problems quickly and accurately, providing accurate, useful information to their colleagues who are focused on building new features.
Team Topologies helps to address all these points by setting forth a team-first approach to software delivery predicated on four fundamental team types, three team interaction patterns, and ways of using difficulties in delivery that empower the organization to sense its surroundings.
The team considers not just its code as part of its external “API” but also its documentation, onboarding processes, interactions with other teams in person and via chat tools, and anything else that other teams need in order to interact with its members.
To avoid software becoming ever larger and eventually overwhelming a team, the size of subsystems (or components) are limited to that manageable by a single team.
Approaches to building and operating cloud software such as microservices accommodate the need for human-sized chunks of functionality alongside enhanced deployability. Teams have a greater chance of innovating and supporting a system if they can understand the constituent parts and feel a sense of ownership over the code, rather than being treated like workers on an assembly line.
A healthy organizational culture: an environment that supports the professional development of individuals and teams—one in which people feel empowered and safe to speak out about problems, and the organization expects to learn continuously. Good engineering practices: test-first design and development of all aspects of the systems, a focus on continuous delivery and operability practices, pairing and mobbing for code review, avoiding the search for a single “root cause” for incidents, designing for testability, and so on. Healthy funding and financial practices: avoiding the pernicious
...more
technology is only ever a part of the platform; roadmaps, guided evolution, clear documentation, a concern for DevEx, and appropriate encapsulation of underlying complexity are all key parts of an effective delivery platform for stream-aligned teams.
In particular, you need people who understand and practice: Team coaching Mentoring (especially of senior staff) Service management (for all kinds of teams and areas, not just for production systems) Well-written documentation Process improvement
No serious sports team would consider not employing coaches and trainers, and no serious organization should be without coaches and trainers either.