More on this book
Community
Kindle Notes & Highlights
by
Gene Kim
Read between
January 23 - March 14, 2022
Excella noticed during a team retrospective that their cycle times were beginning to rise. They had what Joshua Cohen described as a case of the “almost dones.” He noted, “During standup, our developers would give an update on the feature they were working on the previous day. They would say, ‘Hey, I made a lot of progress. I’m almost done.’ And the next morning they would say, ‘Hey, I ran into some issues but I worked through them. I just have a few more tests to run. I’m almost done.’”12
Keep Pushing Quality Closer to the Source
The effectiveness of approval processes decreases as we push decision-making further away from where the work is performed.
Lean defines two types of customers that we must design for: the external customer (who most likely pays for the service we are delivering) and the internal customer (who receives and processes the work immediately after us).
Creating fast feedback is critical to achieving quality, reliability, and safety in the technology value stream.
Generative organizations are characterized by actively seeking and sharing information to better enable the organization to achieve its mission. Responsibilities are shared throughout the value stream, and failure results in reflection and genuine inquiry.
the leader’s role is to create the conditions so their team can discover greatness in their daily work.
complementary working relationship and mutual respect that must occur between leaders and frontline workers. According to Womack, this relationship is necessary because neither can solve problems alone—
True North goals,
These target conditions frame the scientific experiment: we explicitly state the problem we are seeking to solve, our hypothesis of how our proposed countermeasure will solve it, our methods for testing that hypothesis, our interpretation of the results, and our use of learnings to inform the next iteration.
Shewhart-Deming PDCA (Plan, Do, Check, Act) continuous improvement cycle.
Eno is credited as saying: “Scenius stands for the intelligence and the intuition of a whole cultural scene. It is the communal form of the concept of the genius.”24
A vital aspect of the culture was that there should be no fear of failures. As Kelly explained, “The odds of creating a new and popular technology were always stacked against the innovator; only where the environment allowed failure could truly groundbreaking ideas be pursued.”
Flow, Feedback, and Continual Learning and Experimentation.
Greenfield development is when we build on undeveloped land. Brownfield development is when we build on land that was previously used for industrial purposes, potentially contaminated with hazardous waste or pollution. In urban development, many
what predicted performance was whether the application was architected (or could be re-architected) for testability and deployability.13
Increase the Visibility of Work To know if we are making progress toward our goal, it’s essential that everyone in the organization knows the current state of work.
These observations led to what is now known as Conway’s Law, which states that “organizations which design systems . . . are constrained to produce designs which are copies of the communication structures of these organizations. . . . The larger an organization is, the less flexibility it has and the more pronounced the phenomenon.”2
Done poorly, Conway’s Law will prevent teams from working safely and independently; instead, they will be tightly coupled, all waiting on each other for work to be done,
They created a small team that wrote a PHP object-relational mapping (ORM) layer,* enabling
In the field of decision sciences, there are three primary types of organizational structures that inform how we design our DevOps value streams with Conway’s Law in mind: functional, matrix, and market.
Broadly speaking, to achieve DevOps outcomes, we need to reduce the effects of functional orientation (“optimizing for cost”) and enable market orientation (“optimizing for speed”) so we can have many small teams working safely and independently, quickly delivering value to the customer.
Having architecture that is loosely coupled means that services can update in production independently, without having to update other services.
Conway’s Law helps us design our team boundaries in the context of desired communication patterns but it also encourages us to keep our team sizes small, reducing the amount of inter-team communication and encouraging us to keep the scope of each team’s domain small and bounded.
Nice in concept, but it requires a larger group of people. Architecting for a team of 10 that has 4 major services would require 40 people, when only 12 total are available. The role of architect seems to be overlooked in this discussion. This might make sense because they should have a higher view as enterprise architect. Still the role of system architect would be desired for each team.
The team’s lead, working with the executive team, decides on the key business metric that the team is responsible for, known as the fitness function, which becomes the overall evaluation criteria for the team’s experiments.
Supposing the executive team includes architect, product manager, business analyst, and solution management? It would be nice to know what they consider the executive team for these 2-pizza groups.
quick meeting where everyone on the team gets together and presents to each other three things: what was done yesterday, what is going to be done today, and what is preventing you from getting your work done.¶
However, we must remind everyone that improvement of daily work is more important than daily work itself, and that all teams must have dedicated capacity for this (e.g., reserving 20% of all capacity for improvement work, scheduling one day per week or one week per month, etc.). Without doing this, the productivity of the team will almost certainly grind to a halt under the weight of its own technical and process debt.
always use production-like environments at every stage of the value stream. Furthermore, these environments must be created in an automated manner, ideally on demand from scripts and configuration information stored in version control and entirely self-serviced, without any manual work required from Operations.
re-create the entire production environment based on what’s in version control.
Now that our environments can be created on demand and everything is checked into version control, our goal is to ensure that these environments are being used in the daily work of Development. We need to verify that our application runs as expected in a production-like environment long before the end of the project or before our first production deployment.
done, widely defined as when we have “working and potentially shippable code.”
In other words, we will only accept development work as done when it can be successfully built, deployed, and confirmed that it runs as expected in a production-like environment, instead of merely when a developer believes it to be done. Ideally, it runs under a production-like load with a production-like dataset, long before the end of a sprint.
Data supports the importance of architecture and its role in driving elite performance, with the DORA and Puppet 2017 State of DevOps Report finding that architecture was the largest contributor to continuous delivery. The analysis found that teams who scored highest on architectural capabilities could complete their work independently of other teams, and change their systems without dependencies.26
And Martin Fowler’s blog explaining his Strangler Fig Application Pattern is still an essential read. (martinfowler.com/bliki/StranglerFigApplication.html).
•DEBUG level: Information at this level is about anything that happens in the program, most often used during debugging. Often, debug logs are disabled in production but temporarily enabled during troubleshooting. •INFO level: Information at this level consists of actions that are user-driven or system specific (e.g., “beginning credit card transaction”). •WARN level: Information at this level tells us of conditions that could potentially become an error (e.g., a database call taking longer than some predefined time). These will likely initiate an alert and troubleshooting, while other logging
...more
StatsD can generate timers and counters with one line of code (in Ruby, Perl, Python, Java, and other languages) and is often used in conjunction with Graphite or Grafana, which render metric events into graphs and dashboards.

