More on this book
Community
Kindle Notes & Highlights
by
David Farley
Read between
December 20, 2014 - June 26, 2018
Failure of your code to meet preset thresholds for these metrics should fail the commit stage the same way that a failing test does. Useful metrics include: • Test coverage (if your commit tests only cover 5% of your codebase, they’re pretty useless) • Amount of duplicated code • Cyclomatic complexity • Afferent and efferent coupling • Number of warnings • Code style
The Origin of the Term “Deployment Pipeline” When we first used this idea, we named it a pipeline not because it was like a liquid flowing through a pipe; rather, for the hardcore geeks amongst us, it reminded us of the way processors “pipeline” their instruction execution in order to get a degree of parallelism. Processor chips can execute instructions in parallel. But how do you take a stream of machine instructions intended to be executed serially and divide them up into parallel streams that make sense? The way processors do this is very clever and quite complex, but in essence they often
...more
This highlight has been truncated due to consecutive passage length restrictions.
Fixing broken builds remains the top priority for the development team even when those breakages occur in the later stages of the pipeline. We are gambling on success—but are ready to pay our technical debts should our gamble fail.
If you only implement a commit stage in your development process, it usually represents an enormous step forward in the reliability and quality of the output of your teams. However, there are several more stages necessary to complete what we consider to be a minimal deployment pipeline.
The mitigation of these problems is very simple when we view the release step as a natural outcome of our deployment pipeline. Fundamentally, we want to • Have a release plan that is created and maintained by everybody involved in delivering the software, including developers and testers, as well as operations, infrastructure, and support personnel
With automated deployment and release, the process of delivery becomes democratized.
Developers, testers, and operations teams no longer need to rely on ticketing systems and email threads to get builds deployed so they can gather feedback on the production readiness of the system. Testers can decide which version of the system they want in their test environment without needing to be technical experts themselves, nor relying on the availability of such expertise to make the deployment. Since deployment is simple, they can change the build under test more often, perhaps returning to an earlier version of the system to compare its behavior with that of the latest version when
...more
On no account should you have a different process for backing out than you do for deploying, or perform incremental deployments or rollbacks. These processes will be rarely tested and therefore unreliable. They will also not start from a known-good baseline, and therefore will be brittle. Always roll back either by keeping an old version of the application deployed or by completely redeploying a previous known-good version.
Feedback is at the heart of any software delivery process. The best way to improve feedback is to make the feedback cycles short and the results visible.
You should measure continually and broadcast the results of the measurements in some hard-to-avoid manner, such as on a very visible poster on the wall, or on a computer display dedicated to showing bold, big
results. Such devices are known as informa...
This highlight has been truncated due to consecutive passage length restrictions.
What you choose to measure will have an enormous influence on the behavior of your team (this is known as the Hawthorne effect).
For the software delivery process, the most important global metric is cycle time. This is the time between deciding that a feature needs to be implemented and having that feature released to users.
As Mary Poppendieck asks, “How long would it take your organization to deploy a change that involves just one single line of code? Do you do this on a repeatable, reliable basis?”4 This metric is hard to measure because it covers many parts of the software delivery process—from analysis, through development, to release. However, it tells you more about your process than any other metric.
Once you know the cycle time for your application, you can work out how best to reduce it. You can use the Theory of Constraints to do this by applying the following process. 1. Identify the limiting constraint on your system. This is the part of your build, test, deploy, and release process that is the bottleneck. To pick an example at random, perhaps it’s the manual testing process. 2. Exploit the constraint. This means ensuring that you should maximize the throughput of that part of the process. In our example (manual testing), you would make sure that there is always a buffer of stories
...more
While cycle time is the most important metric in software delivery, there are a number of other diagnostics that can warn you of problems. These include • Automated test coverage • Properties of the codebase such as the amount of duplication, cyclomatic complexity, efferent and afferent coupling, style problems, and so on • Number of defects • Velocity, the rate at which your team delivers working, tested, ready for use code • Number of commits to the version control system per day • Number of builds per day • Number of build failures per day • Duration of build, including automated tests
One small point worth noting is that a task has two essential features: the thing it does and the other things it depends on.
Deploying and Testing Layers If there is a fundamental core to our approach to delivery in general and to the building and deployment of complex systems specifically, it is that you should always strive to build on foundations that are known to be good. We don’t bother testing changes that don’t compile, we don’t bother trying to acceptance-test changes that have failed commit tests, and so on.
So, when should you think about automating a process? The simplest answer is, “When you have to do it a second time.” The third time you do something, it should be done using an automated process.
Finally, it bears reiterating that scripts are first-class parts of your system. They should live for its entire life. They should be version-controlled, maintained, tested, and refactored, and be the only mechanism that you use to deploy your software. So many teams treat their build system as an afterthought; in our experience, build and deployment systems are nearly always the poor relation when it comes to design. As a result, such poorly maintained systems are often the barrier to a sensible, repeatable release process, rather than its foundation. Delivery teams should spend time and care
...more
Remember, though, that if the commit stage fails, the rule is that the delivery team must immediately stop whatever they are doing and fix it. Don’t fail the commit test for some reason that hasn’t been agreed upon by the whole team, or people will stop taking failures seriously and continuous integration will break down. Do, however, continuously review your application’s quality and consider enforcing quality metrics through the commit stage where appropriate.
Scripts that are treated as secondary to application code rapidly become impossible to understand and maintain.
Give Developers Ownership At some organizations, there are teams of specialists who are experts at the creation of effective, modular build pipelines and the management of the environments in which they run. We have both worked in this role. However, we consider it a failure if we get to the point where only those specialists can maintain the CI system.
In larger or more widely spread teams, this isn’t always easy. Under these circumstances it is useful to have someone to play the role of a “build master.” Their job is to oversee and direct the maintenance of the build, but also to encourage and enforce build discipline. If a build breaks, the build master notices and gently—or not gently if it has been a while—reminds the culprit of their responsibility to the team to fix the build quickly or back out their changes. Another situation where we have found this role useful is in teams new to continuous integration. In such teams, build
...more
The Results of the Commit Stage
binaries generated by the commit stage are precisely the same ones that will be reused throughout the pipeline, and potentially released to users.
The outputs of the commit stage, your reports and binaries, need to be stored somewhere for reuse in the later stages of your pipeline, and for your team to be able to get hold of them. The obvious place might appear to be your version control system. There are several reasons why this is not the right thing to do,
Figure 7.2 The role of the artifact repository
The following details each step in the happy path of a release candidate that makes it successfully into production. The numbers refer to the enumerated steps shown in Figure 7.2. 1. Somebody on your delivery team commits a change. 2. Your continuous integration server runs the commit stage. 3. On successful completion, the binary as well as any reports and metadata are saved to the artifact repository. 4. Your CI server then retrieves the binaries created by the commit stage and deploys to a production-like test environment. 5. Your continuous integration server then runs the acceptance
...more
This highlight has been truncated due to consecutive passage length restrictions.
The unit tests that form the bulk of your commit tests should never rely on the database.
To achieve this, you should be able to separate the code under test from its storage.
For example, if your system posts a message and then acts on it, wrap the raw message-sending technology with an interface of your own. Then you can confirm that the call is made as you expect in one test case, perhaps using a simple stub that implements the messaging interface or using mocking as described in the next section. You can add a second test that verifies the behavior of the message handler, simply calling the point that would be normally called by the messaging infrastructure. Sometimes, though, depending on your architecture, this is not possible without a lot of work.
We recommend that you work very hard to eliminate asynchrony in the commit stage testing. Tests which rely on infrastructure, such as messaging (even in-memory), count as component tests, not unit tests. More complex, slower-running component tests should be part of your acceptance test stage, not commit stage.
Thus the establishment of a commit stage—an automated process, launched on every change, that builds binaries, runs automated tests, and generates metrics—is the minimum you can do on the way to your adoption of the practice of continuous integration.
The focus on acceptance testing as a means of showing that the application meets its acceptance criteria for each requirement has an additional benefit. It makes everyone involved in the delivery process—customers, testers, developers, analysts, operations personnel, and project managers—think about what success means for each requirement. We will cover this in more detail in the “Acceptance Criteria as Executable Specifications” section on page 195
The objective of acceptance tests is to prove that our application does what the customer meant it to, not that it works the way its programmers think it should.
Unit tests can sometimes share this focus, but not always.
There has always been a great deal of controversy around automated acceptance tests. Project managers and customers often think they are too expensive to create and maintain—which indeed, when done badly, they are. Many developers believe that unit test suites created through test-driven development are enough to protect against regressions. Our experience has been that the cost of a properly created and maintained automated acceptance test suite is much lower than that of performing frequent manual acceptance and regression testing, or that of the alternative of releasing poor-quality
...more
There are several flaws in this argument. First, no other type of test proves that the application, running more or less as it would in production, delivers the business value its users are expecting. Unit and component tests do not test user scenarios, and are thus incapable of finding the kinds of defects that appear when users put the application through a series of states in the course of interacting with it. Acceptance tests are designed exactly for this. They are also great at catching threading problems, emergent behavior in event-driven applications, and other classes of bugs caused by
...more
Finally, teams that choose to forgo automated acceptance tests place a much greater burden on testers, who must then spend much more time on boring and repetitive regression testing. The testers that we know are not in favor of this approach. While developers can take on some of this burden, many developers—who write unit and component tests—are simply not as effective as testers at finding defects in their own work. Automated acceptance tests written with the involvement of testers are, in our experience, a great deal better at finding defects in user scenarios than tests written by
...more
We have split the problem of creating and maintaining effective automated acceptance tests into four sections: creating acceptance tests; creating an application driver layer; implementing acceptance tests; and maintaining acceptance test suites. We will briefly introduce our approach before we go into detail.
How to Create Maintainable Acceptance Test Suites
Unfortunately, this antipattern is very common. Most tests are written at the level of detailed execution: “Poke this, prod that, look here for a result.” Such tests are often the output of record-and-playback-style test automation products, which is one of the main reasons automated acceptance tests are perceived as expensive.
Once the acceptance criteria have been defined, just before the requirement is to be implemented, the analyst and tester sit with the developers who will do the implementation, along with the customer if available. The analyst describes the requirement and the business context in which it exists, and goes through the acceptance criteria. The tester then works with the developers to agree on a collection of automated acceptance tests that will prove that the acceptance criteria have been met.
These short kick-off meetings are a vital part of the glue that binds the iterative delivery process together, ensuring that every party to the implementation of a requirement has a good understanding of that requirement and of their role in its delivery. This approach prevents analysts from creating “ivory tower” requirements that are expensive to implement or test. It prevents testers from raising defects that aren’t defects but are instead a misunderstanding of the system. It prevents developers from implementing something that bears little relationship to what anyone really wants.
As automated testing has become more central to the delivery of projects that use iterative processes, many practitioners have realized that automated testing is not just about testing. Rather, acceptance tests are executable specifications of the behavior of the software being developed. This is a significant realization which has spawned a new approach to automated testing, known as behaviordriven development. One of the core ideas of behavior-driven development is that your acceptance criteria should be written in the form of the customer’s expectations of the behavior of the application.
...more
This approach has some significant advantages. Most specifications begin to become out-of-date as the application evolves. This is not possible for executable specifications: If they don’t specify what the application does accurately, they will raise an exception to that effect when run. The acceptance test stage of the pipeline will fail when run against a version of the application that does not meet its specifications, and that version will therefore not be available for deployment or release.
Acceptance tests are business-facing, which means they should verify that your application d...
This highlight has been truncated due to consecutive passage length restrictions.
Given some initial context, When an event occurs, Then there are some outcomes.

