Todd’s Kindle Notes & Highlights for The DevOps Handbook: How to Create World-Class Agility, Reliability, and Security in Technology Organizations

The DevOps Handbook: How to Create World-Class Agility, Reliability, and Security in Technology Organizations

Rate it:

More on this book

Community

Attila Bertók

3 notes & 121 highlights

Jonathon Conley

13 notes & 153 highlights

Keith

3 notes & 29 highlights

Keith

3 notes & 36 highlights

Koushal

1 note & 31 highlights

Kevin

1 note & 4 highlights

Rodney Roberts

Luciana Ferreyra

Vanwhelan

Dominik Sigmund

Jonathon Conley

Kami

Gustavo Antonio Parada Sarmiento

Duri Chitayat

Dean Sas

Lionel Orellana

Damien Ryan

Daniel Banasiak

Adrian Bordinc

Ketil Moland

Angel Garbarino

Sam Gaulding

Vipin Ajayakumar

Gareth Oates

Charles-Henri Lison

Russ Sanderlin

Dion BROWN

Stanisław Tuszyński

Donnie Berkholz

antoine pecatikov

Dan Lastoria

Cameron Wolff

Matt Chave

Richard Murphy

Michael Hansen

Karl

Chris

Kindle Notes & Highlights

by Todd

See all Todd’s Notes & Highlights

The DevOps Handbook: How to Create World-Class Agility, Reliability, and Security in Technology Organizations

by Gene Kim

Started reading April 11, 2018

Continuous Delivery (Addison-Wesley, 2010)

However, that doesn’t mean that DevOps organizations don’t have effective controls. Instead of security and compliance activities only being performed at the end of the project, controls are integrated into every stage of daily work in the software development life cycle, resulting in better quality, security, and compliance outcomes.

they are creating safe systems of work and enabling small teams to quickly and independently develop and validate code that can be safely deployed to customers.

They care not just about implementing user features, but also actively ensure their work flows smoothly and frequently through the entire value stream without causing chaos and disruption to IT Operations or any other internal or external customer.

For most of us, this is not the world we live in. More often than not, the system we work in is broken, resulting in extremely poor outcomes that fall well short of our true potential. In our world, Development and IT Operations are adversaries; testing and Infosec activities happen only at the end of a project, too late to correct any problems found; and almost any critical activity requires too much manual effort and too many handoffs, leaving us to always be waiting. Not only does this contribute to extremely long lead times to get anything done, but the quality of our work, especially ...more

Respond to the rapidly changing competitive landscape Provide stable, reliable, and secure service to the customer Frequently, Development will take responsibility for responding to changes in the market, deploying features and changes into production as quickly as possible. IT Operations will take responsibility for providing customers with IT service that is stable, reliable, and secure, making it difficult or even impossible for anyone to introduce production changes that could jeopardize production. Configured this way, Development and IT Operations have diametrically opposed goals and ...more

Time and time again, we learn that when IT fails, the entire organization fails. As Steven J. Spear noted in his book The High-Velocity Edge, whether the damages “unfold slowly like a wasting disease” or rapidly “like a fiery crash...the destruction can be just as complete.”

As Christopher Little, a software executive and one of the earliest chroniclers of DevOps, said, “Every company is a technology company, regardless of what business they think they’re in. A bank is just an IT company with a banking license.”§

“It is virtually impossible to make any business decision that doesn’t result in at least one IT change.”

BREAKING THE DOWNWARD SPIRAL WITH DEVOPS Ideally, small teams of developers independently implement their features, validate their correctness in production-like environments, and have their code deployed into production quickly, safely, and securely. Code deployments are routine and predictable. Instead of starting deployments at midnight on Friday and spending all weekend working to complete them, deployments occur throughout the business day when everyone is already in the office and without our customers even noticing—except when they see new features and bug fixes that delight them. And, ...more

In this scenario, everyone feels productive—the architecture allows small teams to work safely and architecturally decoupled from the work of other teams who use self-service platforms that leverage the collective experience of Operations and Information Security. Instead of everyone waiting all the time, with large amounts of late, urgent rework, teams work independently and productively in small batches, quickly and frequently delivering new value to customers.

because everyone fully owns the quality of their work, everyone builds automated testing into their daily work and uses peer reviews to gain confidence that problems are addressed long before they can impact a customer. These processes mitigate risk, as opposed to approvals from distant authorities, allowing us to deliver value quickly, reliably, and securely—even proving to skeptical auditors that we have an effective system of internal controls. And when something does go wrong, we conduct blameless post-mortems, not to punish anyone, but to better understand what caused the accident and how ...more

When we increase the number of developers, individual developer productivity often significantly decreases due to communication, integration, and testing overhead. This is highlighted in the famous book by Frederick P. Brooks, The Mythical Man-Month, where he explains that when projects are late adding more developers not only decreases individual developer productivity but also decreases overall productivity.

DevOps shows us that when we have the right architecture, the right technical practices, and the right cultural norms, small teams of developers are able to quickly, safely, and independently develop, integrate, test, and deploy changes into production.

technology value stream, including Product Management, Development, QA, IT Operations, and Infosec.

high-trust management model.

10%

Value Stream Mapping: How to Visualize Work and Align Leadership for Organizational Transformation as “the sequence of activities an organization undertakes to deliver upon a customer request” or “the sequence of activities required to design, produce, and deliver a good or service to a customer, including the dual flows of information and material.”

10%

In DevOps, we typically define our technology value stream as the process required to convert a business hypothesis into a technology-enabled service that delivers value to the customer.

10%

The input to our process is the formulation of a business objective, concept, idea, or hypothesis, and starts when we accept the work in Development, adding it to our committed backlog of work.

10%

From there, Development teams that follow a typical Agile or iterative process will likely transform that idea into user stories and some sort of feature specification, which is then implemented in code into the application or service being built. The code is then checked in to the version control repository...

This highlight has been truncated due to consecutive passage length restrictions.

10%

Because value is created only when our services are running in production, we must ensure that we are not only delivering fast flow, but that our deployments can also be performed without causing chaos and disruptions such as service ou...

This highlight has been truncated due to consecutive passage length restrictions.

10%

Instead of large batches of work being processed sequentially through the design/development value stream and then through the test/operations value stream (such as when we have a large batch waterfall process or long-lived feature branches), our goal is to have testing and operations happening simultaneously with design/development, enabling fast flow and high quality. This method succeeds when we work in small batches and build quality into every part of our value stream.†††

10%

This is most easily achieved when we have architecture that is modular, well encapsulated, and loosely-coupled so that small teams are able to work with high degrees of autonomy, with failures being small and contained, and without causing global disruptions.

11%

The First Way enables fast left-to-right flow of work from Development to Operations to the customer. In order to maximize flow, we need to make work visible, reduce our batch sizes and intervals of work, build in quality by preventing defects from being passed to downstream work centers, and constantly optimize for the global goals.

11%

The Second Way enables the fast and constant flow of feedback from right to left at all stages of our value stream. It requires that we amplify feedback to prevent problems from happening again, or enable faster detection and recovery.

11%

The Third Way enables the creation of a generative, high-trust culture that supports a dynamic, disciplined, and scientific approach to experimentation and risk-taking, facilitating the creation of organizational learning, both from our successes and failures.

64%

Infosec can only do compliance checking, which is the opposite of security engineering—and

64%

Rugged DevOps.

64%

DevOpsSec,

64%

One way we can do this is by inviting Infosec to the product demonstrations at the end of each development interval so that they can better understand the team goals in the context of organizational goals, observe their implementations as they are being built, and provide guidance and feedback at the earliest stages of the project, when there is the most amount of time and freedom to make corrections.

64%

“When it came to information security and compliance, we found that blockages at the end of the project were much more expensive than at the beginning—and Infosec blockages were among the worst. ‘Compliance by demonstration’ became one of the rituals we used to shift all this complexity earlier in the process.”

64%

When Infosec is an assigned part of the team, even if they are only being kept informed and observing the process, they gain the business context they need to make better risk-based decisions. Furthermore, Infosec is able to help feature teams learn what is required to meet security and compliance objectives.

64%

When possible, we want to track all open security issues in the same work tracking system that Development and Operations are using, ensuring the work is visible and can be prioritized against all other work. This is very different from how Infosec has traditionally worked, where all security vulnerabilities are stored in a GRC (governance, risk, and compliance) tool that only Infosec has access to.

64%

“We put all security issues into JIRA, which all engineers use in their daily work, and they were either ‘P1’ or ‘P2,’ meaning that they had to be fixed immediately or by the end of the week, even if the issue is only an internally-facing application.”

64%

“Any time we had a security issue, we would conduct a post-mortem, because it would result in better educating our engineers on how to prevent it from happening again in the future, as well as a fantastic mechanism for transferring security knowledge to our engineering teams.”

64%

Because everyone in the DevOps value stream uses version control for anything they build or support, putting our information security artifacts there makes it much easier to influence the daily work of Dev and Ops, because anything we create is available, searchable, and reusable.

64%

create and operate shared security-relevant platforms, such as authentication, authorization, logging, and other security and auditing services that Dev and Ops require.

64%

When engineers use one of these predefined libraries or services, they won’t need to schedule a separate security design review for that module; they’ll be using the guidance we’ve created concerning configuration hardening, database security settings, key lengths, and so forth.

65%

Ideally, these automated security tests will be run in our deployment pipeline alongside the other static code analysis tools.

65%

Tools such as Gauntlt have been designed to integrate into the deployment pipelines, which run automated security tests on our applications, our application dependencies, our environment, etc.

66%

This cases study illustrates just how necessary it is to integrate security into the daily work and tools of DevOps and how effectively it can work. Doing so mitigates security risk, reduces the probability of vulnerabilities in the system, and helps teach developers to write more secure code.

66%

we should do whatever is required to help ensure that the environments are in a hardened, risk-reduced state.

66%

We do this by generating automated tests to ensure that all appropriate settings have been correctly applied for configuration hardening, database security settings, key lengths, and so forth. Furthermore, we will use tests to scan our environments for known vulnerabilities.¶

66%

Examples of tools for this include Nmap to ensure that only expected ports are open and Metasploit to ensure that we’ve adequately hardened our environments against known vulnerabilities, such as scanning with SQL injection attacks. The output of these tools should be put into our artifact repository and compared with the previous version as part of our functional testing process.

66%

In order to detect problematic user behavior that could be an indicator or enabler of fraud and unauthorized access, we must create the relevant telemetry in our applications.

66%

For instance, as an early indicator of brute-force login attempts to gain unauthorized access, we might display the ratio of unsuccessful login attempts to successful logins. And, of course, we should create alerting around important events to ensure we can detect and correct issues quickly.

67%

We need to monitor and potentially alert on items, including the following: OS changes (e.g., in production, in our build infrastructure) Security group changes Changes to configurations (e.g., OSSEC, Puppet, Chef, Tripwire) Cloud infrastructure changes (e.g., VPC, security groups, users and privileges) XSS attempts (i.e., “cross-site scripting attacks”) SQLi attempts (i.e., “SQL injection attacks”) Web server errors (e.g., 4XX and 5XX errors) We also want to confirm that we’ve correctly configured our logging so that all telemetry is being sent to the right place.

67%

“One of the results of showing this graph was that developers realized that they were being attacked all the time! And that was awesome, because it changed how developers thought about the security of their code as they were writing the code.”