More on this book
Community
Kindle Notes & Highlights
by
Sam Newman
Read between
December 20, 2018 - January 14, 2019
a single-service per host/container. Look at alternative technologies like LXC or...
This highlight has been truncated due to consecutive passage length restrictions.
Types of Tests
The Great Pile-up The long feedback cycles associated with end-to-end tests aren’t just a problem when it comes to developer productivity.
Consumer-Driven Tests to the Rescue
As the JSON Pact specification is created by the consumer, this needs to become an artifact that the producer build has access to. You could store this in your CI/CD tool’s artifact repository, or else use the Pact Broker, which allows you to store multiple versions of your Pact specifications.
They end up using many of those end-to-end journey tests to monitor the production system using a technique called semantic monitoring, which we will discuss more in Chapter 8.
A common example of this is the smoke test suite, a collection of tests designed to be run against newly deployed software to confirm that the deployment worked.
Graphite is one such system that makes this very easy.
Libraries exist for a number of different platforms that allow our services to send metrics to standard systems. Codahale’s Metrics library is one such example library for the JVM.
Software such as Zipkin can also trace calls across multiple system boundaries. Based on the ideas from Google’s own tracing system, Dapper, Zipkin can provide very detailed tracing of interservice calls, along with a UI to help present the
Personally, I’ve found the requirements of Zipkin to be somewhat heavyweight, requiring custom clients and supporting collection systems.
Needing to handle tasks like consistently passing through correlation IDs can be a strong argument for the use of thin shared client wrapper libraries.
Some of these libraries, such as Hystrix for the JVM, also do a good job of providing these monitoring capabilities for you.
scope of this book, but a great place to start is Stephen Few’s excellent book Information
Dashboard Design: Displaying Data for At-a-Glance Monitoring (Analytics Press).
Riemann is an event server that allows for fairly advanced aggregation and routing of events and can form part of such a solution. Suro is Netflix’s data pipeline and operates in a similar space.
At the time of writing, OpenAM and Gluu are two of the very few options available in this space, compared
So while I think OpenID Connect is the future, it’s quite possible it’ll take a while to reach widespread adoption.
duplicated work.
Another problem is that if we have decided to offload responsibility for authentication to a gateway, it can be harder to reason about how a microservice behaves when looking at it in isolation.
Do be careful, though. Gateway layers tend to take on more and more functionality,
The server needs to manage its own SSL certificates, which can become problematic when it is managing multiple machines. Some organizations take on their own certificate issuing process, which is an additional administrative
It makes calls to a server-side shop application, using the backends-for-frontends pattern we described in Chapter 4.
There is a type of vulnerability called the confused deputy problem, which in the context of service-to-service communication refers to a situation
This problem, unfortunately, has no simple answer, because it isn’t a simple problem. Be aware that it exists, though. Depending on the sensitivity of the operation in question,
If there is nothing else you take away from this chapter, let it be this: don’t write your own crypto. Don’t
Finally, understand the importance of defense in depth, make sure you patch your operating systems, and even if you consider yourself a rock star, don’t try to implement your own cryptography!
discussion of cryptography, check out the book Cryptography Engineering by Niels Ferguson, Bruce Schneier, and Tadayoshi Kohno (Wiley).
Moore’s law, for example, which states that the density of transistors on integrated circuits doubles every two years, has proved to be uncannily accurate (although some people predict that this trend is already slowing).
Any organization that designs a system (defined more broadly here than just information systems) will inevitably produce a design whose structure is a copy of the organization’s communication structure.
This ensured that the architecture of the system was optimized for speed of change. Effectively, Netflix designed the organizational structure for the system architecture it wanted.
One key reason people move toward shared services is to avoid delivery bottlenecks.
Internal Open Source
Conway’s Law in Reverse
The Antifragile Organization
In his book Antifragile (Random House), Nassim Taleb talks about things that actually benefit from failure
The most famous of these programs is the Chaos Monkey, which during certain hours of the day will turn off random machines.
simulates slow network connectivity between machines. Netflix has made these tools available under an open source license.
Not everyone needs to go to the sorts of extremes that Google or Netflix do, but it is important to understand the mindset shift that is required
The Command-Query Responsibility Segregation (CQRS) pattern refers to an alternate model for storing and querying information. With
CAP Theorem
a distributed system, we have three things we can trade off against each other: consistency, availability, and partition tolerance. Specifically, the theorem tells us that we get to keep two in a failure mode.
Getting multinode consistency right is so hard that I would strongly, strongly suggest that if you need it, don’t try to invent it yourself. Instead, pick a data store or lock service that offers these characteristics. Consul, for example, which we’ll discuss shortly, implements a strongly consistent key/value store designed to share configuration between multiple nodes.
Dynamic Service Registries
Zookeeper was originally developed as part of the Hadoop project.
provides a hierarchical namespace for storing information. Clients can insert new nodes in this hierarchy, change them, or query them. Furthermore, they can add watches to nodes to be told when they change.
Like Zookeeper, Consul supports both configuration management and service discovery. But it goes further than Zookeeper in providing more support for these key use cases. For example, it exposes an HTTP interface for service discovery,
Consul is very new, and given the complexity of the algorithms it uses, this would normally make me hesitant in recommending it for such an important job. That said, Hashicorp, the team behind it, certainly has a great track record in creating very useful open source technology (in the form of both Packer and Vagrant), the project is being actively developed, and I’ve spoken to a few people who are happily using it in production.
Eureka
Netflix’s open source Eureka system bucks the trend of systems like Consul and Zookeeper in that it doesn’t also try to be a general-purpose configuration store. It is actually very targeted in its use case.

