More on this book
Kindle Notes & Highlights
Started reading
March 31, 2024
In this chapter You get a foundation for the rest of the book. You write your first search algorithm (binary search). You learn how to talk about the running time of an algorithm (Big O notation). You’re introduced to a common technique for designing algorithms (recursion).
log10 100 is like asking, “How many 10s do we multiply together to get 100?” The answer is 2: 10 × 10. So log10 100 = 2. Logs are the flip of exponentials.
Big O notation Big O notation is special notation that tells you how fast an algorithm is. Who cares? Well, it turns out that you’ll use other people’s algorithms often—and when you do, it’s nice to understand how fast or slow they are. In this section, I’ll explain what Big O notation is and give you a list of the most common running times for algorithms using it.
Here are five Big O run times that you’ll encounter a lot, sorted from fastest to slowest: O(log n), also known as log time. Example: Binary search. O(n), also known as linear time. Example: Simple search. O(n * log n). Example: A fast sorting algorithm, like quicksort (coming up in chapter 4). O(n2). Example: A slow sorting algorithm, like selection sort (coming up in chapter 2). O(n!). Example: A really slow algorithm, like the traveling salesperson (coming up next!).
Algorithm speed isn’t measured in seconds, but in growth of the number of operations. Instead, we talk about how quickly the run time of an algorithm increases as the size of the input increases. Run time of algorithms is expressed in Big O notation. O(log n) is faster than O(n), but it gets a lot faster as the list of items you’re searching grows.
Linked lists are great if you’re going to read all the items one at a time: you can read one item, follow the address to the next item, and so on. But if you’re going to keep jumping around, linked lists are terrible.
What’s better if you want to insert elements in the middle: arrays or lists? With lists, it’s as easy as changing what the previous element points to.
What if you want to delete an element? Again, lists are better, because you just need to change what the previous element points to. With arrays, everything needs to be moved up when you delete an element.
There are two different types of access: random access and sequential access. Sequential access means reading the elements one by one, starting at the first element. Linked lists can only do sequential access. If you want to read the 10th element of a linked list, you have to read the first 9 elements and follow the links to the 10th element. Random access means you can jump directly to the 10th element.
Recap Your computer’s memory is like a giant set of drawers. When you want to store multiple elements, use an array or a list. With an array, all your elements are stored right next to each other. With a list, elements are strewn all over, and one element stores the address of the next one. Arrays allow fast reads. Linked lists allow fast inserts and deletes. All elements in the array should be the same type (all ints, all doubles, and so on).
Quicksort Quicksort is a sorting algorithm. It’s much faster than selection sort and is frequently used in real life. For example, the C standard library has a function called qsort, which is its implementation of quicksort. Quicksort also uses D&C.
D&C works by breaking a problem down into smaller and smaller pieces. If you’re using D&C on a list, the base case is probably an empty array or an array with one element. If you’re implementing quicksort, choose a random element as the pivot. The average runtime of quicksort is O(n log n)! The constant in Big O notation can matter sometimes. That’s why quicksort is faster than merge sort.
The constant almost never matters for simple search versus binary search, because O(log n) is so much faster than O(n) when your list gets big.
Collisions Like I said earlier, most languages have hash tables. You don’t need to know how to write your own. So, I won’t talk about the internals of hash tables too much. But you still care about performance! To understand the performance of hash tables, you first need to understand what collisions are. The next two sections cover collisions and performance. First, I’ve been telling you a white lie. I told you that a hash function always maps different keys to different slots in the array.
If you enqueue two items to the list, the first item you added will be dequeued before the second item. You can use this for your search list! People who are added to the list first will be dequeued and searched first. The queue is called a FIFO data structure: First In, First Out. In contrast, a stack is a LIFO data structure: Last In, First Out.
Running time If you search your entire network for a mango seller, that means you’ll follow each edge (remember, an edge is the arrow or connection from one person to another). So the running time is at least O(number of edges). You also keep a queue of every person to search. Adding one person to the queue takes constant time: O(1). Doing this for every person will take O(number of people) total. Breadth-first search takes O(number of people + number of edges), and it’s more commonly written as O(V+E) (V for number of vertices, E for number of edges).
To recap, Dijkstra’s algorithm has four steps: 1. Find the cheapest node. This is the node you can get to in the least amount of time. 2. Check whether there’s a cheaper path to the neighbors of this node. If so, update their costs. 3. Repeat until you’ve done this for every node in the graph. 4. Calculate the final path. (Coming up in the next section!)
Recap Breadth-first search is used to calculate the shortest path for an unweighted graph. Dijkstra’s algorithm is used to calculate the shortest path for a weighted graph. Dijkstra’s algorithm works when all the weights are positive. If you have negative weights, use the Bellman-Ford algorithm.
Recap Greedy algorithms optimize locally, hoping to end up with a global optimum. NP-complete problems have no known fast solution. If you have an NP-complete problem, your best bet is to use an approximation algorithm. Greedy algorithms are easy to write and fast to run, so they make good approximation algorithms.
But that still doesn’t tell you the formula to use. Computer scientists sometimes joke about using the Feynman algorithm. The Feynman algorithm is named after the famous physicist Richard Feynman, and it works like this: Write down the problem. Think real hard. Write down the solution. Computer scientists are a fun bunch!
Recap Dynamic programming is useful when you’re trying to optimize something given a constraint. You can use dynamic programming when the problem can be broken into discrete subproblems. Every dynamic-programming solution involves a grid. The values in the cells are usually what you’re trying to optimize. Each cell is a subproblem, so think about how you can divide your problem into subproblems. There’s no single formula for calculating a dynamic-programming solution.
From the graph, you can tell visually that fruits A and B are similar. Let’s measure how close they are. To find the distance between two points, you use the Pythagorean formula. Here’s the distance between A and B, for example. The distance between A and B is 1. You can find the rest of the distances, too.
A mathematician would say, instead of calculating the distance in two dimensions, you’re now calculating the distance in five dimensions. But the distance formula remains the same. It just involves a set of five numbers instead of a set of two numbers.
The distance formula is flexible: you could have a set of a million numbers and still use the same old distance formula to find the distance. Maybe you’re wondering, “What does distance mean when you have five numbers?” The distance tells you how similar those sets of numbers are. Here’s the distance between Priyanka and Justin.
Introduction to machine learning KNN is a really useful algorithm, and it’s your introduction to the magical world of machine learning! Machine learning is all about making your computer more intelligent. You already saw one example of machine learning: building a recommendations system. Let’s look at some other examples.
You found Maggie! It’s almost like running a binary search! Searching for an element in a binary search tree takes O(log n) time on average and O(n) time in the worst case. Searching a sorted array takes O(log n) time in the worst case, so you might think a sorted array is better. But a binary search tree is a lot faster for insertions and deletions on average.
Binary search trees have some downsides too: for one thing, you don’t get random access. You can’t say, “Give me the fifth element of this tree.” Those performance times are also on average and rely on the tree being balanced. Suppose you have an imbalanced tree like the one shown next.
The Fourier transform The Fourier transform is one of those rare algorithms: brilliant, elegant, and with a million use cases. The best analogy for the Fourier transform comes from Better Explained (a great website that explains math simply): given a smoothie, the Fourier transform will tell you the ingredients in the smoothie.[1] Or, to put it another way, given a song, the transform can separate it into individual frequencies.
Linear programming I saved the best for last. Linear programming is one of the coolest things I know. Linear programming is used to maximize something given some constraints. For example, suppose your company makes two products, shirts and totes. Shirts need 1 meter of fabric and 5 buttons. Totes need 2 meters of fabric and 2 buttons. You have 11 meters of fabric and 20 buttons. You make $2 per shirt and $3 per tote. How many shirts and totes should you make to maximize your profit? Here you’re trying to maximize profit, and you’re constrained by the amount of materials you have.
Epilogue I hope this quick tour of 10 algorithms showed you how much more is left to discover. I think the best way to learn is to find something you’re interested in and dive in. This book gave you a solid foundation to do just that.

