Kindle Notes & Highlights
Read between
February 22 - April 5, 2021
Computer science is not about machines, in the same way that astronomy is not about telescopes. There is an essential unity of mathematics and computer science.
When you’re on a complex task, keep your brain at the top of its game: dump all important stuff on paper.
Programmers often use flowcharts for writing down computing processes. When doing so, you should follow these guidelines1 for others to understand your flowcharts:
A model is a set of concepts that represents a problem and its characteristics.
Hot Server A server crashes if it’s overheating while the air conditioning is off. It also crashes if it’s overheating and its chassis cooler fails. In which conditions does the server work?
Whenever you’re dealing with things that assume one of two possibilities, remember they can be modeled as logic variables.
The working principles of the modern CPU are still based on boolean algebra. A modern CPU is just a circuit of millions of microscopic wires and logic gates that manipulate electric currents of information.
When the outcome of an event does not influence the outcome of another event, they are independent. The probability that two independent events will happen is the product of their individual probabilities.
When two events cannot happen simultaneously, they are mutually exclusive.
When two mutually exclusive events cover all possible outcomes, they are complementary.
Past events never affect the outcome of an independent event. Never. Ever.
For instance, you can use the “Pigeonhole Principle” to prove at least two people in New York City have exactly the same number of hairs!
A method is a list of unambiguous instructions for achieving a goal. A method that always requires a finite series of operations is called an algorithm.
Time complexity is written (n. It gives the number of operations the algorithm performs when processing an input of size n. We also refer to an algorithm's (n) as its running cost.
There's a special notation to refer to classes of growth: the Big-O notation.
The notation is used for expressing the dominant term of algorithms' cost functions in the worst case—that's the standard way of expressing time complexity.
That's why you need time complexity analysis when you design systems that handle very large inputs.
Some algorithms always run for a constant duration regardless of input size—they're (1).
During execution, algorithms need working storage to keep track of their ongoing calculations. This consumes computer memory, which is not infinite.
The measure for the working storage an algorithm needs is called space complexity.
We observe how space complexity evolves when the algorithm's input size grows,
In future chapters, we'll see how clever data handling can improve space complexity.
The iterative strategy consists in using loops (e.g. for, while) to repeat a process until a condition is met. Each step in a loop is called an iteration.
The process can be written in a single while loop:
We say there's recursion when a function delegates work to clones of itself.
How do you code a function that returns the nth Fibonnacci number?
Recursive algorithms have base cases, when the input is too small to be further reduced. Base cases for fib are numbers 1 and 2; for palindrome, they're words with one or zero characters.
This simplicity comes at a price. Recursive algorithms spawn numerous clones of themselves when they run, introducing a computational overhead.
This potential problem can be visualized in recursion trees: a diagram showing how the algorithm spawns more calls as it delves deeper in calculations.
If performance must be maximized, we can avoid this overhead by rewriting recursive algorithms in a purely iterative form. Doing so is always possible. It's a trade: the iterative code generally runs faster but it's also more complex and harder to understand.
The brute force strategy solves problems by inspecting all of the problem's possible solution candidates.
Backtracking works best in problems where the solution is a sequence of choices and making a choice restrains subsequent choices. It identifies as soon as possible the choices you've made cannot give you the solution you want, so you can sooner step back and try something else. Fail early, fail often.
A heuristic method, or simply a heuristic, is a method that leads to a solution without guaranteeing it is the best or optimal one.
A very common heuristic approach to problems is the greedy approach. It consists in never coming back to previous choices.
A salesman must visit n given cities, ending the trip in the city he started. Which travel plan minimizes the total distance traveled?
Here's a simple greedy algorithm for this problem: 1. Visit the nearest unvisited city. 2. Repeat until all cities are visited.
Dynamic programming is identifying repeated subproblems in order to compute them only once.
We can fix it by storing fib calculations as we do them, only spawning fib calls for calculations not already stored. This trick for reusing partial calculations is called memoization.
Many problems involve minimizing or maximizing a target value: find the shortest path, get the maximum profit, etc. They're called optimization problems.
When the solution is a sequence of choices, we often use a strategy called branch and bound.
Upper and Lower Bounds Bounds refer to the range of a value.
We can often easily get suboptimal solutions: a short path, but maybe not the shortest; a big profit, but maybe not the biggest. These solutions provide bounds to the optimal solution. For instance, the shortest route between two places is never shorter than their straight linear distance. The linear distance is thus a lower bound of the shortest driving distance.
The wise use of upper and lower bounds allowed us to find the optimal profit with minimal computational effort.
We dynamically adapted our search space as we were exploring possibilities. To recap, here's how branch and bound works: 1. Divide the problem into subproblems, 2. Find upper and lower bounds of each new subproblem, 3. Compare subproblem bounds of all branches, 4. Return to step 1 with the most promising subproblem.
backtracking, we remove paths after having explored them as far as we can, and we stop when we're OK with a solution. With branch and bound, we predict which paths are worst and we avoid wasting energy exploring them.
An Abstract Data Type (ADT) is the specification of a group of operations that make sense for a given data type.
The Stack is used when we have a pile of items, and only work with its top item.
A Stack implementation must provide at least these two operations: • push(e): add an item e to the top of the stack, • pop(): retrieve and remove the item on top of the stack.
For implementing the "undo" feature in your text editor, every edition you make is pushed onto a stack. Should you want to undo, the text editor pops an edition from the stack and reverts it.
The Queue is the Stack's antagonist. It's also used for storing and retrieving items, but the retrieved item is always the one in front of the Queue, i.e., the one that has been on the queue the longest.

