Computer Science Distilled: Learn the Art of Solving Computational Problems (Code is Awesome)
Rate it:
45%
Flag icon
The Priority Queue is similar to the Queue, with the difference that enqueued items must have an assigned priority.
46%
Flag icon
When storing a bunch of items, you sometimes need more flexibility. For instance, you could want to freely reorder the items; or to access, insert and remove items at any position. In these cases, the List is handy.
46%
Flag icon
The Stack or Queue should be preferred when the flexibility of List isn't needed. Using a simpler ADT ensures data is handled in a strict and robust way (FIFO or LIFO). It also makes the code easier to understand: knowing a variable is a Stack helps to see how data flows in and out.
46%
Flag icon
The Sorted List is useful when you need to maintain an always sorted list of items.
47%
Flag icon
The Map (aka Dictionary) is used to store mappings between two objects: a key object and a value object.
47%
Flag icon
The Set represents unordered groups of unique items, like mathematical sets described in Appendix
48%
Flag icon
The Array is the simplest way to store a bunch of items in computer memory.
48%
Flag icon
Each object in an array occupies the same amount of space in memory.
49%
Flag icon
When performance isn't an issue, we can rely on these generic ADT implementations and not worry about data structures. But when performance must be optimal, or when working with a lower level language that doesn't have such features, you must decide which data structures to use.
50%
Flag icon
A tree with the origins of Indo-European languages.
50%
Flag icon
In the Tree terminology, a cell is called a node, and a pointer from one cell to another is called an edge.
55%
Flag icon
Searching for existing algorithms should always be your first move when solving problems.
57%
Flag icon
How do you find a node in a Graph? If the Graph's structure offers no navigation help, you must visit every node in the graph until you find the one you want. To achieve that, there are two approaches: depth-first and breadth-first.
57%
Flag icon
Figure 5.2 Exploring a graph depth-first versus breadth-first.
58%
Flag icon
Searching a graph via Depth First Search (DFS), we keep following edges, going deeper and deeper into the graph.
58%
Flag icon
If going deep in the graph isn't a good approach, you can try Breadth First Search (BFS). It explores the graph level per level: first the neighbors of your start node, then its neighbors' neighbors, and so on.
59%
Flag icon
When you need to explore all the nodes of a graph, it's usually better to stick with DFS for its simple implementation and smaller memory footprint.
59%
Flag icon
Figure 5.3 DFS, courtesy of http://xkcd.com.
60%
Flag icon
For this problem, we're not gonna show an algorithm. You should use what you learned so far and try solving this problem by yourself. You can do so at UVA,3 an online judge that will test your proposed solution.
60%
Flag icon
One famous and very effective way of finding the shortest path is the Dijkstra Algorithm
61%
Flag icon
Before founding Google, Sergey Brin and Larry Page were computer science academics at Stanford University, researching graph algorithms. They modeled the Web as a graph: web pages are nodes, and links between web pages are edges. They figured if a web page receives many links from other important pages, then it must be important as well. They created the PageRank Algorithm after that idea. The algorithm runs in rounds. Each web page in the graph starts with an equal number of "points". After each round, each page distributes its points to the pages it has links to. The process is repeated ...more
61%
Flag icon
During World War II, the British Army needed to make the best strategic decisions to optimize the impact of their operations. They created many analytical tools to figure out the best way to coordinate their military operations. That practice was named operations research. It improved the British early warning radar system, and helped the United Kingdom better manage its manpower and resources. During the war, hundreds of Brits were involved in operations research. After, the new ideas were applied to optimize processes in businesses and industries. Operations research involves defining an ...more
61%
Flag icon
Problems where the objective and constraints can be modeled using linear equations4 are called linear programming problems.
62%
Flag icon
Simplex solves linear programming problems very efficiently. It has been helping industries solve complex problems since the 1960s.
63%
Flag icon
Finding a new problem that hasn't been explored before is rare. When researchers find a new problem, they write a scientific paper about it.
64%
Flag icon
DataBase Management System (DBMS): a special piece of software for managing databases. The DBMS organizes and stores the data. It mediates accesses and changes to the database.
64%
Flag icon
we can often extract valuable information from data. That's called data mining.
64%
Flag icon
For instance, a big grocery chain analyzed its product-transaction data, and found that its top-spending customers often buy a type of cheese ranked below 200 in sales. Normally, they discontinued products with low sales. Data mining inspired managers not only to keep that cheese product, but to put it in more visible spots. That pleased their best customers, and made them come back even more.
64%
Flag icon
The emergence of the relational model in the late 1960s was a huge leap for information management. Relational databases make it easy to avoid duplicate information and data inconsistencies. The majority of database systems used today are relational.
65%
Flag icon
The organization of a database table is given by its fields and the restrictions they enforce. This combination of fields and restrictions is called the table's schema.
65%
Flag icon
We use ID values to refer to a specific row within a table.
65%
Flag icon
The ID field of a table is also known as its primary key. A field that records references to other rows' IDs is called a foreign key.
65%
Flag icon
When a database is organized in a way that it is completely free of duplicate information, we say that the database is normalized. The process of transforming a database with replicated data to one without is called normalization.
66%
Flag icon
As an application grows and new features are added, it's unlikely its database structure (the schema of all its tables) remains the same. When we need to change the database structure, we create a schema migration script.
66%
Flag icon
Without creating schema migrations, your "manual" database changes will be hard to revert to a specific working version. It will be hard to guarantee compatibility between the local databases of different software developers. These problems occur frequently in big software projects with careless database practices.
66%
Flag icon
Almost every relational DBMS works with a query language called SQL.
67%
Flag icon
A database manager must always take into account the product of the number of rows of joined tables. For very large tables, joins become unfeasible. The JOIN is the greatest power and, at the same time, the major weakness of relational databases.
67%
Flag icon
For a table's primary key to be useful, we must be able to quickly retrieve a data entry when given its ID value. To that end, the DBMS builds an auxiliary index, mapping row IDs to their respective addresses in memory. An index is essentially a self-balancing binary search tree (sec. 4.3). Each row in the table corresponds to a node in the tree.
67%
Flag icon
Node keys are the values in the field we index. To find the register with a given value, we search for the value in the tree. Once we find a node, we get the address it stores, and use it to fetch the register. Searching a binary search tree is (log n), so finding registers in large tables is fast.
68%
Flag icon
When inserting a new row, the DBMS must search the entire table to make sure no uniqueness constraint is violated. Finding if a value is present in a field without an index means checking all rows in the table. With an index, we can quickly search if the value we're trying to insert is already present. Indexing fields that have a uniqueness constraint
68%
Flag icon
Many DBMSs might even refuse to fulfill queries asking to sort by a non-indexed field when the query involves too many rows.
68%
Flag icon
When sorting by two fields is required, joint indexes are used.
68%
Flag icon
So indexes are awesome: they allow for super fast querying and instant sorted data access. Then why don't we have indexes for all fields in every table? The problem is when a new register is inserted to or removed from the table, all its indexes must be updated to reflect that. If there are a lot of indexes, updating, inserting or removing rows can become computationally expensive (remember tree balancing). Moreover, indexes occupy disk space, which is not an unlimited resource.
68%
Flag icon
If your queries are wasting too much time sequentially scanning data in a field, add an index for that field and see how that helps.
68%
Flag icon
To adjust a database for higher performance, it's crucial to know which indexes to keep and which to discard. If a database is mostly read and rarely updated, it might make sense to keep more indexes. Poor indexing is a major cause for slowdown in commercial systems. Careless system administrators often won't investigate how common queries are run—they will just index random fields they "feel" will help performance. Don't do this! Use "explain" tools to check your queries and add indexes only when it makes a difference.
69%
Flag icon
A transaction is a list of database operations that must be executed atomically.
69%
Flag icon
The most widely known type of NoSQL database is the document store.
69%
Flag icon
Data in the relational model (top) vs. NoSQL (bottom).
69%
Flag icon
The non-relational model expects us to duplicate information at each relevant place. It's hard to keep duplicated data updated and consistent. In return, by grouping relevant data together, document stores can offer more flexibility: You don't need to join rows, You don't need fixed schemas, Each data entry can have its own configuration of fields. This means there are no "tables" and "rows" in document stores. Instead, a data entry is called a document. Related documents are grouped in a collection.
70%
Flag icon
The key-value store is the simplest form of organized, persistent data storage. It's mainly used for caching.