Computer Science Distilled: Learn the Art of Solving Computational Problems (Code is Awesome)
Rate it:
70%
Flag icon
In a graph database, data entries are stored as nodes, and relationships as edges.
70%
Flag icon
This is the most flexible type of database. Letting go of tables and collections, you can store networked data in intuitive ways.
70%
Flag icon
The buzzword Big Data describes data-handling situations that are extremely challenging in terms of Volume, Velocity, or Variety.
71%
Flag icon
Whenever you need a non-standard data management approach because of volume, velocity or variety, you can say it's a "Big Data" application.
71%
Flag icon
SQL vs NoSQL Relational databases are data-centered: they maximize data structuring and eliminate duplication, regardless of how the data will be needed. Non-relational databases are application-centered: they facilitate access and use according to your needs.
71%
Flag icon
Your non-relational database will be powerful, but you will be responsible for updating the duplicated information across documents and collections.
71%
Flag icon
There are several situations in which not one, but several computers must act in coordination to provide a database system:
71%
Flag icon
For these scenarios, there are DBMSs that can run on several coordinated computers, forming a distributed database system.
72%
Flag icon
Sharding If your database receives many write queries for large amounts of data, it's hard to synchronize the database everywhere in the cluster.
72%
Flag icon
A sharding setup with three replicas per shard.
72%
Flag icon
Data Consistency In distributed databases with replication, updates made in one machine don't propagate instantly across all replicas. It takes some time until all machines in the cluster are synchronized. That can damage the consistency of your data.
72%
Flag icon
If your database queries do not strongly enforce data consistency, they are said to work under eventual consistency.
73%
Flag icon
In many cases, working with eventual consistency won't cause problems.
73%
Flag icon
These applications ushered the development of special database systems, known as Geographical Information Systems (GIS). They provide specially designed fields for geographical data: PointField, LineField, PolygonField, and so on.
73%
Flag icon
Many general-use DBMSs provide GIS extensions.
73%
Flag icon
GIS applications are often used in day-to-day life, for instance with GPS navigators like Google Maps or Waze.
73%
Flag icon
How can we store data outside of our database, in a format that is interoperable across different systems? For instance, we might want to backup the data, or export it to an other system. To do this, the data has to go through a process called serialization, where it is transformed according to an encoding format.
73%
Flag icon
SQL is the most common format for serializing relational databases. We write a series of SQL commands that replicate the database and all its details.
73%
Flag icon
XML is another way to represent structured data, but that doesn't depend on the relational model or to a database system implementation.
73%
Flag icon
JSON is the serializing format most the world is converging to. It can represent relational and non-relational data, in an intuitive way to coders.
74%
Flag icon
CSV or Comma Separated Values, is arguably the simplest format for data exchange. Data is stored in textual form, with one data element per line.
74%
Flag icon
Reference
75%
Flag icon
Almost all computers, including our laptops and phones, have the same working principle as the first computing model invented by Von-Neumann in 1945.
75%
Flag icon
A computer is a machine that follows instructions to manipulate data. It has two main components: processor and memory.
75%
Flag icon
Since the memory is an electrical component, we transmit cell addresses through wires as binary numbers.3 Each wire transmits a binary digit. Wires are set at higher voltage for the "one" signal or lower voltage for the "zero" signal.
75%
Flag icon
There are two things the memory can do with a given cell's address: get its value, or store a new value. The memory has a special input wire for setting its operational mode:
75%
Flag icon
The memory can operate in read or write mode.
76%
Flag icon
Usually, each memory cell stores an 8-digit binary number, which is called a byte.
76%
Flag icon
Computer code is essentially a sequence of numbers representing CPU operations.
77%
Flag icon
That's all there is to it. Whether you open a website, play a computer game, or edit a spreadsheet, computations are always the same: a series of simple operations which can only sum, compare, or move data across memory.
77%
Flag icon
People played it in arcade machines equipped with a 2 MHz CPU. That number indicates the CPU's clock: the number of basic operations it executes per second. With a two million hertz (2 MHz) clock, the CPU performs roughly two million basic operations per second.
77%
Flag icon
With modern technological progress, ordinary desktop computers and smartphones typically have 2 GHz CPUs. They can perform hundreds of millions machine instructions every second.
78%
Flag icon
CPU Architectures
78%
Flag icon
32-bit vs. 64-bit Architecture The first CPU, called Intel 4004, was built on a 4-bit architecture. This means it could operate (sum, compare, move) binary numbers of up to 4 digits in a single machine instruction. The 4004 had data and address buses with only four wires each.
78%
Flag icon
Big-Endian vs. Little-Endian
79%
Flag icon
Emulators Sometimes, it's useful to run in your own computer some code that was designed for a different CPU. That way, you can test an iPhone app without an iPhone, or play your favorite vintage Super Nintendo game. For these tasks, there are pieces of software called emulators.
79%
Flag icon
But we rarely write our programs directly as CPU instructions. It would be impossible for a human to write a realistic 3D computer game this way. To express our orders in a more "natural" and compact way, we created programming languages. We write our code in these languages.9 Then, we use a program called a compiler to translate our orders as machine instructions a CPU can run.
79%
Flag icon
The compiler translates complex instructions in a programming language into a equivalent CPU instructions.
80%
Flag icon
Compiled computer programs are essentially sequences of CPU instructions. As we learned, code compiled for a desktop computer won't run on a smartphone, because these machines have CPUs of different architectures. Still, a compiled program may not be usable on two computers that share the same CPU architecture. That's because programs must communicate with the computer's operating system to run.
80%
Flag icon
Besides targeting a specific CPU architecture, compiled code also targets a specific operating system.
81%
Flag icon
Focus on writing clean, self-explanatory code. If you have performance issues, use profiling tools to discover bottlenecks in your code, and try computing these parts in smarter ways.
81%
Flag icon
Some programming languages, called scripting languages, are executed without a direct compilation to machine code. These include JavaScript, Python, and Ruby. Code in these languages works by getting executed not directly by the CPU, but by an interpreter that must be installed in the machine that is running the code. Since the interpreter translates the code to the machine in real time, it usually runs much slower than compiled code. On the other hand, the programmer can always run the code immediately, without waiting through the compilation process. When a project is very big, compiling can ...more
82%
Flag icon
Google engineers had to constantly compile large batches of code. That made coders "lose" (fig. 7.9) a lot of time. Google couldn't switch to scripting languages—they needed the higher performance of the compiled binary. So they developed Go, a language that compiles incredibly fast, but still has a very high performance.
82%
Flag icon
Given a compiled computer program, it's impossible to recover its source code prior to compilation.
82%
Flag icon
Underground hackers often analyze the binary code from licensed programs like Windows, Photoshop, and Grand Theft Auto, in order to determine which part of the code verifies the license. They modify the binary code, placing an instruction to directly JUMP to the part of the code that executes after the license has been validated. When the modified binary is run, it gets to the injected JUMP command before the license is even checked, so people can run these illegal, pirated copies without paying.
82%
Flag icon
The most famous attack of this kind was the Stuxnet, a cyberweapon built by agencies from United States and Israel. It slowed down Iran's nuclear program by infecting computers that controlled underground Iranian fusion reactors.
82%
Flag icon
Without the original source code, even though you can change the binary a little bit to hack it in small ways, it's practically impossible to make any major change to the program, such as adding a new feature. Some people believe that it's much better to build code collaboratively, so they started to make their source code open for other people to change. That's the main concept about open source: software that everyone can use and modify freely. Linux-based operating systems (such as Ubuntu, Fedora, Debian) are open-source, whereas Windows and Mac OS are closed source.
82%
Flag icon
With open-source software, there are more eyes on the code, so it's harder for malicious third parties and government agencies to insert surveillance backdoors. When using Mac OS or Windows, you have to trust that Apple or Microsoft aren't compromising your security and are doing their best to prevent any severe security flaw. Open-source systems are open to public scrutiny, so there are less chances that security flaws slip through unnoticed.
83%
Flag icon
If memory access is slow, the CPU has to sit idle, waiting for the RAM to do its work. The time it takes to read and write data in memory is directly reflected in computer performance.
83%
Flag icon
Recent technological developments increased CPU speeds exponentially. Memory speeds also increased, but at a much slower rate. This performance gap between CPU and RAM is known as the Processor-Memory Gap: