Data Driven
Rate it:
Open Preview
Read between July 27 - August 2, 2025
9%
Flag icon
Using data effectively is not just about which database you use or how many data scientists you have on staff, but rather it’s a complex interplay between the data you have, where it is stored and how people work with it, and what problems are considered worth solving.
10%
Flag icon
the best organizations recognize that people are at the center of this complexity.
10%
Flag icon
In any organization, the answers to questions such as who controls the data, who they report to, and how they choose what to work on are always more important than whether to use a da...
This highlight has been truncated due to consecutive passage length restrictions.
14%
Flag icon
a data scientist must be able to ask the right questions.
14%
Flag icon
Asking the right questions involves domain knowledge and expertise, coupled with a keen ability to see the problem, see the available data, and match up the two.
19%
Flag icon
The most well-known data-driven organizations are consumer Internet companies: Google, Amazon, Facebook, and LinkedIn. However, being data driven isn’t limited to the Internet. Walmart has pioneered the use of data since the 1970s.
20%
Flag icon
This enabled it to become the first company to have more than $1 billion in sales during its first 17 years.
21%
Flag icon
The company wanted to know what products were selling and how the placement of those products in the store impacted sales.
21%
Flag icon
As the number of stores and the volume of goods increased, the complexity of its inventory management increased.
22%
Flag icon
it became the first large company to invest in RFID
22%
Flag icon
FedEx and UPS are well known for using data to compete. UPS’s data led to the realization that, if its drivers took only right turns (limiting left turns), it would see a large improvement in fuel savings and safety, while reducing wasted time. The results were surprising: UPS shaved an astonishing 20.4 million miles off routes in a single year.
25%
Flag icon
acquires, processes, and leverages data in a timely fashion to create efficiencies, iterate on and develop new products, and navigate the competitive landscape..
26%
Flag icon
The first steps in working with data are acquiring and processing. But it’s not obvious what it takes to do these regularly. The best data-driven organizations focus relentlessly on keeping their data clean.
26%
Flag icon
The data must be organized, well documented, consistently formatted, and error free. Cleaning the data is often the most taxing part of data science, and is frequently 80% of the work.
26%
Flag icon
Setting up the process to clean data at scale adds further complexity. Successful organizations invest heavily in tooli...
This highlight has been truncated due to consecutive passage length restrictions.
27%
Flag icon
They have developed a culture that understands the importance of data quality; otherwise, as the adage ...
This highlight has been truncated due to consecutive passage length restrictions.
28%
Flag icon
They use the data to understand their customers and the nuances of their business. They develop experiments that allow them to test hypotheses that improve their organization and processes. And they use the data to build new products. The next section explains how they do it.
29%
Flag icon
The democratization of data is one of the most powerful ideas to come out of data science. Everyone in an organization should have access to as much data as legally possible.
29%
Flag icon
While broad access to data has become more common in the sciences (for example, it is possible to access raw data from the National Weather Service or the National Institutes for Health),
29%
Flag icon
Facebook was one of the first companies to give its employees access to data at scale. Early on, Facebook realized that giving everyone access to data was a good thing. Employees didn’t have to put in a request, wait for ...
This highlight has been truncated due to consecutive passage length restrictions.
31%
Flag icon
Access to data became a critical part of Facebook’s success, and remains something it invests in aggressively.
31%
Flag icon
All of the major web companies soon followed suit. Being able to access data through SQL became a mandatory skill for those in business functions at organizations like Google and LinkedIn.
32%
Flag icon
For example, the World Bank now makes its data open so that groups of volunteers can come together to clean and interpret it.
32%
Flag icon
Governments have also begun to recognize the value of democratizing access to data, at both the local and national level.
32%
Flag icon
The UK government has been a leader in open data efforts,
33%
Flag icon
As the public and the government began to see the value of making the data more open, governments began to catalog their data, provide training on how to use the data, and publish data in ways that are compatible with modern technologies.
34%
Flag icon
In New York City, access to data led to new Moneyball-like approaches that were more efficient, including finding “a five-fold return on the time of building inspectors looking for illegal apartments” and “an increase in the rate of detection for dangerous buildings that are highly likely to result in firefighter injury or death.”
35%
Flag icon
One challenge of democratization is helping people find the right data sets and ensuring that the data is clean. As we’ve said many times, 80% of a data scientist’s work is preparing the data, and users without a background in data analysis won’t be prepared to do the cleanup themselves.
36%
Flag icon
To help employees make the best use of data, a new role has emerged: the data steward.
36%
Flag icon
The steward’s mandate is to ensure consistency and quality of the data by investing in tooling and processes that make the cost of working with data scale logarithmicall...
This highlight has been truncated due to consecutive passage length restrictions.
37%
Flag icon
What Does a Data-Driven Organization Do Well?  There’s almost nothing more exciting than getting access to a new data set and imagining what it might tell you about the world!
37%
Flag icon
Data scientists may have a methodical and precise process for approaching a new data set, but while they are clearly looking for specific things in the data, they are also developing an intuition about the reliability of the data set and how it can be used.
39%
Flag icon
A bit of digging around those dates will show you that there’s no conspiracy here: that data represents Hurricane Sandy, when the bridges and tunnels were deliberately closed. It also explains the spike that happens when the bridges reopened. You also see traffic drop sharply for the blizzard of February 2013. The data set is as simple as they come — it’s just one integer per day — and yet there’s a fascinating story hiding here.
40%
Flag icon
When data scientists initially dive into a data set, they are not just assembling basic statistics, they’re also developing an intuition for any flaws in the data and any unexpected things the data might be able to explain. It’s not a matter of checking statistics off a list, but rather of building a mental model of what data says about the world.