Data Jujitsu: The Art of Turning Data into Product
Rate it:
Open Preview
Kindle Notes & Highlights
5%
Flag icon
Thanks to large investments in the general area of data science, many major innovations (e.g., Hadoop, Voldemort, Cassandra, HBase, Pig, Hive, etc.) have made data products easier to build.
7%
Flag icon
Smart data scientists don’t just solve big, hard problems; they also have an instinct for making big problems small.
10%
Flag icon
Before investing in a big effort, you need to answer one simple question: Does anyone want or need your product? If no one wants the product, all the analytical work you throw at it will be wasted. So, start with something simple that lets you determine whether there are any customers. To do that, you’ll have to take some clever shortcuts to get your product off the ground. Sometimes, these shortcuts will survive into the finished version because they represent some fundamentally good ideas that you might not have seen otherwise; sometimes, they’ll be replaced by more complex analytic ...more
17%
Flag icon
The key is to start simple and stay simple for as long as possible.
18%
Flag icon
One of the biggest challenges of working with data is getting the data in a useful form. It’s easy to overlook the task of cleaning the data and jump to trying to build the product, but you’ll fail if getting the data into a usable form isn’t the first priority.
21%
Flag icon
The point is to have a conversation rather than just a form. Engage the user to help you, rather than relying on analysis. You’re not just getting the user more involved (which is good in itself), you’re getting clean data that will simplify the work for your back-end systems. As a matter of practice, I’ve found that trying to solve a problem on the back end is 100-1,000 times more expensive than on the front end.
22%
Flag icon
As technologists, we are predisposed to look for scalable technical solutions. We often jump to technical solutions before we know what solutions will work. Instead, see if you can break down the task into bite-size portions that humans can do, then figure out a technical solution that allows the process to scale.
23%
Flag icon
Amazon’s Mechanical Turk is a system for posting small problems online and paying people a small amount (typically a couple of cents) for solutions.
25%
Flag icon
Humans are also useful for separating valid input from invalid.
27%
Flag icon
By using humans to solve the problem initially, we can learn a great deal about the problem at a very low cost.
28%
Flag icon
The human solution not only made it clear what they needed to build, it proved that the technical solution was worth the effort and bought them the time they needed to build it.
30%
Flag icon
The point is that technical solutions will always win in the long run; they’ll always be more efficient, and even a poor technical solution is likely to scale better than using humans to answer questions. But when you’re getting started, you don’t care about the long run. You just want to survive long enough to have a long run, to prove that your product has value. And in the short term, human solutions require much less work. Worry about scaling when you need to.
39%
Flag icon
LinkedIn’s People You May Know embodies both Data Jujitsu and grounding the product in the real world.
47%
Flag icon
By giving data back to the user, you can create both engagement and revenue. We’re far enough into the data game that most users have realized that they’re not the customer, they’re the product.
48%
Flag icon
How do you give data back to the user? LinkedIn has a product called “Who’s Viewed Your Profile.” This product lists the people who have viewed your profile (respecting their privacy settings, of course), and provides statistics about the viewers.
51%
Flag icon
In short, everyone reading this has probably spent the last year or more of their professional life immersed in data. But it’s not just us. Everyone, including users, has awakened to the value of data. Don’t hoard it; give it back, and you’ll create an experience that is more engaging and more profitable for both you and your company.
53%
Flag icon
One of the biggest challenges of developing a data product is figuring out how to give data back to the user.
54%
Flag icon
An “inverse interaction law” applies to most users: The more data you present, the less interaction.
54%
Flag icon
The best way to avoid data vomit is to focus on actionability of data. That is, what action do you want the user to take?
59%
Flag icon
What tools do we have to think about bad results — things like unfortunate recommendations and collaborative filtering gone wrong? Two concepts, precision and recall, let us describe the problem more precisely.
59%
Flag icon
Precision — The ability to provide a result that exactly matches what’s desired.
60%
Flag icon
Recall — The set of possible good recommendations. Recall is fundamentally about inventory: Good recall means that you have a lot of good recommendations, or a lot of advertisements that you can potentially show the user.
61%
Flag icon
Unfortunately, precision and recall often work against each other: As precision increases, recall drops, and vice versa.
62%
Flag icon
Low-precision search results yield a poor experience. On the other hand, low-precision ads are almost harmless
65%
Flag icon
Another issue to contend with is subjectivity: How does the user perceive the results?
67%
Flag icon
The most common guideline is to strive for a distribution in which there are many good results, a few great ones, and no bad ones.
72%
Flag icon
We often focus on getting a limited set of data from a user. But done correctly, you can engage the user to give you more useful, high-quality data.
74%
Flag icon
Take heed not just to demand data. You need to explain to the user why you’re asking for data; you need to disarm the user’s resistance to providing more information by telling him that you’re going to provide value (in this case, more valuable recommendations), rather than abusing the data. It’s essential to remember that you’re having a conversation with the user, rather than giving him a long form to fill out.