Phil Simon's Blog, page 49
June 12, 2017
Understanding Data Lakes
Introduction
As I did the research for Analytics: The Agile Way, I encountered a relatively new concept in the business and tech landscape: the data lake. In this post and the next, I’ll broach the subject and describe why they matter.
Let’s begin by examining data lakes in contrast to data warehouses. The latter are predicated upon strictly defined schema—typically either of the star or snowflake variety. That is, they require writing and storing data in a very structured manner or shape. Data warehouses require the strict manipulation of data; they do not store data in its “natural state.”
The tightly controlled process of data warehousing often meets certain business needs—often reporting. Still, it fails to meet others. (More on that in my next post on the subject.)
Enter the Data Lake
I’ve been saying for a while now that traditional data warehouses can’t do it all. To this end, data lakes fulfill a genuine business need and software vendors have taken notice.
Yes, at a high level, both data warehouses and lakes store data but there’s a key difference: on-write vs. on-read.
Foolish is the soul who believes that there’s no difference between on-write vs. on-read.
Let me explain.
Data lakes still require schema but that schema isn’t pre-defined. It’s ad hoc or, if you like, on-read. Data is applied to a plan or schema as it is pulled out of a stored location, not as it goes in. Put differently, data remains in its unaltered (read: natural) state. Critically, a data lake doesn’t define requirements unless and until users query the data. As Margaret Rouse writes:
Each data element in a lake inherits unique identifier tagged with an extended set of metadata tags. When a business question arises, users can query the data lake for relevant data. The end goal: that those users can analyze that smaller dataset to help answer the question.
Think about it. When used correctly, data lakes offer business and technical users to query smaller, more relevant, and more flexible datasets. As a result, query times can drop to a fraction of what they would have been in a datamart, data warehouse, or relational database.
Simon Says
I see a bright future for data lakes. Data volumes continue to increase—especially of the unstructured variety. Data storage costs keep plummeting and data is increasingly valuable. Rather than trying to retrofit useful and mature technologies to a very new environment, expect intelligent organizations to experiment with and adopt data lakes over the next few years.
Feedback
What say you?
In my next post on this subject, I’ll describe a few specific ways that companies can use data lakes.
This post was brought to you by IBM Global Technology Services. For more content like this, visit IT Biz Advisor.
The post Understanding Data Lakes appeared first on Phil Simon.
June 7, 2017
The Speed of Analytics
Introduction
As someone who’s attended more than his fair share of tech conferences over the years, I’ve heard the follow refrain more times that I can count:
It takes our organization too long to make sense of our data.
Typically, I hear this from attendees in hallway side conversations, not from speakers who claim to know all of the answers. But why such a gap between theory and practice?
To be sure, in many organizations, analytics still falls under the “nice to have” umbrella in the whole scheme of things. That is, it doesn’t qualify as truly essential à la running payroll or being able to produce basic compliance or accounting reports. For this very reason, far too many professionals adopt a blasé approach when turning raw data into insights and, eventually, action.
The Law of Unintended Consequences
Of course, some organizations simply don’t have that option. Consider what the social-networking app Nextdoor did when its users began engaging in rampant racial profiling. As I write in Analytics: The Agile Way, its senior management moved very quickly to address what could be a company-killing issue. Over the course of several critical months in 2015, Nextdoor used Agile methods to redesign its app several times. Through intelligent design and an iterative approach, the company reduced racial profiling by an impressive 75 percent.
Organizations need to be more Agile in their analytics efforts.
No doubt that it’s easier in some ways for newer organizations and startups to react and adapt. After all, they generally aren’t encumbered by legacy technologies and dated infrastructure—two things that often impede organizations’ efforts to act nimbly.
Lessons from Nextdoor
As I write in the book, there’s a great deal that more mature firms can learn from lean startups such as Nextdoor. For starters, they can design—or redesign—their products with data in mind. Make no mistake: the data that an app, website, wearable, or tech product or service generates is a direct function of its design.
Second, speed begets more speed. Ditto for lethargy. Organizations unaccustomed to moving quickly almost always experience difficulty when crises and opportunities emerge. Put simply, when it comes to analytics, culture is critical. Foolish is the CXO who believes that an his or her institution’s muscle memory doesn’t matter. It’s evident to me that employees, groups, departments, divisions, and even entire enterprises continue their patterns—whether they are positive or negative.
Simon Says: The speed imperative has never been greater.
The days of multi-year IT and business-intelligence projects are quickly coming to an end. As the world changes faster than ever, the notion of extracting value two years down the road seems downright silly.
Feedback
What say you?
This post was brought to you by IBM Global Technology Services. For more content like this, visit IT Biz Advisor.
The post The Speed of Analytics appeared first on Phil Simon.
May 24, 2017
Book Excerpt from Analytics: The Agile Way

Current Book Status
Writing
100%
Editing
100%
Layout
92.2%
Wiley has allowed me to post the Preface and Introduction to Analytics: The Agile Way. Enjoy.
Pre-order it on Amazon.
Feedback
What say you?
The post Book Excerpt from Analytics: The Agile Way appeared first on Phil Simon.
May 23, 2017
Book Excerpt from Analytics: The Agile Way

Current Book Status
Writing
100%
Editing
100%
Layout
92.2%
Wiley has allowed me to post the Preface and Introduction to Analytics: The Agile Way. Enjoy.
Pre-order it on Amazon.
Feedback
What say you?
The post Book Excerpt from Analytics: The Agile Way appeared first on Phil Simon.
Expectations for Analytics: The Agile Way
Introduction
“It’s tough to make predictions, especially about the future.”
―Yogi Berra
The endorsements are in. The page proofs are nearly final. Ditto for the index and the cover copy. Next month, Analytics: The Agile Way hits the shelves.
Writing a book is a funny thing. If I’ve learned anything over the past nine years and eight books, it’s that making predictions about book sales is a fool’s errand. In other words, Yogi was right.
Muted Expectations This Time
I’m tempted here to revisit some of my previous texts.
Making predictions about book sales is a fool’s errand.
Because of the book tour, my profile, and the efforts of my PR firm, I thought that Message Not Received would do very well. It didn’t—at least by my admittedly lofty standards. I had similarly high hopes for The Age of the Platform and I turned out to be right. Brass tacks: writing a book is ultimately a crapshoot.
In a word, my expectations this time are muted for several reasons. First, I’ve learned that ambitious sales goals increase the odds of disappointment. Second, I’m teaching this summer at ASU and won’t be doing a book tour, nor am I contracting a PR firm. (If you’re interested in a media copy, click here.)
All of this is to say that, relative to my other books, I won’t be expending the same time, energy, and resources promoting Analytics: The Agile Way. (Generally speaking, there’s a direct relationship between marketing and sales.)
Beyond that, Analytics is my first book geared towards an academic audience. No, it’s a textbook (far from it), but it’s a solid effort. I can certainly see it reaching a certain level and the Wiley folks obviously concur.
Finally, I don’t need Analytics to sustain me for the next year or more. I’m very content as a full-time faculty member at ASU. If my new book sells 20,000 copies, as The Age of the Platform did, then I’ll consider it gravy.
Simon Says: Embrace the uncertainty.
Would-be authors looking for certainty and guaranteed sales and income shouldn’t write books. I can’t speak for all writers, but many write because they enjoy the process, not because of certain financial rewards. To paraphrase a line from Rush drummer Neil Peart (a notoriously reluctant touring musician for since the mid-1980s), “that’s just what a writer does.”
Feedback
What say you?
The post Expectations for Analytics: The Agile Way appeared first on Phil Simon.
May 22, 2017
Analytics: The Agile Way Media Copies
Yeah, I’m a media whore.
I’ll be giving out a limited number of media copies of Analytics: The Agile Way. If you are interested and write for a relevant publication or blog that reaches a respectable number of people, click the button on the right and fill out the relevant form.
Media Copy
The post Analytics: The Agile Way Media Copies appeared first on Phil Simon.
May 15, 2017
Thoughts on Smarter Transportation
Introduction
It’s no overstatement to say that we’ve made more progress with autonomous cars over the past five years than we have in the prior 50. John Markoff of The New York Times writes:
Cars are beginning to drive on their own in certain situations, and in the coming years, they will do increasingly more under computer control. They will follow curving roads, change lanes, pass through intersections, and stop and start.
Fascinating stuff to be sure, but will this actually happen? Put differently, can smart vehicles really reach their full potential on “dumb” roads?
The Case for Smarter Infrastructure
Let’s first disavow ourselves of the notion that automobile travel is fundamentally safe. It’s not. Consider the following stats from The Association for Safe International Road Travel, a non-profit, humanitarian organization that promotes road travel safety through education and advocacy:
Nearly 1.3 million people die in road crashes each year. This averages to 3,287 deaths per day.
An additional 20-50 million are injured or disabled.
More than half of all road traffic deaths occur among young adults ages 15-44. (Texting while driving is increasingly problematic.)
Scary numbers to be sure. Technology and data may not solve every problem, but it’s hard to argue that there isn’t significant room for improvement here. As David Carr writes:
Vehicle-to-infrastructure (V2I) and vehicle-to-vehicle (V2V) technologies are intended to work together to promote safer, more efficient transportation. V2V would have cars send each other information such as speed, position, acceleration, size, brake status and other “basic safety message” data as often as 10 times per second.
Can smart vehicles really reach their full potential on “dumb” roads?
Cars and trucks that could seamlessly communicate with each other and centralized hubs can do more than avert traffic. Dynamic algorithms could alert people and trucks to begin routes at optimal times of the day. We already know that Uber uses sophisticated tech and data to entice its “driver-partners” and minimize their downtime. There’s no reason that other organizations cannot follow their lead. Beyond this, expect things such as priority lanes for electric cars, interactive lights, and roads that glow in the dark.
Smarter cars and infrastructure certainly won’t eliminate accidents, but it’s not hard to envision lower incidences. For instance, how about a vehicle with sensors that forbid you to drive if you appear to be intoxicated à la “The Entire History of You”, my favorite episode of Black Mirror? How about roads that communicate your car if you swerve excessively or drive too fast? Ten years ago this was pure science fiction but it’s easy to imagine scenarios such as these today.
Simon Says: The future is exciting.
Phones and televisions used to be simple, data-free devices. This is no longer the case. Each can do far more than their antecedents. Why should cars, roads, and bridges remain stagnant?
Feedback
What say you?
This post was brought to you by IBM Global Technology Services. For more content like this, visit IT Biz Advisor. 
The post Thoughts on Smarter Transportation appeared first on Phil Simon.
May 11, 2017
Update on Analytics: The Agile Way
Good news on the new text.
As I speak, the production folks at Wiley are making the final pages. I’ve procured a few endorsements already and more are forthcoming. My indexer is doing her part.
Here are a few near-final pages from the 304-page book:
I’m excited that Wiley is publishing the book in late June—in time for me to use it in my summer Enterprise Analytics class at ASU.
Pre-order it here.
The post Update on Analytics: The Agile Way appeared first on Phil Simon.
May 8, 2017
Data and Analytics: Understanding the Human Element
Introduction
In my last post, I compared today’s Internet of Things (IoT) to the web circa 1998. The promise to make better decisions is massive, even if relatively few organizations and industries have taken the plunge.
One of the biggest impediments to mainstream adoption of the IoT remains the lack of universal protocols and standards. Few people want to come home from work and spend four hours trying to configure smart devices. Make no mistake, though. That’s hardly the only formidable obstacle. Security remains a major concern. Case in point: Last October, news of a widespread hack proved what many industry experts and IoT skeptics have feared for years: Despite its enormous promise, security threats loom large.
The Thorny Human Element
Much of the ultimate success of the IoT hinges on what we do with data. Remember that “data” and “analytics” typically don’t act on their own volition. (Insert Terminator reference.) We do. At least today, algorithms that power search engines and high-frequency trading remain the exceptions that prove the rule. It’s helpful here to think some fundamental human questions:
Do we know our limitations?
Are we really willing to go where the data takes us?
Or will we fall victim to confirmation bias? Will we simply dismiss new data that doesn’t conform to our world views?
Are we really willing to go where the data takes us?
For instance, even the most skilled and knowledgeable doctor cannot possibly read—let alone understand and interpret—every cancer study. The corpus is massive, complex, and constantly evolving. Yet, there’s evidence that artificial-intelligence engines such as Watson can help medical professionals make better decisions. As Robert Hackett writes:
No human could possibly read the entirety of medical literature, personal health records, and case file histories that might inform a doctor’s professional opinion when trying to save a cancer patient’s life. But a machine can.
I’m no doctor, but I can see how someone who spent a decade studying her craft incurring massive debt might have a hard time taking advice from machines. This is the same tension that Billy Beane experienced as the general manager of the Oakland A’s in Moneyball. Traditional baseball talent scouts didn’t need newfangled analytics telling them which ballplayers were worth drafting. They just knew. (Of course, they were wrong.)
Simon Says: In chaos lies opportunity.
I have no doubt that new data sources and technologies will continue to unearth fascinating insights into just about all walks of life. Healthcare and the IoT are just the tips of the iceberg. I am equally sure, though, that many professionals will continue to discount their power. And it is here that the next generation of companies will distinguish itself from the pack.
Feedback
What say you?
This post was brought to you by IBM Global Technology Services. For more content like this, visit IT Biz Advisor. 
The post Data and Analytics: Understanding the Human Element appeared first on Phil Simon.
May 1, 2017
How Analytics Will Enable a Better, Smarter IoT
Introduction
I’m old enough to remember the rise of the Web. Twenty years ago, bullish industry experts and thought leaders portended the end of just about every brick-and-mortar business: commerce, food delivery, and even currency were supposed to go the way of the Dodo. (For a trip down memory lane, click here.)
Of course, we know what happened. Ridiculous dot-com valuations crumbed and very few of those original entrants remain. For every Amazon, Google, and eBay, thousands of companies have perished.
In retrospect, though, some ideas were truly terrible. Still, some early startups vying to disrupt traditional industries might have failed not because their models were “wrong” but because they were too early. Big difference. Indeed, a look at today’s landscape reveals no shortage of companies using technology, data, and analytics to do things that simply weren’t possible even a decade ago. It’s not hard to imagine primitive 1998 versions of Uber and Airbnb. For all I know, a few college kids were planning on hatching ride- and home-sharing websites when the bubble burst.
Today’s Dot-Com Analog: The IoT
Today, organizations can go beyond merely fixing servers, trucks, and airplanes after they break.
I often think about original dot-com companies in today’s context. In a parallel vein, I’m hardpressed to think of a more overhyped concept than the Internet of Things. Its potential is nearly impossible to overstate, yet relatively little of it has arrived yet—and that is very much the operative word. Twenty years from now, I suspect that we’ll look back at 2017 through a similar lens: the dots were there but few companies and industries connected them.
Rather than wax poetic about high-level trends, consider the following simple question: When will equipment fail? As Doug Bonderud writes:
For IT professionals, the use of data to address tech problems is nothing new. What’s changed is the amount and quality of this information. Ten years ago, companies were stuck in reactionary mode: Lacking the tools to analyze and act on real-time data, IT experts were forced to wait until software or services failed and then use the resulting data to address specific issues.
This is no longer the case. Put differently, waiting for the shoe to eventually drop seems so 1998. Today intelligent people and organizations can go beyond merely repairing servers, trucks, and airplanes after they break or fail. Expect continuous monitoring to become commonplace. To this end, I’ve heard the term predictive maintenance more in the past two years than in the past two decades. At a high level, though, how does this happen?
The short answer is that, thanks to increasingly smart sensors, we’ve got more data and better analytics than ever. As anyone with a modicum of statistical experience knows, better data leads to better predictions and, ultimately, better business outcomes. For instance, professionals will know that a given vehicle has a 20-30 percent chance of failing in the next two weeks.
Simon Says: Better data often yields better analytics and outcomes.
Of course, this isn’t always the case. With greater noise, it can be difficult to find the signal. Still, I’ll bet on the eventual success of organizations that turn raw data into meaningful analytics and insights.
Feedback
What say you?
This post was brought to you by IBM Global Technology Services. For more content like this, visit IT Biz Advisor. 
The post How Analytics Will Enable a Better, Smarter IoT appeared first on Phil Simon.


