A Simplified Definition of “Data Scientist”



There is a lot of controversy around the definition of a Data Scientist.



Some think it means being a statistician, others think it means being a technologist, and others have still other requirements.



I think the best definitions are more general and goal-based, and look something like these:






Data Scientist


/’dadə sīən(t)əst’/


noun
1. Someone who specializes in collecting, massaging, and/or displaying data in order to tell a story that results in a positive outcome.

2. Someone who can technically extract meaning from information in a way that enables decision makers to make better choices.

3. Someone who can extract business value from data using mathematics and technology.




Importantly, this could be a triple-Ph.D in statistics, maths, and computer science, or a talented graphic designer with some decent Python skills.



The key is that they’re able to use data to illuminate how the world works and facilitate progress.



So you can break down the definitions into 49.6 different categories and sub-categories, or you can use this approach and focus on outcomes.



I think this approach is more resilient, especially given how quickly the field is changing.



Notes


The definitions above assume both good faith and possession of requisite talent/skills. Manipulation and incompetence are not in scope.
There’s a humorous alternative definition which says, “A data scientist is someone who’s better at statistics than any software engineer, and better at software engineering than any statistician.”

---

I do a weekly show called Unsupervised Learning, where I collect the most interesting stories in infosec, technology, and humans, and talk about why they matter. You can subscribe here.

 •  0 comments  •  flag
Share on Twitter
Published on January 29, 2017 00:42
No comments have been added yet.


Daniel Miessler's Blog

Daniel Miessler
Daniel Miessler isn't a Goodreads Author (yet), but they do have a blog, so here are some recent posts imported from their feed.
Follow Daniel Miessler's blog with rss.