Data Scientist vs. Data Algorithm

Who is a good data scientist? Any good data scientist must be able to perform problem systems definition, data collection systems analysis integrating into strategic analytics modeling problem-solving operations simulation analysis algorithm and implementation to creating values. Very few people would think using / purchasing top of the line power tools would turn him or her into a master carpenter. So why anyone would believe that applying algorithms or software alone could turn him or her into a data scientist. The "scientist" concept in the open marketplace deal with universal issues, such as gravity or light that affect all humans - the data scientists deal with subject areas that affect a limited number of humans, but the protocols, procedures and everything else are practically the same. In a data science scenario, the data scientist should also be subject to peer review if the consumers expect their results to deal with high-risk scenarios. But to avoid bias and lack of blinding, the subject matter expert should not be the person building the model. And they should get more than one subject matter's opinion (as many as possible) about what candidate variables to consider, for example and other expert inputs and more than one modeler (if they use algorithms that require subjective and potentially biased human choices).
How to define a good algorithm? A good algorithm needs to be developed through integrating knowledge-based data into analytic models simulation testing, implemented for problem-solving. Some business modeling isn't validated, for good reason, because it is sufficiently proven by a performance that independent validation is not economically justified or that it is seen as the secret sauce. Algorithms are indeed nice tools for a data scientist. However, you need to keep in mind that underlying these algorithms are models, models with their own assumptions, strengths, and weaknesses. In addition, these algorithms require data and understanding the idiosyncrasies of these data are critical to model performance. And understanding how to synthesize new predictors in a way which increase the predictive power of these data are critical to increasing model performance. This is where the Data Scientist shows the true value, and where algorithms fall flat on their faces.-These models may predict the direction of an economy (usually large systems of simultaneous econometric equations) validated, independently reproduced, etc.- These models may describe and predict the outcomes of interactions between compounds and biological systems. These models often need to consider molecular structures as well as biological processes.- These models can predict and simulate human behavior through a sequence of events-They may be physical models that are estimating flow through a very dynamic situation using only differential frequency.

The point is that humans should all have some humility and recognize the limitations of their expertise and partner them with the other experts to apply the analytical algorithms for problem-solving. Then there is the opportunity to make something great. Sometimes, you will be unqualified, but still the most qualified resource available, but in other situation, strap on a set of ear and a little humility and help empower the other expert to get the job done.And if a data scientist wants to be called a "scientist" - words mean things - and the word "scientist" is invariably associated with the rigor of a quality outcome.Follow us at: @Pearl_Zhu
Published on July 30, 2015 00:00
No comments have been added yet.