More on this book
Community
Kindle Notes & Highlights
by
Mariya Yao
Read between
January 1 - January 1, 2019
Though AI refers to a larger umbrella of computational techniques, the most successful modern AI solutions are powered by machine learning algorithms.
Statistics is the discipline concerned with the collection, analysis, description, visualization, and drawing of inferences from data.
Data mining is the automation of exploratory statistical analysis on large-scale databases,
The goal of data mining is to extract patterns and knowledge from large-scale datasets so that they can be reshaped into a more understandable structure for later analysis.
Symbolic systems are programs that use human-understandable symbols to represent problems and reasoning.3 The most successful form of symbolic systems is the expert system, which mimic the decision-making process of human experts.
What happens if you want to teach a computer to do a task, but you’re not entirely sure how to do it yourself? What if the problem is so complex that it’s impossible for you to encode all of the rules and knowledge upfront?
Machine learning enables computers to learn without being explicitly programmed. It is a field in computer science that builds on top of computational statistics and data mining.
Supervised learning occurs when the computer is given labeled training data, which consists of paired inputs and outputs (e.g. an image of a cat correctly labeled as “cat”), and learns general rules that can map new inputs to the correct output.
Unsupervised learning occurs when computers are given unstructured rather than labeled data, i.e. no input-output pairs, and asked to discover inherent structures and patterns that lie within the data. One common application of unsupervised learning is clustering, where input data is divided into different groups based on a measure of “similarity."
Reinforcement learning is learning by trial-and-error, in which a computer program is instructed to achieve a stated goal in a dynamic environment.
Deep learning is a subfield of machine learning that builds algorithms by using multi-layered artificial neural networks, which are mathematical structures loosely inspired by how biological neurons fire.
In practice, using simpler AI approaches like older, non-deep-learning machine learning techniques can produce faster and better results than fancy neural nets can. Rather than building custom deep learning solutions, many enterprises opt for Machine Learning as a Service (MLaaS) solutions from Google, Amazon, IBM, Microsoft, or leading AI startups.
Probabilistic programming enables us to create learning systems that make decisions in the face of uncertainty by making inferences from prior knowledge.
Probabilistic programs have been used successfully in applications such as medical imaging, machine perception, financial predictions, and econometric and atmospheric forecasting.
There are four broad categories of ensembling: bagging, boosting, stacking, and bucketing. Bagging entails training the same algorithm on different subsets of the data and includes popular algorithms like random forest. Boosting involves training a sequence of models, where each model prioritizes learning from the examples that the previous model failed on. In stacking, you pool the output of many models. In bucketing, you train multiple models for a given problem and dynamically choose the best one for each specific input.
Systems That Act, which we define as rule-based automata.
Systems That Predict are systems that are capable of analyzing data and using it to produce probabilistic predictions.
machine learning approaches to lead scoring can perform better than rule-based or statistical methods.
computers have been used for generative design and art for decades.
Sentiment analysis, also known as opinion mining or emotion AI, extracts and quantifies emotional states from our text, voice, facial expressions, and body language.
A System That Masters is an intelligent agent capable of constructing abstract concepts and strategic plans from sparse data.
You may have brilliant ideas for using artificial intelligence to improve your organization and community, but translating those ideas into viable software requires having the right mindset, dedicated leadership, and a diverse support team.
The ideal characteristics of an executive AI champion include: C-Suite executive level or higher Business and domain expert Credible and influential Technically knowledgeable Analytical and data-driven Controls sufficient budget Encourages experimentation Understands and accepts risks Collaborates well with decision-makers across multiple business units
The CTO defines the technology architecture, runs engineering teams, and continuously improves the technology behind the company’s product offerings. Creativity, technical skill, and ability to innovate are essential to a CTO’s success.
The CIO runs an organization’s IT and Operations to streamline and support business processes. Unlike the CTO, the CIO’s customers are internal users, functional departments, and business units. CIOs typically adapt and integrate third-party infrastructure solutions to meet their unique business needs and do less custom development than CTOs do.
Chief Data Officers (CDOs) are becoming increasingly common,46 but their mandate is more often the security, regulation, and governance of enterprise data.
CAO can then apply meaningful analytics to solve business problems. The roles overlap, and the titles are often interchangeable.
“A hundred years ago electricity transformed countless industries; 20 years ago the internet did, too. Artificial intelligence is about to do the same,” writes Andrew Ng,
Presenting a clear ROI on AI initiatives is the best way to persuade executive stakeholders, but this can be challenging when enterprise AI adoption is early and still being proven in many sectors.
key strategy is to appeal to your business leaders about the potential of increasing the bottom line.
project, emphasize the value that new technology can deliver instead of the technical details of implementation.
AI systems largely handle individual tasks, not whole jobs.
Machine Intelligence Continuum (MIC) framework and AI applications within a specific industry, a training module on how to evaluate an organization for AI-readiness, and a hands-on project to design and implement a pilot.
data science team manager understands how best to deploy the expertise of his team in order to maximize their productivity on a project.
ML engineers build machine learning solutions to solve business and customer problems. These specialized engineers deploy models, manage infrastructure, and run operations related to
machine learning projects. They are assisted by data scientists and data engineers to manage databases and build the data infrastructure necessary to support the products and services used by their customers.
see. Data scientists collect data, spend most of their time cleaning it, and the rest of their time looking for patterns in the data and building predictive models.
Researchers are more focused on driving scientific discovery and less concerned with pursuing industrial applications of their findings.
Applied researchers straddle research and engineering.
Underfitting occurs when your model is too simple to capture the complexities of your underlying data.
Overfitting occurs when your model does not generalize well outside of your training data.
“The Hidden Technical Debt In Machine Learning Systems,”
Technical debt refers to the cost of additional rework that will be needed in the future when you opt for quick and hacky fixes early on.
However, the performance of your existing models will deteriorate as environmental conditions change over time.
Machine learning debt can be divided into three main types: code debt, data debt, and math debt.
math debt stems from the complexity of the model’s algorithms.
Companies such as Google, Facebook, and AirBnB have created internal Machine Learning as a Service (MLaaS) platforms to enable their engineering teams to build, deploy, and operate machine learning solutions with ease.
MLaaS also facilitates clear knowledge documentation. The following types of data are captured by Uber’s Michelangelo88: Who trained the model Start and end time of the training job (some complicated training jobs can take hours or even days) Full model configuration (features used, hyper-parameter values, etc.) Reference to training and test datasets Model accuracy metrics Standard charts and graphs for each model type Full learned parameters for the model Summary statistics for model visualization Other notes and information