Rate this book

Neural Networks and Deep Learning

Name: Neural Networks and Deep Learning
Rating: 4.57 (67 reviews)

Michael Nielsen

Rate this book

Neural Networks and Deep Learning is a free online book. The book will teach you about:
* Neural networks, a beautiful biologically-inspired programming paradigm which enables a computer to learn from observational data
* Deep learning, a powerful set of techniques for learning in neural networks

Neural networks and deep learning currently provide the best solutions to many problems in image recognition, speech recognition, and natural language processing. This book will teach you the core concepts behind neural networks and deep learning.

GenresArtificial IntelligenceComputer ScienceScienceTechnologyProgrammingNonfictionMathematics

224 pages, ebook

First published November 25, 2013

167 people are currently reading

2184 people want to read

About the author

Michael Nielsen

13 books1,504 followers

On friend requests: I use the "updates" list a lot to find new books. So I will only accept friend requests if either (a) we're actually friends; or (b) your list of updates is mostly books that look really interesting to me, and like the kind of thing I might otherwise not see. It doesn't mean I don't like you if I don't accept the request!

On technical books: I struggle with how to shelve technical books. Serious reading of such a book may take an hour or more per page; I will rarely seriously read the entire book for that reason. On the other hand, I would like to put those books on my Goodreads profile.

Up until October 2023 I mostly simply didn't put them on my profile, with a few exceptions. Going forward I will add more, but tend to shelve them under "serious_use". Roughly speaking, that means I will have seriously used a fair bit of the book - many hours (or, sometimes, far more) of use.

What do you think?

Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars

293 (65%)

4 stars

121 (27%)

3 stars

30 (6%)

2 stars

3 (<1%)

1 star

0 (0%)

Displaying 1 - 30 of 67 reviews

A Mig

333 reviews24 followers

June 3, 2019

Thanks to this book, I can finally build my own neural net from scratch, not just run one line of code on tensorflow, caret or keras to get what I need. This is really a great tutorial: all codes provided, step-by-step improvement of an ANN to predict the MNIST dataset, interactive plots to understand the basics of neural nets, etc. It is easy to read, gives some good insights into the historical problems encountered in this field of research and how those problems got solved. The concepts are very well illustrated. Simply perfect.

science-tech

☘Misericordia☘ ⚡ϟ⚡⛈⚡☁ ❇️❤❣

2,520 reviews19.2k followers

March 3, 2020

A very no-nonsense and hands-on intro to quite a lot of conceptual basics of NN & DL.
Link online: http://neuralnetworksanddeeplearning.com

Dennis Cahillane

115 reviews8 followers

May 29, 2019

Information dense, light on speculation and anecdotes. The author's brain is on the same wavelength as mine, which is a very good thing. Regarding difficulty, this book should be read after a book like The Nature of Code by Shiffman, and before Deep Learning by Goodfellow.

Pastafarianist

24 reviews

April 4, 2015

The book is still a work in progress, so don't take this review too seriously.

I really liked how diligently Nielsen explains the hows and whys of neural networks. If, like I used to, you fear the magic of backpropagation, this book will make every effort to dissolve your fears. An immensely careful and thorough treatment of backprop is probably the book's most valuable chapter. However, later on Nielsen starts making a lot of detours. Of course, the field of neural networks is barely developed at all, so the author has no choice but to keep saying here and there that we don't really know what's going on and why things work the way they do. Still, personally, I would have liked the book more if its ratio of the number of things done per number of words said was a tiny bit higher.

computer-science

Priyansh Trivedi

49 reviews2 followers

August 4, 2016

The first book I've read on the topic. The author carefully lays out a narrative which helps one grasp the entire picture, piece by piece. He works with a (very popular) running example as the concepts are broken down to you. Steers clear of daunting mathematical proofs but doesn't shy away from logical discourses some of which span an entire chapter.

This is by no way a comprehensive book on everything that is happening on the field, but if you, like me are someone who's just begun and is looking at the fringes or the sweet spots, to get in and get a hold of the basics, look no further.

At a steady pace, you can complete it in under four days.

Farsan Rashid

36 reviews17 followers

May 5, 2018

TLDR: Extraordinary for intended readers. The book is intended for readers who wants to understand how/why neural networks work instead of using neural network as a black box.

The book consists of six chapters, first four covers neural networks and rest two lays the foundation of deep neural network. Lots of animations, pictures, interactive elements all over the book makes you visualize neural network right in front of your eyes. Time to time reference of research papers are given for enthusiastic readers.

Chapter 1 : Using neural nets to recognize handwritten digits
Starting from scratch less than 100 lines of code that performs as a neural network to classify hand written digits is created. Gradient descent, the heart of neural network is explained in details. Other key topics of the chapter were neural networks architecture(neuron, bias, weight, learning rate, cos function, activation function), training, test and validation data, stochastic gradient descent, hyper parameters.

Chapter 2:How the backpropagation algorithm works
Using four fundamental equations author describes how exactly backpropagation (the way neural networks learns) works.

Chapter 3:Improving the way neural networks learn
I enjoyed this chapter the most. This chapter teaches how to tune neural network and what things to worry about when designing a robust neural network. Learning slowdown, The cross-entropy cost function, Overfitting, L1 and L2 regularization, dropout, softmax, better initialization method, heuristics for good hyper parameters selection were key topics of the chapter. A very important question "Why does regularization help reduce overfitting?" was answered in a great detail.

Chapter 4:A visual proof that neural nets can compute any function
As the name suggests this chapter teaches why neural net can be theoretically used as a solution to all optimization problem.

Chapter 5:Why are deep neural networks hard to train?
Discusses challenges training a deep neural network. Vanishing gradient problem and exploding gradient problem were the main topic.

Chapter 6:Deep learning
An introduction to deep convolutional networks architecture and implementation using theano library. Recurrent neural nets and long short-term memory units are discussed shortly. Author also shares his thought about future of Machine learning and recent research works of deep learning.

Who should read this book?
- "The purpose of this book is to help you master the core concepts of neural networks, including modern techniques for deep learning."
Who should not read this book?
- "This means the book is emphatically not a tutorial in how to use some particular neural network library. If you mostly want to learn your way around a library, don't read this book!"
Is this book math heavy?
Some chapters are math heavy, some are not. If you can answer the following two question you are good to go.
* What does it mean to differentiate a function?
* How to find the derivative of a composite function? (Chain rule)
If you already know gradient descent there is absolutely nothing to worry about.
Do I need to know programming?
- Programs are written in python but if you have previous exposure to programming in any other language you are good to go.

machine-learning

Ankit

105 reviews47 followers

January 26, 2018

Michael explains the simplistic questions, whys, and hows of Neural Network in a fluent practical way. His clarity about this field and the future of artificial intelligence is appreciable. Michael's book talks to the reader quite often and goes through the thinking procedure till getting the Idea. The thing that I love the most about this book is, it not only explains the facts, it takes to a tour of why we do what we do? how did it get discovered in the first place?. The Appendix discussion about a simple algorithm to crack human intelligence is really intriguing. I appreciate Mr. Nielsen's honest views on the future of this field.

I personally congratulate Mr. Nielson for writing this amazing book. I hope this book had certainly 'fired up some of my neurons'.

read-in-college

Currentcurrant

1 review

April 3, 2021

This book is an excellent introduction to the topic. Hardly any previous knowledge is required and the important ideas are described in a lucid manner and often interactively, either by small widgets in the text or the code accompanying the book. There are also various problems and exercises, and, while the author doesn't recommend doing all of them, and I haven't, those that I did do helped understanding the underlying concepts.

I highly recommend implementing the networks and additional features described in the text yourself. There are several challenges to implementing a neural network, beginning with the need for backpropagation to compute gradients over computing gradients for each weight and bias individually. Developing your own "pet" neural networks alongside the text makes these challenges much more apparent as well as being a fun exercise.

math

Alex

29 reviews

January 2, 2021

This book was a good introduction to the area. After reading it, I found myself quite familiar with the essential techniques and was able to follow some research papers in the area quite easily. The book focuses on explaining things in a more intuitive way, rather than formally, and requires no prior knowledge except basic calculus. The writing is straight to the point and flows, making it a pretty fast read.

Maru Kun

222 reviews556 followers

bubbling-under

September 19, 2018

The website link is here

Alex Aldo

6 reviews4 followers

February 8, 2019

Absolutely amazing! A must read for everyone who wants to kick-start her/himself how neural Networks work, with some basic understanding of how to implement them in python. Many useful exercises and problems test your understanding and motivate to continue.

Harry Harman

827 reviews17 followers

January 9, 2022

in a neural network we don’t tell the computer how to solve our problem. Instead, it learns from observational data, figuring out its own solution to the problem at hand.

It’s not uncommon for technical books to include an admonition from the author that readers must do the exercises and problems.

patience in the face of such frustration is the only way to truly understand and internalize a subject.

In this chapter we’ll write a computer program implementing a neural network that learns to recognize handwritten digits. The program is just 74 lines long, and uses no special neural network libraries. But this short program can recognize digits with an accuracy over 96 percent, without human intervention. Furthermore, in later chapters we’ll develop ideas which can improve accuracy to over 99 percent. In fact, the best commercial neural networks are now so good that they are used by banks to process cheques, and by post offices to recognize addresses

along the way we’ll develop many key ideas about neural networks, including two important types of artificial neuron (the perceptron and the sigmoid neuron)

I just presented the basic mechanics of what’s going on, but it’s worth it for the deeper understanding you’ll attain.

artificial neuron called a perceptron

the main neuron model used is one called the sigmoid neuron

A perceptron takes several binary inputs, x1, x2,..., and produces a single binary output:

Rosenblatt proposed a simple rule to compute the output. He introduced weights, w1,w2, ..., real numbers expressing the importance of the respective inputs to the output. The neuron’s output, 0 or 1, is determined by whether the weighted sum j wjxj is less than or greater than some threshold value.

However, the situation is better than this view suggests. It turns out that we can devise learning algorithms which can automatically tune the weights and biases of a network of artificial neurons. This tuning happens in response to external stimuli, without direct intervention by a programmer. These learning algorithms enable us to use artificial neurons in a way which is radically different to conventional logic gates.

Just like a perceptron, the sigmoid neuron has inputs, x1, x2,.... But instead of being just 0 or 1, these inputs can also take on any values between 0 and 1. So, for instance, 0.638

Show that in the limit as c → ∞

Obviously, onebigdifference between perceptrons and sigmoid neurons is that sigmoid neurons don’t just output 0 or 1. They can have as output any real number between 0 and 1, so values such as 0.173... and 0.689... are legitimate outputs. This can be useful, for example, if we want to use the output value to represent the average intensity of the pixels in an image input to a neural network. But sometimes it can be a nuisance. Suppose we want the output from the network to indicate either “the input image is a 9” or “the input image is not a 9”. Obviously, it’d be easiest to do this if the output was a 0 or a 1, as in a perceptron. But in practice we can set up a convention to deal with this, for example, by deciding to interpret any output of at least 0.5 as indicating a “9”, and any output less than 0.5 as indicating “not a 9”.

As mentioned earlier, the leftmost layer in this network is called the input layer, and the neurons within the layer are called input neurons. The rightmost or output layer contains the output neurons, or, as in this case, a single output neuron. The middle layer is called a hidden layer, since the neurons in this layer are neither inputs nor outputs. The term “hidden” perhaps sounds a little mysterious –the first time I heard the term I thought it must have some deep philosophical or mathematical significance –but it really means nothing more than “not an input or an output”. multiple layer networks are sometimes called multilayer perceptrons or MLPs.

Up to now, we’ve been discussing neural networks where the output from one layer is used as input to the next layer. Such networks are called feedforward neural networks. This means there are no loops in the network –information is always fed forward, never fed back.

However, there are other models of artificial neural networks in which feedback loops are possible. These models are called recurrent neural networks. The idea in these models is to have neurons which fire for some limited duration of time, before becoming quiescent. That firing can stimulate other neurons, which may fire a little while later, also for a limited duration. That causes still more neurons to fire, and so over time we get a cascade of neurons firing.

We humans solve this segmentation problem with ease, but it’s challenging for a computer program to correctly break up the image. We’ll focus on writing a program to solve the second problem, that is, classifying individual digits.

To understand why we do this, it helps to think about what the neural network is doing from first principles.

We’ll use the MNIST data set, which contains tens of thousands of scanned images of handwritten digits, together with their correct classifications. MNIST’s name comes from the fact that it is a modified subset of two data sets collected by NIST, the United States’ National Institute of 1 Standards and Technology

let’s simply ask ourselves: if we were declared God for a day, and could make up our own laws of physics, dictating to the ball how it should roll, what law or laws of motion could we pick that would make it so the ball always rolled to the bottom of the valley?

∇C is just a piece of notational flag-waving, telling you “hey, ∇C is a gradient vector”

where η is a small, positive parameter (known as the learning rate)

we’ll keep decreasing C until –we hope –we reach a global minimum.

An idea called stochastic gradient descent can be used to speed up learning. The idea is to estimate the gradient ∇C by computing ∇Cx for a small sample of randomly chosen training inputs. By averaging over this small sample it turns out that we can quickly get a good estimate of the true gradient ∇C

We can think of stochastic gradient descent as being like political polling: it’s much easier to sample a small mini-batch than it is to apply gradient descent to the full batch, just as carrying out a poll is easier than running a full election. For example, if we have a training set of size n=60,000, as in MNIST, and choose a mini-batch size of (say) m = 10, this means we’ll get a factor of 6,000 speedup in estimating the gradient!

There’s quite a bit going on in this equation, so let’s unpack it piece by piece.

Update the network’s weights and biases by applying gradient descent using backpropagation to a single mini batch. The "mini_batch" is a list of tuples "(x,y)", and "eta" is the learning rate.

If you’re in a rush you can speed things up by decreasing the number of epochs, by decreasing the number of hidden neurons, or by using only part of the training data.

once we’ve trained a network it can be run very quickly indeed, on almost any computing platform. For example, once we’ve learned a good set of weights and biases for a network, it can easily be ported to run in Javascript in a web browser, or as a native app on a mobile device.

, to obtain these accuracies I had to make specific choices for the number of epochs of training, the mini-batch size, and the learning rate, η. As I mentioned above, these are known as hyper-parameters for our neural network, in order to distinguish them from the parameters (weights and biases) learnt by our learning algorithm.

perhaps the outcome will be that we end up understanding neither the brain nor how artificial intelligence works!

Networks with this kind of many-layer structure –two or more hidden layers –are called deep neural networks.

people now routinely train networks with 5 to 10 hidden layers. And, it turns out that these perform far better on many problems than shallow neural networks

If you’re not crazy about mathematics you may be tempted to skip the chapter, and to treat backpropagation as a black box whose details you’re willing to ignore. Why take the time to study those details? The reason, of course, is understanding

implementing matrix multiplication, vector addition, and vectorization

Brian

35 reviews4 followers

March 11, 2019

Simple and informative introduction to many of the most important concepts in this rapidly developing field.

While I read this for the technical details, I was also struck by some of the more abstract discussions on intelligence. For example, this quote stuck with me

"In the early days of AI research people hoped that the effort to build an AI would also help us understand the principles behind intelligence and, maybe, the functioning of the human brain. But perhaps the outcome will be that we end up understanding neither the brain nor how artificial intelligence works!"

Along these lines, if you have any interest in such musings but not in the technical specifics of neural networks, I suggest reading the short appendix titled "Is there a simple algorithm for intelligence?"

Ray

45 reviews5 followers

August 25, 2019

While reading, I knew that I liked this text, and also that I was learning a lot from it, but I couldn't quite pin down what exactly made it so effective as a didactic tool. It was in reading the last section of the last chapter ("Deep Learning"), and the subsequent appendix that I put my finger on how Nielsen was managing to communicate so potently.

How should one go about teaching an emerging field or technology, while developments are still ongoing? What I learned from the way "Neural Networks and Deep Learning" is written is that it helps to think like a science historian. This text spends a lot of time developing the key ideas upon which mountains of research are based. It gives the reader the tools to understand what it means when some new research introduces a new architecture, or learning algorithm, or activation function. I am deeply impressed by how Nielsen managed to write about something that was and is the subject of active research, and do so in a way that is enduring and useful even years later.

Of course, there are many other merits to the content and writing of this book. Nielsen is very formidable at constructing arguments. There's a certain mathematical and philosophical maturity that this text has that impresses at every turn. Gaps in arguments are dutifully pointed out, and assumptions are teased and unearthed at every turn. Because of this, I can't imagine a better introduction to neural networks for those with the proper prerequisite knowledge.

Jan

22 reviews

August 31, 2016

The books most redeeming quality is that the author anticipates the follow up questions the reader might have and keeps on answering them. As a result, one gets a much more complete idea about the individual chapters/book.

On top of that, the exposition is relatively accessible and easy to follow, although there is a few weak spots. The book does get technical (calculus), but it also explains most formulas in plain English and makes it easy to fast forward and just get a rough idea of what's going on.

Reading this book will teach you about neural networks and how they learn first, about deep networks (including convolutional networks) and how their learning differs second, and then give you a tour of recent results and approaches.

AKJ

3 reviews

September 12, 2021

Reading this book was nothing but suffering for me.

1- The codes are written in Python 2.7 and won't run in Python 3 without some excessive changes.
2- Nope! Copy and pasting the codes doesn't work. You need the complete serialized resources locally stored.
3- Why are the noise? The whole book could have been written in less than 50 pages, but no, reading it feels like talking to Grandpa Simpson. Nothing is clear, the sentence constructions leave a lot to be desired.

4+ Useful for total beginners in feed forward deep learning (that don't care about the existense of neural network and the bayesian side of things)
5+ Publicly available.

Thanks for the work.

Josh

153 reviews11 followers

December 16, 2016

Good introduction to neural networks that is more beginner-friendly than most. You'll still need a good understanding of calculus if you want to understand the more technical bits (chapter 2 in particular). Another good (non-book) source for this material is the excellent online Stanford course CS231n.

non-fiction

Sowmitra Das

10 reviews12 followers

Read

October 26, 2017

Literally, the BEST book that I came across to learn about Neural Networks and Deep Learning. It is an absolute treat for any beginner, if not essential. The amount of prior knowledge you require is close to null. Each section and concept is extremely well constructed, motivated and explained. I couldn't suggest a better book to anyone to start learning about neural networks.

Chad

446 reviews75 followers

October 30, 2018

OK, so I write book reviews for my textbooks. Hope that's cool here. I'm getting prepared for the upcoming winter quarter where I will be teaching my first course ("Data Science Methods for Clean Energy Research") pretty much solo. I'm super stoked, but also feeling a tad of imposter syndrome: I only took a short intro course into data science, and the rest is self-taught using Python in my own research. I have never taken a formal course on the material I will be teaching. OK, so some of it, yes: the first few weeks focuses on common statistical methods like regression. This book is a free online textbook covering neural networks that the course material is based off of. I decided to read the textbook through so I can formalize the stuff I've learned on the go. I'm pretty glad I did, too!

The textbook builds a neural network almost from scratch using nothing but the Python package NumPy. I was pretty impressed: I had been relying on the pre-built methods in scikit-learn. The approach is useful, even if you do end up using more complex packages pre-built packages later on, because it gives you the bottom-up approach: what blocks are absolutely essential to neural networks, and what are performance boosters.

The book uses the MNIST data set to demonstrate the power of neural networks. MNIST is a collection of handwritten numbers (0-9). The basic idea is, can we build a neural network that can correctly identify each number? This is the amazing technology that makes it possible for text recognition programs such as the apps that allow you to deposit checks with your phone. By the end of the book, the user can build a neural network that can read handwritten numbers with 99.6% accuracy. Pretty impressive! And that's just scratching the surface. The last two chapters given an introduction to convolution neural networks that can be used for image recognition programs.

The book is a very clear text that has a perfect balance of the what I think three essential elements to data science learning: intuition, math, and code. The author takes a conversational tone; you could imagine being in the classroom. I felt I had a pretty good grasp of the basics to neural networks by the end, and now I won't be using them like a black box anymore.

textbooks

Liudas

33 reviews2 followers

February 1, 2020

The nuts and bolts of neural networks: how they work, and how they can be used to solve pattern recognition problems.

Deep dive in ANN, some CNN and basic concepts of RNN/LSTM/Boltzmann Machines. The book also contains very valuable links to original papers of deep learning "godfathers" and current "gods" as well as tips and tricks on network performance optimisation techniques.

I suppose you would read this to understand fine details for neural nets and deep learning therefor before reading this I suggest to revisit essence of calculus and linear algebra (mine was rusty as it was not much used for 15+ years after CS studies). I found helpful these videos:
- Essence of Calculus
- Essence of Linear Algebra
- Neural Nets

If you want "surface scratcher/introduction" to deep learning and neural nets I suggest to take some course on Udemy or other e-learning platform before reading this. I did it myself and missed "deeper" parts on how they actually work - "all the math and why" - and this book is perfect for that.

As additional benefit there are good philosophical insights on AI in general.

Justpassingby

188 reviews4 followers

Read

March 23, 2020

I read this book at about the right time in my personal development. It provides a good balance between, on the one hand, description and contemplation, and on the other hand, very concrete examples, problems and code. It is possible to read the book without programming, but actually doing the programs (and the occasional maths) really changed the experience.

All the exercises and nearly all the themes revolve around the classical example of recognizing handwritten numbers. This is a good choice because it avoids getting lost in the complexity of the problem itself, plus it permits a fairly objective evaluation of the different techniques such as stochastic gradient descent, several approaches to regularization, softmax output, cross-entropy costing and various activation functions.

Nielsen expresses an interest in so-called deep networks, but warns at the same time that the goal is not to have as many layers as possible, but to solve problems. If a problem can be solved to a high accuracy by 5 convolutional layers and 2 fully connected ones, then going any deeper is an exercise in futility.

The author studiously avoids discussing artificial intelligence, relegating all wildly ambitious goals to an appendix (which happens to form very interesting reading in its own right).

Uri Barkan

7 reviews

May 12, 2024

Michael Nielsen's "Neural Networks and Deep Learning" provides a remarkably clear and thorough introduction to the fundamentals of neural networks and deep learning. The book is targeted towards individuals with a background in undergraduate-level linear algebra and calculus, as well as some familiarity with Python. If you fit this description and are eager to delve into the world of neural networks and deep learning, this book is an excellent starting point.

One of the book's greatest strengths lies in its focus on the core principles of machine learning rather than fleeting trends. This approach ensures the material remains relevant and insightful, even though the book was published some time ago. However, it's worth noting that the code examples in the Deep Learning chapter rely on Theano, a library that has fallen out of favor in recent years. While this doesn't detract from the theoretical understanding gained, readers should be prepared to adapt the code to more contemporary frameworks like TensorFlow or PyTorch.

Overall, "Neural Networks and Deep Learning" is a highly recommended read for beginners who possess the necessary mathematical and programming background. Nielsen's lucid explanations and emphasis on fundamental concepts offer a strong foundation for further exploration in this rapidly evolving field.

Christian

46 reviews

January 20, 2022

Who's this for:

If you want an hands-on introduction to neural networks this book is a good start. Ideally you already have some experience with machine learning and the commonly used math behind it. I think the sweespot for interested readers is if you have a) a general fundament of machine learning and b) practical experience implementing at least simple networks. Then the book nicely is able to connect both of these parts well.

You can find the whole book here: http://neuralnetworksanddeeplearning....

I liked:

- Online, free and with interactive parts
- It's nice that there are complete examples with the added explanations
- It introduces many current concepts in NN and puts them into the right context to then be able to dig deeper (cross-entropy, overfitting, regularization, hyperparameter tuning, CNNs, RNNS etc.)

I didn't like:

Nothing big comes to mind. The website could use some tweaks (width of text, index numbering, etc.) but content-wise it delivers what it promises.

Things I learned:

It's a bit out of scope to retell everything here for such a book but generally it helped to connect some more theoretical contexts I knew with the more applied side (which a tutorial of a NN library would deliver as well).

Giorgi

26 reviews

July 30, 2024

These days AI is everywhere and it is easy to be overwhelmed all the hype around it. Once these first impressions subside you might ask - but how does this work and you better prepare for another layer of hype and buzz words. Transformers, attention is all you need, billions of parameters, RAG and who knows what. At the end of the research you might also come to sad conclusion that it is not quite obvious how LLMs (or what we call AI today) work. But that should not deter you from trying to understand as much as possible and to do so you should start at fundamentals.
This book is such fundamental piece of knowledge that is mandatory to build any solid understanding upon. It walks through some simple shallow feed forward networks, its history, why and how it works and builds upon it until it reaches part of deep learning.
What I really enjoyed about it is how accessible it is even for people without much prior knowledge - it also comes with code examples you can experiment with.
It is truly remarkable that Neural Networks can learn patterns in data as well as they do and this book is remarkable to make understanding of this algorithm simple.

Max

50 reviews3 followers

April 11, 2018

This stands out among the common introductions to machine learning with artificial neural networks because the author's presentation of the mathematical formalism is refreshingly clear and precise. Symbols do not suddenly change their meanings and indices are well defined. The author does not gloss over the mathematics, but enhances derivations with heuristic explanations and interactive illustrations that both serve well to strengthen intuitions. I guess Nielsen's background in theoretical physics shows. I also enjoyed the parts that were more philosophical in spirit.

This is a great tutorial for the basics of deep learning and a recommendable survey of seminal results up to about 2015. Naturally, the most recent developments of this very fast moving field could not be included. I would have loved to find more content geared towards other applications than classifying images (recognizing digits mostly), like natural language processing.

computing

Yass Fuentes

83 reviews9 followers

March 22, 2019

Me ha parecido un libro muy interesante, se entiende fácil y establece las bases sobre redes neuronales de una forma amena y sencilla. Le doy un cinco, porque se lo merece, porque su autor ha dejado el libro en abierto, porque se ve muy trabajado y porque el contenido es de calidad.

Como crítica, quizás el último capítulo se vea un poco atropellado, entiendo que son muchas cosas sobre las que hablar, y entiendo que cuando se ha establecido el marco general y vas a terminar el libro, cuando quieres comentar cómo de abierto está el campo, solo puedes hacer una somera introducción al estado del arte.

Aún y con todo es muy bueno y se lo recomiendo a cualquiera que quiera entrar en los fundamentos de las redes neuronales.

study

Venkatesh-Prasad

223 reviews

February 3, 2019

A nice introduction to foundational aspects of NN and DL. The use of interactive graphics helps convey the intuition behind the workings of NN and DL. Even so, being familiar with math (calculus and linear algebra) and being patient will help get the most out of the book. That said, I applaud the author's attempt to make to simplify the topic. A good online resource to start learning about NN and DL.

Nada

1 review

December 27, 2020

Very well written book! Doesn’t assume any prerequisite, I enjoyed the writing style, fun and deep at once. It provides the reader with a cohesive clear picture of the field. And most importantly, it answers questions like “where did the idea originate and how it evolved”. Questions we don’t get easily from other resources but in my point of view are crucial for research oriented understanding of the field! Highly recommended book!

Alex Esteban

111 reviews1 follower

August 4, 2023

3 ★

Exercises without solution are a big con. I rather not have exercises than have them unsolved.

Answers to deep network problems such as unstable gradient problem were not answered in Chapter 6, as the author claimed; only one or two paragraphs are dedicated to it, saying that we don't have the knowledge to answer it, and the hype was deceiving.

I think this book is nice for starting into neural networks, but some explanatinos may be shallow for developing useful knowledge.

2023 no-ficció

Mario Aburto

16 reviews

May 13, 2017

A very good book to understand the basics of neural networks, the problems that come with them and some improvements to address those problems. The explanations are very clear, and the text is easy to follow.

Additionally, once read, the text is useful for future reference.

Displaying 1 - 30 of 67 reviews

More reviews and ratings