Rate this book

Foundations of Statistical Natural Language Processing

Name: Foundations of Statistical Natural Language Processing
Rating: 4.12 (18 reviews)
ISBN: 9780262133609

Christopher D. Manning, Hinrich Schütze

Rate this book

Statistical approaches to processing natural language text have become dominant in recent years. This foundational text is the first comprehensive introduction to statistical natural language processing (NLP) to appear. The book contains all the theory and algorithms needed for building NLP tools. It provides broad but rigorous coverage of mathematical and linguistic foundations, as well as detailed discussion of statistical methods, allowing students and researchers to construct their own implementations. The book covers collocation finding, word sense disambiguation, probabilistic parsing, information retrieval, and other applications.

GenresComputer ScienceLinguisticsNonfictionScienceArtificial IntelligenceLanguageProgramming

679 pages, Hardcover

First published June 18, 1999

39 people are currently reading

908 people want to read

About the author

Christopher D. Manning

8 books16 followers

Professor of Linguistics and Computer Science, Natural Language Processing Group, Stanford University

What do you think?

Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars

97 (36%)

4 stars

115 (43%)

3 stars

42 (15%)

2 stars

9 (3%)

1 star

2 (<1%)

Displaying 1 - 18 of 18 reviews

Ushan

801 reviews79 followers

December 29, 2010

As the great American anthropologist-linguist Edward Sapir put it, all grammars leak. Some sentences are obviously grammatical, some are obviously ungrammatical, but there are gray areas; native speakers of English disagree on whether sentences such as "Who did Jo think said John saw him?" and "The boys read Mary's stories about each other" are grammatical. A way of resolving this difficulty is to look at a large corpus of texts; sentence structures that occur there often are grammatical, sentence structures that never occur are ungrammatical, and those that occur rarely are in a gray area. We will also need to assign a nonzero probability to sentence structures that we have never seen before, higher if they resembe ones that we've seen before than if they don't. Before Noam Chomsky invented them in 1957, neither "Colorless green ideas sleep furiously" nor "Furiously sleep ideas green colorless" ever occurred in an English text, but sentences like the former occurred much more frequently than sentences like the latter. This book discusses various algorithms used in corpus-based linguistics: parsing text, aligning text in two languages, deciding on the meaning of ambiguous words such as "plant" (a living organism from the kingdom Plantae, or a factory) and "interest" (curiosity, or share in a company). These algorithms do not always work correctly, but they work well enough to be used in the real world.

computer-science linguistics

Michael Shaw

3 reviews2 followers

December 7, 2011

A must read for anyone looking to get into NLP. Teaches from first principles, including briefly touching on information theory/entropy. I felt it was well grounded, and proceded at a good pace. No prior knowledge is required.

I picked this up at the same time as "Speech and Language Processing" (Jurafsky & Martin) and while Foundations is a much narrower book (making up with depth), I think it's for the better, as I found SLP far too broad and thin.

Emmi

135 reviews

December 14, 2017

Explanation on basic idea on NLP is very good, but only this book is not enough to get entire idea on NLP. Better to read "Speech and Language Processing" as well (By Dan Jurafsky, James H. Martin ).

Anna Kiepura

6 reviews

Read

August 6, 2023

masakra

Terran M

78 reviews107 followers

May 19, 2018

A classic on natural language processing. If you know nothing about natural language processing, or have a piecemeal understanding, this book will give you an overview of the field in a rigorous and yet comprehensible way.

Note that this book was written in 1999, so it far predates the current practice to use recursive neural networks for natural language. This book will give you exactly what it says in the title, Foundations, not “modern best practices.”

You may also be interested in Introduction to Information Retrieval by the same authors

Daniel Smith

6 reviews

February 4, 2018

A bit dated now, but still a solid introduction to NLP.

Ane

200 reviews2 followers

February 22, 2019

Very good introductory book to NLP, a little oudated.

nonfiction science

Luuk Suurmeijer

1 review1 follower

October 21, 2020

This book is an exceptional introduction into the world of statistical methods for NLP tasks. The math is fairly accessible and it continues to be my main resource for reference in this field.

Farzam

12 reviews4 followers

May 26, 2021

A little bit outdated and also tough to read, but I liked it. I only needed to cover a few chapters, however I think some more practical examples and coding snippets would be extremely helpful

Douglas Summers-Stay

Author 1 book50 followers

June 14, 2015

This 1999 book does a good job of explaining the different areas of statistical NLP. It was easy to read and very clear, even the formula-heavy sections. The sections on collocations (multi-word phrases) and verb subcategorization were largely new to me.
The problems that natural-language research has faced are similar to the ones computer vision faces, but easier. What that means is that the researchers have made a lot more progress in the higher-level organization of concepts instead of getting stuck at the level of simple features and recognizing objects like computer vision has been.

non-fiction

David

Author 20 books406 followers

December 4, 2011

This and Speech and Language Processing by Jurafsky and Martin are the two big introductory texts in natural language processing. I prefer the Jurafsky book; it goes into more detail, has more examples, and is written more for use as a class text. The Manning and Schutze book is much more mathematically oriented and goes into more detail on algorithms, so if you're focusing on the statistical aspect more than the language aspect, refer to this book. Ideally, you probably want both.

computer-science linguistics machine-learning

Rachid El guerrab

1 review

May 10, 2012

Needs more walk-through integrated examples, not just simple illustrations for specific paragraphs.

It could also benefit from a discussion of NLP software and possible architectures for the domain.