Jump to ratings and reviews
Rate this book

Python 3 Text Processing with NLTK 3 Cookbook

Rate this book

This book will show you the essential techniques of text and language processing. Starting with tokenization, stemming, and the WordNet dictionary, you'll progress to part-of-speech tagging, phrase chunking, and named entity recognition. You'll learn how various text corpora are organized, as well as how to create your own custom corpus. Then, you'll move onto text classification with a focus on sentiment analysis. And because NLP can be computationally expensive on large bodies of text, you'll try a few methods for distributed text processing. Finally, you'll be introduced to a number of other small but complementary Python libraries for text analysis, cleaning, and parsing.

This cookbook provides simple, straightforward examples so you can quickly learn text processing with Python and NLTK.

305 pages, Kindle Edition

First published January 1, 2014

13 people are currently reading
47 people want to read

About the author

Jacob Perkins is an open source programmer, NLP hacker, and startup entrepreneur. He is currently the CTO & co-founder of Weotta, a semantic search engine for local events, activities, restaurants and more. His major open source contributions are to NLTK, a Python toolkit for natural language processing, and Seahorse, the Gnome encryption key application. He contributed a chapter to the Bad Data Handbook and runs a site for NLTK demos and APIs.

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
3 (17%)
4 stars
10 (58%)
3 stars
3 (17%)
2 stars
0 (0%)
1 star
1 (5%)
Displaying 1 - 3 of 3 reviews
15 reviews3 followers
October 23, 2014
In its introduction, the Python 3 Text Processing with NLTK 3 Cookbook claims to skip the preamble and ignore pedagogy, letting you jump straight into text processing. Although it does skip the preamble, I would argue that this statement is false – it definitely does not skip the pedagogy. The examples this book shows you are practical, understandable and well-explained.

The book is intended for those familiar with Python who want to use it in order to process natural language. Following this credo, there is no discussion about software design and no attempt to make especially elegant code. I tend to nitpick at code quality, and although there was nothing that upset me in the code examples here, they didn’t awe me with their subtle beauty. However, the raw power of NLTK, combined with the flexibility of Python, impressed me deeply.

The author takes you on a trip through a large section of natural language processing, starting with text tokenization and using Wordnet. I really enjoyed ideas on computing the semantic “distance” between different words by traversing subset trees. It then continues on to show you how to replace and correct words, tag parts of speech intexts, chunk texts and transform text chunks, and how to classify text. The whole thing is rounded off by a discussion on distributed processing with some nice examples of how to use execnet as a simple but effective message passing interface.

Reading all these examples made me want to go out and write a search engine or a text classifier – with NLTK, daunting tasks in this field become easy.

Above and beyond the practical text processing material in this book, what I enjoyed most was its coverage of various machine learning algorithms. The book definitely is not about machine learning, but it affords you a glimpse into the world of machine learning in a way that you can understand what you’re doing if you’re just using what different libraries give you out of the box. I appreciated these more extended explanations, which I often miss in texts involving machine learning.
Profile Image for Petr.
437 reviews
February 6, 2017
It is great when the author of a tool/library/software takes you on a tour through their creation. This book even more so. Perkins wrote a wonderful guide for his Natural Language Toolkit. The examples are clear, he explains the mechanisms but also keeps the book well structured for just quick reference or "cooking" (it is a cookbook after all).
Displaying 1 - 3 of 3 reviews

Can't find what you're looking for?

Get help and learn more about the design.