Derive useful insights from your data using Python. You will learn both basic and advanced concepts, including text and language syntax, structure, and semantics. You will focus on algorithms and techniques, such as text classification, clustering, topic modeling, and text summarization.
Text Analytics with Python teaches you the techniques related to natural language processing and text analytics, and you will gain the skills to know which technique is best suited to solve a particular problem. You will look at each technique and algorithm with both a bird's eye view to understand how it can be used as well as with a microscopic view to understand the mathematical concepts and to implement them to solve your own problems.
What You Will Learn:
Understand the major concepts and techniques of natural language processing (NLP) and text analytics, including syntax and structure
Build a text classification system to categorize news articles, analyze app or game reviews using topic modeling and text summarization, and cluster popular movie synopses and analyze the sentiment of movie reviews
Implement Python and popular open source libraries in NLP and text analytics, such as the natural language toolkit (nltk), gensim, scikit-learn, spaCy and Pattern
Who This Book Is For: IT professionals, analysts, developers, linguistic experts, data scientists, and anyone with a keen interest in linguistics, analytics, and generating insights from textual data
If you are new to text analytics, natural language processing and all that jazz, this will be a good introductory book to guide you through the concepts, the high level ideas and what will/should be included in the pipeline. Plus it comes with Python implementation (source code available on Github) which is greatttt. However, bear in mind that the stuffs written here will be quite simple, conventional and not particularly in-depth of any techniques which are not unusual for this kind of "introductory" book. Although I wish the author dived a bit deeper into discussing the results from the implementation rather than just showing the mere results and visualisations. Still a good read though.
It's a very useful book though it's a little bit confusing for beginners. in"lemmatization "section for example the code was too complex, otherwise it can be written in three lines with (pattern)library : import pattern from pattern.en import lemma
def lemmatization(text):
lemmatization =" ".join([lemma(wd) for wd in text.split()]) return lemmatization