Doug Lautzenheiser

45%
Flag icon
While stemming can create non-real words, such as 'thu' (from 'thus'), as shown in the previous example, a technique called lemmatization aims to obtain the canonical (grammatically correct) forms of individual words—the so-called lemmas. However, lemmatization is computationally more difficult and expensive compared to stemming and, in practice, it has been observed that stemming and lemmatization have little impact on the performance of text classification (Influence
Python Machine Learning: Machine Learning and Deep Learning with Python, scikit-learn, and TensorFlow
Rate this book
Clear rating
Open Preview