Jump to ratings and reviews
Rate this book

Text Processing with Ruby

Rate this book
Most information in the world is in text format, and programmers often find themselves needing to make sense of the data hiding within. You want to do this efficiently, avoiding labor-intensive, manual work—and Ruby is ideally suited to this task.

Text Processing with Ruby takes a practical approach to working with text:

First, Acquire: Explore Ruby’s core and standard library, and what’s possible with IO and its derived classes like File. Extract text into your Ruby programs from the file system and standard input. Process delimited files such as CSVs, and write utilities that interact with other programs in text-processing pipelines. Process web pages with Nokogiri to pull out information from even the messiest of HTML, and decipher character encoding mysteries.
Second, Transform: Use regular expressions to match, extract, and replace patterns in text. Write a parser using Ruby’s StringScanner library. Use Natural Language Processing techniques to extract keywords and implement fuzzy searching.
Finally, Load: Write the transformed text and data to standard output, files and other processes. Serialize text into JSON, XML, and CVS, and use ERB to create more complex formats.
You’ll soon be able to tackle even the most enormous and entangled text with ease, scything through gigabytes of data and effortlessly extracting the bits that matter.

What You Need

This book requires a passing familiarity with the Ruby programming language, and assumes that you already have Ruby installed on your computer.

200 pages, ebook

First published November 15, 25

2 people want to read

About the author

Rob Miller

24 books2 followers

Ratings & Reviews

What do you think?
Rate this book

Friends & Following

Create a free account to discover what your friends think of this book!

Community Reviews

5 stars
0 (0%)
4 stars
0 (0%)
3 stars
0 (0%)
2 stars
0 (0%)
1 star
0 (0%)
No one has reviewed this book yet.

Can't find what you're looking for?

Get help and learn more about the design.