Offers both students and professionals guidance on large-scale information systems. This resource describes a new generation of techniques for compressing, storing, and retrieving information--both machine readable text and optically scanned documents. Appropriate for information science and information retrieval courses.
The title will make you laugh in 2020, so why would I recommend this citation from stanford.edu/~backrub more than not only the gushing river of nonsense that is the last 5 years of arXiv.ml but even more than the comparatively solid books from 2010 and 2000 on statistics and ML?
Because
[[draft review. goodreads does not have a save function.]]
Much of what machine-learning is about *isn’t* on-the-fly computation, it’s about storing good representations which then index
In an online world where HN has way too much mindshare, it’s relaxing to step back to the days of payphones, cassette voice mails, and yellow page directories.
This is probably a bit dated due to advances in the state of the art, but it is still a great introduction to the topic of document storage and search.
Compressing and Indexing Documents and Images by Ian H. Witten is an essential resource for anyone dealing with large data volumes. This comprehensive guide covers the principles, techniques, and methodologies of data compression and indexing. The book strikes a perfect balance between theory and practical applications, with clear explanations and helpful examples. Its attention to detail and coverage of both textual documents and image data make it a valuable asset for professionals in information management. Highly recommended for those seeking efficient data handling strategies.
This book was instrumental in my research and writing process for an article I recently produced on global gigabyte pricing. It provided me with the necessary insights and knowledge to analyze and understand the complexities of data compression and indexing, enabling me to draw meaningful conclusions. If you're interested, you can find the article at https://hellosafe.pt/telecomunicacoes...
Refresher on Huffman codes, bitmaps, indexing, compression of images, textual images. A book is a bit old author is still concerned about gigabytes, nevertheless many practices are still applicable today.