This book describes OCLC's contributions to the transformation of the Internet from a web of documents to a Web of Data. The new Web is a growing ""cloud"" of interconnected resources that identify the things people want to know about when they approach the Internet with an information need. The linked data architecture has achieved critical mass just as it has become clear that library standards for resource description are nearing obsolescence. Working for the world's largest library cooperative, OCLC researchers have been active participants in the development of next-generation standards for library resource description. By engaging with an international community of library and Web standards experts, they have published some of the most widely used RDF datasets representing library collections and librarianship. This book focuses on the conceptual and technical challenges involved in publishing linked data derived from traditional library metadata. This transformation is a high priority because most searches for information start not in the library, nor even in a Web-accessible library catalog, but elsewhere on the Internet. Modeling data in a form that the broader Web understands will project the value of libraries into the Digital Information Age. The exposition is aimed at librarians, archivists, computer scientists, and other professionals interested in modeling bibliographic descriptions as linked data. It aims to achieve a balanced treatment of theory, technical detail, and practical application.
You might be a metadata nerd if you're reading this in-depth analysis of OCLC's latest experiment over Christmas break but that's okay, I did too. A bit overwhelming at times, Godby does an excellent job of making incredibly complicated text mining and machine learning processes seem like just a step above what many of us are already doing to batch update records. It's just a computer extracting patterns and co-locating records to form Work Clusters at a faster rate than we ever could. Definitely a threatening thought for the "traditional" cataloger (unless you're in Special Collections!) but extremely exciting for the rest of us who have to work at scale. Of course, OCLC starts at the simplest level of documented and controlled relationships, those captured within authority records, and then builds upon those relationships and URIs to generate Works. Then they try for the bigger goal, standardizing larger free-text data, like publisher names, locations, events, through text mining. Of course, anyone working in serials can tell you that there may be an upper limit to this work because although corporations are slightly more stable than journal titles, someone still has to update the information behind the URI somewhere. Godby also does a really great job of selling the idea of extending schema.org, arguably better than they did themselves in the webinar I attended last month. Considering the handful of major linked data projects at the national level, it easily gets confusing as to which agency is doing what and why. BIBFRAME? BIBFLOW? BIBFRAME Lite? BiblioGraph? Schema BibExtend? So, it's awesome to see an industry leader try to work with the computing technology already underpinning the internet to expose our rich metadata instead of trying to hide, reduce, or eliminate it entirely.
A challenging but worthwhile read. While I cannot claim any expertise now I at least have a better understanding of the background for Bibframes (oh so exciting!).