Pages
Categories
- Figure Skating (25)
- Keiko (3)
- Knitting (2)
- mashups (1)
- Metadata (16)
- WorldCat (5)
- Music (1)
- News (4)
- NextGenCatalogs (1)
- On my mind (15)
- Personal (14)
- Research (6)
- ReadingNotes (5)
- Sabbatical (38)
- SemanticWeb (3)
- Theatre (7)
- Travel (5)
- Work (4)
Blogroll
Archives
Meta
A version of this post was published on the ALA TechSource Blog on July 31, 2009.
From Legacy Data to Linked Data: Preparing Libraries for Web 3.0
Monday, July 13, 8:00 to 10:00 a.m.
How can library cataloging data be transformed to function within “Web 3.0″ and be understood by non-library web applications? Speakers from both the library and Semantic Web communities will explore the situation in a non-technical manner and describe current work underway to transform legacy library data into linked data.
Moderator: Corey A. Harper, Metadata Services Librarian, New York University
Speakers: Eric Miller, President, Zepheira, Inc.; Diane Hillmann, Director of Metadata Initiatives, Information Institute of Syracuse; Jennifer Bowen, Co-Principal Investigator, eXtensible Catalog Project, University of Rochester; Rebecca Guenther, Senior Networking and Standards Specialist, Network Development & MARC Standards Office, Library of Congress
This session was a highlight of the 2009 ALA Annual Conference. It brought together four recognized leaders to discuss the emergence of linked data on the Web and the role that the library community can play in realizing the Semantic Web. The session drew a standing room only crowd, and offered a glimpse at the future of cataloging.
First up was Eric Miller, President and co-founder of Zepheira, who provided an overview of the current state of linked data development. Miller defined linked data as: “a term used to describe a recommended best practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs.” He emphasized sharing and connecting data as the key elements. Thousands of organizations and individuals are currently participating in creating linked data, and the availability of linked data has increased tremendously over the past six months.
Linked data principles:
- URIs represent “things”: people, places, concepts, departments
- Using HTTP-compliant URIs makes data more accessible
- When serving URIs, deliver useful, reusable information
- Leverage standards (RDF, SKOS, etc.)
- Add context. It’s all about connecting, creating meaningful relationships between data.
Miller argued that the Web itself is becoming the basic architecture for building applications. Linked data applications don’t run ON the Web; they are applications OF the Web. Users increasingly want their data back, and they want it back their way. With linked data, users are no longer limited to searching based on relationships that have been pre-defined by application developers, database designers, or librarians; users can create and search based on relationships that are meaningful to them. Miller’s company Zepheira is currently working with the Library of Congress to create Recollection, a new platform intended to provide more useful tools and processes for sharing diverse content across the myriad collections covered by the LC Digital Preservation Program. This will empower users to create new views for existing data, combine data sets in customizable ways, and build communities around the data, allowing them to collaborate in curating and connecting collections in customized ways. Zepheira has also launched Freemix, a new social networking application designed to allow users to mix and share data.
In closing, Miller noted that in the linked data environment, credibility is more important than ever before. Libraries are trusted institutions with a wealth of experience in organizing and managing information resources. The library community needs to position itself to leverage this reputation and take a larger role in the development of linked data applications. Linked data has arrived, and the library community cannot afford to be left behind.
Diane Hillmann’s presentation addressed the question: Are Libraries Ready for Linked Data? Her answer: a resounding yes! Linked data is all about relationships, libraries have been concerned with expressing relationships between information objects for a very long time, and we now understand that we must use machine-based methods if we want to do a really good job. Traditional cataloging provides attribute = value pairs, for example: Title = [value] or Author = [value]. These attributes are embedded within a record that has an identifier. Because they don’t have independent identifiers, attributes cannot be referenced outside the context of a record. Linked data is based upon a model of triples consisting of subject, predicate, and object, which permits the assignment of identifiers at the attribute level. Identifiers can also be assigned to relationships between attributes. Hillmann is currently involved in building a registry that maintains and serves relationship identifiers: http://metadataregistry.org/ A vocabulary based on RDA should be completely registered within a few weeks. It will be freely accessible to support linked data applications implemented by libraries and others. The registry, combined with the availability of applications and tools such as those being developed in conjunction with the eXtensible Catalog project, constitute essential infrastructure required to enable the library community to become more actively engaged with both using and creating linked data.
Jennifer Bowen provided an overview of the eXtensible Catalog (XC) project, and described how XC supports linked data. One of the primary goals of the project is to build open source software that supports reuse of MARC-encoded library metadata in an extensible environment. Though it has added to the cost of development, XC has been designed specifically to support linked data. XC metadata is based on the FRBR model, and it supports a level of granularity similar to MARC. XC also facilitates metadata harvesting via OAI-PMH and transformation of Dublin Core (DC) metadata. The XC application profile is being developed in accordance with the guidelines for DC application profiles, though it does not mandate the inclusion of DC Metadata Initiative (DCMI) terms. XC requires that terms be defined in RDF, and it is designed to utilize metadataregistry.org. XC incorporates terms from several namespaces and defines a 37 custom elements in its own namespace. Some of the custom elements mirror elements defined in other metadata schemes that are not yet registered, such as RDA and MARC. One of XC’s biggest strengths is that it enables experimentation. It provides Web-based tools that support harvesting, troubleshooting, transformation, and enhancement of metadata outside the context of existing legacy systems. Librarians can explore new approaches to managing metadata with no danger of permanently corrupting or destroying data stored in legacy systems.
Next steps for XC:
- Finalize schema and registry of elements
- Publish application profile
- Identify and define metadata elements for user generated metadata
- Enable schema data to be harvested as RDF
Rebecca Guenther described efforts currently underway at the Library of Congress to make controlled vocabularies available as linked data. The Library of Congress Vocabularies Service is intended to facilitate development and maintenance of vocabularies maintained by LC and make them freely available to both libraries and the broader Web community. The service provides comprehensive information about the vocabularies in addition to the exposing the vocabularies themselves as linked data. Most vocabularies will be represented using the Simple Knowledge Organization System (SKOS), an RDF application that was recently finalized by the W3C. Currently, LCSH is the only vocabulary available, but others will be offered in the future, including LC Name authorities. The service also offers bulk download of data in RDF format. Now that the service is officially up and running, LC plans to advocate for use and solicit user feedback more actively. Also still to come: a mechanism for updating data as changes are made in the underlying vocabularies and the development of an OWL schema for LCSH to provide greater granularity and a means for expressing facets, since SKOS lacks this capability.
No Comments »
No comments yet.
RSS feed for comments on this post.
Leave a comment
You must be logged in to post a comment.