LODE-BD Recommendations 2.0

With Web advances to an era of open and linked data, the traditional approach of sharing data within silos seems to have reached its end. From governments and international organizations to local cities and institutions, there is a widespread effort of opening up and interlinking their data. This report aims at providing bibliographic data providers of open repositories with a set of recommendations that will support the selection of appropriate encoding strategies for producing meaningful Linked Open Data (LOD)-enabled bibliographical data (LODE-BD). Linked Data, a term coined by Tim Berners-Lee in his design  [1]regarding the Semantic Web architecture, refers to a set of best practices for publishing, sharing, and interlinking structured data on the Web. Key technologies that Linked Data builds on are: Uniform Resource Identifiers (URIs) for identifying entities or concepts in the world, RDF model for structuring and linking descriptions of things, HTTP for retrieving resources or descriptions of resources  [2] , and links to other related URIs in the exposed data to improve discovery of related information on the Web.data (LODE-BD)]

1.1. Purpose of the LODE-BD Recommendations

In the bibliographic universe there is a clear paradigm shift from fixed records to re-combinable metadata statements. For anyone who is contributing to an open bibliographic data repository as a data provider or service provider, the processes and strategies of providing data as Linked Data are practical issues. Guidelines and recommendations on what standards to follow and how to prepare LOD-ready metadata are essential. There seems to be no one-size-fits-all approach because there existed a great number of metadata-related standards developed during the last two decades. They have been created by different communities for specifics purposes to guide the design, creation, and implementation of data structures, data values, data contents, and data exchanges in certain communities. The operational metadata standards for data structures form a whole spectrum, ranging from independent ones (which do not reuse any metadata terms from a known namespace) to integrated ones (which would fully employing and incorporating existing metadata terms from other namespaces, usually seen in newly developed metadata application profiles and ontologies). Decisions regarding what standard(s) to adopt will directly impact the degree of LOD-readiness of the bibliographic data. The approach of employing well-accepted metadata element sets and value vocabularies has already shown great benefits and potentials in terms of resource discovery, data reuse, data sharing, and the creation of new content based on Linked Data. However, deciding to take this approach is only the first step for the data providers and service providers of an open bibliographic repository. In the context of producing LOD-enabled bibliographical data, data and service providers are likely to have many specific questions related to the encoding strategies, for example:

  • What metadata standard(s) should we follow in order to publish any bibliographic data as Linked Data?
  • What is the minimal set of properties that a bibliographic dataset should include to insure meaningful data sharing?
  • Is there any metadata model or application profile that can be directly adopted for producing bibliographical data (especially from our local database)?
  • If the controlled vocabulary we have used is available as Linked Data, what kind of values should we exchange through our repository, specifically, the literal form representing a concept or the URI identifying the concept?
  • How should we encode our data in order to move from a local database to a Linked Data dataset?

This report was born in this context with the purpose of assisting data providers in selecting appropriate encoding strategies for producing LOD-enabled bibliographical data (directly or indirectly). In order to enhance the quality of the interoperability and effectiveness of information exchange, the LODE-BD Recommendations are built on five key principles:

  1. To promote the use of well-established metadata standards and the emerging LOD-enabled vocabularies proposed in the Linked Data community;
  2. To encourage the use of authority data, controlled vocabularies, and syntax encoding standards in metadata statements whenever possible;
  3. To encourage the use of resource URIs as data values when they are available;
  4. To facilitate the decision-making process regarding data encoding for the purpose of exchange and reuse;
  5. To provide a reference support that is open for suggestions of new properties and metadata terms according to the needs of the Linked Data community.

1.2The LODE-BD Report Roadmap

LODE-BD Recommendations are presented as a whole package, encompassing the important components that a data provider may encounter when deciding to produce sharable LOD-ready structured data describing bibliographic resources (such as articles, monographs, theses, conference papers, presentation material, research reports, learning objects, etc. – in print or electronic format) from a local database. In the future the recommendations may be extended to accommodate other kinds of information resources.

The recommendations are included in section 2 and 3 of this report:

  •  Section 2, general recommendations, presents nine groups of common properties identified by LODE-BD and the selected metadata terms to be used for describing bibliographic resources. 
  •  Section 3, decision trees, demonstrates how to make decisions on selecting recommended properties according to the local needs.

Table 1. The Roadmap of the LODE-BD Report

 

 


[1] Berners-Lee, Tim. 2007, Linked Data – Design Issues.   http://www.w3.org/DesignIssues/LinkedData Last accessed: June 2012

[2] LOD2 Collaborative Project. 2010. Deliverable 12.5.1. Project fact sheet version 1.  http://static.lod2.eu/Deliverables/LOD2_D12.5.1_Project_Fact_Sheet_Version.pdf  Last accessed: June 2012