Linked Open Data : A Use Case in the Agricultural Domain

Foto original source http://energise2-0.com/2013/06/20/an-export-support-programme-for-a-connected-world/In the Linked Open Data paradigm, institutional repositories have the opportunity to enhance shareability, extensibility, and re-usability of their data by ensuring:

  • content stable, discoverable, and readable data by both machines and humans
  • use of appropriate well established metadata standards and emerging Linked Open Data enabled vocabularies
  • use of controlled vocabularies, authority data and syntax encoding standards in metadata statements
  • use of resource URIs as data values when they are available

In the agricultural domain, the Food and Agriculture Organization of the United Nations provides to the agricultural information management community with standards, tools and good practices to assist owners of open repositories and to take advantage of the new generation of web-based technologies to increase visibility. This work is facilitated through the global community on Agricultural Information Management Standards  (AIMS)[1].  On the AIMS portal, the community advocates for and promotes the application and use of semantic technologies and standards, interoperability of agricultural information and systems, and recommendations on managing research outputs.

The AIMS community is actively working on providing support and tools for all the processes needed for the publication and consumption of bibliographic content as Linked Open Data in the agricultural domain:

1. Linking content by using widely-used controlled vocabularies

The use of meaningful metadata for bibliographic content description and the use of shared vocabularies are primary steps in facilitating interoperability. AIMS provides recommendations to facilitate this exchange of data and information sharing by encouraging the use of authority data, controlled vocabularies, and syntax encoding standards.

AGROVOC[2] is a subject vocabulary covering areas that include food, nutrition, agriculture, fisheries, forestry and environment. To date, AGROVOC contains over 32,000 concepts organized in a hierarchy, where each concept may have labels in up to 22 languages. AGROVOC is available as a Linked Open Data, aligned with more than 10 vocabularies. The Linked Data version of AGROVOC is in RDF/SKOS-XL. Data is accessible to machines through a SPARQL endpoint, and to humans by means of a HTML pages generated with Pubby.

2. Selecting appropriate encoding strategies for producing metadata

Recommendations are essential on what standards[3] to follow and how to prepare LOD-ready metadata to be exposed to service providers.  A great number of metadata-related standards have been developed during the last two decades by different communities for specific purposes to guide the design, creation, and implementation of data structures, data values, data contents, and data exchanges. This makes a bit difficult the decision on what standards to use.

Decisions regarding what standard(s) to adopt directly impact the degree of LOD-readiness of the bibliographic data. To employ well-accepted metadata element sets and value vocabularies has already shown great benefits and potentials in terms of resource discovery, reuse, sharing, and the creation of new content based on Linked Data. However, in the context of producing LOD-enabled bibliographical data, data and service providers are likely to have many specific questions related to the encoding strategies, e.g. "what metadata standard(s) to follow in order to publish bibliographic data as Linked Data? What minimal set of properties a bibliographic dataset needs to include to insure meaningful data sharing?  if the controlled vocabulary is available as Linked Data, what kind of values should be exchanges through our repository, the literal form representing a concept or the URI identifying the concept?" [4]

LODE-BD[5] was born in this context with the purpose of assisting data providers in selecting appropriate encoding strategies for producing meaningful Linked Open Data (LOD)-enabled bibliographical data. The LODE-BD recommendations are applicable for structured data describing bibliographic resources such as articles, monographs, theses, conference papers, presentation materials, research reports, learning objects, etc. – in print or electronic format[6].

3. Integrating metadata standards, controlled vocabularies, authority data and syntax encoding standards in repository tools

The promotion of information management standards has shown that providing tools that implement good practices in the creation, management and exchange of metadata is a key factor to success. Providing metadata and vocabularies via Linked Open Data, Web services and file downloads is not enough. The additional customization of information management tools pre-packaged with such standards and services is fundamental to ensure interoperability among information management systems.

Information system customizations based on two open source content and digital repository management systems, Drupal and DSpace, have been created under the umbrella of AIMS. These customizations - AgriDrupal[7] and AgriOceanDSpace[8] - facilitate the publication of interoperable and re-usable metadata that describe agricultural research information.

4. Discovering information services by registering them in directories

Once the decision on using tools that integrate widely used metadata standards and controlled vocabularies is taken and repositories are in place, the discovery of the information services hosting Linked Open Data is essential for building aggregators. The CIARD Routemap to Information Nodes and Gateways (RING)[9], a global registry of web-based services, provides a space for information providers to register their services in various categories  with the objective to facilitate the discovery of sources of agriculture-related information across the world.

5. Aggregating information from different resources using mash-up web applications

AGRIS[10], one of the most important world-wide information systems in the area of the agricultural sciences, benefits of all the steps described above. AGRIS uses AGROVOC as backbone to index its records and linked for externals resources; aggregates information using the recommendations on metadata standards described on LODE-BD; and consumes data exposed by data providers registered on the CIARD RING. To date, it hosts more than 7 million of bibliographic records.

AGRIS uses Linked Open Data methodologies to link the bibliographic records with other to related datasets on the web with the objective to enrich the information provided in the AGRIS records. AGRIS interlinks with datasets like DBPedia, World Bank, FAO Geopolitical Ontology, Nature OpenSearch, Global Biodiversity Information Facility and Biodiversity International[11], using AGROVOC[12]. More than 180 million triples have been generated so far.




[1] Agricultural Information Management Standards. Available at http://aims.fao.org/, Accessed December 11 2013.

[2] FAO of the United Nations. AGROVOC. Available at http://aims.fao.org/standards/agrovoc/, Accessed December 11 2013.

[3] Subirats, Imma and Zeng, Marcia Lei, 2012. Meaningful Bibliographic Metadata (M2B): Recommendations of a set of metadata properties and encoding vocabularies. Available at http://aims.fao.org/advice/metadata-beta-version, Accessed December 11 2013.

[4] Purpose of the LODE-BD Recommendations.  Available at http://aims.fao.org/lode/bd-2/about, Accessed December 11 2013.

[5] Subirats, Imma and Zeng, Marcia Lei, 2012. LODE-BD Recommendations 2.0: How to select appropriate encoding strategies for producingLinked Open Data (LOD)-enabled bibliographic data. Available at http://aims.fao.org/lode/bd, Accessed December 11 2013.

[6] Zeng, Marcia; Subirats, Imma, 2013. How to select appropriate encoding strategies for producing LOD-enabled bibliographic data [Webinar] Available at http://aims.fao.org/community/general-information/blogs/webinaraims-how-select-appropriate-encoding-strategies-producing Accessed December 11 2013.

[7] FAO of the United Nations. AgriDrupal. Available at http://aims.fao.org/tools/agridrupal, Accessed December 11 2013.

[8] FAO of the United Nations. AgriOcean DSpace. Available at http://aims.fao.org/agriocean-dspace, Accessed December 11 2013.

[9] GFAR. CIARD RING. Available at http://ring.ciard.net/, Accessed December 11 2013.

[10] AGRIS: International Information System for the Agricultural science and technology. Available at http://agris.fao.org/agris-search/index.do, Accessed December 11 2013.

[11] AGRIS: How it works. Available at http://agris.fao.org/content/how-it-works, Accessed December 11 2013.

[12] Celli, Fabrizio; Keizer, Johannes, 2013. Release of AGRIS 2.0:  Searching agricultural bibliographic data [Webinar] Available at http://aims.fao.org/community/blogs/new-webinaraims-release-agris-20-searching-agricultural-bibliografic-data-interlinke, Accessed December 11 2013. 

Foto original source http://energise2-0.com/2013/06/20/an-export-support-programme-for-a-connected-world/