Enabling Multilingual Search through Controlled Vocabularies: the AGRIS Approach – A review

Enabling Multilingual Search through Controlled Vocabularies: the AGRIS Approach’ is the title of the paper that was awarded the Best Paper Prize, submitted and presented at this year’s 10th International Conference on Metadata and Semantics Research conference (MTSR 2016) held on the 22rd  to the 25th  November 2016.

_____________________________________________________________________________________________________

This paper - authored by Fabrizio Celli and Johannes Keizer of the AGRIS team - presents the earlier development of multilingual search implemented within the AGRIS system. In particular, the paper describes the lightweight approach adopted to enable the cross language information retrieval feature in the AGRIS.

The article is grounded on literature in that it builds upon the two decade old debate that of the usefulness of controlled vocabularies in information retrieval. There are two schools of thoughts that emerge: (i) supporters of controlled vocabularies who believe they aid recall in search results and (ii) those who advocate for the abandoning of controlled vocabularies and prefer free-text searching.

The paper delves into this debate by showing through the AGRIS approach how the adoption of a controlled vocabulary helps in implementing the multilingual search functionality in the AGRIS information system, in order to retrieve multilingual content whose language may be different from the language of the query. Three parts emerge in the form of the paper: (1) An introduction to AGRIS and AGROVOC thesaurus; (2) The AGRIS approach to Multilingual Search; (3) Analysis of the results and conclusions.

In this review we summarize the second and the third later points respectively.

The AGRIS Approach to multilingual search

The model of this approach is aptly described in the paper as thus, “AGRIS approach to multilingual search is based on the adoption of AGROVOC as an instrument to translate user search keywords. In this way, we demonstrate that a controlled vocabulary is not only good for document indexing, but it can be applied to other aspects of information retrieval, as enabling multilingual search through automatic query expansion (AQE). AQE has a 50-year history but, as the survey [3] states, only in recent years it has reached a good level of scientific maturity to lose the status of experimental technique”.

The description goes further to explain the algorithm used, where the user performs a keyword search in the AGRIS database – the system: 

  • identifies the query pattern;
  • uses AGROVOC to translate keywords;
  • expands the user query, boosting keywords provided by the user and
  • returns results in all available languages

The results component of the paper describes the output of this algorithm. The results analysis shows how AGROVOC allows for the return and enhancement of multilingual broad-based results.

In conclusion the paper states how this multilingualism algorithm can be further implemented and enhanced on the AGRIS interface. At present the system can manage singular/plural variations only for English Terms. The current implementation of AGRIS can be improved by increasing the language choice by allowing users to select a subset of languages when enabling the multilingual search. This will allow a filter of languages the users are not interested in. The second intervention is to implement the Synonymous Query Expansion Module. 

Read the full paper HERE