reflecting on the SemaGrow achievements and the project highlights

Nikolaos Marianos is the Project Management Director at Agro-Know - an extraordinary company that captures, organizes and adds value to the rich information available in agricultural and biodersity sciences, in order to make it universally accessible, useful and meaningful. In the context of SemaGrow he works as the main project manager (PMP certified) of Agro-Know, supporting the project. Within SemaGrow he is focused on offering real data and intergrating the SemaGrow technologies to a real product offering the capacity building/education and research communities, to test them with real users and ensuring the exploitation of the project main results. Nikolas is an evaluation expert with a PhD on the evaluation of e-services and he has contributed to the design and implementation in the piloting plan to validate the SemaGrow demonstrators with real users. The AIMS Editorial team asked him the following questions;-

Questions 1. The Goal of SemaGrow is to develop reactive algorithms, infrastructures and methodologies for scaling up data intensive techniques up to extremely large data volumes and real time performance. How does SemaGrow want to achieve this overarching goal?

SemaGrow is mainly a research project, which aims at coping with big datasets in a Linked Open Data environment, providing a way to query such datasets and combining them, using algorithms and tools that allow real time performances. So, SemaGrow is working on federation of SPARQL endpoints, developing algorithms to combine them and to allow many customizations, like the selection of the sources to be federated and the metrics for the combination. Of course, working on the infrastructure is also very important, with the possibility to run jobs on the GRID, to access developed applications by Web services, and to tune triple stores to cope with the project’s needs. Three use cases are provided to test and improve the outcomes of the project. As an example, the reactive resource discovery use case of Agro-Know (AK) aims to test how educators, trainers and researchers are able to find, reuse and exploit data resources created in one environment in very different contexts. SemaGrow is helping to achieve satisfactory precision with very fast response times, so that relevant multimedia results/objects from heterogeneous and diverse sources (educational, cultural and scientific/academic collections) are presented to the user when searching for relevant material to use in an educational activity.

Question 2. What is the current status of development of the SemaGrow project? What has been achieved until now?

We are now at the beginning of the 3rd and final year of the project and so far the 1st version of the SemaGrow. SemaGrow SPARQL endpoint was developed and integrated into the demonstrators. In the case of the Agricultural Discovery Space (ADS) demonstrator the first of two separate development phases has been completed. The first phase aimed to develop the enhanced Agro-Know data platform by including components that allow the queries and users tracking in order to come up to the correct conclusions for the actual users’ needs. The new enhanced data platform was tested during the 2nd SemaGrow Hackathon, together with the SemaGrow SPARQL endpoint and a number of improvements were recommended. Based on them, the 2nd deployment phase has started and the front-end of the ADS demonstrator is being implemented, in the form of a discovery service offered to the GFSP community.

Question 3. How far is the community involved in the project?

AK offers competitive solutions and customized services covering the needs of all stakeholders in the agricultural, food, environmental and biodiversity science spectrum, ranging from scientists and educators to farmers and citizens. Such potential clients are involved in major networks such as CIARD, GODAN, Research Data Alliance and the GFSP and we tried to engage them from the beginning of the project to collect requirements in order to develop a final solution that will fit their needs and solve real problems. Representatives of the community participated to consultation workshops, problem interviews, liaison meetings and hackathons. Their contribution was significant and we hope they will be rewarded by the end of the project with a number of new enhanced services offered by AK.

Question 4. Could you briefly highlight the most important outputs of SemaGrow that can be foreseen after the project will have been finished? How can the community benefit from them in future?

The most important outputs of SemaGrow are the following;

  • A distributed infrastructure layer on top of existing data repositories and networks that will support the interoperable and transparent application of data-intensive techniques over heterogeneous data sources.
  •  A toolkit and best practices guides for both data providers and consumers using the distributed infrastructure layer to support the interoperable and transparent integration of heterogeneous data sources and the application of data-intensive techniques over such sources.
  •  A number of prototype demonstration services such as the ADS and AGRIS demonstrators deployed by AK and FAO respectively.

The distributed infrastructure, together with the toolkit and the best practices guides will help innovative SMEs and other organizations (like SWC and AK) consuming (e.g. organizing, processing, visualizing) agricultural data to have better access to a very large volume of data and develop sustainable services and tools using this data to offer them to the community. This will improve decision making of agricultural researchers, information officers, researchers and educators by providing them access to timely and accurate information/data. AK will provide to the identified communities a real product offering based on the ADS demonstrator, including both a SemaGrow powered technological solution and customized services. An example of potential stakeholders in the community that make advantage of such product are individual researchers and research organizations like the CGIAR centers and the beneficiaries of Horizon 2020 programme that will be supported in their effort to open their data as dictated by emerging Open Access (OA) mandates and EC funding rules (eg Article 29.2 of the Horizon 2020 Grant Agreement ).