Trip Report from jks mission to the States, May 2011

Executive Summary
The goal of  this trip was to discuss recent activities of OEK, especially the CIARD (http:www.ciard.net) initiative and the   work on AGROVOC and linked open data (http://aims.fao.org) with partners in the  USA and to agree on specific collaboration actions.  Main results of the discussions were:

 

1.     AgNIC  confirmed it’s participation in  CIARD  (http://www.agnic.org)  and will substantiate   through the registration of AgNIC services on the CIARD RING (http://ring.ciard.net)

2.     Cornell, Mann Library,  confirms their commitment to support CIARD through  co-ordinating their tools development with the necessities of the CIARD Content Management Taskforce and dedicating stafftime to the activities of the CIARD Content Management Taskforce

3.     The VIVO program (Cornell Mann Library/NIH/Multipartners) (http://www.vivoweb.org)  should become integrated part of the CIARD toolbox for the management of persons and expertise. The VIVO team will invite the CIARD CMTF to present this at the VIVO 2011 conference.

4.     FAO and NAL will strengthen their collaboration. NAL is interested to integrate all AGRIS data in their data service for USDA. NAL will deliver the AGRICOLA data to AGRIS. NAL and FAO will collaborate in the further development of the

5.     The World  Bank department  for agricultural and rural development is interested to get further information regarding the CIARD initiative. The new director of OEK or another Senior officer should go to the Bank for a presentation.  This could be in concomitance with the launch of the ICT in Agriculture sourcebook. Regarding the sourcebook a close integration with the e-Agriculture website  was agreed

6.     The  Worldbank Information Solutions Group is interested to intensify contacts with OEKCS. Goal of the collaboration is to find synergies in the development of vocabularies and to make the Banks document repository available through AGRIS

7.     The AGRIS team and IFPRI will collaborate to integrate some datasets from IFPRI research directly into AGRIS records regarding the specific topics to show the  possibilities of data integration  through “Linked Open Data” technologies

 

 

 

Detailed Reports
Cornell

  Discussion with John Ferreira on the RING
  • Future role of the RING. When providers register their service on the RING they should automatically geta wrapper that extracts Triples from their system to the Infrastructure.
  • RING will be the central switch point for information providers and service developers
  • there has to be a human service around the RING that interacts with service providers when registering their service
  • Migration Part to Drupal 7 necessary

  Meeting with VIVO team

  • I met with the VIVO team to discuss the state of VIVO and the possibilities of collaboration.   After the 13 million grant  that VIVO got through the NIH, VIVO is quickly expanding (during the project more than 100 people were working on VIVO). To recap: VIVO is an ontology based system to link people in Science together and to find and track expertise.  At the basis of VIVO is an Ontology, which classes and properties have been defined by the VIVO team and which is populated through various mechanisms: direct linking of VIVO to databases, data harvesting, intervention of  of Scientist themselves through specific interfaces.  VIVO has now installations in various Universities outside the USA, 2 in China, IICA expressed interest to use it.
  • Also VIVO has implemented SOLR indexes on different VIVO triple stores
  • I have VIVO presentation which I got from Elly Cramer, which is actually to heavy to attach it to the post, but which explains VIVO quite well. (http://dl.dropbox.com/u/27935178/VIVOdemo-current-20110509.pptx)
  • They are working on a national VIVO landing site in Drupal
  • I think that VIVO should become one of the main tools, which we are promoting. It connects very well to AgriDrupal and has an architecture that just fits into our overall concept.  It will be very interesting to do the authordisambiguation from AGRIS, publish our author URIs and then linke then to VIVO.  VIVO by itself is very interested in using our Journal URIs and they are looking forward that we are publishing the  list.
  • VIVO is using extensively the geopolitical Ontology
  • The VIVO grant is finishing in August, but there is now a strong openSource developer community around VIVO

  There is a very interesting Map module for Drupal from Development Seeds, are we using that?

 

  Friday morning I had my presentation on CIARD and a Linked Open Data Infrastructure (will put this presentation online this afternoon). Around 30 people showed up, not only from Mann Library. Among them also the Dean of Library IT services, who is responsible for all Library IT services at Cornell. Also Dianne Hillmann was present. I got some comments in writing from her regarding the AGROVOC LOD.  I will now have a meeting with Mary Ochs, Director of Mann Library

 

  Lunch with Mary Ochs (director of Library) and Jon Corson (Leader VIVO project). Mary is committed to support CIARD, but ways have to be found to do it more efficient and to get it mainstream into Cornells activities. Mary supports John's further involvement into the CIARD CMTF activities as long this is in line with his work at Cornell.  Cornell has no repository for the publications of the staff in agriculture science, so many other Land Grant Universities don't have.  A project idea could be to set up a nation wide agricultural open archive, sponsored morally by the CIARD initiative and AGRIS.  Other possibilities to make common grant proposals should be explored. During the AGNIC meeting the involvement of NAL into this should be discussed with Simon Liu.

  Project Idea: creating a USDA openArchive with integration of AGRIS and VIVO through a LOD strategy. This project would invest in the further development of AgriOceanDspace as a producer of LOD. Mary will discuss this with Simon Liu

  Vivo needs immediately our IAD list. We have to check out Science metrics in Montreal and their 20,000 Journals. CrossRef has made DOIs available as LOD

  TEAL: Teal is still maintained with funds from the Rockefeller foundation. Teal needs to Re-Index part of the content. We tried with great success the MIMOS implementation on a long CTA document (20 MByte) sent to KL and got back 10 very decent AGROVOC URIs as content description.  The teal responsible person wants the implementation of AgroTagger into Teal .

 

PhaseIItechnology

 

 PhaseII   is a Drupal solution provider with now    50 people working and it is   rapidly expanding.  A  big part of this expansion is due to the  massive introduction of Drupal to by the US government  (Whitehouse.gov) and the House of Representatives.

Jeff Wallpole CEO of  Phase// pointed out that  Drupal now is growing much faster than Joomla and Wordpress. Wordpress still is driving 10% of all websites and Drupal 2% of all websites,  but Drupal has definitely overtaken Joomla in importance. Joomla is definitely fading whereas Typo3 is of importance only in Germany.

 

This huge Drupal expansion has now lead to a crisis, because there are two many Dupal applications out and far not enough good Drupal developers around to maintain them. In this is the danger of a huge backdrop by disenchanted users who do not find support.

 

Phase// together with Aquia is at the moment discussing with DIGIT at the commission about the use of Drupal by the Commission, but in big corporate environments openSource solutions are still not widely accepted in IT Managers are more keen to go for  commercial solutions like  Microsoft or Adobe.  This is still less understandable looking to the big advantages which are delivered by expanden openSource communities like that of Drupal.  An interesting case study is PAHO, which committed strongly to Sharepoint, but where in parts of the Organization  now Drupal and openAtrium are introduced because they cater much better for knowledge sharing and management. openAtrium has been essential as a  communication tool after the Haiti Earthquake

 

One of the important Drupal installation packages is openAtrium to drive community sites and Intranets.  Phase//has taken over openAtrium from Development seeds and will further maintain and develop this package. It will be ported soon to Drupal 7, when all necessary modules are available.  In Italy there is an important openAtrium solution provider “Nuvole” which works between Parma and Brussels

 

AgriDrupal: we will have to decide if we go to invest in Installation Profiles or feature servers. Feature servers seem to be the more elegant solution is they give an enormous flexibility for the installing party.

 

AgNIC

 

In the AgNIC Year-in-Review ,   Martin Kesselman, AgNIC Chair, Rutgers, The State University of New Jersey talked about the big changes AgNIC is undergoing growing from 160,000 records one year agor to 5 million records now.  AgNIC is moving away from a “boutique application” collecting only the repository data from AgNIC partners to a big harvesting service. Many of the harvested records were from the Veterinary sections of MedLine.  It is quite obvious that AgNIC and  AGRIS in this way are moving on the same market.

 

The two keynotes of the AgNIC meeting were given by Simon Liu, the Director of the National Agricultural  Library and  Anne Kenny the Librarian of Cornell University. 

Both speeches were impressive and contained important messages.

 

Simon Liu’s vision of  NAL  is to make it the PubMed in Agriculture. Dr. Liu presented a detailed SWOT analysis of NAL and pointed out that the main weakness at the moment is the lack of  any premium products and services.  His goal is to make NAL a comprehensive access point for agricultural  knowledge immediately for USDA researchers but furtheron also worldwide.

Some numbers show this ambition. Indexing  should increase from  50,000 publications to 500,000   publications a year coming quickly to 100 million records in the  digital library.  Cornerstones for this endeavor  are

-         the further development of NALT and the introduction of NALT based automatic indexing. Lori Finch, the thesaurus manager is leading the automatic indexing project

-        The set up of a high technology framework (big IT infrastructure with Fedora Commons framework)

-        Mandated open access for all federal funded research.

 

 The next step will be the extension from managing publications to managing datasets. The Undersecretary of State from USDA has formed a scientific dataset committee, which is shared by Simon Liu. Their goal is to publish massive data repositories and to develop a USDA scientific data management policy with a Digital Commons Architecture.

 

 Part of the new Information Management Vision of USDA is the fullscale implementation of VIVO.

 

The second keynote by Ann Kenny was about the RCUL project between the Libraries of Columbia and Cornell.  In few words the RCUL projects aims to merge any activity of the two libraries that can be done more efficient in a co-operative way.  Ann pointed out that our  habit  to duplicate infrastructures comes from time where we had only horse carts  to move between such infrastructures.  In a globalized and weblinked world  there is now need of this proliferation of costly infrastructures.

Instead of screaming about shrinking budget we need to be open for radical changes in culture and business processes.  If books are not given out  anymore physically they can be stored where it is cheap to store them and make scans to send them out was one of her examples.

 She pointed also out that the main reason for the success of RCUL is in the sharing of cultural values and academic strengths between the two partners.

The project has been funded by the Mellon foundation.  Ann Kenny made also a strong statement for  open access for not being held hostage by the publishers.

    

Meeting with Federico Sancho

 

During the AgNIC meeting I met Federico Sancho to discuss the following points:

 

a) WebAgris Latin America and CATIE.

Catie wants to move away from WebAGRIS for reasons of having better technology and of integrating systems.  They were not doing this until now for not giving a global negative signal for WebAGRIS. We discussed that FAO has absolute no pregiudices against moving away from WebAgris.

They had already discussed with John Ferreira about the use of Drupal.  We discussed that CATIE/IICA could be a fullfledged prototype implementation of AgriDrupal.  A specific issue would be if AgriDrupal can manage a catalogue of 1 Million records as it is present in the CATIE library.  We should make a follow up proposal

 

b) AGROVOC in LA.  It is very obvious that IICA is now also supporting NALT. I think this is more or lesse consequence of our (wrong) decision to work with IIAP which was quite outside the IICA network. Federico still wants a more activie invovlement into  AGROVOC.   We decided to have a furth AGROVOC workshop for Latinamerica (online, virtual) for which IICA will nominate about 15 experts, We will present and demo the VocBench with following discussion. As a result of this eConference, interested colleagaues from LA should apply to become AGROVOC editors. We were thinking about 6 editors and 1 editor in chief.

 

Meeting  with  Lauri Finch.

 

She reports now directly to Simon Liu. She is also responsible for indexing and is leading a big automatic indexing project.

She is still very interested in the collaboration with AGROVOC but has limited resources, because of tough deadlines for the automatic indexing project

NAL most probably woul not be able to use the VocBench because American government information can not be stored outside the USA.

 

World Bank

 

For my talk in the morning more than 30 persons showed up and luckily most were not from the "library".  The "library" at the bank has still   more been downsized in importance than in FAO.   There are left some library services within the team "Library and Archives of Development"  within the Information Solutions group.   The discussion after my presentations  was with  the people from the "Development Data Group" (Neil Fantom), "Information Advisory Services (Luisita GUanlao) and people, who are managing the Document Repository  (Katie Bannon).

 

In the afternoon I had separate meetings with the people from "data.worldbank.org" and the "Information Advisory Services".

 

In the meeting with the data.worldbank.org people we discussed the Information architecture, (which is not a Linked Open Data application) but a linking of the Drupal front end through connectors and APIs  to the data stores.  I did not go into many details as Karl is in direct contact with them too.  In the next phase of data.worldbank.org is forseen a LOD-RDF layer, but this is still under discussion as also the application of SDMX.  They are  very much interested that I broker for them a contact to DIGIT Brussels which is also working on linking data repositories to Drupal.   At the bank is regarding CMS a similar situation  as in FAO.  Whereas the data.worldbank.org group works with Drupal, the web people just decided to buy ADOBE QS.  I briefed them about Fedora Commons, which they did not know and were highly interested in.

 

More important was the meeting with the information advisory people.  They (and not the library) are responsible for the Banks document repository, the thesauri, taxonomies and similar tools.  Well this is large scale business. They have various vocabularies around (the biggest one with 500,000 terms). They have a document repository with about 150,000 documents, among them a lot of CGIAR stuff.   They were enormously interested in the VocBench and our plans for maintenance.  They also have developed methods of automatic indexing using their vocabularies.   We discussed the project of a federated semantic search on our different repositories and everyone was interested. But it was quickly clear that their main interest was in collaboration regarding maintenance, further development and mapping of the vocabularies.  We decided to establish cooperation on this.  As a first step both sides will make an inventory of existing vocabularies and we will exchange these vocabularies. Based on this our team will make concrete proposals how  we could streamline and collaborate.

 

Meeting with Fionna Douglas:

 

At the end of the day I met Fionna Douglas (Program Manager, Agricultural and Rural Development Department).  We talked about the sourcebook, e-Agriculture, e-Learning and CIARD. Fionna wants about all the 3 area a mail s with more specific information. She agreed completely with me on using the e-Agriculture platform to promote the sourcebook, she is very interested in our e-learning activities.  Regardiong CIARD she proposed to bring this to the "global dono platform" She also spoke about the CP program and possible funding for specific information management activities. It was only and 30 minutes meeting and organized on the fly, but I think important (She knew Anton and asked me about who would be his successor, she also knew your name)

 

Meeting with Eija Pehu

 

-  The URL of ithe ICT in Agriculture sourcebook. will be www.ictinagriculture.org .  It will be launced at the end of September.  They will invest 80% of a Full time staff for the content management of the website

 

We agreed the following points of collaboration on the sourcebook

 

-  The sourcebook team will make a serious of eFora about the single chapters of the sourcebook, the first one in October after the launch.  They will choose the subject expert but will use the e-Agriculture platform for these fora.

 

Eija was very interested in e-Agriculture and especially about the outreach possibility through the large community.

 - The single chapters of the sourcebook should be interlinked with the e-Agriculture knowledge base and with the CIARD pathways, which I showed to Eija.  They will index the different chaptes of the sourcebook. We will get into contact with the technical folks of the sourcebook to agree how this interlinking through common indexes could be organized automatically between CIARD.net,  e-Agriculture.org  and ictinagriculture.org.

- the sourcebook will have an official launch in October    to which FAO will be invited to present also e-Agriculture.

Outside the sourcebook topic we stressed

a) The Worldbank needs to take a role in CIARD. The work of the "Agriculture and Rural Development Department"is strongly linked to the goals of CIARD.  We should consider that the new OEK Director or someone else makes a CIARD presentation to the Bank. Could be also together with the lauch of the sourcebook. Stephen should discuss this further with  Eija, when they meet in   Finland

b) they want to organize a e-FORUM on Gender in Agriculture and want to know about FAOs activities in the area

c) I got an el-earning application on "seed systems" done by the bank, which should go to AGP. I will hand this over to  Andrew.

 

Library of Congress

 

About 25 participants from the “standards and MARC team “  came to  my talk:  Talking to them about necessity of standard vocabularies is  “bringing coal to Newcastle”;   They simply agree.

 

So the discussion was dedicated to some of the open questions, i.e.

-        how can the mapping of vocabularies be maintained

-        how can resources be aggregated that are indexed by different vocabularies.

-        Do we need more than dc.subject, i.e coverage or can this be organized through the properties of the URI, for less maintaining on the system side

 

After my presentation I joined a further  meeting with Tom Baker  on vocabulary presentation, in which the responsabilities of Vocabulary owners were emphasized

 

 

IFPRI

 

I had a talk with Chris Addison and Luz Marina Alvare on the situation of Information Management in IFPRI and in the CGIAR network in large.  Luz Marina expressed her doubts about the direction that the information centers in the CGIAR are going. In her opinion the emphasis  is now too strong on packaging and communication and not strong enough on technical services for the researcher, what in her opinion should be the core of the work.

  IFPRI’s general strategy is to use cloud services as much as possible, i.e. ContentDM  from OCLC for digital collection management.  They have started to publish their own data sets as RDF/Linked Open Data of which the World Hunger Index is already used by the FAO Country Profiles.

  We agreed to select and test some datasets for the possibility of direct linking to specific AGRIS records.

 

 

 



 

 

List of follow-up actions
·        Contacting Federico Sancho about AGROVOC eConference in LA  (Caterina Caracciolo)

·        Bringing Worldbank into contact with DIGIT in Brussels  (johannes keizer)

·        Writing statement on WebAgris for LatinAmerican Agris Centers (johannes keizer)

·        Examining the use of openAtrium for AIMS and contacting Nuvole (Valeria Pesce)

·        Our AGRIS needs to study the AgNIC harvesting strategy and politics (Stefano Anibaldi)

·        Preparation of presentation for VIVO meeting in August  (johannes keizer)

·        Sending  Journal  JAD list to VIVO (Stefano Anibaldi)

·        Liaising with Mary Ochs on an AGRIS/USDA/VIVO  repository interface  (johannes keizer)

·        Liaise with John Ferreira on AgroTagger for TEAL  (johannes Keizer

·        Contacting Lory Fynch on further collaboration with NALT  (Yves Jaques)

·        Contacting NAL  for AGRIS/USDA data exchange (Stefano Anibaldi)

·        Organizing fora ICTinAgriculture on eAgriculture (Michael Riggs)

·        Contactin Luisita Guanlao to get an inventory of the  World banks vocabularies (johannes keizer)

·        Checking IFPRI datasets and contacting Chris Addison  (Stefano Anibaldi)

·        Checking the importance of ContentDM for CIARD contentmanagement  (Imma Subirats)

·        Following up with LoC on LODE BD (Imma Subirats)

 

 

Annexes
01    http://www.slideshare.net/Keizer/agnic-2011-5

02    http://www.slideshare.net/Keizer/world-bank-201105

03    http://www.slideshare.net/Keizer/lo-c-20110518

04    http://www.slideshare.net/Keizer/cornell-2011-0513

05    http://www.slideshare.net/Keizer/nal-2011-0519