-- duplicates because they also harvest Scielo
-- their schema is custom (but we already have an conversion XSL)
-- harvest, convert and dedupe (approx. 100k records) as a one-off (do not build a configurable multi-use de-duping component)
-- in future stop harvesting from Scielo since they are anyway in DOAJ (true for other providers as well)
Will we publish one in 2012? (Johannes decision)
-- need to get other kinds of data in there before May
-- Stefano will research on good APIs whether SPARQL or other
-- Stefano is documenting the AGRIS business process so we can better evaluate possible ARIADNE cooperation
-- Fabrizio is investigating the state of the code and documentation in Ariadne
-- Team is working on a document that defines a set of principles and contains numerous real user scenarios with recommended courses of action derived from those principles
-- Does LODE-BD recommend following DC encoding guidelines? example: DFID data has multiple creators in the same element, this is not recommended DC practice, therefore can we reject such data and not put it in AGRIS, or do we accept it because it's not specified in LODE-BD as a bad practice (note that by this rule we should also not accept AGRIS AP as it nests elements, another DC bad practice)?
-- Should we accept any keywords in dc:subject elements? We think yes.
-- Should we inform providers of poor practice in their data export. What is the limit after which we will not accept their data? Examples:
-- Incorrect use of CDATA blocks. DFID is encoding dc:description in CDATA blocks meaning it remains unparsed causing HTML content in these blocks to be displayed incorrectly on the web like <br/>. We suggest warning them but otherwise displaying it incorrectly in AGRIS as it is encoded.
-- broken dc:identifier links. DFID again, the links are broken. We suggest warning providers and adding a script at import time that checks whether a link actually produces a resource.
-- Domain specificity. We feel a provider must at least be able to guarantee the records fit into the AGRIS domain, description of which exists on the website. DFID again is providing records that have nothing to do with agriculture, like, "road building". We think a provider should be rejected in this case.
-- Stefano is preparing a list of repositories and websites for INFN