FAIR Principles – review in context of 4TU.ResearchData
The Dutch archive for the technical sciences - 4TU.ResearchData - has recently analysed itself in the context of the FAIR (Findable, Accessible, Interoperable and Re-usable) data principles. Below is an overview of the metadata that describes each dataset according to the following attributes: FAIR Principles (as scoring matrix) - 4TU.ResearchData Policy - General Comments.
___________________________________________________________________________________________________
To be Findable:
Principle | 4TU.ResearchData Policy | General Comment |
(meta)data are assigned a globally unique and eternally persistent identifier. | Yes, DOIs are employed for each dataset | To the extent that anything is eternally persistent |
data are described with rich metadata | Yes, multiple metadata fields are included | Rich metadata is a vague term |
(meta)data are registered or indexed in a searchable resource | Yes, our data is crawlable crawlable and we provide OAI-PMH sets (eg., NARCIS , Thomson Reuters DCI) as well as (part of our) present (part of our) to DataCite Metadata store. There’s also a (hidden) SPARQL Endpoint. |
|
metadata specify the data identifier | Yes | Slightly opaque English ‘metadata include the data identifier’ perhaps |
To be Accessible:
Principle | 4TU.ResearchData Policy | General Comment |
(meta)data are retrievable by their identifier using a standardized communications protocol. | Yes, http is used |
|
the protocol is open, free, and universally implementable. | Yes, http is open etc |
|
the protocol allows for an authentication and authorization procedure, where necessary. | Yes, users need to authenticate themselves |
|
metadata are accessible, even when the data are no longer available. | Yes, this is part of our policy | This principle seems to be policy driven rather than technical and sits a little bit awkwardly in this section |
To be Interoperable:
Principle | 4TU.ResearchData Policy | General Comment |
(meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation. * | Yes, data uses elements of DC, and can also be exposed as ORE RDF/XML | If you are not a metadata expert this language is opaque |
(meta)data use vocabularies that follow FAIR principles. | We use established vocabs where we can, but they might not follow the FAIR principles. If we develop our own, we publish the ontology. | The FAIR facet is a bit vague. E.g. we use ORCID, which can be seen as a vocabulary. We do not have an ontology or controlled vocabulary in place yet. |
(meta)data include qualified references to other (meta)data. | Sometimes, eg links to publications or ORCID identifiers (orcid is idenfitier) | This is vague. Does it means links to other vocabs and thesauri ? |
To be Re-usable:
Principle | 4TU.ResearchData Policy | General Comment |
meta(data) have a plurality of accurate and relevant attributes | Yes, we employ many of fields | How is this different from Principle F2? |
(meta)data are released with a clear and accessible data usage license. | Yes metadata is released with a CC0 licence. But for data, we currently have our own bespoke licence. We are working on changing this. | This is a worthy aim, but difficult to achieve without much more policy development (for us) |
(meta)data are associated with their provenance. | The source of data is included in the metadata records, but does not display the file processing and how the final data was created. | It’s difficult to display this type of provenance information in metadata. If free-text documentation counts as metadata than we could say we meet the principle |
(meta)data meet domain-relevant community standards | Partially. Difficult to have subject specific metadata when we cover so many different subjects. However, some data formats are tailored for particular domains. | This is difficult to achieve for non subject-specific repositories, especially for descriptive metadata |
Source: FAIR Principles - review in context of 4TU.ResearchData
Related content:
- Evaluation of data repositories based on the FAIR Principles for IDCC 2017 practice paper (the corresponding Excel Spreadsheet with the evaluation overview of 37 data repositories, statistical analysis and graphical figures)
- Core Trustworthy Data Repositories Requirements (The DSA Board )
- The ‘Guidelines on FAIR Data Management in Horizon 2020’ by the European Commission.
- The short version of the ‘FAIR data principles’
- The extended version of the ‘FAIR data principles’
- The Nature article ‘The FAIR Guiding Principles for scientific data management and stewardship’
- Put FAIR principles into practice and enjoy your data!
- Online support tools for Fair Use analysis, such as: (1) ALA's interactive Fair Use Evaluator Tool, (2) University of Minnesota Libraries' Thinking Through Fair Use interactive website, and (3) (for video and film creation) The Fair Use App from New Media Rights