In a previous blog post, I described the work that Silvio Peroni and I undertook in May 2011 to map the main terms from the DataCite Metadata Kernel v2.0 to RDF.
To enable that, we created a ‘proto-ontology’, the DataCite Ontology version 0.2, that contained just the following four object properties:
datacite:hasPrimaryIdentifier
datacite:hasAlternateIdentifier
datacite:hasRelatedIdentifier
datacite:hasPersonalIdentifier
These properties permitted us to provide identifier descriptions required by DataCite that could not be achieved using other Ontologies. We did this using the following type of construction:
:this-dataset datacite:hasPrimaryIdentifier [ a prism:doi ; literal:hasLiteralValue "***" ] .
in which the object property relates to a blank node defining something that is a DOI and that has the particular literal value specified.
In July 2012, to permit an updating and expansion of the DataCite2RDF mapping, to conform to the DataCite Metadata Kernel, v2.2 published in July 2012, we undertook a complete revision of the DataCite Ontology to permit us to create mappings not just for the core DataCite metadata elements, but for them all.
As a result, the latest version of the DataCite Ontology, version 0.6.1, created on 18th July 2012, now has 11 new classes and 5 new object properties, as shown in the following table:
Note that the four original specific object properties have been replaced by the single object property datacite:hasIdentifier, and that the method of defining the identifier has been changed. Now, rather than the object property relating to a blank node in which the identifier is defined as a literal, the object property datacite:hasIdentifier has as its object a member of the class datacite:Identifier, or of one of its three sub-classes, datacite:PersonalIdentifier, datacite:FunderIdentifier or datacite:ResourceIdentifier, as shown in the following diagram:
The exact nature of the identifier is then defined using the second DataCite object property datacite:usesIdentifierScheme that has as its object the class datacite:IdentifierScheme or one of its three sub-classes: datacite:PersonalIdentifierScheme, datacite:FunderIdentifierScheme or datacite:ResourceIdentifierScheme.
This provides a robust method for defining identifiers, since each specific identifier is defined as an individual member of its appropriate identifier scheme class. Using the new DataCite Ontology, these three types of identifier scheme can be used as follows:
:this-dataset a fabio:Dataset ; datacite:hasIdentifier [ a datacite:PrimaryResourceIdentifier ; literal:hasLiteralValue "XXX" ; datacite:usesIdentifierScheme datacite:doi ] ; dcterms:creator [ a foaf:Person ; foaf:name "Smith, Jane" ; datacite:hasIdentifier [a datacite:PersonalIdentifier ; literal:hasLiteralValue "YYY" ; datacite:usesIdentifierScheme datacite:orcid ] ] ; foaf:fundedBy [ a foaf:Organization ; foaf:name "Wellcome Trust" ; datacite:hasIdentifier [a datacite:FunderIdentifier ; literal:hasLiteralValue "ZZZ" ; datacite:usesIdentifierScheme datacite:fundref ] ] .
where datacite:doi is an individual member of the class datacite:ResourceIdentifierScheme specifying a DataCite Digital Object Identifier, datacite:orcid is an individual member of the class datacite:PersonalIdentifierScheme specifying an Open Researcher and Contributor Identifier, and datacite:fundref is an individual member of the class datacite:FunderIdentifierScheme specifying a FundRef Funder Identifier.
As need arises, new identifiers can be added later as new members of each class, without having to modify the structure of the DataCite Ontology. We have already added to the DataCite specification by adding three members, datacite:local-resource-identifier-scheme, datacite:local-personal-identifier-scheme and datacite:local-funder-identifier-scheme, to permit the use of local identifiers , and have requested that DataCite include such local identifier schemes in their next release (version 2.3) of the DataCite Metadata Kernel.
Version 2.2 of the DataCite Metadata Kernel has a property “Description”, with four permitted values: ‘abstract’, ‘other’, ‘series information’ and ‘table of content’. The DataCite team recognises that this rather rag-bag collection of values makes use of the Description property highly problematic, and has referred this matter to its metadata committee for re-consideration.
Nevertheless, to complete our development of the DataCite Ontology, thus permitting mapping of all the DataCite Metadata Kernel v2.2 metadata properties, we have created a new class, datacite:DescriptionType, and two final DataCite object properties, datacite:hasDescription and datacite:hasDescriptionType. These allows us to link an entity to another item representing an entity description of a particular type. This is defined using the property datacite:hasDescriptionType, which must have as its object one of the members of the class datacite:DescriptionType, i.e datacite:abstract, datacite:other, datacite:series-information and datacite:table-of-content. In this way it is possible to associate written documents (e.g. journal articles or ‘data articles’) as descriptions of datasets, as shown in the following excerpt:
:this-dataset a fabio:Dataset ; datacite:hasDescription [ a fabio:JournalArticle ; datacite:hasDescriptionType datacite:other ] .
We expect the membership of this class datacite:DescriptionType will expand once the DataCite Metadata Kernel v2.3 is published.
The following blog post describes the revised DataCite2RDF mapping created using this revised DataCite Ontology. We commend the use of this mapping to all who wish to encode DataCite metadata in RDF.