[Unfortunately, most of the links are obsolete, referring to servers that don’t exist anymore.]
One of the deliverables of the ODaP (Open Data and Publications) project is to have 40 Resource Maps with linked datasets and publications.
Our original plan was to use the ORE gateway that we developed a few years ago, see http://ore.place.pukurin.uvt.nl/. This turned out to be too cumbersome. We have now support of OAI ORE Resource Map implemented directly in our search engine, which is based on Meresco of CQ2. I didn’t use the rdf and triple store support of Meresco itself. There was not enough time to get familiar with this and what we want is rather simple. There is even no need to store the triples – just dynamically generating triples from the xml “parts” that are stored for each document by the search engine. Moreover we can make use of the record processing that is already implemented. Perhaps in a later stage we can make use of a rdf library. The sources of our Meresco variant that is called bzv are here: https://svn.non-gnu.uvt.nl/uvt-dev/trunk/sources/meresco/bzv/. For the ORE support we added a new module: see https://svn.non-gnu.uvt.nl/uvt-dev/trunk/sources/meresco/bzv/src/ore.py – ugly but effective.
There are two test servers that have implemented the ore support:
- http://evs.uvt.nl
http://bzv.uvt.nl
These are test servers. They can be unavailable or broken. For all documents stored by a search engine, Resource Maps can be generated. For the Get It! server (search.uvt.nl) this will mean 6.000.000 Resource Maps (there is a bug for generating Resource Maps for our 700.000 catalog records – this will be fixed).
The following URL templates are supported:
- http(s):///ore/ – Aggregation
- http(s):///ore/.rdf – Resource Map
- http(s):///pub/ – Publication
- http(s):///mods/.xml – MODS
- http(s):///ddi/.xml – DDI, version 2
- http(s):///ddi3/.xml – DDI, version 3
The first four URLs will always work, but the URLs for the DDI work only when there is a DDI document available.
The Aggregation and Publication URLs are 303 See other redirected to the corresponding Resource Map.
Examples are:
http://evs.uvt.nl/search?displayType=single&query=evs-uvt-nl:oai:evs.uvt.nl:3256420 (Human Start Page)
http://evs.uvt.nl/ore/evs-uvt-nl:oai:evs.uvt.nl:3256420.rdf (Resource Map)
and
https://bzv.place.pukurin.uvt.nl/search?displayType=single&query=ir-uvt-nl:oai:wo.uvt.nl:167602 (Human Start Page)
https://bzv.place.pukurin.uvt.nl/ore/ir-uvt-nl:oai:wo.uvt.nl:167602.rdf (Resource Map)
The SURFshare community has an application profile under development for Resource Maps in RDF XML, see http://wiki.surffoundation.nl/display/vp/Resource+Maps+in+RDF+XML
I will now indicate where I have not followed this document. I refer to version 0.9.
- No dcterms:issued for Aggregations. Unclear when Aggregations are issued. They are generated according to rules (computer code). “issued” implies an human act, but in our case there is no such an act.
- Eprint as it is used in the SURFshare document is overloaded. On the one hand it stands for the publication (that is enhanced) and on the other hand it stands for a corresponding object file. It is possible to have a publication without a corresponding object file, but with an enrichment in the form of a dataset or the operationalisations of the concepts used in the study. In the European Values Study, this is often the case. It is also possible to have more object files, that are versions of each other, e.g. files in different formats or to have seperate files for parts of the publication (chapters). The metadata of a publication (work) are not the same as the metadata of a file. In the Aggregation we have one resource that is the publication with the metadata of the publication and we have zero, one or more object file resources. Our arguments to make this distinction between the publication and the object files are the same as in Use of MODS for institutional repositories.There is a describedBy relation from the publication resource to the MODS metadata resource. Both resources are in the Aggregation. The object files are treated as … files. They are also part of the Aggregation.
The type of a publication resource is one of the publication types from info:eo-repo/semantics/. The type of the object file resources is info:eu-repo/semantics/objectFile.
Still to do and issues:
- lots of details
- use file details from the DIDL container in the metadata for the object file resources.
- add data object resources. The data objects are described in the ddi. This can be compared to the Data and supplementary material tab of
https://bzv.uvt.nl/search?displayType=single&query=ir-uvt-nl:oai:wo.uvt.nl:167602that is also based on information from the ddi document, i.e. fromhttps://bzv.uvt.nl/ddi/ir-uvt-nl:oai:wo.uvt.nl:167602. Note that we distinguish between datasets, files supplementary materials and combination files with datasets and supplementary materials. - add operationalisations (European Values Study) to the Aggregation. This is the kind of information that is displayed in the Operationalisations tab of http://evs.uvt.nl/search?displayType=single&query=evs-uvt-nl:oai:evs.uvt.nl:3256420 and that comes from this ddi: http://evs.uvt.nl/ddi3/evs-uvt-nl:oai:evs.uvt.nl:3256420. The same for the information about the waves and the countries, see the corresponding tab. I am not sure how to represent this in rdf. One idea is to represent the operationalisations used in a study as a separate Aggregation,
- expand the publication resource description with more metadata (extracted from the MODS). In the end the MODS becomes redundant because all the information in the MODS record is expressed as rdf triples.
- dcterms:isPartOf is used to relate a publication to what is called in MODS a related item (type=”host” or type=”series”). At the moment the related item is represented by a literal which is not the intention of dcterms:isPartOf. This literal is in the form of the traditional “source” field which we use in the user interface of the search engine. For “human consumption” this is very clear. Should the rdf description of the publication have the same granularity as the MODS? And how to link to the related items when we have no URIs for them? (I packed a lot of issues in one bullet point 😉
- I couldn’t find an ontology or vocabulary to express that a MODS resource “describes” a publication.
- How to express relations between publications and object files?
- Relate publication resource to DOI and other publisher controlled identifiers. The integration of the ‘place’ locator that knows about DOIs, etc. is at the moment at the level of Javascript (in the browser). That’s a problem.
- Add extra information from Webwijs to the foaf descriptions of authors.
- How to publish the Resource Maps? I have a preference for sitemaps. Very easy for me to implement.
- What is the added value of the InContext visualizer for the Aggregations of the European Values Study or of ODaP?
- And so on.