On the 4th & 5th of June 2013 we held a workshop at the Botanics on using stable HTTP URIs (sometimes called URLs) for specimens. This was the result of a paper we published last year and then presented at a meeting of CETAF-ISTC in Belgium in March this year. The approach we were taking seemed attractive to several institutions so we organised this workshop to iron out any issues and see how we could get the approach more widely adopted.
Summary for Humans
Natural history collections and herbaria contain many millions of specimens that are used for research. When scientists publish their results they cite which specimens they used so that other scientists can both check the work and build on what has been achieved.
Institutions that hold specimens are publishing increasing amounts of data about (and images of) their specimens on-line. We need to have a way for scientists to reference specimens so that someone reading research results can simply click a link to see the supporting data and perhaps an image. To make this happen we need stable web links to the specimens that the holding institutions commit to maintain for the long term and that are implemented in a similar way across many institutions. Once this mechanism is widely adopted machines will be able to exploit the links to specimens to help do entirely new kinds of research.
This meeting was about establishing a consistent mechanism that will work across institutions.
Thirteen people attended the workshop. They were: (In the order of appearance in the photograph left to right, top to bottom.)
- Martin Pullan – Royal Botanic Garden Edinburgh, UK
- Ayco Holleman – Naturalis, The Netherlands
- Anton Güntsch – Botanischer Garten und Botanisches Museum Berlin-Dahlem, Germany
- Dominik Röpert – Botanischer Garten und Botanisches Museum Berlin-Dahlem, Germany
- Nicola Nicolson – Royal Botanic Gardens Kew, UK
- Simon Chagnoux – Muséum national d’histoire naturelle, Paris, France
- Robyn Drinkwater – Royal Botanic Garden Edinburgh, UK
- Rob Cubey – Royal Botanic Garden Edinburgh, UK
- Falko Glöckler – Museum für Naturkunde, Berlin, Germany
- Ben Scott – Natural History Museum, London, UK
- Joerg Lange – Staatliches Museum für Naturkunde Stuttgart, Germany
- Roger Hyam – Royal Botanic Garden Edinburgh, UK
- Elspeth Haston – Royal Botanic Garden Edinburgh, UK
- Introduction and Scoping – Roger Hyam (with contribution from Elspeth Haston)
- Digitisation and stable URIs at MfN Berlin – Falko Glöckler
- NHM London – Ben Scott
- How the specimen data is organised and published at BGBM – Dominik Röpert
- Specimen Data at RBGE – Martin Pullan
- Online Specimens at MNMH – Simon Chagnoux
- Level 1: Stable HTTP URIs – Roger Hyam
- Level 1: What if? – Roger Hyam (slightly modified to include an aside on 3rd party suppliers)
- Level 2 – Linked data – Roger Hyam (degrades into demos at the end)
- Web Service Registration – Anton Güntsch
It was agreed that having stable HTTP URIs, at least to implementation level 1, was desirable. There was some discussion of the details of linked data implementation at level 2 particularly around the bookmarking of pages by users. If the user is viewing an HTML document that they have been sent to via a 303 redirect then they feel like they are looking at the specimen and might bookmark that page URL as if it were the specimen. The severity of this issue and strategies to mitigate or remove it will continue to be discussed.
Anton presented ideas for how we could expose the existence of the URIs to users. It was agreed that we would try two things:
- Registering the HTTP URIs for an institution with the The BiodiversityCatalogue which is a service providing a curated catalogue of Biodiversity Web Services.
- Provide a single RDF download file of all the HTTP URIs provided by each institution similar to a sitemap used by Google and other search engines. This would be the simplest method for syndicating data between sites and allowing people to start building services.
Anton agreed to create a specification document for the web services registration and index file.
Of the eight institutions represented in the workshop five are either in the process of implementing stable HTTP URIs or are planning to do it. All five committed to having working implementations in place and registered as web services by October this year. These five institution were, in no particular order:
- Royal Botanic Garden Edinburgh, UK
- Botanischer Garten und Botanisches Museum Berlin-Dahlem, Germany
- Muséum national d’histoire naturelle, Paris, France
- Museum für Naturkunde, Berlin, Germany
- Royal Botanic Gardens Kew, UK
The three other institutions were at different stages in the development of their infrastructure and couldn’t commit to this approach for now but did express and interest in following this pattern in the future.
We plan to have a workshop in Berlin around the 8th to 11th October in the context of a larger pro-iBiosphere meeting where we will develop a test application to confirm that these five institution’s systems are working. We might also have some fun developing demonstration applications to show how the system can be used.
As we demonstrate the value in this way of working we hope to encourage other institutions to adopt HTTP URIs for their specimens. TDWG in Florence in November and the pro-iBiosphere projects may be a suitable platforms for this. We have made no attempt so far to engage with institutions outside of Europe.
There has been overlapping/complementary work going on under the pro-iBiosphere banner on stable identifiers and a useful page discussing stable HTTP URIs is available there: Best Practices for Stable URIs.
For general reading on Linked Data there is a Linked Data Book that can be read for free on-line.