OAI-PMH

Motivation

In 2012, Leip­zig Uni­ver­si­ty Libra­ry (UB Leip­zig) reques­ted a URN ser­ver, which at night sum­ma­ri­zes all new URNs crea­ted during the day and pres­ents them rea­dy for retrie­val by the Deut­sche Natio­nal­bi­blio­thek (DNB), so that they can be inclu­ded in the DNB index. Then as now, the DNB worked with OAI cli­ents to retrie­ve the URNs. Ste­fan Frei­tag, mean­while head of soft­ware deve­lo­p­ment at Leip­zig Uni­ver­si­ty Libra­ry, first imple­men­ted an OAI-PMH ser­ver for URNs using the xepi­cur for­mat. In 2018, it was re-imple­men­ted using more cur­rent tech­no­lo­gies (inclu­ding Java 11, vert.x‑Framework).

At the same time, the idea came up to make METS/MODS files also publicly avail­ab­le. METS/MODS files are crea­ted during the Kito­do digi­tiz­a­ti­on work­flow and are used for IIIF pro­ces­sing. IIIF is a won­der­ful tech­no­lo­gy for exch­an­ging and pre­sen­ting image files, but it lacks the func­tio­n­a­li­ty to pre­sent detail­ed biblio­gra­phic infor­ma­ti­on, which in turn is aggre­ga­ted in the METS/MODS data. For this rea­son, Ste­fan Frei­tag has modi­fied the OAI ser­ver and set up a new instance. Sin­ce then, the METS/MODS data are made avail­ab­le via this ser­ver. To make this inter­face more effi­ci­ent, the data is tem­pora­ri­ly stored in a data­ba­se, which in turn is sup­plied with data at inter­vals from a self-deve­lo­ped inter­face (igiL backend). The Leip­zig Uni­ver­si­ty Libra­ry thus achie­ves a very high-per­for­mance caching of the data for the many and some­ti­mes very lar­ge METS/MODS descriptions.

Usage and example of use

Open Archi­ves Initia­ti­ve Pro­to­col for Meta­da­ta Har­ve­s­ting (OAI-PMH) is a web-based stan­dar­di­zed pro­to­col for meta­da­ta har­ve­s­ting and pro­vi­des an app­li­ca­ti­on-inde­pen­dent inter­ope­ra­bi­li­ty frame­work. With the help of a har­ves­ter (cli­ent app­li­ca­ti­on), meta­da­ta can be collected.

With the help of OAI-PMH, data recon­ci­lia­ti­on bet­ween data­ba­ses is pre­fer­red. Howe­ver, this does not exclu­de the pos­si­bi­li­ty of obtai­ning meta­da­ta without a sepa­ra­te data­ba­se via the inter­face. As men­tio­ned at the begin­ning, the inter­face is main­ly used for data syn­chro­niz­a­ti­on with the DNB. In this way, data records can be auto­ma­ted and exch­an­ged bet­ween the insti­tu­ti­ons in a defi­ned standard.

The simp­lest type of such a cli­ent is for examp­le a web brow­ser. Howe­ver, requests can also be inte­gra­ted into com­pu­ter pro­grams by pro­gramm­ers in order to send them to the ser­ver, recei­ve a respon­se and pro­cess this respon­se in the soft­ware accordingly.

In the fol­lowing, we will show you, how you can use the inter­face. A detail­ed descrip­ti­on of the pro­to­col can be found at https://www.openarchives.org/OAI/openarchivesprotocol.html

Request

OAI-PMH requests must be sent using eit­her the methods HTTP GET or POST. POST has the advan­ta­ge that the length of the argu­ments is not restric­ted. The URL enco­ding must be obser­ved for the requests.

For a request, a base URL is requi­red, which is sup­ple­men­ted by key­word argu­ments. The base URL and the key­word argu­ments must be sepa­ra­ted by a ques­ti­on mark [?].

Base URL of UB Leipzig

Keyword Arguments

In addi­ti­on to the base URL, all que­ries come with a list of key­word argu­ments in the form of key-value pairs (key=value). The argu­ments can be put tog­e­ther in any order. Several argu­ments must be sepa­ra­ted by the amper­sand [&]. Plea­se note that each OAI-PMH request must con­tain at least one key-value pair.

Examp­le for retrie­ving the imprint from the OAI ser­ver using the value “Iden­ti­fy”:

Base URL:

key­word arguments:

By default, repo­si­to­ries make their base URL avail­ab­le as the value of the baseURL ele­ment in the so-cal­led Iden­ti­fy response.

Examp­le for restric­ting the Records to a spe­ci­fic time period:

Base URL:

key­word arguments:

Response

All respon­ses to OAI PMH requests are well-for­med XML docu­ments encoded in UTF-8.

Avail­ab­le formats:

METS Meta­da­ta Enco­ding & Trans­mis­si­on Standard For­mat for describ­ing digi­tal collec­tions of objects with metadata
MODS Meta­da­ta Object Descrip­ti­on Schema For­mat for biblio­gra­phic metadata

In the fol­lowing, you can see an excerpt for the respon­se for examp­le 2. The examp­le con­tains ele­ments and attri­bu­tes that are stored in the sche­me defi­ni­ti­ons. The­se are fil­led with appro­pria­te data as requi­red by the sche­ma defi­ni­ti­on. Let’s look at the fol­lowing sec­tion, whe­re we see that the names are qui­te uni­que. Here is the tit­le of the work:

Con­ta­ct
For ques­ti­ons and sug­ges­ti­ons on this topic, plea­se con­ta­ct us.