Using SeedDMS as an OAI-PMH repository

The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a mechanism to expose meta data by a data provider. SeedDMS may act as such a data provider delivering meta data of those documents stored in SeedDMS.

OAI-PMH is widely used by libraries, repositories and publishers to provide meta data e.g. for discovery systems like vufind. It does not prefer a particular meta data standard, but uses Dublin Core (dc) as its bare minimum.

In order to turn SeedDMS into a OAI-PMH data provider, you need to install the oai-pmh extension which is available at https://codeberg.org/SeedDMS/oai-pmh The extension implements all 6 OAI-PMH request by extending the rest API with the endpoint /restapi/index.php/oai-pmh/.

Though there is some configuration for the extension, it already works with the defaults. Just point your browser to

https://your-domain/restapi/index.php/oai-pmh/?verb=identify

and you should see a table with some information about the repository.

The repository name is constructed by adding ‘OAI-PMH’ to the configured site name of your SeedDMS installation. The base url is also detected automatically unless there is a base url configured in SeedDMS. The repository identifier is the domain part of the base url. That repository identifier forms the record identfier by prefixing it with ‘oai:’ and adding ‘:’ to the end. A record contains the meta data representing a document in SeedDMS. The actual location/identifier of the document itself is part of the meta data. In case of Dublin Core it is stored in element <dc:identifier>

Sets

OAI-PMH knows the notion of sets. Sets are for grouping resources by whatever appears to be useful by the operator of the repository. Harvesting of meta data can be restricted to those sets.

The oai-pmh extension derives sets from either the folder structure or the categories. If the folder structure is used, a document will be part of the set formed by its direct parent folder, but also by all other folders up to the root folder. If categories are used, each set consists of those documents having that particular category set.

Meta data

As already mentioned, the oai-pmh extension uses Dublin Core for meta data representation. It is a very simply schema and sufficient to hold the meta data already stored in SeedDMS. The following list explains the dc fields and how they are filled by the extension.

  • dc:identifier: The full URL to the documents details page
  • dc:creator: The owner of the document
  • dc:subject: The keywords of a document
  • dc:description: The comment of the document
  • dc:date: The creation date of the last version
  • dc:format: The mime type of the last version

There are possibilities to extend those meta data or even use a different kind of meta data standard, but that would be to complex to lay out in this article. Just contact use if you have questions.

Testing your OAI-PMH repository

A very convenient way to test your repository is the tool at https://validator.oaipmh.com/. All you need to do is enter your URL on that page and let it check your repository.