Using SeedDMS as an OAI-PMH repository
The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a mechanism to expose meta data by a data provider. SeedDMS may act as such a data provider delivering meta data of those documents stored in SeedDMS.
OAI-PMH is widely used by libraries, repositories and publishers to provide meta data e.g. for discovery systems like vufind. It does not prefer a particular meta data standard, but uses Dublin Core (dc) as its bare minimum.
In order to turn SeedDMS into a OAI-PMH data provider, you need to install
the oai-pmh extension which is available at https://codeberg.org/SeedDMS/oai-pmh
The extension implements all 6 OAI-PMH request by extending the rest API with
the endpoint /restapi/index.php/oai-pmh/
.
Though there is some configuration for the extension, it already works with the defaults. Just point your browser to
https://your-domain/restapi/index.php/oai-pmh/?verb=identify
and you should see a table with some information about the repository.
Update for 1.1.0: This release adds a new page at
https://your-domain/restapi/index.php/oai-pmh/
with html forms for
testing each verb.
The repository name is constructed by adding ‘OAI-PMH’ to the configured
site name of your SeedDMS installation. The base url is also detected
automatically unless there is a base url configured in SeedDMS. The
repository identifier is the domain part of the base url. That repository
identifier forms the record identifier by prefixing it with ‘oai:’ and adding
‘:’ to the end. A record contains the meta data representing
a document in SeedDMS. The actual location/identifier of the document itself
is part of that meta data. In case of Dublin Core it is stored in element <dc:identifier>
Sets
OAI-PMH introduces the notion of sets. Sets are for grouping resources by whatever appears to be useful by the operator of the repository. Harvesting of meta data can be restricted to those sets.
The oai-pmh extension derives sets from either the folder structure or the categories. If the folder structure is used, a document will be part of the set formed by its direct parent folder, but also by all other folders up to the root folder. If categories are used, each set consists of those documents having that particular category set.
Meta data
As already mentioned, the oai-pmh extension uses Dublin Core for meta data representation. It is a very simply schema and sufficient to hold the meta data already stored in SeedDMS. The following list explains the dc fields and how they are filled by the extension.
dc:identifier
: The full URL to the documents details pagedc:creator
: The owner of the documentdc:subject
: The keywords of a documentdc:description
: The comment of the documentdc:date
: The creation date of the last versiondc:format
: The mime type of the last version
There are possibilities to extend those meta data or even use a different kind of meta data standard, but that would be to complex to lay out in this article. Just contact use if you have questions.
Testing your OAI-PMH repository
A very convenient way to test your repository is the tool at https://validator.oaipmh.com/. All you need to do is enter your URL on that page and let it check your repository.