Skip to content

Commit

Permalink
Merge pull request #227 from tkohr/harvesting-ods
Browse files Browse the repository at this point in the history
Add doc for harvesting simple URL
  • Loading branch information
fgravin authored Mar 2, 2023
2 parents 5383211 + e24eb6b commit 4b5d466
Show file tree
Hide file tree
Showing 2 changed files with 55 additions and 0 deletions.
54 changes: 54 additions & 0 deletions source/user-guide/harvesting/harvesting-simpleurl.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
.. _harvesting-simpleurl-services:

Simple URL harvesting (opendata)
################################

This harvester connects to a remote server via a simple URL to retrieve metadata records. This allows harvesting opendata catalogs such as opendatasoft, ESRI, DKAN and more.

Adding a simple URL harvester
`````````````````````````````

- **Site** - Options about the remote site.

- *Name* - This is a short description of the remote site. It will be shown in the harvesting main page as the name for this instance of the harvester.
- *Service URL* - The URL of the server to be harvested. This can include pagination params like ``?start=0&rows=20``
- *loopElement* - Propery/element containing a list of the record entries. (Indicated as an absolute path from the document root.) eg. ``/datasets``
- *numberOfRecordPath* : Property indicating the total count of record entries. (Indicated as an absolute path from the document root.) eg. ``/nhits``
- *recordIdPath* : Property containing the record id. eg. ``datasetid``
- *pageFromParam* : Property indicating the first record item on the current "page" eg. ``start``
- *pageSizeParam* : Property indicating the number of records containned in the current "page" eg. ``rows``
- *toISOConversion* : Name of the conversion schema to use, which must be available as XSL on the GN instance. eg. ``OPENDATASOFT-to-ISO19115-3-2018``

Note: GN looks for schemas by name in https://github.com/geonetwork/core-geonetwork/tree/4.0.x/web/src/main/webapp/xsl/conversion/import. These schemas might internally include schemas from other locations like https://github.com/geonetwork/core-geonetwork/tree/4.0.x/schemas/iso19115-3.2018/src/main/plugin/iso19115-3.2018/convert. To indicate the ``fromJsonOpenDataSoft`` schema for example, from the latter location directly in the admin UI the following syntax can be used: ``schema:iso19115-3.2018:convert/fromJsonOpenDataSoft``.


**Sample configuration for opendatasoft**

- *loopElement* - ``/datasets``
- *numberOfRecordPath* : ``/nhits``
- *recordIdPath* : ``datasetid``
- *pageFromParam* : ``start``
- *pageSizeParam* : ``rows``
- *toISOConversion* : ``OPENDATASOFT-to-ISO19115-3-2018``


**Sample configuration for ESRI**

- *loopElement* - ``/dataset``
- *numberOfRecordPath* : ``/result/count``
- *recordIdPath* : ``landingPage``
- *pageFromParam* : ``start``
- *pageSizeParam* : ``rows``
- *toISOConversion* : ``ESRIDCAT-to-ISO19115-3-2018``

**Sample configuration for DKAN**

- *loopElement* - ``/result/0``
- *numberOfRecordPath* : ``/result/count``
- *recordIdPath* : ``id``
- *pageFromParam* : ``start``
- *pageSizeParam* : ``rows``
- *toISOConversion* : ``DKAN-to-ISO19115-3-2018``

- **Privileges** - Assign privileges to harvested metadata.

1 change: 1 addition & 0 deletions source/user-guide/harvesting/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ The following sources can be harvested:
harvesting-geonetwork.rst
harvesting-csw.rst
harvesting-ogcwxs.rst
harvesting-simpleurl.rst
harvesting-filesystem.rst
harvesting-webdav.rst
harvesting-oaipmh.rst
Expand Down

0 comments on commit 4b5d466

Please sign in to comment.