|
| 1 | +.. _harvesting-simpleurl-services: |
| 2 | + |
| 3 | +Simple URL harvesting (opendata) |
| 4 | +################################ |
| 5 | + |
| 6 | +This harvester connects to a remote server via a simple URL to retrieve metadata records. This allows harvesting opendata catalogs such as opendatasoft, ESRI, DKAN and more. |
| 7 | + |
| 8 | +Adding a simple URL harvester |
| 9 | +````````````````````````````` |
| 10 | + |
| 11 | +- **Site** - Options about the remote site. |
| 12 | + |
| 13 | + - *Name* - This is a short description of the remote site. It will be shown in the harvesting main page as the name for this instance of the harvester. |
| 14 | + - *Service URL* - The URL of the server to be harvested. This can include pagination params like ``?start=0&rows=20`` |
| 15 | + - *loopElement* - Propery/element containing a list of the record entries. (Indicated as an absolute path from the document root.) eg. ``/datasets`` |
| 16 | + - *numberOfRecordPath* : Property indicating the total count of record entries. (Indicated as an absolute path from the document root.) eg. ``/nhits`` |
| 17 | + - *recordIdPath* : Property containing the record id. eg. ``datasetid`` |
| 18 | + - *pageFromParam* : Property indicating the first record item on the current "page" eg. ``start`` |
| 19 | + - *pageSizeParam* : Property indicating the number of records containned in the current "page" eg. ``rows`` |
| 20 | + - *toISOConversion* : Name of the conversion schema to use, which must be available as XSL on the GN instance. eg. ``OPENDATASOFT-to-ISO19115-3-2018`` |
| 21 | + |
| 22 | + Note: GN looks for schemas by name in https://github.com/geonetwork/core-geonetwork/tree/4.0.x/web/src/main/webapp/xsl/conversion/import. These schemas might internally include schemas from other locations like https://github.com/geonetwork/core-geonetwork/tree/4.0.x/schemas/iso19115-3.2018/src/main/plugin/iso19115-3.2018/convert. To indicate the ``fromJsonOpenDataSoft`` schema for example, from the latter location directly in the admin UI the following syntax can be used: ``schema:iso19115-3.2018:convert/fromJsonOpenDataSoft``. |
| 23 | + |
| 24 | + |
| 25 | + **Sample configuration for opendatasoft** |
| 26 | + |
| 27 | + - *loopElement* - ``/datasets`` |
| 28 | + - *numberOfRecordPath* : ``/nhits`` |
| 29 | + - *recordIdPath* : ``datasetid`` |
| 30 | + - *pageFromParam* : ``start`` |
| 31 | + - *pageSizeParam* : ``rows`` |
| 32 | + - *toISOConversion* : ``OPENDATASOFT-to-ISO19115-3-2018`` |
| 33 | + |
| 34 | + |
| 35 | + **Sample configuration for ESRI** |
| 36 | + |
| 37 | + - *loopElement* - ``/dataset`` |
| 38 | + - *numberOfRecordPath* : ``/result/count`` |
| 39 | + - *recordIdPath* : ``landingPage`` |
| 40 | + - *pageFromParam* : ``start`` |
| 41 | + - *pageSizeParam* : ``rows`` |
| 42 | + - *toISOConversion* : ``ESRIDCAT-to-ISO19115-3-2018`` |
| 43 | + |
| 44 | + **Sample configuration for DKAN** |
| 45 | + |
| 46 | + - *loopElement* - ``/result/0`` |
| 47 | + - *numberOfRecordPath* : ``/result/count`` |
| 48 | + - *recordIdPath* : ``id`` |
| 49 | + - *pageFromParam* : ``start`` |
| 50 | + - *pageSizeParam* : ``rows`` |
| 51 | + - *toISOConversion* : ``DKAN-to-ISO19115-3-2018`` |
| 52 | + |
| 53 | +- **Privileges** - Assign privileges to harvested metadata. |
| 54 | + |
0 commit comments