Skip to content

Commit e24eb6b

Browse files
committed
add doc for simple URL harvesting
1 parent 5383211 commit e24eb6b

File tree

2 files changed

+55
-0
lines changed

2 files changed

+55
-0
lines changed
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
.. _harvesting-simpleurl-services:
2+
3+
Simple URL harvesting (opendata)
4+
################################
5+
6+
This harvester connects to a remote server via a simple URL to retrieve metadata records. This allows harvesting opendata catalogs such as opendatasoft, ESRI, DKAN and more.
7+
8+
Adding a simple URL harvester
9+
`````````````````````````````
10+
11+
- **Site** - Options about the remote site.
12+
13+
- *Name* - This is a short description of the remote site. It will be shown in the harvesting main page as the name for this instance of the harvester.
14+
- *Service URL* - The URL of the server to be harvested. This can include pagination params like ``?start=0&rows=20``
15+
- *loopElement* - Propery/element containing a list of the record entries. (Indicated as an absolute path from the document root.) eg. ``/datasets``
16+
- *numberOfRecordPath* : Property indicating the total count of record entries. (Indicated as an absolute path from the document root.) eg. ``/nhits``
17+
- *recordIdPath* : Property containing the record id. eg. ``datasetid``
18+
- *pageFromParam* : Property indicating the first record item on the current "page" eg. ``start``
19+
- *pageSizeParam* : Property indicating the number of records containned in the current "page" eg. ``rows``
20+
- *toISOConversion* : Name of the conversion schema to use, which must be available as XSL on the GN instance. eg. ``OPENDATASOFT-to-ISO19115-3-2018``
21+
22+
Note: GN looks for schemas by name in https://github.com/geonetwork/core-geonetwork/tree/4.0.x/web/src/main/webapp/xsl/conversion/import. These schemas might internally include schemas from other locations like https://github.com/geonetwork/core-geonetwork/tree/4.0.x/schemas/iso19115-3.2018/src/main/plugin/iso19115-3.2018/convert. To indicate the ``fromJsonOpenDataSoft`` schema for example, from the latter location directly in the admin UI the following syntax can be used: ``schema:iso19115-3.2018:convert/fromJsonOpenDataSoft``.
23+
24+
25+
**Sample configuration for opendatasoft**
26+
27+
- *loopElement* - ``/datasets``
28+
- *numberOfRecordPath* : ``/nhits``
29+
- *recordIdPath* : ``datasetid``
30+
- *pageFromParam* : ``start``
31+
- *pageSizeParam* : ``rows``
32+
- *toISOConversion* : ``OPENDATASOFT-to-ISO19115-3-2018``
33+
34+
35+
**Sample configuration for ESRI**
36+
37+
- *loopElement* - ``/dataset``
38+
- *numberOfRecordPath* : ``/result/count``
39+
- *recordIdPath* : ``landingPage``
40+
- *pageFromParam* : ``start``
41+
- *pageSizeParam* : ``rows``
42+
- *toISOConversion* : ``ESRIDCAT-to-ISO19115-3-2018``
43+
44+
**Sample configuration for DKAN**
45+
46+
- *loopElement* - ``/result/0``
47+
- *numberOfRecordPath* : ``/result/count``
48+
- *recordIdPath* : ``id``
49+
- *pageFromParam* : ``start``
50+
- *pageSizeParam* : ``rows``
51+
- *toISOConversion* : ``DKAN-to-ISO19115-3-2018``
52+
53+
- **Privileges** - Assign privileges to harvested metadata.
54+

source/user-guide/harvesting/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ The following sources can be harvested:
2020
harvesting-geonetwork.rst
2121
harvesting-csw.rst
2222
harvesting-ogcwxs.rst
23+
harvesting-simpleurl.rst
2324
harvesting-filesystem.rst
2425
harvesting-webdav.rst
2526
harvesting-oaipmh.rst

0 commit comments

Comments
 (0)