CSW (Catalogue Service for the Web) is an OGC (Open Geospatial Consortium) specification that defines common interfaces to discover, browse, and query metadata about data, services, and other potential resources.
Data.gov provides access to its catalog via the CSW standard for both first-order and all metadata for harvested data, services and applications. The first-order CSW endpoint provides collection level filtering of all metadata records. The all metadata CSW endpoint provides all levels of metadata at varying levels of granularity.
Any client supporting CSW (desktop, GIS, web application, client library, etc.) can integrate the Data.gov CSW endpoints.
The Data.gov CSW endpoints are implemented using pycsw. pycsw is an OGC CSW server implementation written in Python, enabling the publishing and discovery of data and services, providing a standards-based metadata and catalogue component of spatial data infrastructures. pycsw is Certified OGC Compliant and is an OGC Reference Implementation.
The Data.gov CSW endpoints support the OGC CSW 2.0.2 standard as well as the ISO Metadata Application 1.0.0 Profile. The CSW endpoints operate over HTTP GET, POST (XML) and SOAP.
While making HTTP GET requests is relatively straightforward, HTTP POST (XML) requests require the client to open the connection and send the request XML as the payload. Below are a few examples on how to run HTTP POST (XML) requests on the command line:
# assuming XML request is saved to csw-request.xml
# curl
curl -X POST -d @csw-request.xml http://catalog.data.gov/csw
# lwp-request
cat csw-request.xml | POST http://catalog.data.gov/csw
# wget
wget http://catalog.data.gov/csw --post-file=csw-request.xml
First-order metadata: http://catalog.data.gov/csw
All metadata: http://catalog.data.gov/csw-all
We will use the first-order CSW endpoint for the examples below.
The GetCapabilities
operation provides CSW clients with service metadata
about the CSW service as an XML document.
Examples:
HTTP GET: http://catalog.data.gov/csw?service=CSW&version=2.0.2&request=GetCapabilities
HTTP POST (XML):
<csw:GetCapabilities xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xmlns:ows="http://www.opengis.net/ows" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/cat/csw/2.0.2 http://schemas.opengis.net/csw/2.0.2/CSW-discovery.xsd" service="CSW">
<ows:AcceptVersions>
<ows:Version>2.0.2</ows:Version>
</ows:AcceptVersions>
<ows:AcceptFormats>
<ows:OutputFormat>application/xml</ows:OutputFormat>
</ows:AcceptFormats>
</csw:GetCapabilities>
The DescribeRecord
operation provides CSW clients with elements of supported
information models of the CSW service.
Examples:
HTTP GET: http://catalog.data.gov/csw?service=CSW&version=2.0.2&request=DescribeRecord
HTTP POST (XML):
<csw:DescribeRecord service="CSW" version="2.0.2" outputFormat="application/xml" schemaLanguage="http://www.w3.org/XML/Schema" xmlns="http://www.opengis.net/cat/csw/2.0.2" xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/cat/csw/2.0.2 http://schemas.opengis.net/csw/2.0.2/CSW-discovery.xsd">
<csw:TypeName>csw:Record</csw:TypeName>
</csw:DescribeRecord>
The GetDomain
operation provides an interface to return all possible
values for a given metadata property/queryable or parameter.
Examples:
HTTP GET:
http://catalog.data.gov/csw?service=CSW&version=2.0.2&request=GetDomain&propertyname=dc:type
http://catalog.data.gov/csw?service=CSW&version=2.0.2&request=GetDomain¶metername=GetRecords.outputSchema
HTTP POST (XML):
<csw:GetDomain xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/cat/csw/2.0.2 http://schemas.opengis.net/csw/2.0.2/CSW-discovery.xsd" version="2.0.2" service="CSW">
<csw:PropertyName>dc:type</csw:PropertyName>
</csw:GetDomain>
<csw:GetDomain xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/cat/csw/2.0.2 http://schemas.opengis.net/csw/2.0.2/CSW-discovery.xsd" version="2.0.2" service="CSW">
<csw:ParameterName>GetRecords.outputSchema</csw:ParameterName>
</csw:GetDomain>
The GetRecords
operation provides a query interface to search for data
both using spatial predicates as well as attribute / temporal queries, or both.
GetRecords
queries are best invoked as HTTP POST (XML). Examples:
Notes:
- adjust the
startPosition
andmaxRecords
parameters to customize / page responses - adjust the optional
outputSchema
parameter to customize output format (default is Dublin Core [http://www.opengis.net/cat/csw/2.0.2
], usehttp://www.isotc211.org/2005/gmd
for ISO) - adjust the optional
csw:ElementSetName
parameter (brief
,summary
[default],full
) to adjust verbosity of metadata record responses - bounding box queries and responses always use axis order latitude longitude
Query all records, return records 1 - 15:
<csw:GetRecords xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xmlns:ogc="http://www.opengis.net/ogc" service="CSW" version="2.0.2" resultType="results" startPosition="1" maxRecords="15" outputFormat="application/xml" outputSchema="http://www.opengis.net/cat/csw/2.0.2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/cat/csw/2.0.2 http://schemas.opengis.net/csw/2.0.2/CSW-discovery.xsd">
<csw:Query typeNames="csw:Record">
<csw:ElementSetName>full</csw:ElementSetName>
</csw:Query>
</csw:GetRecords>
Query records with a bounding box:
<csw:GetRecords xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xmlns:ogc="http://www.opengis.net/ogc" service="CSW" version="2.0.2" resultType="results" startPosition="1" maxRecords="5" outputFormat="application/xml" outputSchema="http://www.opengis.net/cat/csw/2.0.2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/cat/csw/2.0.2 http://schemas.opengis.net/csw/2.0.2/CSW-discovery.xsd" xmlns:gml="http://www.opengis.net/gml">
<csw:Query typeNames="csw:Record">
<csw:ElementSetName>brief</csw:ElementSetName>
<csw:Constraint version="1.1.0">
<ogc:Filter>
<ogc:BBOX>
<ogc:PropertyName>ows:BoundingBox</ogc:PropertyName>
<gml:Envelope>
<gml:lowerCorner>47 -5</gml:lowerCorner>
<gml:upperCorner>55 20</gml:upperCorner>
</gml:Envelope>
</ogc:BBOX>
</ogc:Filter>
</csw:Constraint>
</csw:Query>
</csw:GetRecords>
Query records by attribute:
<csw:GetRecords xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xmlns:ogc="http://www.opengis.net/ogc" service="CSW" version="2.0.2" resultType="results" startPosition="1" maxRecords="10" outputFormat="application/xml" outputSchema="http://www.opengis.net/cat/csw/2.0.2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/cat/csw/2.0.2 http://schemas.opengis.net/csw/2.0.2/CSW-discovery.xsd" xmlns:gmd="http://www.isotc211.org/2005/gmd" xmlns:apiso="http://www.opengis.net/cat/csw/apiso/1.0">
<csw:Query typeNames="csw:Record">
<csw:ElementSetName>brief</csw:ElementSetName>
<csw:Constraint version="1.1.0">
<ogc:Filter>
<ogc:PropertyIsLike matchCase="false" wildCard="%" singleChar="_" escapeChar="\">
<ogc:PropertyName>dc:title</ogc:PropertyName>
<ogc:Literal>roads%</ogc:Literal>
</ogc:PropertyIsLike>
</ogc:Filter>
</csw:Constraint>
</csw:Query>
</csw:GetRecords>
Query records by full text search:
<csw:GetRecords xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xmlns:ogc="http://www.opengis.net/ogc" service="CSW" version="2.0.2" resultType="results" startPosition="1" maxRecords="5" outputFormat="application/xml" outputSchema="http://www.opengis.net/cat/csw/2.0.2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/cat/csw/2.0.2 http://schemas.opengis.net/csw/2.0.2/CSW-discovery.xsd">
<csw:Query typeNames="csw:Record">
<csw:ElementSetName>brief</csw:ElementSetName>
<csw:Constraint version="1.1.0">
<ogc:Filter>
<ogc:PropertyIsEqualTo>
<ogc:PropertyName>csw:AnyText</ogc:PropertyName>
<ogc:Literal>roads</ogc:Literal>
</ogc:PropertyIsEqualTo>
</ogc:Filter>
</csw:Constraint>
</csw:Query>
</csw:GetRecords>
and part of the response:
<!-- pycsw 1.8.0 -->
<csw:GetRecordsResponse version="2.0.2" xsi:schemaLocation="http://www.opengis.net/cat/csw/2.0.2 http://schemas.opengis.net/csw/2.0.2/CSW-discovery.xsd">
<csw:SearchStatus timestamp="2014-03-17T18:09:43Z"/>
<csw:SearchResults nextRecord="6" numberOfRecordsMatched="27724" numberOfRecordsReturned="5" recordSchema="http://www.opengis.net/cat/csw/2.0.2" elementSet="brief">
......
Query records by combined bounding box and full text search:
<csw:GetRecords xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xmlns:gml="http://www.opengis.net/gml" xmlns:ogc="http://www.opengis.net/ogc" service="CSW" version="2.0.2" resultType="results" startPosition="1" maxRecords="5" outputFormat="application/xml" outputSchema="http://www.opengis.net/cat/csw/2.0.2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/cat/csw/2.0.2 http://schemas.opengis.net/csw/2.0.2/CSW-discovery.xsd">
<csw:Query typeNames="csw:Record">
<csw:ElementSetName>brief</csw:ElementSetName>
<csw:Constraint version="1.1.0">
<ogc:Filter>
<ogc:And>
<ogc:PropertyIsEqualTo>
<ogc:PropertyName>csw:AnyText</ogc:PropertyName>
<ogc:Literal>roads</ogc:Literal>
</ogc:PropertyIsEqualTo>
<ogc:BBOX>
<ogc:PropertyName>ows:BoundingBox</ogc:PropertyName>
<gml:Envelope>
<gml:lowerCorner>47 -5</gml:lowerCorner>
<gml:upperCorner>55 20</gml:upperCorner>
</gml:Envelope>
</ogc:BBOX>
</ogc:And>
</ogc:Filter>
</csw:Constraint>
</csw:Query>
</csw:GetRecords>
Query the total number of records in the catalogue (HTTP GET):
http://catalog.data.gov/csw?service=CSW&version=2.0.2&request=GetRecords&typenames=csw:Record&elementsetname=brief
The GetRecordById
operation returns defailed information for specific
metadata records.
Examples:
HTTP GET: http://catalog.data.gov/csw?service=CSW&version=2.0.2&request=GetRecordById&id=16bbf4f8-8e88-45c6-a76b-6af51b2b3555&elementsetname=full
HTTP GET (ISO 19139):
http://catalog.data.gov/csw?service=CSW&version=2.0.2&request=GetRecordById&id=16bbf4f8-8e88-45c6-a76b-6af51b2b3555&elementsetname=full&outputSchema=http://www.isotc211.org/2005/gmd
HTTP GET (ISO 19139 brief):
http://catalog.data.gov/csw?service=CSW&version=2.0.2&request=GetRecordById&id=16bbf4f8-8e88-45c6-a76b-6af51b2b3555&elementsetname=brief&outputSchema=http://www.isotc211.org/2005/gmd
HTTP POST (XML):
<csw:GetRecordById service="CSW" version="2.0.2" outputFormat="application/xml" outputSchema="http://www.opengis.net/cat/csw/2.0.2" xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/cat/csw/2.0.2 http://schemas.opengis.net/csw/2.0.2/CSW-discovery.xsd">
<csw:Id>16bbf4f8-8e88-45c6-a76b-6af51b2b3555</csw:Id>
<csw:ElementSetName>full</csw:ElementSetName>
</csw:GetRecordById>
Nice summary!