Python scripts and Jupyter notebooks to import G-NAF dataset into Elasticsearch using MongoDB along with a script to generate dummy Australian demographic data and import it into Elasticsearch.
The Geocoded National Address File (referred to as G-NAF) is Australia’s authoritative, geocoded address file.
G-NAF is produced by PSMA Australia and on their page for GNAF, there is a Getting Started Guide. You can follow the steps in the guide to import G-NAF into a relational database. Almost everything you need is in the guide but one thing is missing - the actual method to import the data files into a database.
I import G-NAF into PostgreSQL and then import from PostgreSQL to Elasticsearch.
copy_gnaf_to_postgres.ipynb
generates PostgreSQL's COPY command to import the files into Postgres. Just copy the generated command and use in PostgreSQL.
elastic_gnaf.py
imports G-NAF from postgres to Elasticsearch.
elastic_australian_people.py
is not related to G-NAF, but generates dummy Australian demographic data and imports it into Elasticsearch, if you need. 😊
- data61/gnaf: A set of utilities developed by CSIRO's Data61 to import G-NAF into a relational database, establish a Apache Lucence (which Elasticsearch uses under the hood) index and many other things.
- Building real-time address search with the Australian G-NAF dataset: A blog from Elastic with a similar goal as this repository but uses Microsoft F#.
- aus-search: Uses Node.js and MongoDB to import G-NAF into Elasticsearch.