Skip to content

Latest commit

 

History

History

scripts

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

#Build release As the original release files have encoding problems, we need to process both the mdb and the csv release. To build the dataset we use the csv version of the current edition.

Tools needed: MDBTools and CSVKit. Download the current edition from UNECE and put it into the root directory. Then execute bash scripts/prepare_edition_mdb.sh loc{ed}mdb.zip, where {ed} identify the release.

To integrate the data from the csv then run the python file

Prerequisites:

pip install pandas titlecase

Run:

python scripts/integrate.py loc232csv.zip

The provided prepare.py file would work alone when the original csv file will be fixed upstream.