Scripts to download information from the online version of the Dictionary of Spanish Language of the Royal Spanish Academy.
Please use these scripts responsibly and never cause DoS attacks to the rae.es services.
You'll need to have Ruby a PostgreSQL and a Redis server.
Setup the project with bin/install
: this script will install the Ruby dependencies required to run the scripts and create the database where the words and their definitions will be stored. It will also create a .env
file with some configuration parameters. You may need to adjust your PostgreSQL and Redis connection details here.
The database contains a words
table with the following fields:
word
: the string with the worddata
: JSON data fetched with the nebrija gemdefined_at
: timestamp indicating when the word got defined (i.e: when thedata
column got updated)created_at
andupdated_at
: obvious timestamps
To insert words from data/lemario.txt
into the database, run:
bin/seed
To list the number of existing words, the number of undefined ones, and the number of workers scheduled, run:
bin/stats
Run sidekiq to be able to schedule and execute workers with:
bin/sidekiq
While sidekiq runs in one terminal, open another one and workers for all the undefined words by running:
bin/schedule
Each worker uses nebrija to download the data for the word and store it in its data
field.
Show the data for a given word with:
bin/show palabra
You can load the library in a IRB session for debugging with:
bin/console
- The
lemario.txt
file comes from the collection compiled by Ismael Olea.