Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][xl] Fixing the workflow in the dataset #33

Merged
merged 7 commits into from
Oct 29, 2024
Merged

Conversation

gradedSystem
Copy link
Member

Changes made

  • Fixed down the workflow of getting the files from unlocode website
  • adjusted the scripts
  • added new data
  • fixed bash script
  • added Makefile

cc @anuveyatsu @sabas

@gradedSystem
Copy link
Member Author

hi @sabas can you confirm this looks fine?

@sabas
Copy link
Contributor

sabas commented Oct 15, 2024

Thanks, questions:

  • are the quotation marks necessary for all the fields in the output? (if I remember I used csvclean to remove them)
  • csvkit and mdbtools are prerequisites for the bash script I wonder if you need to tell this somewhere to allow for automation
  • subdivision codes don't work, as it's missing the SUType column

@gradedSystem
Copy link
Member Author

Thank you for reviewing @sabas :

  1. Yes i can remove the quotation marks
  2. As for the automation it could be done in .github/workflows I will adjust this PR to the latest changes [add][m] Adding github workflow #31
  3. I will fix the script and add subdivision codes

@gradedSystem
Copy link
Member Author

@sabas With my latest script I handled the case that you mentioned

  1. Quotation marks removed from .csv files
  2. I will update the script [add][m] Adding github workflow #31 and automate updating the mdbtools and others
  3. Subdivision codes now should have SUType column

Copy link
Contributor

@sabas sabas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The remove_double_quotes doesn't work for code-list.csv, it needs to be replaced by a csv handler

scripts/prepare.py Show resolved Hide resolved
data/subdivision-codes.csv Outdated Show resolved Hide resolved
@gradedSystem gradedSystem requested a review from sabas October 25, 2024 10:39
@sabas
Copy link
Contributor

sabas commented Oct 25, 2024

@gradedSystem please fix the latest outstanding issue in subdivision file and it'd be good to merge
Build failed currently as requirements.txt is in this PR

@gradedSystem
Copy link
Member Author

@sabas Fixed down the code with the latest oustanding issue #32

@sabas
Copy link
Contributor

sabas commented Oct 29, 2024

I don't get it, I asked to change subdivision NA issue..

@gradedSystem
Copy link
Member Author

Hi @sabas, I thought that when you said latest oustanding issue it was one that was mentioned here: #32
If I understood you correctly you are saying to fix the issues when SUCode has a missing value?

@sabas
Copy link
Contributor

sabas commented Oct 29, 2024

@gradedSystem see the notes I added in subdivision-codes, the script omits NA in the fields, and the name has spaces that can be trimmed

@gradedSystem
Copy link
Member Author

@sabas should I keep the function which addresses latest oustanding issue? this one #32 that converts coordinates.

btw I fixed down the subdivision NA issue just waiting your response for above question.

@gradedSystem
Copy link
Member Author

@sabas fixed the issue

@sabas
Copy link
Contributor

sabas commented Oct 29, 2024

@sabas should I keep the function which addresses latest oustanding issue? this one #32 that converts coordinates.

btw I fixed down the subdivision NA issue just waiting your response for above question.

Sorry, I got it now, No please keep them as in the original dataset, conversion shall be done downstream

@gradedSystem
Copy link
Member Author

@sabas Ok got it i will remove the function for conversion and I think it is good to go

@gradedSystem
Copy link
Member Author

@sabas should be fixed now

@sabas sabas merged commit 6dee4f0 into datasets:main Oct 29, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants