-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Error when Python Model Goes To Write To Database #628
Comments
@Avinash-1394 maybe you have some clue here? since you implemented the python models feature. |
I think the I will try and recreate this issue. |
@Avinash-1394 do you think this could be the issue? https://docs.aws.amazon.com/athena/latest/ug/notebooks-spark-troubleshooting-tables.html#notebooks-spark-troubleshooting-tables-illegal-argument-exception my guess is that the database was created initially by a Athena SQL model, which didn’t set the location field at the database level. Then the user tries to write to the same database with a python model and it throws the error. |
@iamrobo That is does make sense. I didn't test the source feature when I added python models so I will have to recreate this to fully understand what is happening. I just assumed it will be very similar to ref but looks like there are some differences I didn't foresee. |
This is correct! I went in and modified the table directly and added a location and it did work. |
@cuff-links Glad you found a resolution so quickly. Do you mind providing some steps you took to fix so that it is documented in the GH issue itself? I will let @nicor88 decide if the issue should be closed or kept open. |
@Avinash-1394 if we don't have any action need in term of code changes we can close it. |
@nicor88 I would say that it's more of a workaround. The issue still persists. I was following the dialog here and saw that there were comments made about the location field on the database. It was empty so I filled it out in AWS and reran the model and it was able to work. But I think there should still be a fix since this would take manual intervention if the same situation happens. I think the models should be able to be used interchangeably. |
@cuff-links Let me get this right, without having a location in the database this won't work? Also I really recommend to have database creation outside dbt, this apply to whatever system you use. I'm not sure then that we need to fix the issue here. |
@nicor88, would it make sense to add the location to the database creation in the SQL materialization macros, or to add the location if it doesn't already exists in the python materialization macros, or just throw a better exception? |
I believe that we should throw a better error, and then we need to add to the documentation somewhere (in the readme) a section about python models, where we need to specify that the location must be set. I'm unsure now that we should add the location in the SQL macro materialization. If the db is created from scratch could make sense, but if the db already exists we shouldn't touch it. @cuff-links in your case the source db was already existing right? |
Yes, that is correct. I agree that a better error and some documentation would be good for this. |
Should this still be open? |
@cuff-links yes, any of the actions about were not taken. |
Is this a new bug in dbt-athena?
Current Behavior
When I run dbt run for a python model, there is a crash when the data is being inserted into the output. This does not happen with an equivalent SQL model.
Expected Behavior
We expect the database and tables to be created without error as is the same with the SQL Model.
Steps To Reproduce
1.) With the config and models provided in the additional context section,
attempt to run
dbt run
with that singular model.Environment
Additional Context
Our Python Model
Our Equivalent SQL Model
Tables and Database list in AWS
Stack Trace from Spark Session
Output from console with debug flag on
Output from console
The text was updated successfully, but these errors were encountered: