-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Implement iceberg retry logic #657
Conversation
Code looks good, I'm wondering if we can use the test suite for functional testing to attempt to run few incremental models with append behavior to simulate a concurrent insert scenario. |
@svdimchenko if you have some urgency on merging this, I'm fine to to merge as it is, because you did the functional testing in your environment. It will be a nice to have to add the functional testing of running the same "model" concurrently, but it's not really necessary before the merging. |
I've managed to add functional tests, but somehow functional tests are not run in ci as they are marked as skipped( |
@svdimchenko I noticed in other prs too that the functional tests were skipped, they should kick in now. I don't know what lead to that issue, but I removed the label and added it again |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job, specially on the functional tests
This PR contains the following updates: | Package | Update | Change | |---|---|---| | [dbt-athena-community](https://togithub.com/dbt-athena/dbt-athena) | minor | `==1.7.2` -> `==1.8.2` | --- ### Release Notes <details> <summary>dbt-athena/dbt-athena (dbt-athena-community)</summary> ### [`v1.8.2`](https://togithub.com/dbt-athena/dbt-athena/releases/tag/v1.8.2) [Compare Source](https://togithub.com/dbt-athena/dbt-athena/compare/v1.8.1...v1.8.2) ### What's Changed #### Fixes - fix: Add wait_random_exponential for query retries by [@​svdimchenko](https://togithub.com/svdimchenko) in [https://github.com/dbt-athena/dbt-athena/pull/655](https://togithub.com/dbt-athena/dbt-athena/pull/655) - fix: Resolve error when cloning Python models ([#​645](https://togithub.com/dbt-athena/dbt-athena/issues/645)) by [@​jeancochrane](https://togithub.com/jeancochrane) in [https://github.com/dbt-athena/dbt-athena/pull/651](https://togithub.com/dbt-athena/dbt-athena/pull/651) - fix: Fixed table_type for GOVERNED tables by [@​svdimchenko](https://togithub.com/svdimchenko) in [https://github.com/dbt-athena/dbt-athena/pull/661](https://togithub.com/dbt-athena/dbt-athena/pull/661) #### Features - feat: Set unique table suffix to allow parallel incremental executions by [@​pierrebzl](https://togithub.com/pierrebzl) in [https://github.com/dbt-athena/dbt-athena/pull/650](https://togithub.com/dbt-athena/dbt-athena/pull/650) - feat: Allow custom schema def for tmp tables generated by incremental by [@​pierrebzl](https://togithub.com/pierrebzl) in [https://github.com/dbt-athena/dbt-athena/pull/659](https://togithub.com/dbt-athena/dbt-athena/pull/659) - feat: Implement iceberg retry logic by [@​svdimchenko](https://togithub.com/svdimchenko) in [https://github.com/dbt-athena/dbt-athena/pull/657](https://togithub.com/dbt-athena/dbt-athena/pull/657) #### Dependencies - chore: Update moto requirement from ~=5.0.7 to ~=5.0.8 by [@​dependabot](https://togithub.com/dependabot) in [https://github.com/dbt-athena/dbt-athena/pull/660](https://togithub.com/dbt-athena/dbt-athena/pull/660) - chore: Bumped version to 1.8.2 for release by [@​svdimchenko](https://togithub.com/svdimchenko) in [https://github.com/dbt-athena/dbt-athena/pull/663](https://togithub.com/dbt-athena/dbt-athena/pull/663) #### Docs - docs: Cleanup README grammar, punctuation, and capitalisation by [@​dfsnow](https://togithub.com/dfsnow) in [https://github.com/dbt-athena/dbt-athena/pull/654](https://togithub.com/dbt-athena/dbt-athena/pull/654) #### New Contributors - [@​jeancochrane](https://togithub.com/jeancochrane) made their first contribution in [https://github.com/dbt-athena/dbt-athena/pull/651](https://togithub.com/dbt-athena/dbt-athena/pull/651) - [@​pierrebzl](https://togithub.com/pierrebzl) made their first contribution in [https://github.com/dbt-athena/dbt-athena/pull/650](https://togithub.com/dbt-athena/dbt-athena/pull/650) **Full Changelog**: dbt-labs/dbt-athena@v1.8.1...v1.8.2 ### [`v1.8.1`](https://togithub.com/dbt-athena/dbt-athena/releases/tag/v1.8.1) [Compare Source](https://togithub.com/dbt-athena/dbt-athena/compare/v1.7.2...v1.8.1) #### What's Changed ##### Relevant notes⚠️ 1.8.1 version is equivalent to 1.8.0 in term of features and fixes. You can install the changes from this release via pip install dbt-athena-community==1.8.1 ##### Features - feat: Add column meta to glue column parameters by [@​SoumayaMauthoorMOJ](https://togithub.com/SoumayaMauthoorMOJ) in [https://github.com/dbt-athena/dbt-athena/pull/644](https://togithub.com/dbt-athena/dbt-athena/pull/644) ##### Dependencies - chore: Update moto requirement from ~=5.0.6 to ~=5.0.7 by [@​dependabot](https://togithub.com/dependabot) in [https://github.com/dbt-athena/dbt-athena/pull/648](https://togithub.com/dbt-athena/dbt-athena/pull/648) ##### Docs - docs: Cleanup Python models section of README by [@​dfsnow](https://togithub.com/dfsnow) in [https://github.com/dbt-athena/dbt-athena/pull/643](https://togithub.com/dbt-athena/dbt-athena/pull/643) #### New Contributors - [@​dfsnow](https://togithub.com/dfsnow) made their first contribution in [https://github.com/dbt-athena/dbt-athena/pull/643](https://togithub.com/dbt-athena/dbt-athena/pull/643) </details> --- ### Configuration 📅 **Schedule**: Branch creation - "before 4am on the first day of the month" (UTC), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://togithub.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4zODMuMCIsInVwZGF0ZWRJblZlciI6IjM3LjM4My4wIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJhdXRvbWVyZ2UiXX0=-->
expected__post_seed_name = "expected_target_post" | ||
run_dbt(["seed", "--select", expected__post_seed_name, "--full-refresh"]) | ||
|
||
run, log = run_dbt_and_capture(["run", "--select", "tag:src"], expect_pass=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@svdimchenko @nicor88. Doing a bit of PR archeology here bc of some recent issues with this file 😁 What's the precondition that would cause this test to reliably fail once, but pass, once you retry it? Unless I'm missing something, the logic up to this point is identical to the one in TestIcebergRetriesEnabled.test__retries_iceberg()
, save for the configured number of retries.
This test causes issues and seems quite flaky now -- I'm trying to determine whether it's other changes that impacted this or the test itself is flaky. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VanTudor I can not reproduce this error locally. Can you ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@svdimchenko it's very inconsistent - I think I managed to reproduce it locally like three or four times, but I don't have a way to trigger it reliably. The problem is that this test expects at least one error to appear when you run a command (TestIcebergRetriesDisabled
has profile["num_iceberg_retries"] = 0
). So if this error appears, the test passes. However, sometimes the query executes successfully from the first try, which causes this test to fail.
I'm failing to see why the initial expectation was that it would throw an error on the first run, but not on the second (note that TestIcebergRetriesEnabled
runs the same thing, but with profile["num_iceberg_retries"] = 1
).
The way I see it now, whether the query fails on the first run depends on chance, which makes the success of this test inconsistent .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's start to define why the retry logic was added.
Iceberg doesn't support natively concurrent transactions - this means that if x inserts statements that happen in parallel or concurrently to the same table, they will fail. One way to solve this is via a retry mechanism.
Now, as you can imagine reproducing that scenario is tricky.
The only assumption that I can make is that if num_iceberg_retries are high enough considering the amount of transactions, the test shouldn't never fail - that doesn't seems to be the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it's great that the test fails 🙃
This means that queries are run successfully even without extra retries for iceberg, that's why I support the decision to mark this test as skipped here.
Description
Implemented num_iceberg_retries parameters
Checklist