Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implement iceberg retry logic #657

Merged
merged 14 commits into from
May 28, 2024
Merged

feat: Implement iceberg retry logic #657

merged 14 commits into from
May 28, 2024

Conversation

svdimchenko
Copy link
Contributor

Description

Implemented num_iceberg_retries parameters

Checklist

  • You followed contributing section
  • You kept your Pull Request small and focused on a single feature or bug fix.
  • You added unit testing when necessary
  • You added functional testing when necessary

@svdimchenko svdimchenko changed the title Implement iceberg retry logic feat: Implement iceberg retry logic May 22, 2024
@nicor88
Copy link
Contributor

nicor88 commented May 22, 2024

Code looks good, I'm wondering if we can use the test suite for functional testing to attempt to run few incremental models with append behavior to simulate a concurrent insert scenario.

@nicor88
Copy link
Contributor

nicor88 commented May 28, 2024

@svdimchenko if you have some urgency on merging this, I'm fine to to merge as it is, because you did the functional testing in your environment. It will be a nice to have to add the functional testing of running the same "model" concurrently, but it's not really necessary before the merging.

nicor88
nicor88 previously approved these changes May 28, 2024
@svdimchenko
Copy link
Contributor Author

@svdimchenko if you have some urgency on merging this, I'm fine to to merge as it is, because you did the functional testing in your environment. It will be a nice to have to add the functional testing of running the same "model" concurrently, but it's not really necessary before the merging.

I've managed to add functional tests, but somehow functional tests are not run in ci as they are marked as skipped(

@nicor88
Copy link
Contributor

nicor88 commented May 28, 2024

@svdimchenko I noticed in other prs too that the functional tests were skipped, they should kick in now. I don't know what lead to that issue, but I removed the label and added it again

Copy link
Contributor

@nicor88 nicor88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job, specially on the functional tests

@svdimchenko svdimchenko merged commit 76f6d5f into main May 28, 2024
10 checks passed
@svdimchenko svdimchenko deleted the feat/iceberg-retries branch May 28, 2024 17:14
kodiakhq bot referenced this pull request in cloudquery/policies Jun 5, 2024
This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [dbt-athena-community](https://togithub.com/dbt-athena/dbt-athena) | minor | `==1.7.2` -> `==1.8.2` |

---

### Release Notes

<details>
<summary>dbt-athena/dbt-athena (dbt-athena-community)</summary>

### [`v1.8.2`](https://togithub.com/dbt-athena/dbt-athena/releases/tag/v1.8.2)

[Compare Source](https://togithub.com/dbt-athena/dbt-athena/compare/v1.8.1...v1.8.2)

### What's Changed

#### Fixes

-   fix: Add wait_random_exponential for query retries by [@&#8203;svdimchenko](https://togithub.com/svdimchenko) in [https://github.com/dbt-athena/dbt-athena/pull/655](https://togithub.com/dbt-athena/dbt-athena/pull/655)
-   fix: Resolve error when cloning Python models ([#&#8203;645](https://togithub.com/dbt-athena/dbt-athena/issues/645)) by [@&#8203;jeancochrane](https://togithub.com/jeancochrane) in [https://github.com/dbt-athena/dbt-athena/pull/651](https://togithub.com/dbt-athena/dbt-athena/pull/651)
-   fix: Fixed table_type for GOVERNED tables by [@&#8203;svdimchenko](https://togithub.com/svdimchenko) in [https://github.com/dbt-athena/dbt-athena/pull/661](https://togithub.com/dbt-athena/dbt-athena/pull/661)

#### Features

-   feat: Set unique table suffix to allow parallel incremental executions by [@&#8203;pierrebzl](https://togithub.com/pierrebzl) in [https://github.com/dbt-athena/dbt-athena/pull/650](https://togithub.com/dbt-athena/dbt-athena/pull/650)
-   feat: Allow custom schema def for tmp tables generated by incremental by [@&#8203;pierrebzl](https://togithub.com/pierrebzl) in [https://github.com/dbt-athena/dbt-athena/pull/659](https://togithub.com/dbt-athena/dbt-athena/pull/659)
-   feat: Implement iceberg retry logic by [@&#8203;svdimchenko](https://togithub.com/svdimchenko) in [https://github.com/dbt-athena/dbt-athena/pull/657](https://togithub.com/dbt-athena/dbt-athena/pull/657)

#### Dependencies

-   chore: Update moto requirement from ~=5.0.7 to ~=5.0.8 by [@&#8203;dependabot](https://togithub.com/dependabot) in [https://github.com/dbt-athena/dbt-athena/pull/660](https://togithub.com/dbt-athena/dbt-athena/pull/660)
-   chore: Bumped version to 1.8.2 for release by [@&#8203;svdimchenko](https://togithub.com/svdimchenko) in [https://github.com/dbt-athena/dbt-athena/pull/663](https://togithub.com/dbt-athena/dbt-athena/pull/663)

#### Docs

-   docs: Cleanup README grammar, punctuation, and capitalisation by [@&#8203;dfsnow](https://togithub.com/dfsnow) in [https://github.com/dbt-athena/dbt-athena/pull/654](https://togithub.com/dbt-athena/dbt-athena/pull/654)

#### New Contributors

-   [@&#8203;jeancochrane](https://togithub.com/jeancochrane) made their first contribution in [https://github.com/dbt-athena/dbt-athena/pull/651](https://togithub.com/dbt-athena/dbt-athena/pull/651)
-   [@&#8203;pierrebzl](https://togithub.com/pierrebzl) made their first contribution in [https://github.com/dbt-athena/dbt-athena/pull/650](https://togithub.com/dbt-athena/dbt-athena/pull/650)

**Full Changelog**: dbt-labs/dbt-athena@v1.8.1...v1.8.2

### [`v1.8.1`](https://togithub.com/dbt-athena/dbt-athena/releases/tag/v1.8.1)

[Compare Source](https://togithub.com/dbt-athena/dbt-athena/compare/v1.7.2...v1.8.1)

#### What's Changed

##### Relevant notes

⚠️ 1.8.1 version is equivalent to 1.8.0 in term of features and fixes.

You can install the changes from this release via

    pip install dbt-athena-community==1.8.1

##### Features

-   feat: Add column meta to glue column parameters by [@&#8203;SoumayaMauthoorMOJ](https://togithub.com/SoumayaMauthoorMOJ) in [https://github.com/dbt-athena/dbt-athena/pull/644](https://togithub.com/dbt-athena/dbt-athena/pull/644)

##### Dependencies

-   chore: Update moto requirement from ~=5.0.6 to ~=5.0.7 by [@&#8203;dependabot](https://togithub.com/dependabot) in [https://github.com/dbt-athena/dbt-athena/pull/648](https://togithub.com/dbt-athena/dbt-athena/pull/648)

##### Docs

-   docs: Cleanup Python models section of README by [@&#8203;dfsnow](https://togithub.com/dfsnow) in [https://github.com/dbt-athena/dbt-athena/pull/643](https://togithub.com/dbt-athena/dbt-athena/pull/643)

#### New Contributors

-   [@&#8203;dfsnow](https://togithub.com/dfsnow) made their first contribution in [https://github.com/dbt-athena/dbt-athena/pull/643](https://togithub.com/dbt-athena/dbt-athena/pull/643)

</details>

---

### Configuration

📅 **Schedule**: Branch creation - "before 4am on the first day of the month" (UTC), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://togithub.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4zODMuMCIsInVwZGF0ZWRJblZlciI6IjM3LjM4My4wIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJhdXRvbWVyZ2UiXX0=-->
expected__post_seed_name = "expected_target_post"
run_dbt(["seed", "--select", expected__post_seed_name, "--full-refresh"])

run, log = run_dbt_and_capture(["run", "--select", "tag:src"], expect_pass=False)
Copy link

@VanTudor VanTudor Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@svdimchenko @nicor88. Doing a bit of PR archeology here bc of some recent issues with this file 😁 What's the precondition that would cause this test to reliably fail once, but pass, once you retry it? Unless I'm missing something, the logic up to this point is identical to the one in TestIcebergRetriesEnabled.test__retries_iceberg(), save for the configured number of retries.

This test causes issues and seems quite flaky now -- I'm trying to determine whether it's other changes that impacted this or the test itself is flaky. Thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VanTudor I can not reproduce this error locally. Can you ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@svdimchenko it's very inconsistent - I think I managed to reproduce it locally like three or four times, but I don't have a way to trigger it reliably. The problem is that this test expects at least one error to appear when you run a command (TestIcebergRetriesDisabled has profile["num_iceberg_retries"] = 0). So if this error appears, the test passes. However, sometimes the query executes successfully from the first try, which causes this test to fail.

I'm failing to see why the initial expectation was that it would throw an error on the first run, but not on the second (note that TestIcebergRetriesEnabled runs the same thing, but with profile["num_iceberg_retries"] = 1).

The way I see it now, whether the query fails on the first run depends on chance, which makes the success of this test inconsistent .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's start to define why the retry logic was added.

Iceberg doesn't support natively concurrent transactions - this means that if x inserts statements that happen in parallel or concurrently to the same table, they will fail. One way to solve this is via a retry mechanism.

Now, as you can imagine reproducing that scenario is tricky.

The only assumption that I can make is that if num_iceberg_retries are high enough considering the amount of transactions, the test shouldn't never fail - that doesn't seems to be the case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it's great that the test fails 🙃
This means that queries are run successfully even without extra retries for iceberg, that's why I support the decision to mark this test as skipped here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants