Skip to content

[Feature] Concurrency issues in snowflake incremental models. #1123

Open
@rattata2me

Description

Is this a new bug in dbt-snowflake?

  • I believe this is a new bug in dbt-snowflake
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

When running an incremental table with the on_schema_change policy set to append_new_columns, a race condition can occur. If two jobs perform the column check at the same time, the same ALTER TABLE statement will be generated, resulting in a SQL compilation error on the slower executor:

SQL Compilation error:
ERROR
  column '<new_column>' already exists

The core issue is that we cannot ensure the column schema remains unchanged between the column check operation and the ALTER TABLE statement.

Expected Behavior

To avoid this issue, the process should either utilize the IF NOT EXISTS and IF EXISTS conditions provided by Snowflake or make use of transactions so that column checkup and update is atomic.

Steps To Reproduce

  1. Create an Incremental Table:
    Define and create an incremental table in your dbt project.
  2. Add a Column:
    Add a new column entry to the model.
  3. Concurrently Execute dbt run:
    Execute the dbt process in two separate jobs concurrently.

Relevant log output

No response

Environment

- OS:Ubuntu 22.04
- Python: 3.9
- dbt-core: 1.5.7
- dbt-snowflake: 1.5.7

Additional Context

No response

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions