Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify dataflow to SQL logic for JoinOverTimeRangeNode #1540

Merged
merged 1 commit into from
Dec 9, 2024

Conversation

courtneyholcomb
Copy link
Contributor

There should be no functional changes in this commit, only cleanup and readability improvements. Mostly involves moving complex logic to helper functions.



@dataclass(frozen=True)
class AnnotatedSqlDataSet:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class should be unchanged. I needed to move it to resolve circular imports.

@@ -1390,6 +1361,21 @@ def visit_semi_additive_join_node(self, node: SemiAdditiveJoinNode) -> SqlDataSe
),
)

def _choose_instance_for_time_spine_join(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper function is only used once at this point, but it will be used again for the JoinToTimeSpineNode farther up the stack.

@@ -803,6 +803,7 @@ def transform(self, instance_set: InstanceSet) -> SelectColumnSet: # noqa: D102
)


# TODO: delete this class & all uses. It doesn't do anything.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is supposed to change the column names, but if the specs didn't change, then the column names shouldn't either, so it seems like it doesn't do anything. LMK if I'm overlooking something here!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This used to be needed but maybe things have changed. There were cases where nodes did not output data sets where the column name in the generated SQL did not match the defined format. Trying to remember what that was though, but there was a bug fix that required this transform.

@courtneyholcomb courtneyholcomb marked this pull request as ready for review November 21, 2024 00:03
Copy link
Contributor

@plypaul plypaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments, but LGTM.

"No appropriate agg_time_dimension was found to join to the time spine. "
"This indicates that the dataflow plan was configured incorrectly."
)
agg_time_dimension_instances.sort(key=lambda instance: instance.spec.time_granularity.base_granularity.to_int())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this PR, but we probably should just make the TimeGranularity enum orderable in DSI at some point since this operation has come up a few times.

@@ -803,6 +803,7 @@ def transform(self, instance_set: InstanceSet) -> SelectColumnSet: # noqa: D102
)


# TODO: delete this class & all uses. It doesn't do anything.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This used to be needed but maybe things have changed. There were cases where nodes did not output data sets where the column name in the generated SQL did not match the defined format. Trying to remember what that was though, but there was a bug fix that required this transform.

Copy link
Contributor

@plypaul plypaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to approve earlier.

Base automatically changed from court/simp1 to main December 9, 2024 19:20
There should be no functional changes in this commit, only cleanup and readability improvements. Mostly involves moving complex logic to helper functions.
@courtneyholcomb courtneyholcomb enabled auto-merge (squash) December 9, 2024 19:23
@courtneyholcomb courtneyholcomb merged commit 2b49b34 into main Dec 9, 2024
11 checks passed
@courtneyholcomb courtneyholcomb deleted the court/simp2 branch December 9, 2024 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants