Skip to content

Commit

Permalink
remove snapshot query best practices--not relevant anymore for 1.9 an…
Browse files Browse the repository at this point in the history
…d beyond (dbt-labs#6448)

## What are you changing in this pull request and why?
<!--
Describe your changes and why you're making them. If related to an open
issue or a pull request on dbt Core or another repository, then link to
them here!

To learn more about the writing conventions used in the dbt Labs docs,
see the [Content style
guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md).
-->

The Snapshot Query best practices page feels out of date to me
mentioning things like "avoid joins in your query" , "include as many
columns as possible", etc. Given that there's already a configs best
practice above this section, it feels like this section should get
deleted altogether.

## Checklist
- [ ] I have reviewed the [Content style
guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md)
so my content adheres to these guidelines.
- [ ] The topic I'm writing about is for specific dbt version(s) and I
have versioned it according to the [version a whole
page](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version)
and/or [version a block of
content](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#versioning-blocks-of-content)
guidelines.
- [ ] I have added checklist item(s) to this list for anything anything
that needs to happen before this PR is merged, such as "needs technical
review" or "change base branch."
- [ ] The content in this PR requires a dbt release note, so I added one
to the [release notes
page](https://docs.getdbt.com/docs/dbt-versions/dbt-cloud-release-notes).
<!--
PRE-RELEASE VERSION OF dbt (if so, uncomment):
- [ ] Add a note to the prerelease version [Migration
Guide](https://github.com/dbt-labs/docs.getdbt.com/tree/current/website/docs/docs/dbt-versions/core-upgrade)
-->
<!-- 
ADDING OR REMOVING PAGES (if so, uncomment):
- [ ] Add/remove page in `website/sidebars.js`
- [ ] Provide a unique filename for new pages
- [ ] Add an entry for deleted pages in `website/vercel.json`
- [ ] Run link testing locally with `npm run build` to update the links
that point to deleted pages
-->
  • Loading branch information
faithebear authored Nov 9, 2024
1 parent 63c55da commit dc911a3
Showing 1 changed file with 0 additions and 23 deletions.
23 changes: 0 additions & 23 deletions website/docs/docs/build/snapshots.md
Original file line number Diff line number Diff line change
Expand Up @@ -390,29 +390,6 @@ snapshots:

</VersionBlock>

## Snapshot query best practices

This section outlines some best practices for writing snapshot queries:

- #### Snapshot source data
Your models should then select from these snapshots, treating them like regular data sources. As much as possible, snapshot your source data in its raw form and use downstream models to clean up the data

- #### Use the `source` function in your query
This helps when understanding <Term id="data-lineage">data lineage</Term> in your project.

- #### Include as many columns as possible
In fact, go for `select *` if performance permits! Even if a column doesn't feel useful at the moment, it might be better to snapshot it in case it becomes useful – after all, you won't be able to recreate the column later.

- #### Avoid joins in your snapshot query
Joins can make it difficult to build a reliable `updated_at` timestamp. Instead, snapshot the two tables separately, and join them in downstream models.

- #### Limit the amount of transformation in your query
If you apply business logic in a snapshot query, and this logic changes in the future, it can be impossible (or, at least, very difficult) to apply the change in logic to your snapshots.

Basically – keep your query as simple as possible! Some reasonable exceptions to these recommendations include:
* Selecting specific columns if the table is wide.
* Doing light transformation to get data into a reasonable shape, for example, unpacking a <Term id="json" /> blob to flatten your source data into columns.

## Snapshot meta-fields

Snapshot <Term id="table">tables</Term> will be created as a clone of your source dataset, plus some additional meta-fields*.
Expand Down

0 comments on commit dc911a3

Please sign in to comment.