Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add macro to get columns #516

Merged
merged 13 commits into from
Mar 28, 2022
50 changes: 47 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ For compatibility details between versions of dbt-core and dbt-utils, [see this

- [Introspective macros](#introspective-macros):
- [get_column_values](#get_column_values-source)
- [get_filtered_columns_in_relation](#get_filtered_columns_in_relation-source)
- [get_relations_by_pattern](#get_relations_by_pattern-source)
- [get_relations_by_prefix](#get_relations_by_prefix-source)
- [get_query_results_as_dict](#get_query_results_as_dict-source)
Expand Down Expand Up @@ -544,7 +545,7 @@ These macros run a query and return the results of the query as objects. They ar
#### get_column_values ([source](macros/sql/get_column_values.sql))
This macro returns the unique values for a column in a given [relation](https://docs.getdbt.com/docs/writing-code-in-dbt/class-reference/#relation) as an array.

Arguments:
**Args:**
- `table` (required): a [Relation](https://docs.getdbt.com/reference/dbt-classes#relation) (a `ref` or `source`) that contains the list of columns you wish to select from
- `column` (required): The name of the column you wish to find the column values of
- `order_by` (optional, default=`'count(*) desc'`): How the results should be ordered. The default is to order by `count(*) desc`, i.e. decreasing frequency. Setting this as `'my_column'` will sort alphabetically, while `'min(created_at)'` will sort by when thevalue was first observed.
Expand Down Expand Up @@ -585,6 +586,30 @@ Arguments:
...
```

#### get_filtered_columns_in_relation ([source](macros/sql/get_filtered_columns_in_relation.sql))
This macro returns an iterable Jinja list of columns for a given [relation](https://docs.getdbt.com/docs/writing-code-in-dbt/class-reference/#relation), (i.e. not from a CTE)
- optionally exclude columns
- the input values are not case-sensitive (input uppercase or lowercase and it will work!)
- declare whether you want lowercase output values with `output_lower=TRUE` (as Snowflake by default will output values in UPPERCASE)
> Note: The native [`adapter.get_columns_in_relation` macro](https://docs.getdbt.com/reference/dbt-jinja-functions/adapter#get_columns_in_relation) allows you
to pull column names in a non-filtered fashion, also bringing along with it other (potentially unwanted) information, such as dtype, char_size, numeric_precision, etc.

**Args:**
- `from` (required): a [Relation](https://docs.getdbt.com/reference/dbt-classes#relation) (a `ref` or `source`) that contains the list of columns you wish to select from
- `except` (optional, default=`[]`): The name of the columns you wish to exclude. (case-insensitive)
- `output_lower` (optional, default=`False`): True if you want lowercase outputs. False if you want outputs to be database default (which may be upper or lower depending on database)
patkearns10 marked this conversation as resolved.
Show resolved Hide resolved

**Usage:**
```sql
-- Returns a list of the columns from a relation, so you can then iterate in a for loop
{% set column_names = dbt_utils.get_filtered_columns_in_relation(from=ref('your_model'), except=["field_1", "field_2"]) %}
...
{% for column_name in column_names %}
max({{ column_name }}) ... as max_'{{ column_name }}',
{% endfor %}
...
```

#### get_relations_by_pattern ([source](macros/sql/get_relations_by_pattern.sql))
Returns a list of [Relations](https://docs.getdbt.com/docs/writing-code-in-dbt/class-reference/#relation)
that match a given schema- or table-name pattern.
Expand Down Expand Up @@ -748,9 +773,21 @@ group by 1,2,3
```

#### star ([source](macros/sql/star.sql))
This macro generates a comma-separated list of all fields that exist in the `from` relation, excluding any fields listed in the `except` argument. The construction is identical to `select * from {{ref('my_model')}}`, replacing star (`*`) with the star macro. This macro also has an optional `relation_alias` argument that will prefix all generated fields with an alias (`relation_alias`.`field_name`).
This macro generates a comma-separated list of all fields that exist in the `from` relation, excluding any fields
listed in the `except` argument. The construction is identical to `select * from {{ref('my_model')}}`, replacing star (`*`) with
the star macro.
This macro also has an optional `relation_alias` argument that will prefix all generated fields with an alias (`relation_alias`.`field_name`).
The macro also has optional `prefix` and `suffix` arguments. When one or both are provided, they will be concatenated onto each field's alias
in the output (`prefix` ~ `field_name` ~ `suffix`). NB: This prevents the output from being used in any context other than a select statement.

**Args:**
- `from` (required): a [Relation](https://docs.getdbt.com/reference/dbt-classes#relation) (a `ref` or `source`) that contains the list of columns you wish to select from
- `except` (optional, default=`[]`): The name of the columns you wish to exclude. (case-insensitive)
- `relation_alias` (optional, default=`''`): will prefix all generated fields with an alias (`relation_alias`.`field_name`).
- `prefix` (optional, default=`''`): will prefix the output `field_name` (`field_name as prefix_field_name`).
- `suffix` (optional, default=`''`): will suffix the output `field_name` (`field_name as field_name_suffix`).
patkearns10 marked this conversation as resolved.
Show resolved Hide resolved
- `output_lower` (optional, default=`False`): True if you want lowercase outputs. False if you want outputs to be database default (which may be upper or lower depending on database)
patkearns10 marked this conversation as resolved.
Show resolved Hide resolved

The macro also has optional `prefix` and `suffix` arguments. When one or both are provided, they will be concatenated onto each field's alias in the output (`prefix` ~ `field_name` ~ `suffix`). NB: This prevents the output from being used in any context other than a select statement.

**Usage:**
```sql
Expand All @@ -767,6 +804,13 @@ from {{ ref('my_model') }}

```

```sql
select
{{ dbt_utils.star(from=ref('my_model'), except=["exclude_field_1", "exclude_field_2"], prefix="max_", output_lower=True) }}
from {{ ref('my_model') }}

```

#### union_relations ([source](macros/sql/union.sql))

This macro unions together an array of [Relations](https://docs.getdbt.com/docs/writing-code-in-dbt/class-reference/#relation),
Expand Down
4 changes: 4 additions & 0 deletions integration_tests/data/sql/data_star_uppercase_columns.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
FIELD_1,FIELD_2,FIELD_3
A,B,C
D,E,F
G,H,I
32 changes: 32 additions & 0 deletions integration_tests/macros/assert_equal_values.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
{% macro assert_equal_values(actual_object, expected_object) %}
{% if not execute %}

{# pass #}

{% elif actual_object != expected_object %}

{% set msg %}
Expected did not match actual

-----------
Actual:
-----------
--->{{ actual_object }}<---

-----------
Expected:
-----------
--->{{ expected_object }}<---
patkearns10 marked this conversation as resolved.
Show resolved Hide resolved

{% endset %}

{{ log(msg, info=True) }}

select 'fail'

{% else %}

select 'ok' limit 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be {{ limit_zero() }} - not all databases support limit clauses (notably T-SQL)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!
@joellabes -- this is no longer required, as I switched to using the seed/model method. Should I remove this file then?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that makes sense - the Core team are working on a wider testing suite that I hope will also cover packages - if that wasn't in the pipeline then I'd suggest we kept this around because "we’d need it one day" but I think we'll have a more long term solution soon.


{% endif %}
{% endmacro %}
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{% set actual_output = dbt_utils.get_filtered_columns_in_relation(from= ref('data_star_uppercase_columns'), except=['field_1'], output_lower=True) %}

{% set expected_output = ['field_2', 'field_3'] %}

{{ assert_equal_values (actual_output | trim, expected_output | trim) }}
patkearns10 marked this conversation as resolved.
Show resolved Hide resolved
joellabes marked this conversation as resolved.
Show resolved Hide resolved
7 changes: 7 additions & 0 deletions integration_tests/tests/sql/test_star_output_lower.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{% set actual_output = dbt_utils.star(from=ref('data_star_uppercase_columns'), except=['field_1', 'field_2'], prefix='test_', output_lower=True ) %}

{% set expected_output %}
"field_3" as "test_field_3"
{% endset %}

{{ assert_equal_values (actual_output | trim, expected_output | trim) }}
patkearns10 marked this conversation as resolved.
Show resolved Hide resolved
29 changes: 29 additions & 0 deletions macros/sql/get_filtered_columns_in_relation.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
{% macro get_filtered_columns_in_relation(from, except=[], output_lower=False) -%}
{{ return(adapter.dispatch('get_filtered_columns_in_relation', 'dbt_utils')(from, except, output_lower)) }}
{% endmacro %}

{% macro default__get_filtered_columns_in_relation(from, except=[], output_lower=False) -%}
{%- do dbt_utils._is_relation(from, 'get_filtered_columns_in_relation') -%}
{%- do dbt_utils._is_ephemeral(from, 'get_filtered_columns_in_relation') -%}

{# -- Prevent querying of db in parsing mode. This works because this macro does not create any new refs. #}
{%- if not execute -%}
{{ return('') }}
{% endif %}

{%- set include_cols = [] %}
{%- set cols = adapter.get_columns_in_relation(from) -%}
{%- set except = except | map("lower") | list %}
{%- for col in cols -%}
{%- if col.column|lower not in except -%}
{%- if not output_lower %}
{% do include_cols.append(col.column) %}
{% else %}
{% do include_cols.append(col.column|lower) %}
{%- endif -%}
{%- endif %}
{%- endfor %}

{{ return(include_cols) }}

{%- endmacro %}
21 changes: 5 additions & 16 deletions macros/sql/star.sql
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
{% macro star(from, relation_alias=False, except=[], prefix='', suffix='') -%}
{{ return(adapter.dispatch('star', 'dbt_utils')(from, relation_alias, except, prefix, suffix)) }}
{% macro star(from, relation_alias=False, except=[], prefix='', suffix='', output_lower=False) -%}
{{ return(adapter.dispatch('star', 'dbt_utils')(from, relation_alias, except, prefix, suffix, output_lower)) }}
{% endmacro %}

{% macro default__star(from, relation_alias=False, except=[], prefix='', suffix='') -%}
{% macro default__star(from, relation_alias=False, except=[], prefix='', suffix='', output_lower=False) -%}
{%- do dbt_utils._is_relation(from, 'star') -%}
{%- do dbt_utils._is_ephemeral(from, 'star') -%}

Expand All @@ -11,20 +11,9 @@
{{ return('') }}
{% endif %}

{%- set include_cols = [] %}
{%- set cols = adapter.get_columns_in_relation(from) -%}
{%- set except = except | map("lower") | list %}
{%- for col in cols -%}
{%- for col in dbt_utils.get_filtered_columns_in_relation(from, except, output_lower) %}

{%- if col.column|lower not in except -%}
{% do include_cols.append(col.column) %}

{%- endif %}
{%- endfor %}

{%- for col in include_cols %}

{%- if relation_alias %}{{ relation_alias }}.{% else %}{%- endif -%}{{ adapter.quote(col)|trim }} {%- if prefix!='' or suffix!='' -%} as {{ adapter.quote(prefix ~ col ~ suffix)|trim }} {%- endif -%}
{%- if relation_alias %}{{ relation_alias }}.{% else %}{%- endif -%}{{ adapter.quote(col)|trim }} {%- if prefix!='' or suffix!='' %} as {{ adapter.quote(prefix ~ col ~ suffix)|trim }} {%- endif -%}
{%- if not loop.last %},{{ '\n ' }}{% endif %}

{%- endfor -%}
Expand Down