Skip to content
This repository was archived by the owner on Sep 2, 2025. It is now read-only.

Conversation

@antonysouthworth-halter
Copy link
Contributor

@antonysouthworth-halter antonysouthworth-halter commented Apr 29, 2024

Description

When using a catalog of type LAMBDA (e.g. federated query) you will get errors like the following:

03:51:33    Parameter validation failed:
Invalid type for parameter CatalogId, value: None, type: <class 'NoneType'>, valid types: <class 'str'>

It's because we set the catalog ID to None when we actually need to not-pass it.

You would probably get the same problem using HIVE type but I haven't tested that.

Models used to test - Optional

Trying to show some kind of reprex here, the full config is a bit much for a PR description. Let me know if you want to see more.

Create some Lambda function that host the Athena federated query app:

resource "aws_lambda_function" "athena_postgres" {
  function_name    = "my_postgres_fdw"
  handler          = "com.amazonaws.athena.connectors.postgresql.PostGreSqlMuxCompositeHandler"
  runtime          = "java11"
  timeout          = 900
  memory_size      = 10240
  role             = var.iam_role_arn

  // https://github.com/awslabs/aws-athena-query-federation/releases
  filename = "${path.module}/../terraform/athena-postgresql-2024.15.2.jar" 

  reserved_concurrent_executions = 10 // don't blow up your account limit!

  vpc_config {
    subnet_ids         = var.subnet_ids
    security_group_ids = var.security_group_ids
  }

  environment {
    variables = merge({
      disable_spill_encryption = "false"
      spill_bucket             = var.spill_bucket_name
      spill_prefix             = "fdw-spill/postgres"
      default                  = "postgres://jdbc:postgresql://${var.db_host}:${var.db_port}/?$${${var.db_credentials_secret_name}}"
      default_scale            = 0
    })
  }
}

Create the Athena catalog:

resource "aws_athena_data_catalog" "postgres" {
  name        = "postgres"
  type        = "LAMBDA"

  parameters = {
    "function" = aws_lambda_function.athena_postgres.arn
  }
}

Create the source:

version: 2
sources:
  - name: postgres
    description: |
      Foreign data wrapper implemented by some Lambda.
    database: postgres # athena catalog name
    schema: public # athena database name, postgres schema name

Create a model:

select * from {{ source('postgres', 'orders') }} // or whatever table in public schema

Checklist

  • You followed contributing section
  • You kept your Pull Request small and focused on a single feature or bug fix.
  • [?] You added unit testing when necessary
  • [?] You added functional testing when necessary

Dubious if this is worth functional+unit test or not, it's quite a lot of setup for a very small change and obviously not a lot of people are using dbt-athena + athena federated query otherwise they would have bumped into this.

@antonysouthworth-halter antonysouthworth-halter changed the title fix get_columns_in_relation for LAMBDA catalogs fix: get_columns_in_relation for LAMBDA catalogs Apr 29, 2024
@antonysouthworth-halter antonysouthworth-halter marked this pull request as ready for review April 29, 2024 05:36
@nicor88
Copy link
Contributor

nicor88 commented Apr 29, 2024

@antonysouthworth-halter nice catch.

Thanks to providing the tf infra to make the setup reproducible.

@nicor88 nicor88 merged commit ac66fca into dbt-labs:main Apr 29, 2024
@antonysouthworth-halter antonysouthworth-halter deleted the patch-1 branch April 30, 2024 01:17
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants