Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Authentication failure occurs in Azure Databricks when using dbt-databricks 1.6 #609

Closed
hazvk opened this issue Mar 8, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@hazvk
Copy link

hazvk commented Mar 8, 2024

Describe the bug

Authorization of app registration fails at some point. Tested when running dbt --debug debug, on Azure Databricks.

Steps To Reproduce

  1. Set up Databricks SQL warehouse
  2. Follow steps here to: set up App Registration; set up profiles.yml in your dbt project
  3. Set up job that runs one task
    1. Task has one step: dbt --debug debug
    2. Link project repo where dbt project is set up
    3. Job cluster is per what is default, with one library added: dbt-databricks==1.6.8
  4. Run job

Expected behavior

Expect job to run successfully and show that connection was possible.

Screenshots and log output

Output of job:

00:31:45  Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'start', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f7bb888e9e0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f7bb642fee0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f7bb642ff10>]}
00:31:45  Running with dbt=1.6.10
00:31:45  running dbt with arguments {'printer_width': '80', 'indirect_selection': 'eager', 'log_cache_events': 'False', 'write_json': 'True', 'partial_parse': 'True', 'cache_selected_only': 'False', 'profiles_dir': '/tmp/tmp-dbt-run-477395276437665/src/custom_profile', 'debug': 'True', 'fail_fast': 'False', 'log_path': '/tmp/tmp-dbt-run-477395276437665/src/logs', 'warn_error': 'None', 'version_check': 'True', 'use_colors': 'True', 'use_experimental_parser': 'False', 'no_print': 'None', 'quiet': 'False', 'warn_error_options': 'WarnErrorOptions(include=[], exclude=[])', 'invocation_command': 'dbt --debug --log-level-file error debug', 'introspect': 'True', 'log_format': 'default', 'target_path': 'None', 'static_parser': 'True', 'send_anonymous_usage_stats': 'True'}
00:31:45  dbt version: 1.6.10
00:31:45  python version: 3.10.12
00:31:45  python path: /local_disk0/.ephemeral_nfs/cluster_libraries/python/bin/python
00:31:45  os info: Linux-5.15.0-1056-azure-x86_64-with-glibc2.35
00:31:47  Using profiles dir at /tmp/tmp-dbt-run-477395276437665/src/custom_profile
00:31:47  Using profiles.yml file at /tmp/tmp-dbt-run-477395276437665/src/custom_profile/profiles.yml
00:31:47  Using dbt_project.yml file at /tmp/tmp-dbt-run-477395276437665/src/dbt_project.yml
00:31:47  adapter type: databricks
00:31:47  adapter version: 1.6.8
00:31:47  Configuration:
00:31:47    profiles.yml file [OK found and valid]
00:31:47    dbt_project.yml file [OK found and valid]
00:31:47  Required dependencies:
00:31:47  Executing "git --help"
00:31:47  STDOUT: "b"usage: git [--version] [--help] [-C <path>] [-c <name>=<value>]\n           [--exec-path[=<path>]] [--html-path] [--man-path] [--info-path]\n           [-p | --paginate | -P | --no-pager] [--no-replace-objects] [--bare]\n           [--git-dir=<path>] [--work-tree=<path>] [--namespace=<name>]\n           [--super-prefix=<path>] [--config-env=<name>=<envvar>]\n           <command> [<args>]\n\nThese are common Git commands used in various situations:\n\nstart a working area (see also: git help tutorial)\n   clone     Clone a repository into a new directory\n   init      Create an empty Git repository or reinitialize an existing one\n\nwork on the current change (see also: git help everyday)\n   add       Add file contents to the index\n   mv        Move or rename a file, a directory, or a symlink\n   restore   Restore working tree files\n   rm        Remove files from the working tree and from the index\n\nexamine the history and state (see also: git help revisions)\n   bisect    Use binary search to find the commit that introduced a bug\n   diff      Show changes between commits, commit and working tree, etc\n   grep      Print lines matching a pattern\n   log       Show commit logs\n   show      Show various types of objects\n   status    Show the working tree status\n\ngrow, mark and tweak your common history\n   branch    List, create, or delete branches\n   commit    Record changes to the repository\n   merge     Join two or more development histories together\n   rebase    Reapply commits on top of another base tip\n   reset     Reset current HEAD to the specified state\n   switch    Switch branches\n   tag       Create, list, delete or verify a tag object signed with GPG\n\ncollaborate (see also: git help workflows)\n   fetch     Download objects and refs from another repository\n   pull      Fetch from and integrate with another repository or a local branch\n   push      Update remote refs along with associated objects\n\n'git help -a' and 'git help -g' list available subcommands and some\nconcept guides. See 'git help <command>' or 'git help <concept>'\nto read about a specific subcommand or concept.\nSee 'git help git' for an overview of the system.\n""
00:31:47  STDERR: "b''"
00:31:47   - git [OK found]

00:31:47  Connection:
00:31:47    host: adb-5040045441632685.5.azuredatabricks.net
00:31:47    http_path: sql/1.0/warehouses/9c4ad4393c0fccd2
00:31:47    schema: dbt
00:31:47  Registered adapter: databricks=1.6.8
00:31:47  Acquiring new databricks connection 'debug'
00:31:47  Using databricks connection "debug"
00:31:47  On debug: select 1 as id
00:31:47  Opening a new connection, currently in state init
00:31:47  Databricks adapter: Error while running:
select 1 as id
00:31:47  Databricks adapter: invalid_client: Client authentication failed
00:31:47    Connection test: [ERROR]

00:31:47  1 check failed:
00:31:47  dbt was unable to connect to the specified database.
The database returned the following error:

  >Runtime Error
  invalid_client: Client authentication failed

Check your database credentials and try again. For more information, visit:
https://docs.getdbt.com/docs/configure-your-profile


00:31:47  Command `dbt debug` failed at 00:31:47.909246 after 1.95 seconds
00:31:47  Connection 'debug' was properly closed.
00:31:47  Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f7bb888e9e0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f7b9e06eec0>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x7f7b9e0641c0>]}
00:31:47  Flushing usage events

System information

The output of dbt --version:
`1.6.10`` (see above output)

Output logs from dbt debug
As above

The operating system you're using:
DBR 13.3 LTS
Apache Spark 3.4.1

The output of python --version:
Python 3.10.12

Additional context

  • This issue started happening sometime between Jan 24 and Jan 25th this year. This is also the same tiem databricks-sdk released a new version 0.18.0
  • This issue happens when fixing many different versions of dbt-core==1.6.x
  • This issue does not occur when using dbt-databricks==1.7.8
  • My analysis shows the root cause is due to something in databricks-sdk

The setup.py specifies as of v1.6.8 that we can install anything databricks-sdk>= 0.9.0 (despite requirements.txt specifying databricks-sdk==0.9.0). Thus my proposed workarounds:

  1. Upgrade to the latest versions of dbt-databricks==1.7.8
  2. For those not ready to upgrade to the latest minor version of dbt is: fix the databricks-sdk version databricks-sdk==0.17.0. Adding this lib to the cluster keeps the dependency resolves within the range and auth is successful.

Note: I have not tested for other side-effects of fixing the databricks-sdk version... I had other issues come up, not sure if they were related to this fix so I gave up and upgraded to dbt 1.7

@hazvk hazvk added the bug Something isn't working label Mar 8, 2024
@benc-db
Copy link
Collaborator

benc-db commented Mar 8, 2024

Thanks for this report. We had a suspicion that 0.18.0 could break auth things, so in the 1.7.x branch we limited to less than 0.18.0 in the later versions. I'll backport this pinning to the 1.6.x branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants