Skip to content

AzureMLAssetDataset's credentials are not passed correctly when used in a dataset factory #160

Open
@AlexandreOuellet

Description

@AlexandreOuellet

When used in a dataset factory, azureml's credentials are not setup correctly and creates a crash instead. For instance, the following crashes with missing credentials :

"{name}_csv":
    type: utils.azure_ml_dataset.AzureMLDataset
    azureml_dataset: raw
    root_dir: data/01_raw/
    dataset:
        type: pandas.CSVDataset
        filepath: "{name}.csv"

The issue is that when the catalog is created, the dataset factories are not actual datasets, but dataset_pattern, which are resolved afterward. Then the hook for after_catalog_created is called, but still no actual dataset for the dataset_pattern.

I've discussed with some people on kedro's slack channels, and they seem to agree that the best way to pass the credentials would be with a after_context_created hook.

I'll open a pull request soon to fix that, with the following pattern instead :

"{name}_csv":
    type: utils.azure_ml_dataset.AzureMLDataset
    azureml_dataset: raw
    root_dir: data/01_raw/
    credentials: azureml # added credentials, and implicited the credentials to azureml
    dataset:
        type: pandas.CSVDataset
        filepath: "{name}.csv"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions