Create and manage Dataplex business glossaries

This document describes how to create and use business glossary content in Dataplex.

Business glossary provides a single place to maintain and manage business-related terminology and definitions across the organization. It lets you attach terms to the columns of cataloged data entries.

You can use Dataplex business glossary to do the following:

  • Create and maintain business glossaries, categories and terms.
  • Use categories to represent hierarchical structures of categories and terms.
  • Establish relationships between terms as synonyms or related terms.
  • Attach terms to data entry columns.
  • Browse and filter terms within business glossary.
  • Display related terms and data entries for a given term.
  • Search for data entries tagged by a particular term.
  • Display related terms for a given data entry.

Terminology

This section describes the terminology used in this document.

Business glossary

Business glossary is a repository that provides appropriate vocabularies, and governs the business terms of an enterprise, along with the associated term definitions and relationships between the terms. It provides a single place to maintain and manage business terms and their definitions. A business glossary has the following property displayed in the console:

  • Overview: a free-form rich-text description of business glossary purpose and content.

Category

A category is defined within a business glossary. It lets you organize and structure various categories and terms. You can nest categories up to three levels. The following are the properties of a category:

  • Category: a relationship used to establish the container of the current category, if any.
  • Description: a free-form rich text business definition of the category.
  • Data Steward: specifies the person responsible for maintaining the category. This is a descriptive property and it doesn't affect the permissions on the category.

Term

A term is defined within a business glossary either in the glossary directly or within any category found in the glossary. It describes a concept used in a particular branch of business within an enterprise. A term has the following properties displayed in the console:

  • Category: a relationship used to establish the container of the current term, if any.
  • Description: a free-form rich text business definition of the term.
  • Data Steward: specifies the person responsible for maintaining the term. This is a descriptive property and it doesn't affect the permissions on the term.

You can attach terms to the columns of a data entry to indicate that the columns are described by those terms.

Synonym

A synonym is a relationship used to indicate semantic similarity or equivalence between two different terms. It is used when two similar terms are defined by different teams in different glossaries. For example, earnings and income.

A related term is a relationship used to indicate that two terms are semantically related, yet different. For example, profit and income.

Before you begin

Limitations

  • The bulk import of terms into a business glossary isn't supported.
  • The business glossary feature is only supported in the Google Cloud console.
  • When you search for data entries on the Dataplex Search page, the search result includes data entries where one of their attached terms matches the term used for search. However, reference to the matched term isn't displayed in the search result.
  • BigQuery Omni regions aren't supported. You can't create business glossary content or attach terms to data entries in these regions.

Required roles and permissions

To get the permissions that you need to create and manage glossaries, ask your administrator to grant you the following IAM roles on the project:

  • Full access to glossaries, categories, and terms: DataCatalog Glossary Owner (roles/datacatalog.glossaryOwner)
  • Read glossaries, categories, and terms, create attachments between terms, and create attachments between terms and data entries: DataCatalog Glossary User (roles/datacatalog.glossaryUser)
  • Read-only access to glossaries, categories, and terms: DataCatalog Entry Viewer (roles/datacatalog.entryViewer)

For more information about granting roles, see Manage access to projects, folders, and organizations.

These predefined roles contain the permissions required to create and manage glossaries. To see the exact permissions that are required, expand the Required permissions section:

Required permissions

The following permissions are required to create and manage glossaries:

  • Full access to glossaries, categories, and terms:
    • datacatalog.entries.*
    • datacatalog.relationships.*
  • Read glossaries, categories, and terms, create attachments between terms, and create attachments between terms and data entries:
    • datacatalog.entries.get
    • datacatalog.entries.list
    • datacatalog.relationships.*
  • Read-only access to glossaries, categories, and terms:
    • datacatalog.entries.get
    • datacatalog.entries.list
    • datacatalog.entryGroups.get
    • datacatalog.relationships.list
    • resourcemanager.projects.list
    • resourcemanager.projects.get
  • Create or delete glossaries: datacatalog.entryGroups.create

You might also be able to get these permissions with custom roles or other predefined roles.

For more information, see Predefined roles for Data Catalog.

Create and manage business glossaries

Create a business glossary

  1. In the Google Cloud console, go to the Dataplex Glossaries page.

    Go to Glossaries

  2. Click Create business glossary.

  3. On the Create business glossary page, specify a name, location, and description for the business glossary.

    The name and location fields are mandatory. You cannot change the location after you create the glossary.

  4. Click Create.

    The glossary is created under the current project and the glossary page is displayed.

View the list of available glossaries

  • In the Google Cloud console, go to the Dataplex Glossaries page.

    Go to Glossaries

    The My glossaries pane displays all the organization's glossaries you have permission to view, along with their descriptions and last modified timestamps.

Modify a glossary

You can modify the name and description of a glossary.

  1. In the Google Cloud console, go to the Dataplex Glossaries page.

    Go to Glossaries

  2. Click the glossary you want to modify.

    • To edit the name, click Edit, enter a new name, and click Save.

    • To edit the description, click Edit description, enter a new description, and click Save.

Delete a glossary

  1. In the Google Cloud console, go to the Dataplex Glossaries page.

    Go to Glossaries

  2. Click the glossary you want to delete.

  3. Click Delete.

    To confirm your action, enter the glossary name and click Confirm.

Create and manage categories

Create a category under a glossary

  1. In the Google Cloud console, go to the Dataplex Glossaries page.

    Go to Glossaries

  2. Select a glossary in which you want to create a category, and click Create category.

  3. On the Create category page, enter a name and a description for the category.

  4. Click Create.

    The category details page opens.

Create a category under a category

  1. In the Google Cloud console, go to the Dataplex Glossaries page.

    Go to Glossaries

  2. Expand the glossary in which you want to create a category.

  3. Select the parent category in which you want to create another category, and click Create category. You can nest a category up to three levels.

  4. On the Create category page, enter a name and a description for the category.

  5. Click Create.

    The category details page opens.

View the list of available categories

  1. In the Google Cloud console, go to the Dataplex Glossaries page.

    Go to Glossaries

  2. To view the list of available categories in a glossary, expand the glossary.

  3. To view the list of available nested categories, expand the category.

Update a category

You can modify the name, description, data steward, and parent category for the current category.

  1. In the Google Cloud console, go to the Dataplex Glossaries page.

    Go to Glossaries

  2. Expand the glossary and click the category you want to modify.

    1. To edit the name, click Edit and enter a new name.

    2. To edit the description, click Edit description and enter a new description.

    3. To edit the list of data stewards, click Edit next to Steward, provide a list of data steward emails, and click Save.

    4. To edit the category, click Edit next to Category.

      • To change the parent category, in the Select a category field, enter the name of a new parent category from the current glossary, and select it.

      • To remove the current category, in the Select a category field, select ---None---.

      Click Save.

Delete a category

  1. In the Google Cloud console, go to the Dataplex Glossaries page.

    Go to Glossaries

  2. Expand the glossary and click the category you want to delete permanently.

    A warning is displayed stating that the deletion is irreversible and all terms and categories under the current category will be moved up to the next level.

    To confirm the deletion, enter the category name and click Confirm.

Create and manage terms

Create a term

  1. In the Google Cloud console, go to the Dataplex Glossaries page.

    Go to Glossaries

  2. Select a glossary in which you want to create a term, and click Create term.

  3. On the Create term page, enter a name and a description for the term.

  4. Click Create.

    The term details page opens.

View the list of available terms

  1. In the Google Cloud console, go to the Dataplex Glossaries page.

    Go to Glossaries

  2. To view the list of available terms in a glossary, expand the glossary.

Access the details of a term

  1. In the Google Cloud console, go to the Dataplex Glossaries page.

    Go to Glossaries

  2. To view the list of available terms in a glossary, expand the glossary.

  3. Click a term.

    The term details page displays the term name, description, related terms, synonyms, and related entries.

Update a term

You can modify the name, description, and data steward for a term.

  1. In the Google Cloud console, go to the Dataplex Glossaries page.

    Go to Glossaries

  2. Expand the glossary and click the term you want to modify.

    1. To edit the name, click Edit and enter a new name.

    2. To edit the description, click Edit description and enter a new description.

    3. To edit the list of data stewards, click Edit next to Steward, provide a list of data steward emails, and click Save.

    4. To edit the category, click Edit next to Category.

      • To change the parent category, in the Select a category field, enter the name of a new parent category from the current glossary, and select it.

      • To remove the current category, in the Select a category field, select ---None---.

      Click Save.

Delete a term

  1. In the Google Cloud console, go to the Dataplex Glossaries page.

    Go to Glossaries

  2. Expand the glossary and click the term you want to delete permanently.

    A warning is displayed stating that the deletion is irreversible and all relationships associated with the term will be removed.

    To confirm the deletion, enter the term name and click Confirm.

You can add relationships between terms by marking them as related terms or synonyms. The lists of related terms and synonyms are displayed on the term details page.

  1. In the Google Cloud console, go to the Dataplex Glossaries page.

    Go to Glossaries

  2. Expand the glossary and click the term to which you want to attach a synonym or a related term.

  3. Click Edit for Related terms or Synonyms.

  4. In the Search glossary terms field, enter the term you want to attach, and select the term from the search result.

    The term gets attached immediately when you select it.

  5. Click Done.

  1. In the Google Cloud console, go to the Dataplex Glossaries page.

    Go to Glossaries

  2. Expand the glossary and click the term for which you want to remove a synonym or a related term.

  3. To remove a related term, follow these steps:

    1. Click Edit next to Related terms.

    2. For the related term you want to delete, click Delete . The attachment is deleted immediately.

    3. Click Done.

  4. To remove a synonym, follow these steps:

    1. Click Edit next to Synonyms.

    2. For the synonym you want to delete, click Delete. The attachment is deleted immediately.

    3. Click Done.

Manage column level terms

You can attach terms to the columns of Data Catalog entries that have schemas (for example, tables and filesets), to provide more context about the column data.

Attach terms to columns

  1. Navigate to a Data Catalog entry that has a schema.

  2. Click the Schema tab.

  3. In the Business Terms column, for the field you want to attach terms to, click Add .

  4. In the Term search field, enter the term name and select it from the search result. You can select multiple terms.

  5. Click Attach terms.

View the details of terms attached to columns

  1. Navigate to a Data Catalog entry that has a schema.

  2. Click the Schema tab.

  3. In the Business Terms column, click the term to view the term description. You can also remove the term or navigate to the term details page using the options in this window.

Remove terms attached to columns

  1. Navigate to a Data Catalog entry that has a schema.

  2. Click the Schema tab.

  3. In the Business Terms column, for the term that you want to remove, click Delete

    You can also remove the attached term by clicking the term, and then clicking Remove business term > Confirm.

Locate glossaries and terms

To navigate across your business glossary content, use the glossary and terms tree on the Dataplex Glossaries page.

You can use the filter bar on the Dataplex Glossaries page to locate categories by display name or terms based on a criteria. The filter returns items with any attributes matching the filter query. You can also use the following qualifiers:

  • parent:VALUE: highlights the terms whose parent glossary name or description matches VALUE. For example, parent:Finance highlights the terms connected with the parent glossary name or description containing the phrase Finance.

  • contact:VALUE: highlights the terms where the data steward is VALUE. For example, contact:[email protected] highlights the entries connected to terms with data steward matching [email protected].

  • synonym:VALUE: highlights the terms that have at least one synonym with name or description matching VALUE. For example, synonym:interest highlights the terms that have related terms with name or description containing the phrase interest.

  • related_to:VALUE: highlights the terms that have at least one related term with name or description matching VALUE. For example, related_to:discounting highlights the terms that have related terms with name or description containing the phrase discounting.

Search for data entries using terms

You can use the Dataplex business glossary content when searching for entries on the Dataplex Search page.

The following search scenarios are supported:

  • Simple search: The search query is specified as free-form text that captures term name or description. Search results include entries connected to a term with name or description matching the search query. These results are displayed alongside the results fetched from the other ways of matching that are unrelated to business glossary content.

  • Search using qualifiers: The search query is specified as term:VALUE. The search result includes entries connected to a term where a substring of name, description, or data steward matches VALUE.

What's next