Data Catalog

Data Catalog enables you to search data in a single section regardless of access controls. It displays a catalog of the following:

  • all data sources that have been crawled, ingested and synchronized
  • all data pipeline targets that have been created using transformation pipelines
  • all cubes and accelerated data models created using Infoworks
  • Click the Data Catalog menu. The data catalog page is displayed. This page allows you to search data by Name, Description, Tags, Categories (source, table, cubes) and Favorites.
  • Enter the data to be searched and click the search icon. A list of records matching the search keyword is displayed.
  • In the search result, click the Name of the required data. You will be redirected to the details page.

Following are the searches supported:

Search by Name

A regex search is performed on the name of the sources and tables.

Search by Description

A regex search is performed on the description provided for the sources, tables and pipelines.

Search by Tag

A regex search is performed on all the tags and the sources and tables associated with the tag are displayed.

Search by Category

During global search, you can filter the data by the following categories:

  • Data Sources: Includes all sources regardless of access control.
  • Data Models: Includes all tables regardless of access control.
  • Accelerated Data Models: Includes all cubes associated with a pipeline.
  • Favorites: See Search by Favorites

Search by Favorites

During global search, you can filter data by favorites, which includes sources and tables.

Source Detail View

Clicking the source displays a detailed view of the source.

×

It includes the following details:

Configure Source Link

Data Catalog allows you to only view data. To edit or configure the data source, click the edit icon next to the name of the Source. You will be redirected to the sources settings page.

Global Search in Data Source

You can perform a global search from the Data Sources page. A regex search is performed on the name, table, description, tags. This search does not include search by categories and favorites.

Description Field

The Description section displays the following details:

  • Description: Description for the Source.
  • Owner: User who created the source.
  • Status: The status of the source.
  • Record Count: Number of tables associated with the source.
  • Last Refreshed: Last modified time of the source.
  • New Records Added: Number of tables added in the previous week.

Favorite

  • Click the star icon to add the source as favorite. A black colored star represents the source has been added as favorite.

Tags

  • This option allows you to add, remove and view tags associated with the source.
  • Add Tag: Click the + icon, enter the new tag and click Enter.
  • Remove Tag: Click the x icon to remove the tag from the source.

Tables

  • List of all tables associated with the source.
  • Click the name of a table to navigate to the table detail view.

Table Detail View

Clicking the table displays a detailed view of the table.

×

Following are the fields:

  • Column Name: Name of the column in the Hive table.

  • Column Name at Source: Name of the column in the source database table.

  • Column Target Datatype: Datatype of the column in the Hive table.

  • Options (Precision, Scale): The precision/scale of the column in the Hive table.

  • Column Description: Editable description of the column.

Configure Table Link

Clicking the edit icon navigates to the source configuration page if the table is associated with a source or navigates to the pipeline configuration page if the table is associated with a pipeline.

Navigation Tabs

You can view the schema, sample data, data and lineage of the table by clicking the respective tabs.

  • Sample Data: Includes the sample data of the source table and the frequency histogram of the data.
  • Data: Visual representation of all the Hive tables and views.
  • Lineage: Displays all the instances where the data of the table is being used.

Description Field

The Description section displays the following details:

  • Status: Status of the table.
  • Columns: number of columns in the table
  • Records: number of records in the table
  • Last Refreshed: the time when the table was last ingested.
  • New Records Added (Week): number of rows that have been added in the previous week.
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard