Hive Ingestion

Hive Crawling helps to store existing hive schema's metadata in mongoDB so that Data Transformation can use it to build pipelines on top of that data.

NOTE: Currently, Hive crawling is supported only if the Hive schema is on the same cluster where Infoworks DataFoundry is deployed.

Creating Hive Source

For creating a Hive source, see Source Creation. Ensure that the Source Type selected is Hive.

Configuring Hive Source

For configuring a Hive source, see Source Configuration.

Hive Configurations

Field	Description
Fetch Data Using	The mechanism through which Infoworks fetches data from the database. The option includes JDBC.
Connection URL	The connection URL through which Infoworks connects to the database. The URL must be in the following format: `jdbc:hive:///TMODE=,database=`
Username	The username for the connection to the database.
Password	The password for the username provided.
Source Schema	The schema in the database to be crawled. The schema value is case sensitive.

Once the setting is saved, you can test the connection or navigate to the Source Configuration page to crawl the metadata.

NOTE: Ensure that the source schema that you provide is already existing in Hive.

Crawling Hive Source Metadata

For crawling a Hive source, see Crawling Metadata.

Last updated on