Hive Ingestion
Hive Crawling helps to store existing hive schema's metadata in mongoDB so that Data Transformation can use it to build pipelines on top of that data.
NOTE: Currently, Hive crawling is supported only if the Hive schema is on the same cluster where Infoworks DataFoundry is deployed.
Creating Hive Source
For creating a Hive source, see Source Creation. Ensure that the Source Type selected is Hive.
Configuring Hive Source
For configuring a Hive source, see Source Configuration.
Hive Configurations
Field | Description |
---|---|
Fetch Data Using | The mechanism through which Infoworks fetches data from the database. The option includes JDBC. |
Connection URL | The connection URL through which Infoworks connects to the database. The URL must be in the following format: jdbc:hive:///TMODE=,database= |
Username | The username for the connection to the database. |
Password | The password for the username provided. |
Source Schema | The schema in the database to be crawled. The schema value is case sensitive. |
Once the setting is saved, you can test the connection or navigate to the Source Configuration page to crawl the metadata.
NOTE: Ensure that the source schema that you provide is already existing in Hive.
Crawling Hive Source Metadata
For crawling a Hive source, see Crawling Metadata.