How to Read Hive Table Data Ingested in Parquet Format through Spark Shell

Issue

Reading Hive table data ingested in parquet format through Spark shell.

Cause

Infoworks can ingest data in Parquet format in Hive. The data will be stored recursively in nested HDFS directories and if user tries to read the data through Spark shell, no results will be displayed.

Solution

To read the hive table data stored in recursive directories in HDFS through Spark shell, you must set the following configurations in the df_spark-defaults.conf file in the $IW_HOME/conf directory and then run the Spark shell command:

Copy

Spark Shell Command

Copy
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard