On-premise Installation

Prerequisites

Supported Operating Systems

CentOS - Versions 6.6+, 7.3
Red Hat Enterprise Linux - Versions 6.6+, 7.3
Ubuntu - Version 16.04 (supported for HDInsight only)
Debian - 8.1 (supported for DataProc only)

Supported Hadoop Distributions

HDP - Versions 2.5.5, 2.6.4
MAPR - Version 6.0.1
Cloudera - Version 5.13

Installation Procedure

The installation logs are available in <path_to_Infoworks_home>/iw-installer/logs/installer.log.

Perform the following:

Download and Extract Installer

Download the installer tar ball: wget <link-to-download>
Extract the installer: tar -xf deploy_<version_number>.tar.gz
Navigate to installer directory: cd iw-installer

Configure Installation

Run the following command: ./configure_install.sh

Enter the details for each prompt:

Hadoop distro name and installation path (If not auto-detected)
Infoworks user
Infoworks user group
Infoworks installation path
Infoworks HDFS home (path of home folder for Infoworks artifacts)
Hive schema for Infoworks sample data
IP address for accessing Infoworks UI (when in doubt use the FQDN of the Infoworks host)
HiveServer2 thrift server hostname
Hive user name
Hive user password

If Hadoop distro is Cloudera (CDH):

Impala hostname
Impala port number
Impala user name
Impala password
Is Impala Kerberized?

If Impala is Kerberized:

Kerberos Realm
Kerberos host FQDN

Run Installation

Install Infoworks: ./install.sh -v <version_number>

NOTE:

For CentOS/RHEL6, replace <version_number> with 2.7.2

For CentOS/RHEL7, replace <version_number> with 2.7.2-rhel7

Post Installation

If the target machine is Kerberos enabled, performed the following post installation steps:

Go to <IW_HOME>/conf/conf.properties
Edit the Kerberos security settings as follows (ensure these settings are uncommented):

Restart the Infoworks services.

NOTE: Kerberos tickets are renewed before running all the Infoworks DataFoundry jobs. Infoworks DataFoundry platform supports single Kerberos principal for a Kerberized cluster. Hence, all Infoworks DataFoundry jobs work using the same Kerberos principal, which must have access to all the artifacts in Hive, Spark, and HDFS.

Last updated on