On-premise Installation
Prerequisites
Supported Operating Systems
- CentOS - Versions 6.6+, 7.3
- Red Hat Enterprise Linux - Versions 6.6+, 7.3
- Ubuntu - Version 16.04 (supported for HDInsight only)
- Debian - 8.1 (supported for DataProc only)
Supported Hadoop Distributions
- HDP - Versions 2.5.5, 2.6.4
- MAPR - Version 6.0.1
- Cloudera - Version 5.13
Installation Procedure
The installation logs are available in <path_to_Infoworks_home>/iw-installer/logs/installer.log
.
Perform the following:
Download and Extract Installer
- Download the installer tar ball:
wget <link-to-download>
- Extract the installer:
tar -xf deploy_<version_number>.tar.gz
- Navigate to installer directory:
cd iw-installer
Configure Installation
- Run the following command:
./configure_install.sh
Enter the details for each prompt:
- Hadoop distro name and installation path (If not auto-detected)
- Infoworks user
- Infoworks user group
- Infoworks installation path
- Infoworks HDFS home (path of home folder for Infoworks artifacts)
- Hive schema for Infoworks sample data
- IP address for accessing Infoworks UI (when in doubt use the FQDN of the Infoworks host)
- HiveServer2 thrift server hostname
- Hive user name
- Hive user password
If Hadoop distro is Cloudera (CDH):
- Impala hostname
- Impala port number
- Impala user name
- Impala password
- Is Impala Kerberized?
If Impala is Kerberized:
- Kerberos Realm
- Kerberos host FQDN
Run Installation
- Install Infoworks:
./install.sh -v <version_number>
NOTE:
For CentOS/RHEL6, replace <version_number>
with 2.7.2
For CentOS/RHEL7, replace <version_number>
with 2.7.2-rhel7
Post Installation
If the target machine is Kerberos enabled, performed the following post installation steps:
- Go to
<IW_HOME>/conf/conf.properties
- Edit the Kerberos security settings as follows (ensure these settings are uncommented):
- Restart the Infoworks services.
NOTE: Kerberos tickets are renewed before running all the Infoworks DataFoundry jobs. Infoworks DataFoundry platform supports single Kerberos principal for a Kerberized cluster. Hence, all Infoworks DataFoundry jobs work using the same Kerberos principal, which must have access to all the artifacts in Hive, Spark, and HDFS.