On-premise Installation

Prerequisites

Supported Operating Systems

  • CentOS - Versions 6.6+, 7.3
  • Red Hat Enterprise Linux - Versions 6.6+, 7.3
  • Ubuntu - Version 16.04 (supported for HDInsight only)
  • Debian - 8.1 (supported for DataProc only)

Supported Hadoop Distributions

  • HDP - Versions 2.5.5, 2.6.4, 3.x
  • MAPR - Version 6.0.1
  • Cloudera - Version 5.13
  • Azure - HDI 3.6
  • GCP - 1.2 (Unsecured), 1.3 (Secured) Dataproc
  • EMR - Version 5.17.0

Installation Procedure

The installation logs are available in <path_to_Infoworks_home>/iw-installer/logs/installer.log.

Perform the following:

Download and Extract Installer

  • Download the installer tar ball: wget <link-to-download>
  • Extract the installer: tar -xf deploy_<version_number>.tar.gz
  • Navigate to installer directory: cd iw-installer

Configure Installation

  • Run the following command: ./configure_install.sh

Enter the details for each prompt:

  • Hadoop distro name and installation path (If not auto-detected)
  • Infoworks user
  • Infoworks user group
  • Infoworks installation path
  • Infoworks HDFS home (path of home folder for Infoworks artifacts)
  • Hive schema for Infoworks sample data
  • IP address for accessing Infoworks UI (when in doubt use the FQDN of the Infoworks host)
  • HiveServer2 thrift server hostname
  • Hive user name
  • Hive user password

If Hadoop distro is Cloudera (CDH):

  • Impala hostname
  • Impala port number
  • Impala user name
  • Impala password
  • Is Impala Kerberized?

If Impala is Kerberized:

  • Kerberos Realm
  • Kerberos host FQDN

If Hadoop distro is GCP:

  • Managed Mongo URL
  • Are infoworks directories already extracted in IW_HOME?

Run Installation

  • Install Infoworks: ./install.sh -v <version_number>

NOTE: For machines without certificate setup, --certificate-check parameter can be entered as false as described in the following syntax: ./install.sh -v <version_number> --certificate-check <true/false>. The default value is true. If you set it to false, this performs insecure request calls. This is not a recommended setup.

NOTE:

For HDP, CentOS/RHEL6, replace <version_number> with 2.9.0-hdp-rhel6

For HDP, CentOS/RHEL7, replace <version_number> with 2.9.0-hdp-rhel7

For MapR or Cloudera, CentOS/RHEL6, replace <version_number> with 2.9.0-rhel6

For MapR or Cloudera, CentOS/RHEL7, replace <version_number> with 2.9.0-rhel7

For Azure, replace <version_number>with 2.9.0-azure

For GCP, replace <version_number>with 2.9.0-gcp

For EMR, replace <version_number>with 2.9.0-emr

Post Installation

If the target machine is Kerberos enabled, performed the following post installation steps:

  • Go to <IW_HOME>/conf/conf.properties
  • Edit the Kerberos security settings as follows (ensure these settings are uncommented):
Copy
  • Restart the Infoworks services.

NOTE: Kerberos tickets are renewed before running all the Infoworks DataFoundry jobs. Infoworks DataFoundry platform supports single Kerberos principal for a Kerberized cluster. Hence, all Infoworks DataFoundry jobs work using the same Kerberos principal, which must have access to all the artifacts in Hive, Spark, and HDFS.

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard