On-premise Installation

Prerequisites

Supported Operating Systems

  • CentOS - Versions 6.6+, 7.3
  • Red Hat Enterprise Linux - Version 7.5
  • Ubuntu - Version 16.04 (supported for HDInsight only)
  • Debian - 8.1 (supported for DataProc only)
  • SUSE Linux Enterprise Server - Version 12
  • EMR Operating System - Amazon Linux

Supported Hadoop Distributions

  • HDP - Versions 2.5.5, 2.6.4, 3.x
  • MAPR - Version 6.0.1
  • Cloudera - Version 5.14
  • Azure - HDI 3.6
  • GCP - 1.2 (Unsecured), 1.3 (Secured) Dataproc
  • EMR - Version 5.17.0

Installation Procedure

Perform the following:

Step 1: Download and Extract Installer

  • Navigate to a temporary directory:cd <temporary_installer_directory>
  • Download the installer tar ball by running the following command: wget <link-to-download>

NOTE: Contact support@infoworks.io to get the <link-to-download>.

Step 2: Extract the installer by running the following command: tar -xf deploy_<version_number>.tar.gz

Step 3: Navigate to the installer directory by running the following command: cd iw-installer

This creates a directory named iw-installer.

Step 4: Configure installation

  • Run the following command: ./configure_install.sh

Enter the details for the following queries prompted:

  • Hadoop distribution name and installation path (If not auto-detected).
  • Infoworks user
  • Infoworks user group
  • Infoworks installation path where you need to install Infoworks. This location will be referred as IW_HOME.
  • Infoworks HDFS home (path of home folder for Infoworks artifacts)
  • Hive schema for Infoworks sample data
  • IP address for accessing Infoworks UI (when in doubt use the FQDN of the Infoworks host)
  • HiveServer2 thrift server hostname: Hostname of the instance where the HiveServer2 service is running.
  • Hive user name
  • Hive user password

If Hadoop distro is Cloudera (CDH):

  • Impala hostname
  • Impala port number
  • Impala user name
  • Impala password
  • Is Impala Kerberized?

If Impala is Kerberized:

  • Kerberos Realm
  • Kerberos host FQDN

If Hadoop distro is GCP:

  • Managed MongoDB URL, if the MongoDB is not managed on the same machine.
  • Are Infoworks directories already extracted in IW_HOME?

Run Installation

Run the following command to run the Installation of Infoworks: ./install.sh -v <version_number>

NOTE: For machines without certificate setup, --certificate-check parameter can be entered as false as described in the following syntax: ./install.sh -v <version_number> --certificate-check <true/false>. The default value is true. If you set it to false, this performs insecure request calls. This is not a recommended setup.

To exclude a particular service, use the following command: --exclude-services cube-engine . For example, to exclude Cube engine, use ./install.sh -v <version_number> --exclude-services cube-engine

  • For HDP, CentOS/RHEL6, replace <version-number> with 2.9.0-hdp-rhel6
  • For HDP, CentOS/RHEL7, replace with 2.9.0-hdp-rhel7
  • For MapR or Cloudera, CentOS/RHEL6, replace <version_number> with 2.9.0-rhel6
  • For MapR or Cloudera, CentOS/RHEL7, replace <version_number> with 2.9.0-rhel7
  • For Azure, replace <version_number>with 2.9.0-azure
  • For GCP, replace <version_number>with 2.9.0-gcp
  • For EMR, replace <version_number>with 2.9.0-emr

NOTE: To find the rhel version, run the following command: cat /etc/os-release or lsb_release -r

The installation logs are available in <temporary-installer-directory>/iw-installer/logs/installer.log

Silent Installation Procedure

To perform the installation offline, follow the steps below:

Step 1: Get the installer tar ball locally.

Step 2: Extract the installer by running the following command:tar -xf deploy_<version_number>.tar.gz

Step 3: Get the Infoworks DataFoundry tar ball.

Step 4: Run the following commands to place the Infoworks DataFoundry tar ball in the correct location:

mkdir iw-installer/downloads

cp infoworks-x.tar.gz iw-installer/downloads/

Step 5: Navigate to the installer directory by running the following command: cd iw-installer

Step 6: Go to Step 4 of the Installation Procedure.

Post Installation

If the target machine is Kerberos enabled, performed the following post installation steps:

  • Go to <IW_HOME>/conf/conf.properties
  • Edit the Kerberos security settings as follows (ensure that these settings are uncommented):

NOTE: Kerberos tickets are renewed before running all the Infoworks DataFoundry jobs. Infoworks DataFoundry platform supports single Kerberos principal for a Kerberized cluster. Hence, all Infoworks DataFoundry jobs work using the same Kerberos principal, which must have access to all the artifacts in Hive, Spark, and HDFS.

VariableType to search · ESC to discard
GlossaryType to search · ESC to discard
InsertType to search · ESC to discard
No matches