GCP Deployment

Infoworks automates the creation and operation of big data workflows from source to consumption in both on-premise and cloud. We help our customers successfully implement to production in days or weeks using 5x fewer people. Infoworks has introduced the world’s first Autonomous Data Engine (ADE). By applying an unprecedented level of automation to data workflows and data engineering, we eliminate big data complexity in both on-premise and cloud.

This article explains how to install and configure the Infoworks DataFoundry (DF) on the Google Cloud Platform.

Prerequisites

  • Quota limit for CPU cores must be greater than 72 for the region that Infoworks is spinning up.
  • API must be enabled for the DataProc, Compute Engine, Deployment Manager and Runtime Configuration services.
Infoworks Deployment Architecture

Infoworks Deployment Architecture

Installing and Configuring Infoworks

Setting Google Account

A Google account is required to work with GCP.

Deploying Infoworks Autonomous Data Engine

  • Choose a name for your deployment instance.
  • Configure the settings for the following machines:
  • Dataproc - The Dataproc hadoop cluster where the jobs will be run.
  • Infoworks Node - The host where the Infoworks services will be installed. User can access the Infoworks ADE which runs on this node.
  • Metadata Server - The machine that includes the service which stores Infoworks metadata.
  • Metadata Server Boot Disk - The disk where the operating system and metadata server process resides.
  • Metadata Server Persistent Disk - The disk where the actual metadata is persisted.

NOTE: The Dataproc cluster might require significant resources and extension of quota for the zone being used for the deployment.

  • Click Deploy.

Once the deployment is successful, the following window is displayed:

You can select the further steps from this page, which also include the following:

  • SSH - allows you to login to the Infoworks server.
  • Web Interface - allows you to access the Infoworks ADE via the web interface.

Accessing Infoworks ADE

On the web interface, enter the login credentials and click Sign In. The email ID is admin@infoworks.io and password can be obtained by writing to cloud@infoworks.io.

The Infoworks dashboard will be displayed on successful login. It is recommended to change the password as per the instructions below.

Changing Password

  • In the Infoworks ADE, click Infoworks Admin > Settings and click the Change Password icon.
  • Enter the old and new passwords and click Update Password. The password will be updated.

License Key

To get started with Infoworks ADE on the Google Cloud platform, you will need a license key. Login to the Infoworks ADE and navigate to Admin >License manager.

For instructions, see the License Management document and write to cloud@infoworks.io for the license key.

Configuration

For advanced understanding of the system, view the settings in the configuration files as per the following instructions:

  • Click the SSH option.

The command prompt of the Infoworks server will be displayed.

  • The default Unix user for Infoworks is infoworks.
  • Switch to this user using the command: sudo su – infoworks
  • Navigate to the Infoworks configuration directory using the command: cd /opt/infoworks/conf
  • The configurations files are located in this folder. The basic configurations are included in the conf.properties file.
  • NOTE: It is recommended to add/overwrite a configuration parameter using Infoworks Web Interface only (Admin > Configuration).

Accessing REST API and Cube Server

Click the Infoworks REST API and Infoworks Cube Server links to access the REST API and cube server respectively.

Viewing VM Instances

Login to Google Cloud Console. Click the icon on the top left and click Compute Engine > VM Instances.

The list of VM instances in the project are displayed including the instances used by Infoworks.

Scaling the Dataproc Cluster

Following are the instructions to scale up or down the Dataproc cluster:

  • Click the icon on the top left and click Dataproc > Clusters.
  • Click the specific cluster you want to scale.
  • Click the Configuration tab and click the Edit button.
  • Scale the worker nodes up or down.
  • Click Save. The worker node is updated.

Metadata Server Access Credentials

The Infoworks Metadata server is hosted on a dedicated VM instance. It hosts a MongoDB, which uses two users: admin and infoworks.

Write to cloud@infoworks.io to obtain the default passwords of these users.

Getting Help

You can contact Infoworks support for any queries at support@infoworks.io

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard