GCP Deployment

Infoworks automates the creation and operation of big data workflows from source to consumption in both on-premise and cloud. We help our customers successfully implement to production in days or weeks using 5x fewer people. Infoworks has introduced the world’s first Autonomous Data Engine (ADE). By applying an unprecedented level of automation to data workflows and data engineering, we eliminate big data complexity in both on-premise and cloud.

This article explains how to install and configure the Infoworks DataFoundry (DF) on the Google Cloud Platform.

Prerequisites

Quota limit for CPU cores must be greater than 72 for the region that Infoworks is spinning up.
API must be enabled for the DataProc, Compute Engine, Deployment Manager and Runtime Configuration services.

Infoworks Deployment Architecture

Installing and Configuring Infoworks

Setting Google Account

A Google account is required to work with GCP.

Sign up for a Google account if you do not have one.
Log on to the Google Cloud Platform Console, and create a new project.
Enable your billing account.
Select Infoworks Autonomous Data Engine from the Google Cloud Platform Marketplace console and click LAUNCH ON COMPUTE ENGINE.

Deploying Infoworks Autonomous Data Engine

Choose a name for your deployment instance.
Configure the settings for the following machines:
Dataproc - The Dataproc hadoop cluster where the jobs will be run.
Infoworks Node - The host where the Infoworks services will be installed. User can access the Infoworks ADE which runs on this node.
Metadata Server - The machine that includes the service which stores Infoworks metadata.
Metadata Server Boot Disk - The disk where the operating system and metadata server process resides.
Metadata Server Persistent Disk - The disk where the actual metadata is persisted.

NOTE: The Dataproc cluster might require significant resources and extension of quota for the zone being used for the deployment.

Click Deploy.

Once the deployment is successful, the following window is displayed:

You can select the further steps from this page, which also include the following:

SSH - allows you to login to the Infoworks server.
Web Interface - allows you to access the Infoworks ADE via the web interface.

Accessing Infoworks ADE

On the web interface, enter the login credentials and click Sign In. The email ID is admin@infoworks.io and password can be obtained by writing to cloud@infoworks.io.

The Infoworks dashboard will be displayed on successful login. It is recommended to change the password as per the instructions below.

Changing Password

In the Infoworks ADE, click Infoworks Admin > Settings and click the Change Password icon.
Enter the old and new passwords and click Update Password. The password will be updated.

License Key

To get started with Infoworks ADE on the Google Cloud platform, you will need a license key. Login to the Infoworks ADE and navigate to Admin >License manager.

For instructions, see the License Management document and write to cloud@infoworks.io for the license key.

Configuration

For advanced understanding of the system, view the settings in the configuration files as per the following instructions:

Click the SSH option.

The command prompt of the Infoworks server will be displayed.

The default Unix user for Infoworks is infoworks.
Switch to this user using the command: sudo su – infoworks
Navigate to the Infoworks configuration directory using the command: cd /opt/infoworks/conf

The configurations files are located in this folder. The basic configurations are included in the conf.properties file.
NOTE: It is recommended to add/overwrite a configuration parameter using Infoworks Web Interface only (Admin > Configuration).

Click the icon on the top left and click Dataproc > Clusters.

Click the specific cluster you want to scale.

Click the Configuration tab and click the Edit button.
Scale the worker nodes up or down.

Click Save. The worker node is updated.

Metadata Server Access Credentials

The Infoworks Metadata server is hosted on a dedicated VM instance. It hosts a MongoDB, which uses two users: admin and infoworks.

Write to cloud@infoworks.io to obtain the default passwords of these users.

Getting Help

You can contact Infoworks support for any queries at support@infoworks.io

Last updated on

GCP Deployment

Prerequisites

Installing and Configuring Infoworks

Setting Google Account

Deploying Infoworks Autonomous Data Engine

Accessing Infoworks ADE

Changing Password

License Key

Configuration

Accessing REST API and Cube Server

Viewing VM Instances

Scaling the Dataproc Cluster

Metadata Server Access Credentials

Getting Help