Setting EMR Cluster with Infoworks EdgeNode

NOTES:

  • Building marketplace solution in AWS is in progress and currently not available.
  • Currently, only EMR 5.17 version is supported.

Prerequisites

  • AWS Account ID of the customer that must be whitelisted for accessing IW EdgeNode must be available.
  • Both EMR Cluster and Infoworks edge node must point to the same Subnet ID.
  • Security Group must include inbound rules on which traffic must be allowed internally with EMR and Infoworks edge node.
  • EMR Cluster MasterNode Private IP Address must be available.

Installing and Configuring EMR Cluster in AWS

  • Login to AWS Console.
  • Search for EMR in Find Services of AWS Console dashboard.
  • In the EMR dashboard, select Create cluster and switch to Advanced Options on top of the cluster parameters section. This option allows you to select the required applications for Infoworks.

Software and Steps

  • Select the EMR release version as 5.17 from drop-down list and select the required applications for Infoworks.
  • Click Next at the end of the page to proceed to the next blade.

Hardware

The hardware blade includes the networking, AZ/Subnet and instance type sections.

  • Select VPC from the drop-down list in the Network section and select the appropriate subnet in the Subnet section.
  • Root device EBS volume size - Set the root device volume from the range of 10-100 GB.
  • Configure the machine type of Master and Core nodes. Infoworks recommends minimum of m4.4xlarge for Master Node and m4.2xlarge for Core nodes and Task nodes.

Following are the recommended Minimum Machine type for Master, Core and Task nodes.

M4.4xlarge or similar of 32 vCPUs with 64GB of RAM for Master node.

M4.2xlarge or similar of 16 vCPUs with 32GB of RAM for Core and Task nodes.

  • Click Next to proceed with General Cluster Configurations blade.

General Cluster Configurations

  • Provide the naming convention to the cluster in Cluster name section.
  • Termination protected - This option avoids accidental deletion of the instances.
  • Create tags for the resources in the Tags section.
  • Click Next to proceed to the final blade, Security.

Security

  • In the Security options section, select the respective key pair to access the cluster through SSH.
  • EMR by default includes roles and security groups assigned.
  • Click Create Cluster.

Deploying Infoworks EdgeNode

  • Login to AWS Console.
  • Search for EC2 in Find Services in the AWS Console dashboard.

Choose AMI

  • Select Launch Instance from the EC2 Dashboard. Select the image from My AMI Section.
  • NOTE**: *The AMI ID might be different.

Choose Instance Type

  • Select the machine type for the Infoworks Edgenode. Minimum and recommended is m4.4xlarge.

Configure Instance

  • Number of Instance is 1.
  • Select the VPC and Subnet ID, similar to EMR Cluster.

Add Storage

  • Add Root volume Storage in GB. For example, 300 GB

Add Tags

  • Add naming convention or environment tags for the resource.

Configure Security Group

  • Create a new security group and allow IW ports and SSH.

Review

  • In this section review the configurations and select existing key pair or create a new key pair and proceed with creation of Instance.

IMPORTANT: Ensure that you add the EdgeNode Security Group ID to allow all inbound traffic to EMR Security Group.

SSH to EdgeNode

Switch to root user using the following commands:

Perform sanity check by running the HDFS commands and Hive shell in the edge Node.

Install Infoworks manually in the edge node.

Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard