Prerequisites

NOTE: In the following points, Infoworks edge node refers to the Infoworks installation where the replicator will execute for migration. This can be source side, destination side, or some other edge node.

Ensure that using Infoworks user from the Infoworks edge node, you are able to perform the following:

  • run hadoop dfs -ls on the source directories to be migrated.
  • run hadoop dfs -cat on some files which are present in the directories on the source cluster which are to be migrated.
  • run hadoop dfs -mkdir on the destination paths where the migrated data will be located on the destination cluster.
  • run hadoop dfs -put on the directory created using -mkdir in the previous step.
  • run hadoop dfs -cat on a file which was created using -put in the previous step.
  • query Hive server of the source from the Infoworks edge node.
  • query Hive server of the destination from the Infoworks edge node.
  • create entities on destination Hive.

Also ensure all the above steps which were run from the edge node are also executable from the data nodes of the cluster where the migration will be executed. This can be verified by testing these steps at least on one data node.

Required Ports

Following are the ports used by the replicator for communication:

NOTE: Ensure these ports are open on the cluster where the replicator is not running.

Component
ServiceConfigurationPortAccess RequirementQualifierComments
Hive Thrift PortMetastorehive.metastore.uris9083External--
Hadoop NameNodeNameNode
fs.default.name
or
fs.defaultFS
8020External--
Datanode PortsDataNodedfs.datanode. address1004/1019ExternalSecureSee the configuration in the cluster to obtain the correct port.
Datanode PortsDataNodedfs.datanode. address50010External--
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard