Prerequisites

NOTE: In the following points, Infoworks edge node refers to the Infoworks installation where the replicator will execute for migration. This can be source side, destination side, or some other edge node.

Ensure that using Infoworks user from the Infoworks edge node, you are able to perform the following:

run hadoop dfs -ls on the source directories to be migrated.
run hadoop dfs -cat on some files which are present in the directories on the source cluster which are to be migrated.
run hadoop dfs -mkdir on the destination paths where the migrated data will be located on the destination cluster.
run hadoop dfs -put on the directory created using -mkdir in the previous step.
run hadoop dfs -cat on a file which was created using -put in the previous step.
query Hive server of the source from the Infoworks edge node.
query Hive server of the destination from the Infoworks edge node.
create entities on destination Hive.

Also ensure all the above steps which were run from the edge node are also executable from the data nodes of the cluster where the migration will be executed. This can be verified by testing these steps at least on one data node.

Required Ports

Following are the ports used by the replicator for communication:

NOTE: Ensure these ports are open on the cluster where the replicator is not running.

Component	Service	Configuration	Port	Access Requirement	Qualifier	Comments
Hive Thrift Port	Metastore	hive.metastore.uris	9083	External	-	-
Hadoop NameNode	NameNode	fs.default.name or fs.defaultFS	8020	External	-	-
Datanode Ports	DataNode	dfs.datanode. address	1004/1019	External	Secure	See the configuration in the cluster to obtain the correct port.
Datanode Ports	DataNode	dfs.datanode. address	50010	External	-	-

Last updated on