The Information Asset team recently brought Oracle Enterprise Metadata Manager (OEMM) into our big data lab. Although OEMM supports metadata integration with several repositories, we wanted to test the Hadoop integration.

Our use case was to harvest the Hive tables from Cloudera’s distribution of Apache Hadoop into OEMM. We first installed the drivers to connect to Hive and then copied the dependent jar files from Cloudera (See Figure 1).

 

 

 

 

 

 

 

 

Figure 1: Configuring the Hive drivers within OEMM

I won’t bother you with the details. After installing the Hive drivers, we set the classpath in the environment variables, selected Cloudera Impala (Hadoop Hive) in the bridge section of OEMM and specified the connection string along with the credentials to connect to the Hive server.

Finally, after the import process completed successfully, we opened the model in OEMM to view the imported Hive database and tables (See Figure 2).

 

 

 

 

 

 

 

Figure 2: Imported Hive database and tables in OEMM