Hands-on Big Data Governance with Cloudera Navigator

The Information Asset team brought Cloudera Navigator into our Big Data lab. Cloudera Navigator supports metadata capabilities within Hadoop. In Figure 1, we were able to view data lineage that includes a Sqoop job (Supplies), Cloudera, output file (part-m-00000), HDFS file (Contacts.csv), Hive table (contact_details) and a Hive job. Figure 1: Hadoop data lineage with […]

Continue Reading >

Integrating Oracle Enterprise Metadata Manager with Hadoop

The Information Asset team recently brought Oracle Enterprise Metadata Manager (OEMM) into our big data lab. Although OEMM supports metadata integration with several repositories, we wanted to test the Hadoop integration. Our use case was to harvest the Hive tables from Cloudera’s distribution of Apache Hadoop into OEMM. We first installed the drivers to connect […]

Continue Reading >

First Take – InfoSphere Stewardship Center

The Information Asset team has been working closely with the IBM InfoSphere tooling at a number of clients. We had a chance to view a demo of the new business process management (BPM) capabilities within IBM InfoSphere Stewardship Center. IBM InfoSphere Stewardship Center is a newly-released capability that is integrated with IBM InfoSphere Information Governance […]

Continue Reading >

Hands-on Big Data Governance with Waterline Data Science

We recently brought Waterline Data Science into the Information Asset Big Data Lab for hands-on testing. Waterline is a VC-funded startup. The company is run by some of my former IBM colleagues including Alex Gorelik and Oliver Claude, so I was interested in their newly-released product. Waterline has positioned itself as the “Amazon of Big […]

Continue Reading >