Hands-on Big Data Governance with Cloudera Navigator

The Information Asset team brought Cloudera Navigator into our Big Data lab. Cloudera Navigator supports metadata capabilities within Hadoop. In Figure 1, we were able to view data lineage that includes a Sqoop job (Supplies), Cloudera, output file (part-m-00000), HDFS file (Contacts.csv), Hive table (contact_details) and a Hive job. Figure 1: Hadoop data lineage with […]

Continue Reading >

Integrating Oracle Enterprise Metadata Manager with Hadoop

The Information Asset team recently brought Oracle Enterprise Metadata Manager (OEMM) into our big data lab. Although OEMM supports metadata integration with several repositories, we wanted to test the Hadoop integration. Our use case was to harvest the Hive tables from Cloudera’s distribution of Apache Hadoop into OEMM. We first installed the drivers to connect […]

Continue Reading >

First Take – InfoSphere Stewardship Center

The Information Asset team has been working closely with the IBM InfoSphere tooling at a number of clients. We had a chance to view a demo of the new business process management (BPM) capabilities within IBM InfoSphere Stewardship Center. IBM InfoSphere Stewardship Center is a newly-released capability that is integrated with IBM InfoSphere Information Governance […]

Continue Reading >

Hands-on Big Data Governance with Waterline Data Science

We recently brought Waterline Data Science into the Information Asset Big Data Lab for hands-on testing. Waterline is a VC-funded startup. The company is run by some of my former IBM colleagues including Alex Gorelik and Oliver Claude, so I was interested in their newly-released product. Waterline has positioned itself as the “Amazon of Big […]

Continue Reading >

Hands-on Big Data Governance with Dataguise DgSecure

The Information Asset team brought Dataguise DgSecure into our Big Data lab. We love the product and are already recommending Dataguise to our clients to support their Big Data Governance programs. Define Policy As shown in Figure 1, we created a fine-grained policy called Demo_Policy by assembling multiple out-of-the-box expressions for Address, Social Security Number, […]

Continue Reading >

Producing Eye-Catching Data Quality Dashboards with Tableau

By Sunil Soares, Umang Sukhia Data Quality and Data Governance programs are often perceived to be boring and tend to take a backseat to analytics projects. In this post, we discuss how to add some much needed flair to your data quality program using Tableau. As a first step, we assume that you have completed […]

Continue Reading >

Forthcoming book – Data Governance Tools

I recently released a research report on Data Governance tools. Data governance programs have traditionally been focused on people and process. Cost has historically been a key consideration because data governance programs have often started from scratch with little to no funding. As a result, Microsoft Excel and SharePoint are the tools of choice to […]

Continue Reading >