Big Data Governance for RDARR in Financial Services

Risk Data Aggregation and Risk Reporting (RDARR) is critical to compliance in financial services and is manifested in several initiatives such as BCBS 239, DFAST and CCAR. I will be co-hosting a webinar with Zaloni on this topic called Governance of the Big Data Lake with a focus on RDARR for Financial Services on Tuesday, […]

Continue Reading >

The Chief Data Governance Officer is Here!

I am starting to notice several executives with the title of “Chief Data Governance Officer (CDGO).” By now, we are all familiar with the role of the Chief Data Officer (CDO) who has overall accountability for data within an organization. In contrast, the CDGO has executive-level ownership for Data Governance and may report to the […]

Continue Reading >

Data Governance in the Self-Service Data Blending World

Some of our clients’ Data Science teams are using Alteryx for Data Blending (Data Munging). I downloaded a trial version of Alteryx Designer because I was eager to get my hands on the tool. I was also interested to understand how Data Governance might intersect with a day-in-the-life of a data scientist who uses Alteryx […]

Continue Reading >

End-to-end data lineage with DAG MetaCenter & Cloudera Navigator

In this blog post, we will discuss how to establish end-to-end data lineage across SQL Server and Hadoop using DAG MetaCenter and Cloudera Navigator. Cloudera Navigator provides lineage within the Hadoop environment itself. However, if you want end-to-end data lineage (including non-Hadoop data sources), then you need to work with an enterprise metadata repository like […]

Continue Reading >

Importing Cloudera Navigator metadata into Collibra

The Information Asset team has been working with Cloudera Navigator and Collibra. Cloudera Navigator provides rich Hadoop metadata around artifacts like Hive tables and Sqoop jobs. Collibra provides tooling to govern these data artifacts. In this blog, we will discuss how we imported the metadata from Cloudera Navigator into Collibra so that it can be […]

Continue Reading >

Hands-on Big Data Governance with Cloudera Navigator

The Information Asset team brought Cloudera Navigator into our Big Data lab. Cloudera Navigator supports metadata capabilities within Hadoop. In Figure 1, we were able to view data lineage that includes a Sqoop job (Supplies), Cloudera, output file (part-m-00000), HDFS file (Contacts.csv), Hive table (contact_details) and a Hive job. Figure 1: Hadoop data lineage with […]

Continue Reading >

Integrating Oracle Enterprise Metadata Manager with Hadoop

The Information Asset team recently brought Oracle Enterprise Metadata Manager (OEMM) into our big data lab. Although OEMM supports metadata integration with several repositories, we wanted to test the Hadoop integration. Our use case was to harvest the Hive tables from Cloudera’s distribution of Apache Hadoop into OEMM. We first installed the drivers to connect […]

Continue Reading >

First Take – InfoSphere Stewardship Center

The Information Asset team has been working closely with the IBM InfoSphere tooling at a number of clients. We had a chance to view a demo of the new business process management (BPM) capabilities within IBM InfoSphere Stewardship Center. IBM InfoSphere Stewardship Center is a newly-released capability that is integrated with IBM InfoSphere Information Governance […]

Continue Reading >

Hands-on Big Data Governance with Waterline Data Science

We recently brought Waterline Data Science into the Information Asset Big Data Lab for hands-on testing. Waterline is a VC-funded startup. The company is run by some of my former IBM colleagues including Alex Gorelik and Oliver Claude, so I was interested in their newly-released product. Waterline has positioned itself as the “Amazon of Big […]

Continue Reading >