By Sunil Soares, Umang Sukhia
Data Quality and Data Governance programs are often perceived to be boring and tend to take a backseat to analytics projects. In this post, we discuss how to add some much needed flair to your data quality program using Tableau.
As a first step, we assume that you have completed data profiling using manual SQL queries or automated tools such as IBM Information Analyzer, Informatica Data Quality, Trillium TS Discovery or SAP Information Steward. Figure 1 shows an overall Data Quality Scorecard for the Client Entity. The scorecard also shows the percentage of compliant and non-compliant records by data quality dimension – completeness, conformity, consistency and uniqueness. There are additional dimensions such as accuracy, timeliness and synchronization that we do not cover in this dashboard. The scorecard is also color coded by red, yellow and green to show acceptable and unacceptable thresholds.
Figure 1: Tableau Data Quality Scorecard – Overall Client Entity and by Data Quality Dimension
In Figure 2, we can drill down to view the percentage of compliant and non-compliant records by business rule. For example, BR-0009 is a business rule relating to the completeness of email address. Tableau shows that 90 percent of the email addresses are populated while 10 percent are null. These business rules should ideally be populated in a business glossary.
Figure 2: Drill-down into Data Quality Results by Business Rule