Ensuring trust and reliability on enterprise data through Data Quality
By Wes Shah | @intelia | October 9
Introduction
With the rise in creation and consumption of data in enterprises, and the proliferation of technologies that create data as a biproduct (such as logs), business and technical teams are looking for data quality solutions that can improve reliability and trust in their data for operational and analytical purposes. This has become even more significant in recent days due to serious concerns around cyber security threats leading to personally identifiable information data exposure.
Traditional Data Quality Methods – What is usually seen in an organisation?
Traditional data quality methods, such as custom SQL scripts or excel spreadsheets, have long been a way to tackle data quality issues. While these methods may offer a quick way of launching data quality checks on datasets, they are unable to cope with large volumes and everchanging data structures. Their limitations are also an indication of manual maintenance which could easily become a burden for the ETL developers. Instead, their efforts should be directed towards ensuring robust data pipelines are in place for data ingestion purposes.
Implementation of Data Quality Function – How can things be improved?
Establishing a data quality function driven by a well thought out data strategy will help to improve the quality and trust in data across the business. By creating new roles such as Data Lead, Data Steward and Data Quality Lead/Analyst, teams can work with more confidence to identify and resolve data quality issues in a timely manner.
Collibra Data Quality and Observability provides an ease with which data can be scanned, profiled and then viewed for analysis. The OOTB ML generated rules can be leveraged to add to the Rulebook and users can be assigned as approvers. This gives control to the business to set the right standards for data quality conforming to the business requirements.
Benefit of using Collibra Data Quality and Observability over traditional Data Quality method
Collibra Data Quality and Observability stands out as the “visionary” data quality tool as per the Gartner magic quadrant.
The automation around rule generation reduces the manual effort of creating and maintaining the rules. Continuous monitoring being another OOTB feature, allows for quick anomaly detection which can be resolved based on assigned priority compared to traditional monitoring and mitigation methods.
Other data quality steps such as data profiling, data quality assessment, and data cleansing and standardisation tasks can soon become an arduous process when carried out manually.
Collibra DQ&O brings in scalability and automation to the activities and extends them by adding essential steps like data issue management, data issue remediation and data quality performance and impact reporting.
The tool also provides a business-friendly way of creating scorecards and adding them to the catalog for wider communication of the issues and their impact. Notification of the issues is sent via workflows, which can be customised as needed to raise issues and tag the appropriate owners.
Conclusion
When it comes to managing enterprise data, ensuring security, reliability and assigning ownership of data, Collibra Data Quality and Observability is a wise option compared to other data quality tools. It’s compatibility with other Collibra products such as Collibra Edge for harvesting lineage it one of the key components of the overall product offering.
References
Create-an-enterprise-vision-for-data-quality-and-observability-Whitepaper
Gartner report: Gartner Reprint