»
Measuring Data Quality for Ongoing Improvement
 
 

Measuring Data Quality for Ongoing Improvement, 1st Edition

A Data Quality Assessment Framework

 
Measuring Data Quality for Ongoing Improvement, 1st Edition,Laura Sebastian-Coleman,ISBN9780123970336
 
 
 

  

Morgan Kaufmann

9780123970336

9780123977540

376

235 X 191

A ready-to-use framework for data quality measurement!

Print Book + eBook

USD 59.94
USD 99.90

Buy both together and save 40%

Print Book

Paperback

In Stock

Estimated Delivery Time
USD 49.95

eBook
eBook Overview

VST (VitalSource Bookshelf) format

DRM-free included formats : EPUB, Mobi (for Kindle), PDF

USD 49.95
Add to Cart
 
 

Key Features

    • Demonstrates how to leverage a technology independent data quality measurement framework for your specific business priorities and data quality challenges
    • Enables discussions between business and IT with a non-technical vocabulary for data quality measurement
    • Describes how to measure data quality on an ongoing basis with generic measurement types that can be applied to any situation

    Description

    The Data Quality Assessment Framework shows you how to measure and monitor data quality, ensuring quality over time. You’ll start with general concepts of measurement and work your way through a detailed framework of more than three dozen measurement types related to five objective dimensions of quality: completeness, timeliness, consistency, validity, and integrity. Ongoing measurement, rather than one time activities will help your organization reach a new level of data quality. This plain-language approach to measuring data can be understood by both business and IT and provides practical guidance on how to apply the DQAF within any organization enabling you to prioritize measurements and effectively report on results. Strategies for using data measurement to govern and improve the quality of data and guidelines for applying the framework within a data asset are included. You’ll come away able to prioritize which measurement types to implement, knowing where to place them in a data flow and how frequently to measure. Common conceptual models for defining and storing of data quality results for purposes of trend analysis are also included as well as generic business requirements for ongoing measuring and monitoring including calculations and comparisons that make the measurements meaningful and help understand trends and detect anomalies.

    Readership

    Data quality engineers, managers and analysts, application program managers and developers, data stewards, data managers and analysts, compliance analysts, Business intelligence professionals, Database designers and administrators, Business and IT managers

    Laura Sebastian-Coleman

    Laura Sebastian-Coleman, a data quality architect at Optum Insight, has worked on data quality in large health care data warehouses since 2003. Optum Insight specializes in improving the performance of the health system by providing analytics, technology and consulting services. Laura has implemented data quality metrics and reporting, launched and facilitated Optum Insight’s Data Quality Community, contributed to data consumer training programs, and has led efforts to establish data standards and manage metadata. In 2009, she led a group of analysts from Optum and UnitedHealth Group in developing the original Data Quality Assessment Framework (DQAF) which is the basis for Measuring Data Quality for Ongoing Improvement. An active professional, Laura has delivered papers at MIT’s Information Quality Conferences and at conferences sponsored by the International Association for Information and Data Quality (IAIDQ) and the Data Governance Organization (DGO). From 2009-2010, she served as IAIDQ’s Director of Member Services. Before joining Optum Insight, she spent eight years in internal communications and information technology roles in the commercial insurance industry. She holds the IQCP (Information Quality Certified Professional) designation from IAIDQ, a Certificate in Information Quality from MIT, a B.A. in English and History from Franklin & Marshall College, and Ph.D. in English Literature from the University of Rochester (NY).

    Affiliations and Expertise

    Laura Sebastian-Coleman, a data quality architect at Optum Insight.

    Measuring Data Quality for Ongoing Improvement, 1st Edition

    Dedication

    Acknowledgments

    Foreword

    Author Biography

    Introduction: Measuring Data Quality for Ongoing Improvement

    Data Quality Measurement: the Problem we are Trying to Solve

    Recurring Challenges in the Context of Data Quality

    DQAF: the Data Quality Assessment Framework

    Overview of Measuring Data Quality for Ongoing Improvement

    Intended Audience

    What Measuring Data Quality for Ongoing Improvement Does Not Do

    Why I Wrote Measuring Data Quality for Ongoing Improvement

    Section 1. Concepts and Definitions

    Chapter 1. Data

    Purpose

    Data

    Data as Representation

    Data as Facts

    Data as a Product

    Data as Input to Analyses

    Data and Expectations

    Information

    Concluding Thoughts

    Chapter 2. Data, People, and Systems

    Purpose

    Enterprise or Organization

    IT and the Business

    Data Producers

    Data Consumers

    Data Brokers

    Data Stewards and Data Stewardship

    Data Owners

    Data Ownership and Data Governance

    IT, the Business, and Data Owners, Redux

    Data Quality Program Team

    Stakeholder

    Systems and System Design

    Concluding Thoughts

    Chapter 3. Data Management, Models, and Metadata

    Purpose

    Data Management

    Database, Data Warehouse, Data Asset, Dataset

    Source System, Target System, System of Record

    Data Models

    Types of Data Models

    Physical Characteristics of Data

    Metadata

    Metadata as Explicit Knowledge

    Data Chain and Information Life Cycle

    Data Lineage and Data Provenance

    Concluding Thoughts

    Chapter 4. Data Quality and Measurement

    Purpose

    Data Quality

    Data Quality Dimensions

    Measurement

    Measurement as Data

    Data Quality Measurement and the Business/IT Divide

    Characteristics of Effective Measurements

    Data Quality Assessment

    Data Quality Dimensions, DQAF Measurement Types, Specific Data Quality Metrics

    Data Profiling

    Data Quality Issues and Data Issue Management

    Reasonability Checks

    Data Quality Thresholds

    Process Controls

    In-line Data Quality Measurement and Monitoring

    Concluding Thoughts

    Section 2. DQAF Concepts and Measurement Types

    Chapter 5. DQAF Concepts

    Purpose

    The Problem the DQAF Addresses

    Data Quality Expectations and Data Management

    The Scope of the DQAF

    DQAF Quality Dimensions

    Defining DQAF Measurement Types

    Metadata Requirements

    Objects of Measurement and Assessment Categories

    Functions in Measurement: Collect, Calculate, Compare

    Concluding Thoughts

    Chapter 6. DQAF Measurement Types

    Purpose

    Consistency of the Data Model

    Ensuring the Correct Receipt of Data for Processing

    Inspecting the Condition of Data upon Receipt

    Assessing the Results of Data Processing

    Assessing the Validity of Data Content

    Assessing the Consistency of Data Content

    Comments on the Placement of In-line Measurements

    Periodic Measurement of Cross-table Content Integrity

    Assessing Overall Database Content

    Assessing Controls and Measurements

    The Measurement Types: Consolidated Listing

    Concluding Thoughts

    Section 3. Data Assessment Scenarios

    Purpose

    Assessment Scenarios

    Metadata: Knowledge before Assessment

    Chapter 7. Initial Data Assessment

    Purpose

    Initial Assessment

    Input to Initial Assessments

    Data Expectations

    Data Profiling

    Column Property Profiling

    Structure Profiling

    Profiling an Existing Data Asset

    From Profiling to Assessment

    Deliverables from Initial Assessment

    Concluding Thoughts

    Chapter 8. Assessment in Data Quality Improvement Projects

    Purpose

    Data Quality Improvement Efforts

    Measurement in Improvement Projects

    Chapter 9. Ongoing Measurement

    Purpose

    The Case for Ongoing Measurement

    Example: Health Care Data

    Inputs for Ongoing Measurement

    Criticality and Risk

    Automation

    Controls

    Periodic Measurement

    Deliverables from Ongoing Measurement

    In-Line versus Periodic Measurement

    Concluding Thoughts

    Section 4. Applying the DQAF to Data Requirements

    Context

    Chapter 10. Requirements, Risk, Criticality

    Purpose

    Business Requirements

    Data Quality Requirements and Expected Data Characteristics

    Data Quality Requirements and Risks to Data

    Factors Influencing Data Criticality

    Specifying Data Quality Metrics

    Concluding Thoughts

    Chapter 11. Asking Questions

    Purpose

    Asking Questions

    Understanding the Project

    Learning about Source Systems

    Your Data Consumers’ Requirements

    The Condition of the Data

    The Data Model, Transformation Rules, and System Design

    Measurement Specification Process

    Concluding Thoughts

    Section 5. A Strategic Approach to Data Quality

    Chapter 12. Data Quality Strategy

    Purpose

    The Concept of Strategy

    Systems Strategy, Data Strategy, and Data Quality Strategy

    Data Quality Strategy and Data Governance

    Decision Points in the Information Life Cycle

    General Considerations for Data Quality Strategy

    Concluding Thoughts

    Chapter 13. Directives for Data Quality Strategy

    Purpose

    Directive 1: Obtain Management Commitment to Data Quality

    Directive 2: Treat Data as an Asset

    Directive 3: Apply Resources to Focus on Quality

    Directive 4: Build Explicit Knowledge of Data

    Directive 5: Treat Data as a Product of Processes that can be Measured and Improved

    Directive 6: Recognize Quality is Defined by Data Consumers

    Directive 7: Address the Root Causes of Data Problems

    Directive 8: Measure Data Quality, Monitor Critical Data

    Directive 9: Hold Data Producers Accountable for the Quality of their Data (and Knowledge about that Data)

    Directive 10: Provide Data Consumers with the Knowledge they Require for Data Use

    Directive 11: Data Needs and Uses will Evolve—Plan for Evolution

    Directive 12: Data Quality Goes beyond the Data—Build a Culture Focused on Quality

    Concluding Thoughts: Using the Current State Assessment

    Section 6. The DQAF in Depth

    Functions for Measurement: Collect, Calculate, Compare

    Features of the DQAF Measurement Logical Data Model

    Facets of the DQAF Measurement Types

    Chapter 14. Functions of Measurement: Collection, Calculation, Comparison

    Purpose

    Functions in Measurement: Collect, Calculate, Compare

    Collecting Raw Measurement Data

    Calculating Measurement Data

    Comparing Measurements to Past History

    Statistics

    The Control Chart: A Primary Tool for Statistical Process Control

    The DQAF and Statistical Process Control

    Concluding Thoughts

    Chapter 15. Features of the DQAF Measurement Logical Model

    Purpose

    Metric Definition and Measurement Result Tables

    Optional Fields

    Denominator Fields

    Automated Thresholds

    Manual Thresholds

    Emergency Thresholds

    Manual or Emergency Thresholds and Results Tables

    Additional System Requirements

    Support Requirements

    Concluding Thoughts

    Chapter 16. Facets of the DQAF Measurement Types

    Purpose

    Facets of the DQAF

    Organization of the Chapter

    Measurement Type #1: Dataset Completeness—Sufficiency of Metadata and Reference Data

    Measurement Type #2: Consistent Formatting in One Field

    Measurement Type #3: Consistent Formatting, Cross-table

    Measurement Type #4: Consistent Use of Default Value in One Field

    Measurement Type #5: Consistent Use of Default Values, Cross-table

    Measurement Type #6: Timely Delivery of Data for Processing

    Measurement Type #7: Dataset Completeness—Availability for Processing

    Measurement Type #8: Dataset Completeness—Record Counts to Control Records

    Measurement Type #9: Dataset Completeness—Summarized Amount Field Data

    Measurement Type #10: Dataset Completeness—Size Compared to Past Sizes

    Measurement Type #11: Record Completeness—Length

    Measurement Type #12: Field Completeness—Non-Nullable Fields

    Measurement Type #13: Dataset Integrity—De-Duplication

    Measurement Type #14: Dataset Integrity—Duplicate Record Reasonability Check

    Measurement Type #15: Field Content Completeness—Defaults from Source

    Measurement Type #16: Dataset Completeness Based on Date Criteria

    Measurement Type #17: Dataset Reasonability Based on Date Criteria

    Measurement Type #18: Field Content Completeness—Received Data is Missing Fields Critical to Processing

    Measurement Type #19: Dataset Completeness—Balance Record Counts Through a Process

    Measurement Type #20: Dataset Completeness—Reasons for Rejecting Records

    Measurement Type #21: Dataset Completeness Through a Process—Ratio of Input to Output

    Measurement Type #22: Dataset Completeness Through a Process—Balance Amount Fields

    Measurement Type #23: Field Content Completeness—Ratio of Summed Amount Fields

    Measurement Type #24: Field Content Completeness—Defaults from Derivation

    Measurement Type #25: Data Processing Duration

    Measurement Type #26: Timely Availability of Data for Access

    Measurement Type #27: Validity Check, Single Field, Detailed Results

    Measurement Type #28: Validity Check, Roll-up

    Measurement Logical Data Model

    Measurement Type #29: Validity Check, Multiple Columns within a Table, Detailed Results

    Measurement Type #30: Consistent Column Profile

    Measurement Type #31: Consistent Dataset Content, Distinct Count of Represented Entity, with Ratios to Record Counts

    Measurement Type #32 Consistent Dataset Content, Ratio of Distinct Counts of Two Represented Entities

    Measurement Type #33: Consistent Multicolumn Profile

    Measurement Type #34: Chronology Consistent with Business Rules within a Table

    Measurement Type #35: Consistent Time Elapsed (hours, days, months, etc.)

    Measurement Type #36: Consistent Amount Field Calculations Across Secondary Fields

    Measurement Type #37: Consistent Record Counts by Aggregated Date

    Measurement Type #38: Consistent Amount Field Data by Aggregated Date

    Measurement Type #39: Parent/Child Referential Integrity

    Measurement Type #40: Child/Parent Referential Integrity

    Measurement Type #41: Validity Check, Cross Table, Detailed Results

    Measurement Type #42: Consistent Cross-table Multicolumn Profile

    Measurement Type #43: Chronology Consistent with Business Rules Across-tables

    Measurement Type #44: Consistent Cross-table Amount Column Calculations

    Measurement Type #45: Consistent Cross-Table Amount Columns by Aggregated Dates

    Measurement Type #46: Consistency Compared to External Benchmarks

    Measurement Type #47: Dataset Completeness—Overall Sufficiency for Defined Purposes

    Measurement Type #48: Dataset Completeness—Overall Sufficiency of Measures and Controls

    Concluding Thoughts: Know Your Data

    Glossary

    Bibliography

    Index

    Online Materials

    Appendix A. Measuring the Value of Data

    Appendix B. Data Quality Dimensions

    Purpose

    Richard Wang’s and Diane Strong’s Data Quality Framework, 1996

    Thomas Redman’s Dimensions of Data Quality, 1996

    Larry English’s Information Quality Characteristics and Measures, 1999

    Appendix C. Completeness, Consistency, and Integrity of the Data Model

    Purpose

    Process Input and Output

    High-Level Assessment

    Detailed Assessment

    Quality of Definitions

    Summary

    Appendix D. Prediction, Error, and Shewhart’s Lost Disciple, Kristo Ivanov

    Purpose

    Limitations of the Communications Model of Information Quality

    Error, Prediction, and Scientific Measurement

    What Do We Learn from Ivanov?

    Ivanov’s Concept of the System as Model

    Appendix E. Quality Improvement and Data Quality

    Purpose

    A Brief History of Quality Improvement

    Process Improvement Tools

    Implications for Data Quality

    Limitations of the Data as Product Metaphor

    Concluding Thoughts: Building Quality in Means Building Knowledge in

    Quotes and reviews

    "This book provides a very well-structured introduction to the fundamental issue of data quality, making it a very useful tool for managers, practitioners, analysts, software developers, and systems engineers. It also helps explain what data quality management entails and provides practical approaches aimed at actual implementation. I positively recommend reading it…"--ComputingReviews.com, January 30, 2014
    "The framework she describes is a set of 48 generic measurement types based on five dimensions of data quality: completeness, timeliness, validity, consistency, and integrity. The material is for people who are charged with improving, monitoring, or ensuring data quality."--Reference and Research Book News, August 2013
    "If you are intent on improving the quality of the data at your organization you would do well to read Measuring Data Quality for Ongoing Improvement and adopt the DQAF offered up in this fine book."--Data and Technology Today blog, July 2, 2013

     
     
    Free Shipping
    Shop with Confidence

    Free Shipping around the world
    ▪ Broad range of products
    ▪ 30 days return policy
    FAQ

    Contact Us