I’m having five important comments related to your ‘data downtime’ KPI as metric for measuring data quality. First, it appears to measure the performance of a process and therefore can be hardly considered as basis for measuring the intrinsic quality of the data. Consistency, completeness, accuracy, uniqueness, conformity and integrity are intrinsic data quality dimensions (aka variables in your post), while timeliness and accessibility are extrinsic dimensions, as they are dependent more on the infrastructure than on the data themselves. This delimitation is important, because one deals with two different perspectives that imply different types of actions, respectively approaches. It makes sense to split the two when considering KPIs.

Secondly, downtime refers to the time data is out of action or unavailable for use. Unless that’s addressed by design, namely the data with defects are not shown, then the data are further available and all the consequences deriving from this. Therefore, the metric can easily create confusion.

Thirdly, unless you are referring to a data product or system, I don’t think that the KPI is a good metric measure because is sensitive to data growth – if the data volume increases, more likely also the value will increase considerably.

Fourthly, multiplying the number of incidents to something that has the potential of having big values, carries the possibility of having the number of incidents disappearing in the big values. It’s enough to have a few outliers that impact your metric considerably, even if per total data quality is acceptable for the business.

This brings me to the fifth remark, data quality is best defined as the “fit for use” and this is context dependent. The KPIs need to consider this aspect, otherwise metric’s meaning for the business doesn’t have much value.

Not sure what you mean by ‘traditional methods’ because there seems no general accepted approach on how to measure data quality, even if Six Sigma provides a good basis for building upon it as it considers the defects in relation with the opportunities. This approach addresses better the data growth and allows using all kind of statistical and non-statistical tools coming with Six Sigma. It can be time consuming and prone to rule changes though that’s the reality.

--

--

--

IT professional/blogger with more than 19 years experience in IT - Software Engineering, BI & Analytics, Data, Project, Quality, Database & Knowledge Management

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Adrian

Adrian

IT professional/blogger with more than 19 years experience in IT - Software Engineering, BI & Analytics, Data, Project, Quality, Database & Knowledge Management

More from Medium

Doctoral student Jennifer Ward remembered for enduring influence and inspiration

Jennifer Ward will be awarded a posthumous PhD at the U of A convocation ceremony for the Faculty of Native Studies on June 16, 2022, for her contributions to the academy and Indigenous studies research.

CRODO.IO — ROADMAP AND ACHIEVEMENTS

Theater Review.

REIMAGINING THE FUTURE OF EVENTS — FROM LIVE TO VIRTUAL TO HYBRID