The Cost of Bad Data

The Cost of Bad Data

Bad Data is duplicated, outdated, incorrect, or incomplete information that resides in multiple IT systems across a company’s data center. IBM estimated the yearly cost of poor quality data, in the US alone, was $3.1 trillion in 2016[i].

The reason bad data costs so much is that decision makers, managers, knowledge workers and others must accommodate it in their everyday work. And doing so is both time-consuming and expensive. The data they need has plenty of errors, and in the face of a critical deadline, many individuals simply make corrections themselves to complete the task at hand[ii].

Importantly, the benefits of improving data quality go far beyond reduced costs. It is hard to imagine any sort of future in data when so much is so bad. Thus, improving data quality is a gift that keeps giving — it enables you to take out costs permanently and to more easily pursue other data strategies. For all but a few, there is no better opportunity in data[iii].

Business cost of bad data may be as high as 10-25% of an organizations revenue. The cost of bad data in healthcare has been estimated to be around $314 billion. As much as 50% of a typical IT budget may be spent in “information scrap and rework.”[iv]

Healthcare Data is Increasingly Important (But Especially Challenging)

Healthcare organizations are confronted with bad data every day – although it may often be ignored or overlooked. Why?

  • Healthcare data is created by multiple systems in multiple formats.
    These systems are developed by multiple vendors for multiple purposes (e.g. EHR, practice management, human resources, risk management, etc.). Data is growing at a rate of 40% per year. More and more data is coming from outside the organization (e.g. patient satisfaction surveys, state & market databases, personal devices, etc.).
  • Healthcare data has inconsistent and multiple definitions.
    Consider readmission data – an important metric for hospitals with implications on quality, patient satisfaction and revenue. The definition varies by context – CMS, state agencies and payors often differ in their inclusions/exclusions. All are important for different reasons.
  • Healthcare data is complicated.
    While developing standard processes that improve quality is one of the goals in healthcare, the number of data variables involved makes it far more challenging. You’re not working with a finite number of identical parts to create identical outcomes. Instead, you’re looking at an amalgam of individual systems that are so complex we don’t even begin to profess we understand how they work together (that is to say, the human body). Managing the data related to each of those systems (which is often being captured in disparate applications), and turning it into something usable across a population, requires a far more sophisticated set of tools than is needed for other industries like manufacturing[v].
  • Requirements for Healthcare data is constantly changing.
    ACOs, Value-based purchasing, Meaningful Use, MACRA/MIPS, Joint Commission are just a few of the recent demands for reporting and operational changes. Health systems need flexibility and agility to keep pace.

Healthcare has a unique data challenge that requires a unique solution. A solution that is centralized in the enterprise to aggregate the data and manage the definitions. A solution that understands healthcare analytics and overcomes the complexities while nimbly keeping pace with changing regulations.


[i] Bad Data Costs the U.S. $3 Trillion Per Year, Thomas C. Redman, September 22, 2016, Harvard Business
[ii] Excerpt from
[iii] Excerpt from
[iv] Excerpt from
[v] Excerpt from 5-reasons-healthcare-data-is-difficult-to-measure, by Dan LeSueur, HealthCatalyst

Are You Ready to Get Started?

  • We are the fastest, easiest, most cost effective way to leverage your healthcare data. Why wait when you could be saving time and money today!
  • Learn More