{"id":32613,"date":"2023-06-13T06:00:00","date_gmt":"2023-06-13T13:00:00","guid":{"rendered":"https:\/\/insidebigdata.com\/?p=32613"},"modified":"2023-06-09T10:56:53","modified_gmt":"2023-06-09T17:56:53","slug":"state-of-data-quality-report","status":"publish","type":"post","link":"https:\/\/insidebigdata.com\/2023\/06\/13\/state-of-data-quality-report\/","title":{"rendered":"State of Data Quality Report"},"content":{"rendered":"<div class=\"wp-block-image\">\n<figure class=\"alignright size-full\"><img decoding=\"async\" loading=\"lazy\" width=\"300\" height=\"230\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/06\/BigEye_report.png\" alt=\"\" class=\"wp-image-32614\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/06\/BigEye_report.png 300w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/06\/BigEye_report-150x115.png 150w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/figure><\/div>\n\n\n<p><a href=\"https:\/\/www.bigeye.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">Bigeye<\/a>, the data observability company, announced the results of its <a href=\"https:\/\/assets.ctfassets.net\/jefo7j3ehpck\/3RJQQQsDcGiWpy6oLbRrPJ\/f85084c5285873d913dcd6f8d3aa70f4\/Bigeye_Data-Quality-Survey_Report-2023.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">2023 State of Data Quality<\/a> survey. The report sheds light on the most pervasive problems in data quality today.<\/p>\n\n\n\n<p>The report, which was researched and authored by Bigeye, consisted of answers from 100 survey respondents. At least 63 came from mid-to-large cloud data warehouse customers (with a spend of more than $500k per annum) who have some form of data monitoring in place, whether third-party or built in-house.<\/p>\n\n\n\n<p><strong>First line of defense against data issues<\/strong><\/p>\n\n\n\n<p>Bigeye\u2019s survey found that data engineers are the first line of defense in managing data issues, followed closely by software engineers. The role of data engineer has moved on par with software engineering. Like software engineers, data engineers are in charge of a product &#8211; the data product &#8211; that increasingly demands software-like levels of process, maintenance, and code review.<\/p>\n\n\n\n<p><strong>Desire for automation<\/strong><\/p>\n\n\n\n<p>Respondents who used third-party data monitoring solutions found about a 2x to 3x ROI over in-house solutions. They also noted that at full utilization, third-party data monitoring solved for two issues: fractured infrastructure, and anomalous data. They further reported that third-party data monitoring solutions had better test libraries, and a broader perspective on data problems.&nbsp;<\/p>\n\n\n\n<p><strong>Data incident frequency<\/strong><\/p>\n\n\n\n<p>Research revealed that companies experience a median of five to ten data incidents over a period of three months. These incidents range from severe enough to impact the company&#8217;s bottom line, to reducing engineer productivity. These incidents take an average of 48 hours to troubleshoot.<\/p>\n\n\n\n<p>Organizations with more than five data incidents a month are essentially lurching from incident to incident, with little ability to trust data or invest in larger data infrastructure projects. They are largely performing reactive over proactive data quality work.<\/p>\n\n\n\n<p><strong>Other important findings from the survey<\/strong><\/p>\n\n\n\n<p>There were other interesting insights revealed through survey results, including:<\/p>\n\n\n\n<ul>\n<li>Respondents told us 37,500 man hours to build an in-house data quality monitoring solution<\/li>\n\n\n\n<li>Roughly that equates to one year of work for 20 engineers<\/li>\n\n\n\n<li>70% of respondents reported at least two data incidents that diminished the productivity oftheir teams<\/li>\n\n\n\n<li>Data issues most commonly take ~1-2 days to spot and fix, but with a long tail lasting up toweeks and months<\/li>\n\n\n\n<li>Respondents reported at least two \u201csevere\u201d data incidents in the last six months, whichcreated damage to the business\/bottom line and were visible at the C-level<\/li>\n<\/ul>\n\n\n\n<blockquote class=\"wp-block-quote\">\n<p>\u201cComing from a data team before starting Bigeye, I knew anecdotally how much of a burden data quality and pipeline reliability issues were. These survey results confirmed my experience: data quality issues are the biggest blockers preventing data teams from being successful,\u201d said Kyle Kirwan, Bigeye\u2019s CEO and co-founder. \u201cWe\u2019ve heard that around 250-500 hours are lost every quarter, just dealing with data pipeline issues.\u201d<\/p>\n<\/blockquote>\n\n\n\n<p>To read the full report, click <a href=\"https:\/\/assets.ctfassets.net\/jefo7j3ehpck\/3RJQQQsDcGiWpy6oLbRrPJ\/f85084c5285873d913dcd6f8d3aa70f4\/Bigeye_Data-Quality-Survey_Report-2023.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">HERE<\/a>.<\/p>\n\n\n\n<p><em>Sign up for the free insideBIGDATA&nbsp;<a href=\"http:\/\/inside-bigdata.com\/newsletter\/\" target=\"_blank\" rel=\"noreferrer noopener\">newsletter<\/a>.<\/em><\/p>\n\n\n\n<p><em>Join us on Twitter:&nbsp;<a href=\"https:\/\/twitter.com\/InsideBigData1\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/twitter.com\/InsideBigData1<\/a><\/em><\/p>\n\n\n\n<p><em>Join us on LinkedIn:&nbsp;<a href=\"https:\/\/www.linkedin.com\/company\/insidebigdata\/\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/www.linkedin.com\/company\/insidebigdata\/<\/a><\/em><\/p>\n\n\n\n<p><em>Join us on Facebook:&nbsp;<a href=\"https:\/\/www.facebook.com\/insideBIGDATANOW\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/www.facebook.com\/insideBIGDATANOW<\/a><\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Bigeye, the data observability company, announced the results of its 2023 State of Data Quality survey. The report sheds light on the most pervasive problems in data quality today. The report, which was researched and authored by Bigeye, consisted of answers from 100 survey respondents.<\/p>\n","protected":false},"author":10513,"featured_media":27298,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"footnotes":""},"categories":[115,63,64,180,268,56,84,1],"tags":[280,237,96],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v20.6 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>State of Data Quality Report - insideBIGDATA<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/insidebigdata.com\/2023\/06\/13\/state-of-data-quality-report\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"State of Data Quality Report - insideBIGDATA\" \/>\n<meta property=\"og:description\" content=\"Bigeye, the data observability company, announced the results of its 2023 State of Data Quality survey. The report sheds light on the most pervasive problems in data quality today. The report, which was researched and authored by Bigeye, consisted of answers from 100 survey respondents.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/insidebigdata.com\/2023\/06\/13\/state-of-data-quality-report\/\" \/>\n<meta property=\"og:site_name\" content=\"insideBIGDATA\" \/>\n<meta property=\"article:publisher\" content=\"http:\/\/www.facebook.com\/insidebigdata\" \/>\n<meta property=\"article:published_time\" content=\"2023-06-13T13:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-06-09T17:56:53+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/10\/data_quality_shutterstock_243064750.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"300\" \/>\n\t<meta property=\"og:image:height\" content=\"283\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Editorial Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@insideBigData\" \/>\n<meta name=\"twitter:site\" content=\"@insideBigData\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Editorial Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/insidebigdata.com\/2023\/06\/13\/state-of-data-quality-report\/\",\"url\":\"https:\/\/insidebigdata.com\/2023\/06\/13\/state-of-data-quality-report\/\",\"name\":\"State of Data Quality Report - insideBIGDATA\",\"isPartOf\":{\"@id\":\"https:\/\/insidebigdata.com\/#website\"},\"datePublished\":\"2023-06-13T13:00:00+00:00\",\"dateModified\":\"2023-06-09T17:56:53+00:00\",\"author\":{\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/2949e412c144601cdbcc803bd234e1b9\"},\"breadcrumb\":{\"@id\":\"https:\/\/insidebigdata.com\/2023\/06\/13\/state-of-data-quality-report\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/insidebigdata.com\/2023\/06\/13\/state-of-data-quality-report\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/insidebigdata.com\/2023\/06\/13\/state-of-data-quality-report\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/insidebigdata.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"State of Data Quality Report\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/insidebigdata.com\/#website\",\"url\":\"https:\/\/insidebigdata.com\/\",\"name\":\"insideBIGDATA\",\"description\":\"Your Source for AI, Data Science, Deep Learning &amp; Machine Learning Strategies\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/insidebigdata.com\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/2949e412c144601cdbcc803bd234e1b9\",\"name\":\"Editorial Team\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/e137ce7ea40e38bd4d25bb7860cfe3e4?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/e137ce7ea40e38bd4d25bb7860cfe3e4?s=96&d=mm&r=g\",\"caption\":\"Editorial Team\"},\"sameAs\":[\"http:\/\/www.insidebigdata.com\"],\"url\":\"https:\/\/insidebigdata.com\/author\/editorial\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"State of Data Quality Report - insideBIGDATA","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/insidebigdata.com\/2023\/06\/13\/state-of-data-quality-report\/","og_locale":"en_US","og_type":"article","og_title":"State of Data Quality Report - insideBIGDATA","og_description":"Bigeye, the data observability company, announced the results of its 2023 State of Data Quality survey. The report sheds light on the most pervasive problems in data quality today. The report, which was researched and authored by Bigeye, consisted of answers from 100 survey respondents.","og_url":"https:\/\/insidebigdata.com\/2023\/06\/13\/state-of-data-quality-report\/","og_site_name":"insideBIGDATA","article_publisher":"http:\/\/www.facebook.com\/insidebigdata","article_published_time":"2023-06-13T13:00:00+00:00","article_modified_time":"2023-06-09T17:56:53+00:00","og_image":[{"width":300,"height":283,"url":"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/10\/data_quality_shutterstock_243064750.jpg","type":"image\/jpeg"}],"author":"Editorial Team","twitter_card":"summary_large_image","twitter_creator":"@insideBigData","twitter_site":"@insideBigData","twitter_misc":{"Written by":"Editorial Team","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/insidebigdata.com\/2023\/06\/13\/state-of-data-quality-report\/","url":"https:\/\/insidebigdata.com\/2023\/06\/13\/state-of-data-quality-report\/","name":"State of Data Quality Report - insideBIGDATA","isPartOf":{"@id":"https:\/\/insidebigdata.com\/#website"},"datePublished":"2023-06-13T13:00:00+00:00","dateModified":"2023-06-09T17:56:53+00:00","author":{"@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/2949e412c144601cdbcc803bd234e1b9"},"breadcrumb":{"@id":"https:\/\/insidebigdata.com\/2023\/06\/13\/state-of-data-quality-report\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/insidebigdata.com\/2023\/06\/13\/state-of-data-quality-report\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/insidebigdata.com\/2023\/06\/13\/state-of-data-quality-report\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/insidebigdata.com\/"},{"@type":"ListItem","position":2,"name":"State of Data Quality Report"}]},{"@type":"WebSite","@id":"https:\/\/insidebigdata.com\/#website","url":"https:\/\/insidebigdata.com\/","name":"insideBIGDATA","description":"Your Source for AI, Data Science, Deep Learning &amp; Machine Learning Strategies","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/insidebigdata.com\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/2949e412c144601cdbcc803bd234e1b9","name":"Editorial Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/e137ce7ea40e38bd4d25bb7860cfe3e4?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/e137ce7ea40e38bd4d25bb7860cfe3e4?s=96&d=mm&r=g","caption":"Editorial Team"},"sameAs":["http:\/\/www.insidebigdata.com"],"url":"https:\/\/insidebigdata.com\/author\/editorial\/"}]}},"jetpack_featured_media_url":"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/10\/data_quality_shutterstock_243064750.jpg","jetpack_shortlink":"https:\/\/wp.me\/p9eA3j-8u1","jetpack-related-posts":[{"id":29701,"url":"https:\/\/insidebigdata.com\/2022\/06\/28\/what-is-data-reliability-engineering\/","url_meta":{"origin":32613,"position":0},"title":"What Is Data Reliability Engineering?","date":"June 28, 2022","format":false,"excerpt":"In this contributed article, Kyle Kirwan, CEO and co-founder of Bigeye, discusses Data Reliability Engineering (DRE), the work done to keep data pipelines delivering fresh and high-quality input data to the users and applications that depend on them. The goal of DRE is to allow for iteration on data infrastructure,\u2026","rel":"","context":"In &quot;Big Data&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":31210,"url":"https:\/\/insidebigdata.com\/2022\/12\/23\/data-leaders-looking-to-data-observability-to-overcome-data-quality-cost-and-pipeline-concerns\/","url_meta":{"origin":32613,"position":1},"title":"Data Leaders Looking to Data Observability to Overcome Data Quality, Cost and Pipeline Concerns","date":"December 23, 2022","format":false,"excerpt":"Acceldata, a market leader in data observability, conducted a survey of 200 data executives and found that they have major concerns including the lack of data pipeline visibility. Below are additional highlights from the results.","rel":"","context":"In &quot;Big Data&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2022\/09\/Observability_shutterstock_152448146.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":32062,"url":"https:\/\/insidebigdata.com\/2023\/04\/08\/grafana-labs-observability-survey-2023-finds-centralization-saves-time-and-money-for-an-industry-plagued-by-tool-and-data-source-overload\/","url_meta":{"origin":32613,"position":2},"title":"Grafana Labs Observability Survey 2023 Finds Centralization Saves Time and Money for an Industry Plagued by Tool and Data Source Overload","date":"April 8, 2023","format":false,"excerpt":"Grafana Labs, the company behind the open and composable operational dashboards, announced the findings of the\u00a0Grafana Labs Observability Survey 2023. The report, which focused on the state of observability, found that organizations are challenged by tool sprawl and data source overload, with 52% of respondents reporting that their companies use\u2026","rel":"","context":"In &quot;Analytics&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2022\/09\/Observability_shutterstock_152448146.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":31948,"url":"https:\/\/insidebigdata.com\/2023\/03\/26\/dbt-labs-report-opportunities-and-challenges-for-analytics-engineers\/","url_meta":{"origin":32613,"position":3},"title":"dbt Labs Report &#8211; Opportunities and Challenges for Analytics Engineers","date":"March 26, 2023","format":false,"excerpt":"The practice of analytics engineering (made popular by dbt Labs) took the data world by storm last year as the novel approach changed how data professionals work. Today, dbt Labs officially launched the inaugural State of Analytics Engineering report. The report assessed the analytics engineering practice and gathered insights from\u2026","rel":"","context":"In &quot;Analytics&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2684,"url":"https:\/\/insidebigdata.com\/2013\/04\/02\/survey-sheds-additional-light-on-big-data-trends\/","url_meta":{"origin":32613,"position":4},"title":"Survey Sheds Additional Light on Big Data Trends","date":"April 2, 2013","format":false,"excerpt":"Talend, a purveyor of open source middleware solutions for, among other things, Big Data integration, recently released the results of a small data survey it conducted to get a better perspective on how data professionals are implementing the technology at their companies. The 231 professionals surveyed were drawn from North\u2026","rel":"","context":"In &quot;Machine Learning&quot;","img":{"alt_text":"","src":"https:\/\/dl.dropbox.com\/u\/5192443\/talendCover.jpg","width":350,"height":200},"classes":[]},{"id":15233,"url":"https:\/\/insidebigdata.com\/2016\/06\/22\/big-data-or-bad-data-survey-shows-enterprises-struggle-to-manage-big-data-flows\/","url_meta":{"origin":32613,"position":5},"title":"Big Data or Bad Data?  Survey Shows Enterprises Struggle to Manage Big Data Flows","date":"June 22, 2016","format":false,"excerpt":"StreamSets, the company that delivers performance management for data flows, today announced results from a global survey of more than 300 data management professionals conducted by independent research firm Dimensional Research\u00ae.","rel":"","context":"In &quot;Big Data&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts\/32613"}],"collection":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/users\/10513"}],"replies":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/comments?post=32613"}],"version-history":[{"count":0,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts\/32613\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/media\/27298"}],"wp:attachment":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/media?parent=32613"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/categories?post=32613"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/tags?post=32613"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}