{"id":33778,"date":"2023-10-31T03:00:00","date_gmt":"2023-10-31T10:00:00","guid":{"rendered":"https:\/\/insidebigdata.com\/?p=33778"},"modified":"2023-10-30T12:17:37","modified_gmt":"2023-10-30T19:17:37","slug":"who-done-it-3-possible-suspects-in-this-halloweens-bad-data-horror-movie-and-how-data-teams-can-make-it-out-alive","status":"publish","type":"post","link":"https:\/\/insidebigdata.com\/2023\/10\/31\/who-done-it-3-possible-suspects-in-this-halloweens-bad-data-horror-movie-and-how-data-teams-can-make-it-out-alive\/","title":{"rendered":"Who Done It? 3 Possible Suspects in this Halloween\u2019s Bad Data Horror Movie, And How Data Teams Can Make It Out Alive"},"content":{"rendered":"\n<p>We all know the tell-tale signs that something is about to go horribly awry in a horror movie.<\/p>\n\n\n\n<p>The getaway car won\u2019t start. Every entrance in the mansion is locked aside from the backdoor. The basement light flickers and the room goes silent.&nbsp;<\/p>\n\n\n\n<p>While moviegoers can hide behind their buckets of popcorn or yell at the protagonist to \u201cget away from the door!\u201d data engineers are not so lucky when horror strikes. And for data engineers, that \u201chorror\u201d is more often than not bad data.&nbsp;<\/p>\n\n\n\n<p>According to a recent study from <a href=\"https:\/\/www.montecarlodata.com\/blog-2022-data-quality-survey\/\" target=\"_blank\" rel=\"noreferrer noopener\">Wakefield Research<\/a>, data teams spend 40 percent or more of their time tackling poor data quality, impacting 26 percent of their company\u2019s total revenue. Talk about a horror story.<\/p>\n\n\n\n<p>What constitutes a data horror story, you might ask? Here are a few examples.<\/p>\n\n\n\n<p>On a dark and stormy night in 2022 (just kidding, we don\u2019t know what time of day it was), gaming software company Unity Technologies\u2019 Audience Pinpoint tool, designed to aid game developers in targeted player acquisition and advertising, ingested bad data from a large customer, causing major inaccuracies in the training sets for its predictive ML algorithms and a subsequent dip in performance. The data downtime incident <a href=\"https:\/\/www.marketwatch.com\/story\/unity-software-loses-5-billion-in-market-cap-after-apples-changes-lead-to-self-inflicted-wound-11652291876\" target=\"_blank\" rel=\"noreferrer noopener\">sent the company\u2019s stock plummeting by 36%<\/a>, costing the company upwards of $110 million in lost revenue.\u00a0<\/p>\n\n\n\n<p>Or take <a href=\"https:\/\/www.cnn.com\/2022\/08\/03\/business\/equifax-wrong-credit-scores\/index.html\" target=\"_blank\" rel=\"noreferrer noopener\">Equifax<\/a>, who issued inaccurate credit scores to millions of its customers back in the summer of 2022, all due to a problem with bad data on a legacy on-prem server.<\/p>\n\n\n\n<p>So, how can you evade your own data quality horror stories? We share three common causes of data downtime and walk through how you can escape them.&nbsp;<\/p>\n\n\n\n<p><strong>Is the Call Coming From Inside The House? 3 Root Causes of Data Downtime<\/strong><\/p>\n\n\n\n<p>In the 2023 edition of the same <a href=\"https:\/\/www.montecarlodata.com\/blog-data-quality-survey\" target=\"_blank\" rel=\"noreferrer noopener\">Wakefield data quality survey<\/a>, the time it takes to detect and resolve a given data incident rose by an astounding 166% year-over-year. In the case of our horror movie, that\u2019s like taking an additional week to figure out who the killer is.<\/p>\n\n\n\n<p>To trim that time-to-resolution down (and save some fictional lives), it\u2019s critical to understand more about the root causes of data anomalies. And while there are a near-infinite number of root causes for each type of anomaly, they all stem from issues across three layers of your data infrastructure.&nbsp;<\/p>\n\n\n\n<p>Understanding these layers and how they produce data anomalies can provide structure to your incident resolution process.<\/p>\n\n\n\n<p><strong>System root causes<\/strong><\/p>\n\n\n\n<p>System or operational issues are found when there is an error introduced by the system or tools that customers apply to the data during the extraction, loading, and transformation processes. An example of this could be an Airflow check that took too long to run causing a data freshness anomaly. Another example could be a job that relies on accessing a particular schema in Snowflake, but it doesn\u2019t have the right permissions to access that schema.<\/p>\n\n\n\n<p><strong>Code root causes<\/strong><\/p>\n\n\n\n<p>The second type of data incident root causes are code-related. An example would be is there anything wrong with your SQL or engineering code? An improper JOIN statement resulting in unwanted or unfiltered rows perhaps? Or is it a dbt model that accidentally added a very restrictive WHERE clause that resulted in a reduced number of rows of output data triggering a volume anomaly?<\/p>\n\n\n\n<p><strong>Data root causes<\/strong><\/p>\n\n\n\n<p>System and code issues are also very typical in software engineering, but in the wonderful world of data engineering, there can also be issues that arise in the data itself making it a more dynamic variable. For example, it could be a consumer application where the customer input is just wacky. Let\u2019s say you are an online pet retailer and someone enters their dog weighs 500 pounds instead of just 50 which results in a field health anomaly.&nbsp;<\/p>\n\n\n\n<p><strong>Your Ticket to Data Horror Story Survival<\/strong><\/p>\n\n\n\n<p>Since data anomalies can originate across each component of your data environment, as well as the data itself, incident resolution gets messy, and it becomes trickier to nab our killer.&nbsp;<\/p>\n\n\n\n<p>Data teams may have tabs open for Fivetran, Databricks, Snowflake, Airflow, and dbt, while simultaneously reviewing logs and error traces in their ETL engine and running multiple queries to segment the data. And on top of all of this, the massive pressure on data teams to focus on generative AI has caused the production of data and data pipelines to go into hyperdrive, only exacerbating the shakiness of manual and reactive data quality processes.&nbsp;<\/p>\n\n\n\n<p>Proactive data monitoring and observability can help consolidate and automate these processes by allowing you to see any changes in your data stack, regardless of code, data, or system cause. Not only that, it provides the lineage of the issues &#8211; down to the field &#8211; at just the click of a mouse.&nbsp;<\/p>\n\n\n\n<p>No cliffhangers in this data horror movie! Unless\u2026\u2026\u2026<\/p>\n\n\n\n<p><strong>About the Author<\/strong><\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"alignleft size-full\"><img decoding=\"async\" loading=\"lazy\" width=\"150\" height=\"150\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/10\/Lior_montecarlodata.com_.jpeg\" alt=\"\" class=\"wp-image-33779\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/10\/Lior_montecarlodata.com_.jpeg 150w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/10\/Lior_montecarlodata.com_-300x300.jpeg 300w\" sizes=\"(max-width: 150px) 100vw, 150px\" \/><\/figure><\/div>\n\n\n<p><em>Lior Gavish is CTO and Co-Founder of\u00a0<a href=\"http:\/\/www.montecarlodata.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">Monte Carlo<\/a>, a data reliability company backed by Accel, Redpoint Ventures, GGV, ICONIQ Growth, Salesforce Ventures, and IVP. Prior to Monte Carlo, Lior co-founded cybersecurity startup Sookasa, which was acquired by Barracuda in 2016. At Barracuda, Lior was SVP of Engineering, launching award-winning ML products for fraud prevention. Lior holds an MBA from Stanford and an MSC in Computer Science from Tel-Aviv University.<\/em><\/p>\n\n\n\n<p><em>Sign up for the free insideBIGDATA&nbsp;<a href=\"http:\/\/inside-bigdata.com\/newsletter\/\" target=\"_blank\" rel=\"noreferrer noopener\">newsletter<\/a>.<\/em><\/p>\n\n\n\n<p><em>Join us on Twitter:&nbsp;<a href=\"https:\/\/twitter.com\/InsideBigData1\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/twitter.com\/InsideBigData1<\/a><\/em><\/p>\n\n\n\n<p><em>Join us on LinkedIn:&nbsp;<a href=\"https:\/\/www.linkedin.com\/company\/insidebigdata\/\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/www.linkedin.com\/company\/insidebigdata\/<\/a><\/em><\/p>\n\n\n\n<p><em>Join us on Facebook:&nbsp;<a href=\"https:\/\/www.facebook.com\/insideBIGDATANOW\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/www.facebook.com\/insideBIGDATANOW<\/a><\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this contributed article, Lior Gavish, CTO and Co-Founder of\u00a0Monte Carlo, outlines some of the ways companies can erase themselves from ever appearing in these bad data horror stories, ranging from simple tips to bolster governance within their organization, to tools and best practices that will save data teams the time, hassle, and headache that comes with dealing with bad data.<\/p>\n","protected":false},"author":10531,"featured_media":33238,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"footnotes":""},"categories":[115,180,268,56,97,1],"tags":[1400,237,96],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v20.6 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Who Done It? 3 Possible Suspects in this Halloween\u2019s Bad Data Horror Movie, And How Data Teams Can Make It Out Alive - insideBIGDATA<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/insidebigdata.com\/2023\/10\/31\/who-done-it-3-possible-suspects-in-this-halloweens-bad-data-horror-movie-and-how-data-teams-can-make-it-out-alive\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Who Done It? 3 Possible Suspects in this Halloween\u2019s Bad Data Horror Movie, And How Data Teams Can Make It Out Alive - insideBIGDATA\" \/>\n<meta property=\"og:description\" content=\"In this contributed article, Lior Gavish, CTO and Co-Founder of\u00a0Monte Carlo, outlines some of the ways companies can erase themselves from ever appearing in these bad data horror stories, ranging from simple tips to bolster governance within their organization, to tools and best practices that will save data teams the time, hassle, and headache that comes with dealing with bad data.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/insidebigdata.com\/2023\/10\/31\/who-done-it-3-possible-suspects-in-this-halloweens-bad-data-horror-movie-and-how-data-teams-can-make-it-out-alive\/\" \/>\n<meta property=\"og:site_name\" content=\"insideBIGDATA\" \/>\n<meta property=\"article:publisher\" content=\"http:\/\/www.facebook.com\/insidebigdata\" \/>\n<meta property=\"article:published_time\" content=\"2023-10-31T10:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-10-30T19:17:37+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/08\/Data_shutterstock_1055190668_special.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1100\" \/>\n\t<meta property=\"og:image:height\" content=\"550\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Contributor\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@insideBigData\" \/>\n<meta name=\"twitter:site\" content=\"@insideBigData\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Contributor\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/insidebigdata.com\/2023\/10\/31\/who-done-it-3-possible-suspects-in-this-halloweens-bad-data-horror-movie-and-how-data-teams-can-make-it-out-alive\/\",\"url\":\"https:\/\/insidebigdata.com\/2023\/10\/31\/who-done-it-3-possible-suspects-in-this-halloweens-bad-data-horror-movie-and-how-data-teams-can-make-it-out-alive\/\",\"name\":\"Who Done It? 3 Possible Suspects in this Halloween\u2019s Bad Data Horror Movie, And How Data Teams Can Make It Out Alive - insideBIGDATA\",\"isPartOf\":{\"@id\":\"https:\/\/insidebigdata.com\/#website\"},\"datePublished\":\"2023-10-31T10:00:00+00:00\",\"dateModified\":\"2023-10-30T19:17:37+00:00\",\"author\":{\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/35a290930284d4cdbf002d457f3d5d87\"},\"breadcrumb\":{\"@id\":\"https:\/\/insidebigdata.com\/2023\/10\/31\/who-done-it-3-possible-suspects-in-this-halloweens-bad-data-horror-movie-and-how-data-teams-can-make-it-out-alive\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/insidebigdata.com\/2023\/10\/31\/who-done-it-3-possible-suspects-in-this-halloweens-bad-data-horror-movie-and-how-data-teams-can-make-it-out-alive\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/insidebigdata.com\/2023\/10\/31\/who-done-it-3-possible-suspects-in-this-halloweens-bad-data-horror-movie-and-how-data-teams-can-make-it-out-alive\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/insidebigdata.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Who Done It? 3 Possible Suspects in this Halloween\u2019s Bad Data Horror Movie, And How Data Teams Can Make It Out Alive\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/insidebigdata.com\/#website\",\"url\":\"https:\/\/insidebigdata.com\/\",\"name\":\"insideBIGDATA\",\"description\":\"Your Source for AI, Data Science, Deep Learning &amp; Machine Learning Strategies\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/insidebigdata.com\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/35a290930284d4cdbf002d457f3d5d87\",\"name\":\"Contributor\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/36bffd267e38ed3f525205f67270e91b?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/36bffd267e38ed3f525205f67270e91b?s=96&d=mm&r=g\",\"caption\":\"Contributor\"},\"url\":\"https:\/\/insidebigdata.com\/author\/contributor\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Who Done It? 3 Possible Suspects in this Halloween\u2019s Bad Data Horror Movie, And How Data Teams Can Make It Out Alive - insideBIGDATA","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/insidebigdata.com\/2023\/10\/31\/who-done-it-3-possible-suspects-in-this-halloweens-bad-data-horror-movie-and-how-data-teams-can-make-it-out-alive\/","og_locale":"en_US","og_type":"article","og_title":"Who Done It? 3 Possible Suspects in this Halloween\u2019s Bad Data Horror Movie, And How Data Teams Can Make It Out Alive - insideBIGDATA","og_description":"In this contributed article, Lior Gavish, CTO and Co-Founder of\u00a0Monte Carlo, outlines some of the ways companies can erase themselves from ever appearing in these bad data horror stories, ranging from simple tips to bolster governance within their organization, to tools and best practices that will save data teams the time, hassle, and headache that comes with dealing with bad data.","og_url":"https:\/\/insidebigdata.com\/2023\/10\/31\/who-done-it-3-possible-suspects-in-this-halloweens-bad-data-horror-movie-and-how-data-teams-can-make-it-out-alive\/","og_site_name":"insideBIGDATA","article_publisher":"http:\/\/www.facebook.com\/insidebigdata","article_published_time":"2023-10-31T10:00:00+00:00","article_modified_time":"2023-10-30T19:17:37+00:00","og_image":[{"width":1100,"height":550,"url":"https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/08\/Data_shutterstock_1055190668_special.jpg","type":"image\/jpeg"}],"author":"Contributor","twitter_card":"summary_large_image","twitter_creator":"@insideBigData","twitter_site":"@insideBigData","twitter_misc":{"Written by":"Contributor","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/insidebigdata.com\/2023\/10\/31\/who-done-it-3-possible-suspects-in-this-halloweens-bad-data-horror-movie-and-how-data-teams-can-make-it-out-alive\/","url":"https:\/\/insidebigdata.com\/2023\/10\/31\/who-done-it-3-possible-suspects-in-this-halloweens-bad-data-horror-movie-and-how-data-teams-can-make-it-out-alive\/","name":"Who Done It? 3 Possible Suspects in this Halloween\u2019s Bad Data Horror Movie, And How Data Teams Can Make It Out Alive - insideBIGDATA","isPartOf":{"@id":"https:\/\/insidebigdata.com\/#website"},"datePublished":"2023-10-31T10:00:00+00:00","dateModified":"2023-10-30T19:17:37+00:00","author":{"@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/35a290930284d4cdbf002d457f3d5d87"},"breadcrumb":{"@id":"https:\/\/insidebigdata.com\/2023\/10\/31\/who-done-it-3-possible-suspects-in-this-halloweens-bad-data-horror-movie-and-how-data-teams-can-make-it-out-alive\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/insidebigdata.com\/2023\/10\/31\/who-done-it-3-possible-suspects-in-this-halloweens-bad-data-horror-movie-and-how-data-teams-can-make-it-out-alive\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/insidebigdata.com\/2023\/10\/31\/who-done-it-3-possible-suspects-in-this-halloweens-bad-data-horror-movie-and-how-data-teams-can-make-it-out-alive\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/insidebigdata.com\/"},{"@type":"ListItem","position":2,"name":"Who Done It? 3 Possible Suspects in this Halloween\u2019s Bad Data Horror Movie, And How Data Teams Can Make It Out Alive"}]},{"@type":"WebSite","@id":"https:\/\/insidebigdata.com\/#website","url":"https:\/\/insidebigdata.com\/","name":"insideBIGDATA","description":"Your Source for AI, Data Science, Deep Learning &amp; Machine Learning Strategies","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/insidebigdata.com\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/35a290930284d4cdbf002d457f3d5d87","name":"Contributor","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/36bffd267e38ed3f525205f67270e91b?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/36bffd267e38ed3f525205f67270e91b?s=96&d=mm&r=g","caption":"Contributor"},"url":"https:\/\/insidebigdata.com\/author\/contributor\/"}]}},"jetpack_featured_media_url":"https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/08\/Data_shutterstock_1055190668_special.jpg","jetpack_shortlink":"https:\/\/wp.me\/p9eA3j-8MO","jetpack-related-posts":[{"id":29612,"url":"https:\/\/insidebigdata.com\/2022\/06\/18\/great-expectations-study-reveals-77-of-organizations-have-data-quality-issues\/","url_meta":{"origin":33778,"position":0},"title":"Great Expectations Study Reveals 77% of Organizations have Data Quality Issues\u00a0","date":"June 18, 2022","format":false,"excerpt":"Great Expectations, a leading open-source platform for data quality, announced the results of a survey highlighting top pain points and consequences of poor data quality within organizations. Insights from 500 data practitioners (engineers, analysts, and scientists) showed that 77% have data quality issues and 91% said it\u2019s impacting their company\u2019s\u2026","rel":"","context":"In &quot;Big Data&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2021\/10\/data_quality_shutterstock_243064750.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":27686,"url":"https:\/\/insidebigdata.com\/2021\/11\/21\/data-and-analytics-leaders-report-wasting-funds-on-bad-data\/","url_meta":{"origin":33778,"position":1},"title":"Data and Analytics Leaders Report Wasting Funds on Bad Data","date":"November 21, 2021","format":false,"excerpt":"As enterprises fiercely compete for data engineers, a new global poll out today by Wakefield Research and Fivetran, a leading provider of automated data integration, shows that, on average, 44 percent of their time is wasted building and rebuilding data pipelines, which connect data lakes and warehouses with databases and\u2026","rel":"","context":"In &quot;Analytics&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2021\/11\/Wakefield-Research-State-of-Data-Management-survey-infographic-11.15.21.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":32613,"url":"https:\/\/insidebigdata.com\/2023\/06\/13\/state-of-data-quality-report\/","url_meta":{"origin":33778,"position":2},"title":"State of Data Quality Report","date":"June 13, 2023","format":false,"excerpt":"Bigeye, the data observability company, announced the results of its 2023 State of Data Quality survey. The report sheds light on the most pervasive problems in data quality today. The report, which was researched and authored by Bigeye, consisted of answers from 100 survey respondents.","rel":"","context":"In &quot;Big Data&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2021\/10\/data_quality_shutterstock_243064750.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":24821,"url":"https:\/\/insidebigdata.com\/2020\/08\/02\/infographic-data-engineering-evolved\/","url_meta":{"origin":33778,"position":3},"title":"Infographic: Data Engineering Evolved","date":"August 2, 2020","format":false,"excerpt":"Ascend.io, the data engineering company, announced results from a new research study about the work conditions of data scientists, data engineers, and enterprise architects in the U.S. Conducted in June 2020, findings from over 300 professionals reveal key insights on their teams\u2019 current workload, productivity bottlenecks, and perspectives on automation\u2026","rel":"","context":"In &quot;Big Data&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2020\/07\/ascend-data-eng-survey-infographic-scaled.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":25579,"url":"https:\/\/insidebigdata.com\/2021\/02\/03\/data-engineering-survey-2021-impact-report\/","url_meta":{"origin":33778,"position":4},"title":"Data Engineering Survey: 2021 Impact Report","date":"February 3, 2021","format":false,"excerpt":"This Data Engineering Survey: 2021 Impact Report summarizes key findings from the inaugural survey and provides a glimpse into the current and future state of data engineering and DataOps. The report highlights some of the major trends uncovered in this year\u2019s survey including the adoption of cloud data platforms, what\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2021\/02\/Immuta_fig1.png?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":31429,"url":"https:\/\/insidebigdata.com\/2023\/01\/19\/interview-david-willingham-principal-product-manager-mathworks\/","url_meta":{"origin":33778,"position":5},"title":"Interview: David Willingham, Principal Product Manager,\u00a0MathWorks","date":"January 19, 2023","format":false,"excerpt":"I recently caught up with David Willingham, Principal Product Manager, MathWorks\u00a0to discuss the evolution of data-centric AI and how engineers can best navigate \u2013 and benefit from \u2013 the transition to data-focused models within deep learning environments. As research into data-centric AI continues, we can expect best practice to evolve\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts\/33778"}],"collection":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/users\/10531"}],"replies":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/comments?post=33778"}],"version-history":[{"count":0,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts\/33778\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/media\/33238"}],"wp:attachment":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/media?parent=33778"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/categories?post=33778"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/tags?post=33778"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}