{"id":32621,"date":"2023-06-23T03:00:00","date_gmt":"2023-06-23T10:00:00","guid":{"rendered":"https:\/\/insidebigdata.com\/?p=32621"},"modified":"2023-07-06T12:53:18","modified_gmt":"2023-07-06T19:53:18","slug":"book-review-the-kaggle-book-workbook","status":"publish","type":"post","link":"https:\/\/insidebigdata.com\/2023\/06\/23\/book-review-the-kaggle-book-workbook\/","title":{"rendered":"Book Review: The Kaggle Book\/Workbook"},"content":{"rendered":"<div class=\"wp-block-image\">\n<figure class=\"alignright size-full is-resized\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/06\/Packt_KaggleBook.png\" alt=\"\" class=\"wp-image-32696\" width=\"213\" height=\"270\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/06\/Packt_KaggleBook.png 357w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/06\/Packt_KaggleBook-236x300.png 236w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/06\/Packt_KaggleBook-118x150.png 118w\" sizes=\"(max-width: 213px) 100vw, 213px\" \/><\/figure><\/div>\n\n\n<p><a href=\"https:\/\/www.kaggle.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">Kaggle<\/a> (<a href=\"https:\/\/techcrunch.com\/2017\/03\/07\/google-is-acquiring-data-science-community-kaggle\/\" target=\"_blank\" rel=\"noreferrer noopener\">acquired by Google<\/a> in 20217) is an incredible resource for all data scientists. The company promotes itself as &#8220;the home of data science.&#8221; I advise my Intro to Data Science students at UCLA to take advantage of Kaggle by first completing the venerable <a href=\"https:\/\/www.kaggle.com\/c\/titanic\" target=\"_blank\" rel=\"noreferrer noopener\">Titanic<\/a> <em>Getting Started Prediction Challenge<\/em>, and then moving on to active challenges. Kaggle is a great way to gain valuable experience with data science and machine learning. Now, there are two excellent books to lead you through the Kaggle process. <a href=\"https:\/\/www.kaggle.com\/general\/320574\" target=\"_blank\" rel=\"noreferrer noopener\">The Kaggle Book<\/a> by Konrad Banachewicz and Luca Massaron published in 2022, and <a href=\"https:\/\/www.packtpub.com\/product\/the-kaggle-workbook\/9781804611210\" target=\"_blank\" rel=\"noreferrer noopener\">The Kaggle Workbook<\/a> by the same authors published in 2023, are both from UK-based Packt Publishing. <\/p>\n\n\n\n<p>Let&#8217;s start with <em>The Kaggle Book<\/em>. The book is an invaluable learning resource for anyone participating in a Kaggle competition, as well as pretty much any data scientist wishing to sharpen his\/her skills. Reading the book is like doing a Vulcan mind-meld with Kaggle Masters and Grandmasters; you get an instant appreciation for how these experts have done so well in the Kaggle ecosystem. This is achieved in a number of ways: through their winning Python code, through their detailed interview sidebars spread throughout the book, and through selected links pointing to important Kaggle discussions. This last feature of the book maybe the most useful as some of the discussions offer insights that you won&#8217;t find anywhere else. For example, Grandmaster Michael Jahrer <a href=\"https:\/\/www.kaggle.com\/c\/porto-seguro-safe-driver-prediction\/discussion\/44629\" target=\"_blank\" rel=\"noreferrer noopener\">famous post<\/a> on denoising autoencoders is featured in Chapter 7. Reading his detailed explanations for how he won 1st place in the Porto Seguro&#8217;s Safe Driver Prediction competition is an excellent way to add to your own data science toolbox. Chapter 7 also included an insightful interview with well-known Kaggler Bojan Tunguz who is a huge proponent of XGBoost on <a href=\"https:\/\/twitter.com\/tunguz\" target=\"_blank\" rel=\"noreferrer noopener\">Twitter<\/a>. <\/p>\n\n\n\n<p>The book also offers strategic references to many Kaggle competitions that illustrate critical methods for ensuring machine learning success. For example, Chapter 5 includes references to a number of competitions for which AUC was used for determining classification accuracy. You can have fun bouncing from project to project to better understand important principles in machine learning. The book serves as a guide map for such explorations. The result is a much better understanding for how to approach projects moving forward. <\/p>\n\n\n\n<p>One of my favorite chapters is Chapter 5 on Metrics since when it comes right down to it, you need a firm stable of techniques with which to judge the performance of your ML solutions. Another favorite is Chapter 8 on Hyperparameter Optimization. Using the best and most powerful algorithms is one thing, but knowing how to optimize a model&#8217;s many hyperparameter is quite another. Although the book doesn&#8217;t address the mathematical foundations for the algorithms and their hyperparameters, it does provide insights into finding the the best hyperparameters for your models. Seeing how Grandmasters address the hyperparameter problem is quite valuable. I also enjoyed Chapter 7 on modeling tabular data, i.e. business data. Here are discussions of important topics like dimensionality reduction, feature engineering, and using neural networks for tabular data. <\/p>\n\n\n\n<p>The balance of the book includes useful topics like: an introduction to Kaggle datasets, working with Kaggle notebooks, leveraging Kaggle discussion forums, as well as popular topics like computer vision and NLP. This book is a great way to tame the complex Kaggle infrastructure, and I can&#8217;t image proceeding with a Kaggle competition without this book by your side. <\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"alignleft size-full is-resized\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/06\/Packt_KaggleWorkbook.png\" alt=\"\" class=\"wp-image-32694\" width=\"236\" height=\"292\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/06\/Packt_KaggleWorkbook.png 327w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/06\/Packt_KaggleWorkbook-242x300.png 242w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/06\/Packt_KaggleWorkbook-121x150.png 121w\" sizes=\"(max-width: 236px) 100vw, 236px\" \/><\/figure><\/div>\n\n\n<p>A very nice adjunct to The Kaggle Book, is <em>The Kaggle Workbook<\/em> that contains just four chapters, each having a thorough review of past Kaggle challenges which can be viewed as self-leaning exercises containing valuable insights for Kaggle data science competitions. Each of the 4 chapters includes Python source code for the solution. The code is designed to run on a Kaggle notebook. Here is a list of the projects:<\/p>\n\n\n\n<ul>\n<li><a href=\"https:\/\/www.kaggle.com\/c\/porto-seguro-safe-driver-prediction\" target=\"_blank\" rel=\"noreferrer noopener\">Porto Seguro&#8217;s Safe Driver Prediction<\/a> &#8211; Predict if a driver will file an insurance claim next year. The project includes using the Light GBM model, building a denoising autoencoder and how to use it to fee a neural network, and blending models. <\/li>\n\n\n\n<li><a href=\"https:\/\/www.kaggle.com\/competitions\/m5-forecasting-uncertainty\" target=\"_blank\" rel=\"noreferrer noopener\">M5 on Kaggle for Accuracy and Uncertainty<\/a> &#8211; Based on Walmart&#8217;s daily sales time series of items hierarchically arranged into departments, categories, and stores spread across three U.S. states, the solution demonstrates how to use LightGBM for this time series problem. <\/li>\n\n\n\n<li><a href=\"https:\/\/www.kaggle.com\/competitions\/cassava-leaf-disease-classification\" target=\"_blank\" rel=\"noreferrer noopener\">Cassava Leaf Disease Classification<\/a> &#8211; Classify crowdsourced photos of cassava plants. This multiclass problem demonstrates how to build a complete pipeline for image classification. <\/li>\n\n\n\n<li><a href=\"https:\/\/www.kaggle.com\/competitions\/google-quest-challenge\" target=\"_blank\" rel=\"noreferrer noopener\">Google Quest Q&amp;A Labeling<\/a> &#8211; Predict human responder&#8217;s evaluation of subjective aspects of a question\/answer pair where an understanding of context was crucial. Cast as a multiclass classification problem, the solution explores the semantic characteristics of a corpus. <\/li>\n<\/ul>\n\n\n\n<p><strong>Conclusion<\/strong><\/p>\n\n\n\n<p>If you&#8217;re thinking of competing in Kaggle challenge or if you just want to push forward with your data science skills, I would highly recommend this Kaggle book tandem. I can&#8217;t see investing in one book and not the other. You need them both. They represent an excellent one-two punch for gaining valuable experience with solving machine learning problems. <\/p>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"alignleft size-full is-resized\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2018\/12\/Daniel_2018_pic.png\" alt=\"\" class=\"wp-image-21778\" width=\"107\" height=\"122\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2018\/12\/Daniel_2018_pic.png 200w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2018\/12\/Daniel_2018_pic-131x150.png 131w\" sizes=\"(max-width: 107px) 100vw, 107px\" \/><\/figure><\/div>\n\n\n<p><em>Contributed by Daniel D. Gutierrez, Editor-in-Chief and Resident Data Scientist for insideBIGDATA. In addition to being a tech journalist, Daniel also is a consultant in data scientist, author, educator and sits on a number of advisory boards for various start-up companies.&nbsp;<\/em><\/p>\n\n\n\n<p><em>Sign up for the free insideBIGDATA&nbsp;<a href=\"http:\/\/inside-bigdata.com\/newsletter\/\" target=\"_blank\" rel=\"noreferrer noopener\">newsletter<\/a>.<\/em><\/p>\n\n\n\n<p><em>Join us on Twitter:&nbsp;<a href=\"https:\/\/twitter.com\/InsideBigData1\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/twitter.com\/InsideBigData1<\/a><\/em><\/p>\n\n\n\n<p><em>Join us on LinkedIn:&nbsp;<a href=\"https:\/\/www.linkedin.com\/company\/insidebigdata\/\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/www.linkedin.com\/company\/insidebigdata\/<\/a><\/em><\/p>\n\n\n\n<p><em>Join us on Facebook:&nbsp;<a href=\"https:\/\/www.facebook.com\/insideBIGDATANOW\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/www.facebook.com\/insideBIGDATANOW<\/a><\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Kaggle is an incredible resource for all data scientists. I advise my Intro to Data Science students at UCLA to take advantage of Kaggle by first completing the venerable Titanic Getting Started Prediction Challenge, and then moving on to active challenges. Kaggle is a great way to gain valuable experience with data science and machine learning. Now, there are two excellent books to lead you through the Kaggle process. The Kaggle Book by Konrad Banachewicz and Luca Massaron published in 2022, and The Kaggle Workbook by the same authors published in 2023, both from UK-based Packt Publishing, are excellent learning resources. <\/p>\n","protected":false},"author":37,"featured_media":32634,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"footnotes":""},"categories":[526,115,92,182,270,90,180,67,268,56,1],"tags":[437,133,264,814,277,95],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v20.6 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Book Review: The Kaggle Book\/Workbook - insideBIGDATA<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/insidebigdata.com\/2023\/06\/23\/book-review-the-kaggle-book-workbook\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Book Review: The Kaggle Book\/Workbook - insideBIGDATA\" \/>\n<meta property=\"og:description\" content=\"Kaggle is an incredible resource for all data scientists. I advise my Intro to Data Science students at UCLA to take advantage of Kaggle by first completing the venerable Titanic Getting Started Prediction Challenge, and then moving on to active challenges. Kaggle is a great way to gain valuable experience with data science and machine learning. Now, there are two excellent books to lead you through the Kaggle process. The Kaggle Book by Konrad Banachewicz and Luca Massaron published in 2022, and The Kaggle Workbook by the same authors published in 2023, both from UK-based Packt Publishing, are excellent learning resources.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/insidebigdata.com\/2023\/06\/23\/book-review-the-kaggle-book-workbook\/\" \/>\n<meta property=\"og:site_name\" content=\"insideBIGDATA\" \/>\n<meta property=\"article:publisher\" content=\"http:\/\/www.facebook.com\/insidebigdata\" \/>\n<meta property=\"article:published_time\" content=\"2023-06-23T10:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-07-06T19:53:18+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/06\/Machine_Learning_shutterstock_1110900704_special.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1100\" \/>\n\t<meta property=\"og:image:height\" content=\"550\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Daniel Gutierrez\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@AMULETAnalytics\" \/>\n<meta name=\"twitter:site\" content=\"@insideBigData\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Daniel Gutierrez\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/insidebigdata.com\/2023\/06\/23\/book-review-the-kaggle-book-workbook\/\",\"url\":\"https:\/\/insidebigdata.com\/2023\/06\/23\/book-review-the-kaggle-book-workbook\/\",\"name\":\"Book Review: The Kaggle Book\/Workbook - insideBIGDATA\",\"isPartOf\":{\"@id\":\"https:\/\/insidebigdata.com\/#website\"},\"datePublished\":\"2023-06-23T10:00:00+00:00\",\"dateModified\":\"2023-07-06T19:53:18+00:00\",\"author\":{\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/2540da209c83a68f4f5922848f7376ed\"},\"breadcrumb\":{\"@id\":\"https:\/\/insidebigdata.com\/2023\/06\/23\/book-review-the-kaggle-book-workbook\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/insidebigdata.com\/2023\/06\/23\/book-review-the-kaggle-book-workbook\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/insidebigdata.com\/2023\/06\/23\/book-review-the-kaggle-book-workbook\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/insidebigdata.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Book Review: The Kaggle Book\/Workbook\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/insidebigdata.com\/#website\",\"url\":\"https:\/\/insidebigdata.com\/\",\"name\":\"insideBIGDATA\",\"description\":\"Your Source for AI, Data Science, Deep Learning &amp; Machine Learning Strategies\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/insidebigdata.com\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/2540da209c83a68f4f5922848f7376ed\",\"name\":\"Daniel Gutierrez\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/5780282e7e567e2a502233e948464542?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/5780282e7e567e2a502233e948464542?s=96&d=mm&r=g\",\"caption\":\"Daniel Gutierrez\"},\"description\":\"Daniel D. Gutierrez is a Data Scientist with Los Angeles-based AMULET Analytics, a service division of AMULET Development Corp. He's been involved with data science and Big Data long before it came in vogue, so imagine his delight when the Harvard Business Review recently deemed \\\"data scientist\\\" as the sexiest profession for the 21st century. Previously, he taught computer science and database classes at UCLA Extension for over 15 years, and authored three computer industry books on database technology. He also served as technical editor, columnist and writer at a major computer industry monthly publication for 7 years. Follow his data science musings at @AMULETAnalytics.\",\"sameAs\":[\"http:\/\/www.insidebigdata.com\",\"https:\/\/twitter.com\/@AMULETAnalytics\"],\"url\":\"https:\/\/insidebigdata.com\/author\/dangutierrez\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Book Review: The Kaggle Book\/Workbook - insideBIGDATA","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/insidebigdata.com\/2023\/06\/23\/book-review-the-kaggle-book-workbook\/","og_locale":"en_US","og_type":"article","og_title":"Book Review: The Kaggle Book\/Workbook - insideBIGDATA","og_description":"Kaggle is an incredible resource for all data scientists. I advise my Intro to Data Science students at UCLA to take advantage of Kaggle by first completing the venerable Titanic Getting Started Prediction Challenge, and then moving on to active challenges. Kaggle is a great way to gain valuable experience with data science and machine learning. Now, there are two excellent books to lead you through the Kaggle process. The Kaggle Book by Konrad Banachewicz and Luca Massaron published in 2022, and The Kaggle Workbook by the same authors published in 2023, both from UK-based Packt Publishing, are excellent learning resources.","og_url":"https:\/\/insidebigdata.com\/2023\/06\/23\/book-review-the-kaggle-book-workbook\/","og_site_name":"insideBIGDATA","article_publisher":"http:\/\/www.facebook.com\/insidebigdata","article_published_time":"2023-06-23T10:00:00+00:00","article_modified_time":"2023-07-06T19:53:18+00:00","og_image":[{"width":1100,"height":550,"url":"https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/06\/Machine_Learning_shutterstock_1110900704_special.jpg","type":"image\/jpeg"}],"author":"Daniel Gutierrez","twitter_card":"summary_large_image","twitter_creator":"@AMULETAnalytics","twitter_site":"@insideBigData","twitter_misc":{"Written by":"Daniel Gutierrez","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/insidebigdata.com\/2023\/06\/23\/book-review-the-kaggle-book-workbook\/","url":"https:\/\/insidebigdata.com\/2023\/06\/23\/book-review-the-kaggle-book-workbook\/","name":"Book Review: The Kaggle Book\/Workbook - insideBIGDATA","isPartOf":{"@id":"https:\/\/insidebigdata.com\/#website"},"datePublished":"2023-06-23T10:00:00+00:00","dateModified":"2023-07-06T19:53:18+00:00","author":{"@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/2540da209c83a68f4f5922848f7376ed"},"breadcrumb":{"@id":"https:\/\/insidebigdata.com\/2023\/06\/23\/book-review-the-kaggle-book-workbook\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/insidebigdata.com\/2023\/06\/23\/book-review-the-kaggle-book-workbook\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/insidebigdata.com\/2023\/06\/23\/book-review-the-kaggle-book-workbook\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/insidebigdata.com\/"},{"@type":"ListItem","position":2,"name":"Book Review: The Kaggle Book\/Workbook"}]},{"@type":"WebSite","@id":"https:\/\/insidebigdata.com\/#website","url":"https:\/\/insidebigdata.com\/","name":"insideBIGDATA","description":"Your Source for AI, Data Science, Deep Learning &amp; Machine Learning Strategies","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/insidebigdata.com\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/2540da209c83a68f4f5922848f7376ed","name":"Daniel Gutierrez","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/5780282e7e567e2a502233e948464542?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/5780282e7e567e2a502233e948464542?s=96&d=mm&r=g","caption":"Daniel Gutierrez"},"description":"Daniel D. Gutierrez is a Data Scientist with Los Angeles-based AMULET Analytics, a service division of AMULET Development Corp. He's been involved with data science and Big Data long before it came in vogue, so imagine his delight when the Harvard Business Review recently deemed \"data scientist\" as the sexiest profession for the 21st century. Previously, he taught computer science and database classes at UCLA Extension for over 15 years, and authored three computer industry books on database technology. He also served as technical editor, columnist and writer at a major computer industry monthly publication for 7 years. Follow his data science musings at @AMULETAnalytics.","sameAs":["http:\/\/www.insidebigdata.com","https:\/\/twitter.com\/@AMULETAnalytics"],"url":"https:\/\/insidebigdata.com\/author\/dangutierrez\/"}]}},"jetpack_featured_media_url":"https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/06\/Machine_Learning_shutterstock_1110900704_special.jpg","jetpack_shortlink":"https:\/\/wp.me\/p9eA3j-8u9","jetpack-related-posts":[{"id":12583,"url":"https:\/\/insidebigdata.com\/2015\/01\/10\/data-science-101-lessons-kaggle-competitions\/","url_meta":{"origin":32621,"position":0},"title":"Data Science 101: Lessons Learned from Kaggle Competitions","date":"January 10, 2015","format":false,"excerpt":"In the video presentation below, \"Machine learning best practices we've learned from hundreds of competitions,\" Ben Hamner, Chief Scientist at Kaggle, discusses some very intriguing insights into how find success in data science projects.","rel":"","context":"In &quot;Data Science&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2187,"url":"https:\/\/insidebigdata.com\/2012\/12\/16\/kaggle-startup-lets-you-host-data-science-competitions\/","url_meta":{"origin":32621,"position":1},"title":"Kaggle Startup Lets You Host Data Science Competitions","date":"December 16, 2012","format":false,"excerpt":"\u00a0 Could your Startup use some help from Data Scientists? Kaggle is a new site for hosting and competing in data science competitions. Kaggle is an innovative solution for statistical\/analytics outsourcing. We are the leading platform for predictive modeling competitions. Companies, governments and researchers present datasets and problems - the\u2026","rel":"","context":"In &quot;Machine Learning&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":6467,"url":"https:\/\/insidebigdata.com\/2013\/12\/22\/skills-top-10-kaggle-competitors\/","url_meta":{"origin":32621,"position":2},"title":"The Superstars of Kaggle Competitions","date":"December 22, 2013","format":false,"excerpt":"Kaggle is the de facto standard for data science competitions. The organization has made a huge splash in this space by providing a platform for pushing the limits of machine learning technology by assisting all types of enterprises in getting more value from their data assets.","rel":"","context":"In &quot;Machine Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/kaggle_monster.png?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":29717,"url":"https:\/\/insidebigdata.com\/2022\/07\/04\/video-highlights-the-rise-of-deberta-for-nlp-downstream-tasks\/","url_meta":{"origin":32621,"position":3},"title":"Video Highlights: The Rise of DeBERTa for NLP Downstream Tasks","date":"July 4, 2022","format":false,"excerpt":"In episode seven of the NVIDIA Grandmaster Series, you\u2019ll learn from four members of the Kaggle Grandmasters of NVIDIA (KGMON) team. Watch this video to learn how they used natural language processing to analyze argumentative writing elements from students and identified key phrases in patient notes from medical licensing exams.","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2019\/10\/NLP_shutterstock_299138114.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":24945,"url":"https:\/\/insidebigdata.com\/2020\/09\/05\/video-highlights-bigquery-notebooks-building-an-analytics-pipeline-on-kaggle\/","url_meta":{"origin":32621,"position":4},"title":"Video Highlights: BigQuery + Notebooks: Building an Analytics Pipeline on Kaggle","date":"September 5, 2020","format":false,"excerpt":"Your architecture choices impact how efficiently you\u2019re able to use your data. In this \"Snapshots\" video produced by Kaggle, Data Scientist Wendy Kan demonstrates how she incorporates BigQuery and Kaggle Notebooks into her workflow. Watch her create an interactive network analysis graph that explores the most commonly installed Python packages!","rel":"","context":"In &quot;Big Data&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/img.youtube.com\/vi\/Zm79aqARusw\/0.jpg?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":6053,"url":"https:\/\/insidebigdata.com\/2013\/11\/25\/pointers-kaggle-success\/","url_meta":{"origin":32621,"position":5},"title":"Some Pointers for Kaggle Success","date":"November 25, 2013","format":false,"excerpt":"Kaggle competitions using machine learning techniques have become the fascination of data scientists worldwide. In the video below, Jeremy Howard presents to the Melbourne R meetup group, where he gave a brief overview of his Data Scientist's Toolbox (using a few Kaggle competitions as practical examples).","rel":"","context":"In &quot;Education \/ Training&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts\/32621"}],"collection":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/users\/37"}],"replies":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/comments?post=32621"}],"version-history":[{"count":0,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts\/32621\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/media\/32634"}],"wp:attachment":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/media?parent=32621"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/categories?post=32621"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/tags?post=32621"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}