{"id":26685,"date":"2021-07-19T06:00:00","date_gmt":"2021-07-19T13:00:00","guid":{"rendered":"https:\/\/insidebigdata.com\/?p=26685"},"modified":"2021-07-20T08:49:53","modified_gmt":"2021-07-20T15:49:53","slug":"best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-june-2021","status":"publish","type":"post","link":"https:\/\/insidebigdata.com\/2021\/07\/19\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-june-2021\/","title":{"rendered":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 June 2021"},"content":{"rendered":"\n<div class=\"wp-block-image is-style-default\"><figure class=\"alignright size-large is-resized\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg\" alt=\"\" class=\"wp-image-6361\" width=\"231\" height=\"195\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg 450w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv-150x127.jpg 150w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv-300x253.jpg 300w\" sizes=\"(max-width: 231px) 100vw, 231px\" \/><\/figure><\/div>\n\n\n\n<p>In this recurring monthly feature, we filter recent research papers appearing on the <a rel=\"noreferrer noopener\" href=\"https:\/\/arxiv.org\/\" target=\"_blank\">arXiv.org<\/a> preprint server for compelling subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the past month. Researchers from all over the world contribute to this repository as a prelude to the peer review process for publication in traditional journals. arXiv contains a veritable treasure trove of statistical learning methods you may use one day in the solution of data science problems. The articles listed below represent a small fraction of all articles appearing on the preprint server. They are listed in no particular order with a link to each paper along with a brief overview. Links to GitHub repos are provided when available. Especially relevant articles are marked with a \u201cthumbs up\u201d icon. Consider that these are academic research papers, typically geared toward graduate students, post docs, and seasoned professionals. They generally contain a high degree of mathematics so be prepared. Enjoy!<\/p>\n\n\n\n<p><a href=\"https:\/\/arxiv.org\/pdf\/2106.11959v1.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Revisiting Deep Learning Models for Tabular Data<\/a><\/p>\n\n\n\n<div class=\"wp-block-image is-style-default\"><figure class=\"alignleft size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"100\" height=\"100\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2019\/04\/ThumbsUP_shutterstock_325452782.jpg\" alt=\"\" class=\"wp-image-22440\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2019\/04\/ThumbsUP_shutterstock_325452782.jpg 100w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2019\/04\/ThumbsUP_shutterstock_325452782-50x50.jpg 50w\" sizes=\"(max-width: 100px) 100vw, 100px\" \/><\/figure><\/div>\n\n\n\n<p>The necessity of deep learning for tabular data is still an unanswered question addressed by a large number of research efforts. The recent literature on tabular DL proposes several deep architectures reported to be superior to traditional &#8220;shallow&#8221; models like Gradient Boosted Decision Trees. However, since existing works often use different benchmarks and tuning protocols, it is unclear if the proposed models universally outperform GBDT. Moreover, the models are often not compared to each other, therefore, it is challenging to identify the best deep model for practitioners. This paper starts with a thorough review of the main families of DL models recently developed for tabular data. The authors carefully tune and evaluate them on a wide range of datasets and reveal two significant findings. First, it is shown that the choice between GBDT and DL models highly depends on data and there is still no universally superior solution. Second, it&#8217;s demonstrated that a simple ResNet-like architecture is a surprisingly effective baseline, which outperforms most of the sophisticated models from the DL literature. Finally, the authors design a simple adaptation of the Transformer architecture for tabular data that becomes a new strong DL baseline and reduces the gap between GBDT and DL models on datasets where GBDT dominates. The GitHub repo associated with this paper can be found <a href=\"https:\/\/github.com\/yandex-research\/rtdl\" target=\"_blank\" rel=\"noreferrer noopener\">HERE<\/a>. <\/p>\n\n\n\n<div class=\"wp-block-image is-style-default\"><figure class=\"aligncenter size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"700\" height=\"387\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_1.png\" alt=\"\" class=\"wp-image-26701\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_1.png 700w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_1-150x83.png 150w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_1-300x166.png 300w\" sizes=\"(max-width: 700px) 100vw, 700px\" \/><\/figure><\/div>\n\n\n\n<p><a href=\"Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training\" target=\"_blank\" rel=\"noreferrer noopener\">SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training<\/a><\/p>\n\n\n\n<div class=\"wp-block-image is-style-default\"><figure class=\"alignleft size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"100\" height=\"100\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2019\/04\/ThumbsUP_shutterstock_325452782.jpg\" alt=\"\" class=\"wp-image-22440\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2019\/04\/ThumbsUP_shutterstock_325452782.jpg 100w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2019\/04\/ThumbsUP_shutterstock_325452782-50x50.jpg 50w\" sizes=\"(max-width: 100px) 100vw, 100px\" \/><\/figure><\/div>\n\n\n\n<p>Tabular data underpins numerous high-impact applications of machine learning from fraud detection to genomics and healthcare. Classical approaches to solving tabular problems, such as gradient boosting and random forests, are widely used by practitioners. However, recent deep learning methods have achieved a degree of performance competitive with popular techniques. This paper devises a hybrid deep learning approach to solving tabular data problems. The proposed method, SAINT, performs attention over both rows and columns, and it includes an enhanced embedding method. The paper also studies a new contrastive self-supervised pre-training method for use when labels are scarce. SAINT consistently improves performance over previous deep learning methods, and it even outperforms gradient boosting methods, including XGBoost, CatBoost, and LightGBM, on average over a variety of benchmark tasks. The GitHub repo associated with this paper can be found <a href=\"https:\/\/github.com\/somepago\/saint\" target=\"_blank\" rel=\"noreferrer noopener\">HERE<\/a>. <\/p>\n\n\n\n<div class=\"wp-block-image is-style-default\"><figure class=\"aligncenter size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"700\" height=\"535\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_2.png\" alt=\"\" class=\"wp-image-26702\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_2.png 700w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_2-150x115.png 150w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_2-300x229.png 300w\" sizes=\"(max-width: 700px) 100vw, 700px\" \/><\/figure><\/div>\n\n\n\n<p><a href=\"https:\/\/arxiv.org\/pdf\/2106.06561v1.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">GANs N&#8217; Roses: Stable, Controllable, Diverse Image to Image Translation (works for videos too!)<\/a><\/p>\n\n\n\n<p>The research detailed in this paper shows how to learn a map that takes a content code, derived from a face image, and a randomly chosen style code to an anime image. An adversarial loss from our simple and effective definitions of style and content is derived. This adversarial loss guarantees the map is diverse &#8212; a very wide range of anime can be produced from a single content code. Under plausible assumptions, the map is not just diverse, but also correctly represents the probability of an anime, conditioned on an input face. In contrast, current multimodal generation procedures cannot capture the complex styles that appear in anime. Extensive quantitative experiments support the idea the map is correct. Extensive qualitative results show that the method can generate a much more diverse range of styles than SOTA comparisons. Finally, the paper shows that the formalization of content and style makes it possible to perform video to video translation without ever training on videos. The GitHub repo associated with this paper can be found <a href=\"https:\/\/github.com\/mchong6\/GANsNRoses\" target=\"_blank\" rel=\"noreferrer noopener\">HERE<\/a>. <\/p>\n\n\n\n<div class=\"wp-block-image is-style-default\"><figure class=\"aligncenter size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"449\" height=\"539\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_3.png\" alt=\"\" class=\"wp-image-26703\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_3.png 449w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_3-125x150.png 125w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_3-250x300.png 250w\" sizes=\"(max-width: 449px) 100vw, 449px\" \/><\/figure><\/div>\n\n\n\n<p><a href=\"https:\/\/arxiv.org\/pdf\/2106.01345v2.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Decision Transformer: Reinforcement Learning via Sequence Modeling<\/a><\/p>\n\n\n\n<p>This paper introduces a framework that abstracts Reinforcement Learning (RL) as a sequence modeling problem. This makes it possible to draw upon the simplicity and scalability of the Transformer architecture, and associated advances in language modeling such as GPT-x and BERT. In particular, the paper presents Decision Transformer, an architecture that casts the problem of RL as conditional sequence modeling. Unlike prior approaches to RL that fit value functions or compute policy gradients, Decision Transformer simply outputs the optimal actions by leveraging a causally masked Transformer. By conditioning an autoregressive model on the desired return (reward), past states, and actions, the Decision Transformer model can generate future actions that achieve the desired return. Despite its simplicity, Decision Transformer matches or exceeds the performance of state-of-the-art model-free offline RL baselines on Atari, OpenAI Gym, and Key-to-Door tasks. The GitHub repo associated with this paper can be found <a href=\"https:\/\/github.com\/kzl\/decision-transformer\" target=\"_blank\" rel=\"noreferrer noopener\">HERE<\/a>. <\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-style-default\"><img decoding=\"async\" loading=\"lazy\" width=\"700\" height=\"421\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_4.png\" alt=\"\" class=\"wp-image-26704\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_4.png 700w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_4-150x90.png 150w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_4-300x180.png 300w\" sizes=\"(max-width: 700px) 100vw, 700px\" \/><\/figure>\n\n\n\n<p><a href=\"https:\/\/arxiv.org\/pdf\/2106.15561v2.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">A Survey on Neural Speech Synthesis<\/a><\/p>\n\n\n\n<p>Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural speech given text, is a hot research topic in speech, language, and machine learning communities and has broad applications in the industry. As the development of deep learning and artificial intelligence, neural network-based TTS has significantly improved the quality of synthesized speech in recent years. This paper conducts a comprehensive survey on neural TTS, aiming to provide a good understanding of current research and future trends. The focus is on the key components in neural TTS, including text analysis, acoustic models and vocoders, and several advanced topics, including fast TTS, low-resource TTS, robust TTS, expressive TTS, and adaptive TTS, etc. The paper further summarizes resources related to TTS (e.g., datasets, opensource implementations) and discuss future research directions. <\/p>\n\n\n\n<div class=\"wp-block-image is-style-default\"><figure class=\"aligncenter size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"606\" height=\"103\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_5.png\" alt=\"\" class=\"wp-image-26705\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_5.png 606w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_5-150x25.png 150w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_5-300x51.png 300w\" sizes=\"(max-width: 606px) 100vw, 606px\" \/><\/figure><\/div>\n\n\n\n<p><a href=\"https:\/\/arxiv.org\/pdf\/2106.15481v1.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Interactive Dimensionality Reduction for Comparative Analysis<\/a><\/p>\n\n\n\n<p>Finding the similarities and differences between two or more groups of datasets is a fundamental analysis task. For high-dimensional data, dimensionality reduction (DR) methods are often used to find the characteristics of each group. However, existing DR methods provide limited capability and flexibility for such comparative analysis as each method is designed only for a narrow analysis target, such as identifying factors that most differentiate groups. This paper introduces an interactive DR framework where a new DR method, called ULCA (unified linear comparative analysis), is integrated with an interactive visual interface. ULCA unifies two DR schemes, discriminant analysis and contrastive learning, to support various comparative analysis tasks. To provide flexibility for comparative analysis, an optimization algorithm was developed that enables analysts to interactively refine ULCA results. Additionally, an interactive visualization interface is provided to examine ULCA results with a rich set of analysis libraries. <\/p>\n\n\n\n<div class=\"wp-block-image is-style-default\"><figure class=\"aligncenter size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"700\" height=\"399\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_6.png\" alt=\"\" class=\"wp-image-26706\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_6.png 700w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_6-150x86.png 150w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_6-300x171.png 300w\" sizes=\"(max-width: 700px) 100vw, 700px\" \/><\/figure><\/div>\n\n\n\n<p><a href=\"https:\/\/arxiv.org\/pdf\/2106.13200v1.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Software for Dataset-wide XAI: From Local Explanations to Global Insights with Zennit, CoRelAy, and ViRelAy<\/a><\/p>\n\n\n\n<p>Deep Neural Networks (DNNs) are known to be strong predictors, but their prediction strategies can rarely be understood. With recent advances in Explainable AI, approaches are available to explore the reasoning behind those complex models&#8217; predictions. One class of approaches are post-hoc attribution methods, among which Layer-wise Relevance Propagation (LRP) shows high performance. However, the attempt at understanding a DNN&#8217;s reasoning often stops at the attributions obtained for individual samples in input space, leaving the potential for deeper quantitative analyses untouched. As a manual analysis without the right tools is often unnecessarily labor intensive, this paper introduces three software packages targeted at scientists to explore model reasoning using attribution approaches and beyond: (1) Zennit &#8211; a highly customizable and intuitive attribution framework implementing LRP and related approaches in PyTorch, (2) CoRelAy &#8211; a framework to easily and quickly construct quantitative analysis pipelines for dataset-wide analyses of explanations, and (3) ViRelAy &#8211; a web-application to interactively explore data, attributions, and analysis results.<\/p>\n\n\n\n<div class=\"wp-block-image is-style-default\"><figure class=\"aligncenter size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"700\" height=\"496\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_7.png\" alt=\"\" class=\"wp-image-26707\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_7.png 700w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_7-150x106.png 150w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_7-300x213.png 300w\" sizes=\"(max-width: 700px) 100vw, 700px\" \/><\/figure><\/div>\n\n\n\n<p><a href=\"https:\/\/arxiv.org\/pdf\/2106.12790v1.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Towards Automatic Speech to Sign Language Generation<\/a><\/p>\n\n\n\n<p>The research discussed in this paper aims to solve the highly challenging task of generating continuous sign language videos solely from speech segments for the first time. Recent efforts in this space have focused on generating such videos from human-annotated text transcripts without considering other modalities. However, replacing speech with sign language proves to be a practical solution while communicating with people suffering from hearing loss. Therefore, the research eliminates the need of using text as input and design techniques that work for more natural, continuous, freely uttered speech covering an extensive vocabulary. Since the current datasets are inadequate for generating sign language directly from speech, the paper discusses the collection and release of the first Indian sign language data set comprising speech-level annotations, text transcripts, and the corresponding sign-language videos. Also proposed is a multi-tasking transformer network trained to generate signer&#8217;s poses from speech segments. With speech-to-text as an auxiliary task and an additional cross-modal discriminator, the model learns to generate continuous sign pose sequences in an end-to-end manner. Extensive experiments and comparisons with other baselines demonstrate the effectiveness of the approach. <\/p>\n\n\n\n<div class=\"wp-block-image is-style-default\"><figure class=\"aligncenter size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"416\" height=\"397\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_8.png\" alt=\"\" class=\"wp-image-26708\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_8.png 416w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_8-150x143.png 150w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_8-300x286.png 300w\" sizes=\"(max-width: 416px) 100vw, 416px\" \/><\/figure><\/div>\n\n\n\n<p><a href=\"https:\/\/arxiv.org\/pdf\/2106.12605v1.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Deep Fake Detection: Survey of Facial Manipulation Detection Solutions<\/a><\/p>\n\n\n\n<p>Deep Learning as a field has been successfully used to solve a plethora of complex problems, the likes of which we could not have imagined a few decades back. But as many benefits as it brings, there are still ways in which it can be used to bring harm to our society. Deep fakes have been proven to be one such problem, and now more than ever, when any individual can create a fake image or video simply using an application on the smartphone, there need to be some countermeasures, with which we can detect if the image or video is a fake or real and dispose of the problem threatening the trustworthiness of online information. Although the Deep fakes created by neural networks, may seem to be as real as a real image or video, it still leaves behind spatial and temporal traces or signatures after moderation, these signatures while being invisible to a human eye can be detected with the help of a neural network trained to specialize in Deep fake detection. This paper analyzes several such states of the art neural networks (MesoNet, ResNet-50, VGG-19, and Xception Net) and compare them against each other, to find an optimal solution for various scenarios like real-time deep fake detection to be deployed in online social media platforms where the classification should be made as fast as possible or for a small news agency where the classification need not be in real-time but requires utmost accuracy.<\/p>\n\n\n\n<div class=\"wp-block-image is-style-default\"><figure class=\"aligncenter size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"553\" height=\"422\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_9.png\" alt=\"\" class=\"wp-image-26709\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_9.png 553w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_9-150x114.png 150w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_9-300x229.png 300w\" sizes=\"(max-width: 553px) 100vw, 553px\" \/><\/figure><\/div>\n\n\n\n<p><a href=\"https:\/\/arxiv.org\/pdf\/2106.13097v1.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Understanding the Spread of COVID-19 Epidemic: A Spatio-Temporal Point Process View<\/a><\/p>\n\n\n\n<p>Since the first coronavirus case was identified in the U.S. on Jan. 21, more than 1 million people in the U.S. have confirmed cases of COVID-19. This infectious respiratory disease has spread rapidly across more than 3000 counties and 50 states in the U.S. and have exhibited evolutionary clustering and complex triggering patterns. It is essential to understand the complex spacetime intertwined propagation of this disease so that accurate prediction or smart external intervention can be carried out. This paper models the propagation of the COVID-19 as spatio-temporal point processes and propose a generative and intensity-free model to track the spread of the disease. Further adopted is a generative adversarial imitation learning framework to learn the model parameters. In comparison with the traditional likelihood-based learning methods, this imitation learning framework does not need to prespecify an intensity function, which alleviates the model-misspecification. Moreover, the adversarial learning procedure bypasses the difficult-to-evaluate integral involved in the likelihood evaluation, which makes the model inference more scalable with the data and variables. The paper showcases the dynamic learning performance on the COVID-19 confirmed cases in the U.S. and evaluate the social distancing policy based on the learned generative model.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large is-style-default\"><img decoding=\"async\" loading=\"lazy\" width=\"700\" height=\"243\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_10.png\" alt=\"\" class=\"wp-image-26710\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_10.png 700w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_10-150x52.png 150w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/07\/arXiv_2021_06_10-300x104.png 300w\" sizes=\"(max-width: 700px) 100vw, 700px\" \/><\/figure>\n\n\n\n<p><em>Sign up for the free insideBIGDATA&nbsp;<a rel=\"noreferrer noopener\" href=\"http:\/\/insidebigdata.com\/newsletter\/\" target=\"_blank\">newsletter<\/a>.<\/em><\/p>\n\n\n\n<p><em>Join us on Twitter:&nbsp;@InsideBigData1 \u2013 <a href=\"https:\/\/twitter.com\/InsideBigData1\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/twitter.com\/InsideBigData1<\/a><\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this recurring monthly feature, we will filter all the recent research papers appearing in the arXiv.org preprint server for subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the month.<\/p>\n","protected":false},"author":37,"featured_media":6361,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"footnotes":""},"categories":[526,65,115,182,87,180,67,56,77,1],"tags":[437,741,280,133,264,277,96],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v20.6 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 June 2021 - insideBIGDATA<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/insidebigdata.com\/2021\/07\/19\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-june-2021\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 June 2021 - insideBIGDATA\" \/>\n<meta property=\"og:description\" content=\"In this recurring monthly feature, we will filter all the recent research papers appearing in the arXiv.org preprint server for subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the month.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/insidebigdata.com\/2021\/07\/19\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-june-2021\/\" \/>\n<meta property=\"og:site_name\" content=\"insideBIGDATA\" \/>\n<meta property=\"article:publisher\" content=\"http:\/\/www.facebook.com\/insidebigdata\" \/>\n<meta property=\"article:published_time\" content=\"2021-07-19T13:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2021-07-20T15:49:53+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"450\" \/>\n\t<meta property=\"og:image:height\" content=\"380\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Daniel Gutierrez\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@AMULETAnalytics\" \/>\n<meta name=\"twitter:site\" content=\"@insideBigData\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Daniel Gutierrez\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"12 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/insidebigdata.com\/2021\/07\/19\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-june-2021\/\",\"url\":\"https:\/\/insidebigdata.com\/2021\/07\/19\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-june-2021\/\",\"name\":\"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 June 2021 - insideBIGDATA\",\"isPartOf\":{\"@id\":\"https:\/\/insidebigdata.com\/#website\"},\"datePublished\":\"2021-07-19T13:00:00+00:00\",\"dateModified\":\"2021-07-20T15:49:53+00:00\",\"author\":{\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/2540da209c83a68f4f5922848f7376ed\"},\"breadcrumb\":{\"@id\":\"https:\/\/insidebigdata.com\/2021\/07\/19\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-june-2021\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/insidebigdata.com\/2021\/07\/19\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-june-2021\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/insidebigdata.com\/2021\/07\/19\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-june-2021\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/insidebigdata.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 June 2021\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/insidebigdata.com\/#website\",\"url\":\"https:\/\/insidebigdata.com\/\",\"name\":\"insideBIGDATA\",\"description\":\"Your Source for AI, Data Science, Deep Learning &amp; Machine Learning Strategies\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/insidebigdata.com\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/2540da209c83a68f4f5922848f7376ed\",\"name\":\"Daniel Gutierrez\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/5780282e7e567e2a502233e948464542?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/5780282e7e567e2a502233e948464542?s=96&d=mm&r=g\",\"caption\":\"Daniel Gutierrez\"},\"description\":\"Daniel D. Gutierrez is a Data Scientist with Los Angeles-based AMULET Analytics, a service division of AMULET Development Corp. He's been involved with data science and Big Data long before it came in vogue, so imagine his delight when the Harvard Business Review recently deemed \\\"data scientist\\\" as the sexiest profession for the 21st century. Previously, he taught computer science and database classes at UCLA Extension for over 15 years, and authored three computer industry books on database technology. He also served as technical editor, columnist and writer at a major computer industry monthly publication for 7 years. Follow his data science musings at @AMULETAnalytics.\",\"sameAs\":[\"http:\/\/www.insidebigdata.com\",\"https:\/\/twitter.com\/@AMULETAnalytics\"],\"url\":\"https:\/\/insidebigdata.com\/author\/dangutierrez\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 June 2021 - insideBIGDATA","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/insidebigdata.com\/2021\/07\/19\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-june-2021\/","og_locale":"en_US","og_type":"article","og_title":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 June 2021 - insideBIGDATA","og_description":"In this recurring monthly feature, we will filter all the recent research papers appearing in the arXiv.org preprint server for subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the month.","og_url":"https:\/\/insidebigdata.com\/2021\/07\/19\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-june-2021\/","og_site_name":"insideBIGDATA","article_publisher":"http:\/\/www.facebook.com\/insidebigdata","article_published_time":"2021-07-19T13:00:00+00:00","article_modified_time":"2021-07-20T15:49:53+00:00","og_image":[{"width":450,"height":380,"url":"https:\/\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg","type":"image\/jpeg"}],"author":"Daniel Gutierrez","twitter_card":"summary_large_image","twitter_creator":"@AMULETAnalytics","twitter_site":"@insideBigData","twitter_misc":{"Written by":"Daniel Gutierrez","Est. reading time":"12 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/insidebigdata.com\/2021\/07\/19\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-june-2021\/","url":"https:\/\/insidebigdata.com\/2021\/07\/19\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-june-2021\/","name":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 June 2021 - insideBIGDATA","isPartOf":{"@id":"https:\/\/insidebigdata.com\/#website"},"datePublished":"2021-07-19T13:00:00+00:00","dateModified":"2021-07-20T15:49:53+00:00","author":{"@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/2540da209c83a68f4f5922848f7376ed"},"breadcrumb":{"@id":"https:\/\/insidebigdata.com\/2021\/07\/19\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-june-2021\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/insidebigdata.com\/2021\/07\/19\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-june-2021\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/insidebigdata.com\/2021\/07\/19\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-june-2021\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/insidebigdata.com\/"},{"@type":"ListItem","position":2,"name":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 June 2021"}]},{"@type":"WebSite","@id":"https:\/\/insidebigdata.com\/#website","url":"https:\/\/insidebigdata.com\/","name":"insideBIGDATA","description":"Your Source for AI, Data Science, Deep Learning &amp; Machine Learning Strategies","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/insidebigdata.com\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/2540da209c83a68f4f5922848f7376ed","name":"Daniel Gutierrez","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/5780282e7e567e2a502233e948464542?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/5780282e7e567e2a502233e948464542?s=96&d=mm&r=g","caption":"Daniel Gutierrez"},"description":"Daniel D. Gutierrez is a Data Scientist with Los Angeles-based AMULET Analytics, a service division of AMULET Development Corp. He's been involved with data science and Big Data long before it came in vogue, so imagine his delight when the Harvard Business Review recently deemed \"data scientist\" as the sexiest profession for the 21st century. Previously, he taught computer science and database classes at UCLA Extension for over 15 years, and authored three computer industry books on database technology. He also served as technical editor, columnist and writer at a major computer industry monthly publication for 7 years. Follow his data science musings at @AMULETAnalytics.","sameAs":["http:\/\/www.insidebigdata.com","https:\/\/twitter.com\/@AMULETAnalytics"],"url":"https:\/\/insidebigdata.com\/author\/dangutierrez\/"}]}},"jetpack_featured_media_url":"https:\/\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg","jetpack_shortlink":"https:\/\/wp.me\/p9eA3j-6Wp","jetpack-related-posts":[{"id":21994,"url":"https:\/\/insidebigdata.com\/2019\/01\/16\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-december-2018\/","url_meta":{"origin":26685,"position":0},"title":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 December 2018","date":"January 16, 2019","format":false,"excerpt":"In this recurring monthly feature, we will filter all the recent research papers appearing in the arXiv.org preprint server for subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":22953,"url":"https:\/\/insidebigdata.com\/2019\/07\/18\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-june-2019\/","url_meta":{"origin":26685,"position":1},"title":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 June 2019","date":"July 18, 2019","format":false,"excerpt":"In this recurring monthly feature, we will filter all the recent research papers appearing in the arXiv.org preprint server for subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":22263,"url":"https:\/\/insidebigdata.com\/2019\/03\/15\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-february-2019\/","url_meta":{"origin":26685,"position":2},"title":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 February 2019","date":"March 15, 2019","format":false,"excerpt":"In this recurring monthly feature, we will filter all the recent research papers appearing in the arXiv.org preprint server for subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":21252,"url":"https:\/\/insidebigdata.com\/2018\/10\/17\/best-arxiv-org-ai-machine-learning-deep-learning-september-2018\/","url_meta":{"origin":26685,"position":3},"title":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 September 2018","date":"October 17, 2018","format":false,"excerpt":"In this recurring monthly feature, we will filter all the recent research papers appearing in the arXiv.org preprint server for subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":22431,"url":"https:\/\/insidebigdata.com\/2019\/04\/09\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-march-2019\/","url_meta":{"origin":26685,"position":4},"title":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 March 2019","date":"April 9, 2019","format":false,"excerpt":"In this recurring monthly feature, we will filter all the recent research papers appearing in the arXiv.org preprint server for subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":20910,"url":"https:\/\/insidebigdata.com\/2018\/08\/13\/best-arxiv-org-ai-machine-learning-deep-learning-july-2018\/","url_meta":{"origin":26685,"position":5},"title":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 July 2018","date":"August 13, 2018","format":false,"excerpt":"In this recurring monthly feature, we will filter all the recent research papers appearing in the arXiv.org preprint server for subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]}],"_links":{"self":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts\/26685"}],"collection":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/users\/37"}],"replies":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/comments?post=26685"}],"version-history":[{"count":0,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts\/26685\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/media\/6361"}],"wp:attachment":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/media?parent=26685"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/categories?post=26685"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/tags?post=26685"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}