{"id":26882,"date":"2021-08-16T06:00:00","date_gmt":"2021-08-16T13:00:00","guid":{"rendered":"https:\/\/insidebigdata.com\/?p=26882"},"modified":"2021-08-17T08:53:20","modified_gmt":"2021-08-17T15:53:20","slug":"best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-july-2021","status":"publish","type":"post","link":"https:\/\/insidebigdata.com\/2021\/08\/16\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-july-2021\/","title":{"rendered":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 July 2021"},"content":{"rendered":"\n<div class=\"wp-block-image is-style-default\"><figure class=\"alignright size-large is-resized\"><img decoding=\"async\" loading=\"lazy\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg\" alt=\"\" class=\"wp-image-6361\" width=\"236\" height=\"199\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg 450w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv-150x127.jpg 150w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv-300x253.jpg 300w\" sizes=\"(max-width: 236px) 100vw, 236px\" \/><\/figure><\/div>\n\n\n\n<p>In this recurring monthly feature, we filter recent research papers appearing on the <a rel=\"noreferrer noopener\" href=\"https:\/\/arxiv.org\/\" target=\"_blank\">arXiv.org<\/a> preprint server for compelling subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the past month. Researchers from all over the world contribute to this repository as a prelude to the peer review process for publication in traditional journals. arXiv contains a veritable treasure trove of statistical learning methods you may use one day in the solution of data science problems. The articles listed below represent a small fraction of all articles appearing on the preprint server. They are listed in no particular order with a link to each paper along with a brief overview. Links to GitHub repos are provided when available. Especially relevant articles are marked with a \u201cthumbs up\u201d icon. Consider that these are academic research papers, typically geared toward graduate students, post docs, and seasoned professionals. They generally contain a high degree of mathematics so be prepared. Enjoy!<\/p>\n\n\n\n<p><a href=\"https:\/\/arxiv.org\/pdf\/2107.13171v1.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Learning with Multiclass AUC: Theory and Algorithms<\/a><\/p>\n\n\n\n<p>The Area under the ROC curve (AUC) is a well-known ranking metric for problems such as imbalanced learning and recommender systems. The vast majority of existing AUC-optimization-based machine learning methods only focus on binary-class cases, while leaving the multiclass cases unconsidered. This paper starts an early trial to consider the problem of learning multiclass scoring functions via optimizing multiclass AUC metrics. Our foundation is based on the M metric, which is a well-known multiclass extension of AUC. The paper pays a revisit to this metric, showing that it could eliminate the imbalance issue from the minority class pairs. Motivated by this, it is proposec an empirical surrogate risk minimization framework to approximately optimize the M metric. Theoretically, it is shown that: (i) optimizing most of the popular differentiable surrogate losses suffices to reach the Bayes optimal scoring function asymptotically; (ii) the training framework enjoys an imbalance-aware generalization error bound, which pays more attention to the bottleneck samples of minority classes compared with the traditional O(\u221a(1\/N)) result. Practically, to deal with the low scalability of the computational operations, acceleration methods are proposed for three popular surrogate loss functions, including the exponential loss, squared loss, and hinge loss, to speed up loss and gradient evaluations. Finally, experimental results on 11 real-world datasets demonstrate the effectiveness of our proposed framework.<\/p>\n\n\n\n<div class=\"wp-block-image is-style-default\"><figure class=\"aligncenter size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"700\" height=\"337\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_7.png\" alt=\"\" class=\"wp-image-26899\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_7.png 700w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_7-150x72.png 150w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_7-300x144.png 300w\" sizes=\"(max-width: 700px) 100vw, 700px\" \/><\/figure><\/div>\n\n\n\n<p><a href=\"https:\/\/arxiv.org\/pdf\/2107.03374v2.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Evaluating Large Language Models Trained on Code<\/a><\/p>\n\n\n\n<p>This paper introduces Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities. A distinct production version of Codex powers <a href=\"https:\/\/copilot.github.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">GitHub Copilot<\/a>. On HumanEval, a new evaluation set released to measure functional correctness for synthesizing programs from docstrings, the model solves 28.8% of the problems, while GPT-3 solves 0% and GPT-J solves 11.4%. Furthermore, repeated sampling from the model is a surprisingly effective strategy for producing working solutions to difficult prompts. Using this method, 70.2% of problems are solved with 100 samples per problem. Careful investigation of the model reveals its limitations, including difficulty with docstrings describing long chains of operations and with binding operations to variables. Finally, the potential broader impacts of deploying powerful code generation technologies, covering safety, security, and economics is discussed.<\/p>\n\n\n\n<div class=\"wp-block-image is-style-default\"><figure class=\"aligncenter size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"473\" height=\"534\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arxiv_2021_07_1.png\" alt=\"\" class=\"wp-image-26884\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arxiv_2021_07_1.png 473w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arxiv_2021_07_1-133x150.png 133w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arxiv_2021_07_1-266x300.png 266w\" sizes=\"(max-width: 473px) 100vw, 473px\" \/><\/figure><\/div>\n\n\n\n<p><a href=\"https:\/\/arxiv.org\/pdf\/2107.06825v2.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">A Generalized Lottery Ticket Hypothesis<\/a><\/p>\n\n\n\n<p>This paper introduces a generalization to the lottery ticket hypothesis in which the notion of &#8220;sparsity&#8221; is relaxed by choosing an arbitrary basis in the space of parameters. Evidence is presented that the original results reported for the canonical basis continue to hold in this broader setting. Structured pruning methods are described, including pruning units or factorizing fully-connected layers into products of low-rank matrices, can be cast as particular instances of this &#8220;generalized&#8221; lottery ticket hypothesis. <\/p>\n\n\n\n<div class=\"wp-block-image is-style-default\"><figure class=\"aligncenter size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"692\" height=\"377\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_2.png\" alt=\"\" class=\"wp-image-26886\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_2.png 692w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_2-150x82.png 150w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_2-300x163.png 300w\" sizes=\"(max-width: 692px) 100vw, 692px\" \/><\/figure><\/div>\n\n\n\n<p><a href=\"https:\/\/arxiv.org\/pdf\/2107.08430v2.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">YOLOX: Exceeding YOLO Series in 2021<\/a><\/p>\n\n\n\n<p>This paper presents some experienced improvements to YOLO series, forming a new high-performance detector &#8212; YOLOX. The YOLO detector is switched to an anchor-free manner and conduct other advanced detection techniques, i.e., a decoupled head and the leading label assignment strategy SimOTA to achieve state-of-the-art results across a large scale range of models: For YOLO-Nano with only 0.91M parameters and 1.08G FLOPs, 25.3% AP on COCO is found, surpassing NanoDet by 1.8% AP; for YOLOv3, one of the most widely used detectors in industry, and is boosted to 47.3% AP on COCO, outperforming the current best practice by 3.0% AP; for YOLOX-L with roughly the same amount of parameters as YOLOv4-CSP, YOLOv5-L, 50.0% AP on COCO is achieved at a speed of 68.9 FPS on Tesla V100, exceeding YOLOv5-L by 1.8% AP. The GitHub repo associated with this paper can be found <a href=\"https:\/\/github.com\/Megvii-BaseDetection\/YOLOX\" target=\"_blank\" rel=\"noreferrer noopener\">HERE<\/a>. <\/p>\n\n\n\n<div class=\"wp-block-image is-style-default\"><figure class=\"aligncenter size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"700\" height=\"300\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_3.png\" alt=\"\" class=\"wp-image-26890\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_3.png 700w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_3-150x64.png 150w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_3-300x129.png 300w\" sizes=\"(max-width: 700px) 100vw, 700px\" \/><\/figure><\/div>\n\n\n\n<p><a href=\"https:\/\/arxiv.org\/pdf\/2107.00420v6.pdf\">CBNetV2: A Composite Backbone Network Architecture for Object Detection<\/a><\/p>\n\n\n\n<p>Modern top-performing object detectors depend heavily on backbone networks, whose advances bring consistent performance gains through exploring more effective network structures. This paper proposes a novel and flexible backbone framework, namely CBNetV2, to construct high-performance detectors using existing open-sourced pre-trained backbones under the pre-training fine-tuning paradigm. In particular, CBNetV2 architecture groups multiple identical backbones, which are connected through composite connections. Specifically, it integrates the high- and low-level features of multiple backbone networks and gradually expands the receptive field to more efficiently perform object detection. Also proposed is a better training strategy with assistant supervision for CBNet-based detectors. Without additional pre-training of the composite backbone, CBNetV2 can be adapted to various backbones (CNN-based vs. Transformer-based) and head designs of most mainstream detectors (one-stage vs. two-stage, anchor-based vs. anchor-free-based). The GitHub repo associated with this paper can be found <a href=\"https:\/\/github.com\/VDIGPKU\/CBNetV2\" target=\"_blank\" rel=\"noreferrer noopener\">HERE<\/a>.<\/p>\n\n\n\n<div class=\"wp-block-image is-style-default\"><figure class=\"aligncenter size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"700\" height=\"260\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_4.png\" alt=\"\" class=\"wp-image-26892\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_4.png 700w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_4-150x56.png 150w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_4-300x111.png 300w\" sizes=\"(max-width: 700px) 100vw, 700px\" \/><\/figure><\/div>\n\n\n\n<p><a href=\"https:\/\/arxiv.org\/pdf\/2107.00645v1.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Global Filter Networks for Image Classification<\/a><\/p>\n\n\n\n<p>Recent advances in self-attention and pure multi-layer perceptrons (MLP) models for vision have shown great potential in achieving promising performance with fewer inductive biases. These models are generally based on learning interaction among spatial locations from raw data. The complexity of self-attention and MLP grows quadratically as the image size increases, which makes these models hard to scale up when high-resolution features are requiring. This paper presents the Global Filter Network (GFNet), a conceptually simple yet computationally efficient architecture, that learns long-term spatial dependencies in the frequency domain with log-linear complexity. The architecture replaces the self-attention layer in vision transformers with three key operations: a 2D discrete Fourier transform, an element-wise multiplication between frequency-domain features and learnable global filters, and a 2D inverse Fourier transform. Favorable accuracy\/complexity trade-offs of the models on both ImageNet and downstream tasks are exhibited. The results demonstrate that GFNet can be a very competitive alternative to transformer-style models and CNNs in efficiency, generalization ability and robustness. The GitHub repo associated with this paper can be found <a href=\"https:\/\/github.com\/raoyongming\/GFNet\" target=\"_blank\" rel=\"noreferrer noopener\">HERE<\/a>. <\/p>\n\n\n\n<div class=\"wp-block-image is-style-default\"><figure class=\"aligncenter size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"700\" height=\"485\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_5.png\" alt=\"\" class=\"wp-image-26894\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_5.png 700w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_5-150x104.png 150w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_5-300x208.png 300w\" sizes=\"(max-width: 700px) 100vw, 700px\" \/><\/figure><\/div>\n\n\n\n<p><a href=\"https:\/\/arxiv.org\/pdf\/2107.14795v2.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Perceiver IO: A General Architecture for Structured Inputs &amp; Outputs<\/a><\/p>\n\n\n\n<p>The recently-proposed Perceiver model obtains good results on several domains (images, audio, multimodal, point clouds) while scaling linearly in compute and memory with the input size. While the Perceiver supports many kinds of inputs, it can only produce very simple outputs such as class scores. Perceiver IO overcomes this limitation without sacrificing the original&#8217;s appealing properties by learning to flexibly query the model&#8217;s latent space to produce outputs of arbitrary size and semantics. Perceiver IO still decouples model depth from data size and still scales linearly with data size, but now with respect to both input and output sizes. The full Perceiver IO model achieves strong results on tasks with highly structured output spaces, such as natural language and visual understanding, StarCraft II, and multi-task and multi-modal domains. As highlights, Perceiver IO matches a Transformer-based BERT baseline on the GLUE language benchmark without the need for input tokenization and achieves state-of-the-art performance on Sintel optical flow estimation. The GitHub repo associated with this paper can be found <a href=\"https:\/\/github.com\/deepmind\/deepmind-research\/tree\/master\/perceiver\" target=\"_blank\" rel=\"noreferrer noopener\">HERE<\/a>. <\/p>\n\n\n\n<div class=\"wp-block-image is-style-default\"><figure class=\"aligncenter size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"693\" height=\"372\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_6.png\" alt=\"\" class=\"wp-image-26896\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_6.png 693w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_6-150x81.png 150w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_6-300x161.png 300w\" sizes=\"(max-width: 693px) 100vw, 693px\" \/><\/figure><\/div>\n\n\n\n<p><a href=\"https:\/\/arxiv.org\/pdf\/2107.14000v1.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Resisting Out-of-Distribution Data Problem in Perturbation of XAI<\/a><\/p>\n\n\n\n<p>With the rapid development of eXplainable Artificial Intelligence (XAI), perturbation-based XAI algorithms have become quite popular due to their effectiveness and ease of implementation. The vast majority of perturbation-based XAI techniques face the challenge of Out-of-Distribution (OoD) data \u2013 an artifact of randomly perturbed data becoming inconsistent with the original dataset. OoD data leads to the over-confidence problem in model predictions, making the existing XAI approaches unreliable. The OoD data problem in perturbation-based XAI algorithms has not been adequately addressed in the literature. This paper addresses this OoD data problem by designing an additional module quantifying the affinity between the perturbed data and the original dataset distribution, which is integrated into the process of aggregation. The solution is shown to be compatible with the most popular perturbation-based XAI algorithms, such as RISE, OCCLUSION, and LIME. Experiments have confirmed that the proposed methods demonstrate a significant improvement in general cases using both computational and cognitive metrics. <\/p>\n\n\n\n<div class=\"wp-block-image is-style-default\"><figure class=\"aligncenter size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"478\" height=\"298\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_08.png\" alt=\"\" class=\"wp-image-26901\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_08.png 478w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_08-150x94.png 150w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2021\/08\/arXiv_2021_07_08-300x187.png 300w\" sizes=\"(max-width: 478px) 100vw, 478px\" \/><\/figure><\/div>\n\n\n\n<p><a href=\"https:\/\/arxiv.org\/pdf\/2107.11059v1.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">LocalGLMnet: interpretable deep learning for tabular data<\/a><\/p>\n\n\n\n<p>Deep learning models have gained great popularity in statistical modeling because they lead to very competitive regression models, often outperforming classical statistical models such as generalized linear models. The disadvantage of deep learning models is that their solutions are difficult to interpret and explain, and variable selection is not easily possible because deep learning models solve feature engineering and variable selection internally in a nontransparent way. Inspired by the appealing structure of generalized linear models, this paper proposes a new network architecture that shares similar features as generalized linear models, but provides superior predictive power benefiting from the art of representation learning. This new architecture allows for variable selection of tabular data and for interpretation of the calibrated deep learning model, in fact, the approach provides an additive decomposition in the spirit of Shapley values and integrated gradients.<\/p>\n\n\n\n<p><em>Sign up for the free insideBIGDATA&nbsp;<a rel=\"noreferrer noopener\" href=\"http:\/\/insidebigdata.com\/newsletter\/\" target=\"_blank\">newsletter<\/a>.<\/em><\/p>\n\n\n\n<p><em>Join us on Twitter:&nbsp;@InsideBigData1 \u2013 <a href=\"https:\/\/twitter.com\/InsideBigData1\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/twitter.com\/InsideBigData1<\/a><\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this recurring monthly feature, we will filter all the recent research papers appearing in the arXiv.org preprint server for subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the month.<\/p>\n","protected":false},"author":37,"featured_media":6361,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"footnotes":""},"categories":[526,115,182,87,180,67,56,77,84,1],"tags":[437,741,133,264,277,933,96],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v20.6 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 July 2021 - insideBIGDATA<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/insidebigdata.com\/2021\/08\/16\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-july-2021\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 July 2021 - insideBIGDATA\" \/>\n<meta property=\"og:description\" content=\"In this recurring monthly feature, we will filter all the recent research papers appearing in the arXiv.org preprint server for subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the month.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/insidebigdata.com\/2021\/08\/16\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-july-2021\/\" \/>\n<meta property=\"og:site_name\" content=\"insideBIGDATA\" \/>\n<meta property=\"article:publisher\" content=\"http:\/\/www.facebook.com\/insidebigdata\" \/>\n<meta property=\"article:published_time\" content=\"2021-08-16T13:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2021-08-17T15:53:20+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"450\" \/>\n\t<meta property=\"og:image:height\" content=\"380\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Daniel Gutierrez\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@AMULETAnalytics\" \/>\n<meta name=\"twitter:site\" content=\"@insideBigData\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Daniel Gutierrez\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"9 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/insidebigdata.com\/2021\/08\/16\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-july-2021\/\",\"url\":\"https:\/\/insidebigdata.com\/2021\/08\/16\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-july-2021\/\",\"name\":\"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 July 2021 - insideBIGDATA\",\"isPartOf\":{\"@id\":\"https:\/\/insidebigdata.com\/#website\"},\"datePublished\":\"2021-08-16T13:00:00+00:00\",\"dateModified\":\"2021-08-17T15:53:20+00:00\",\"author\":{\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/2540da209c83a68f4f5922848f7376ed\"},\"breadcrumb\":{\"@id\":\"https:\/\/insidebigdata.com\/2021\/08\/16\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-july-2021\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/insidebigdata.com\/2021\/08\/16\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-july-2021\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/insidebigdata.com\/2021\/08\/16\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-july-2021\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/insidebigdata.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 July 2021\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/insidebigdata.com\/#website\",\"url\":\"https:\/\/insidebigdata.com\/\",\"name\":\"insideBIGDATA\",\"description\":\"Your Source for AI, Data Science, Deep Learning &amp; Machine Learning Strategies\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/insidebigdata.com\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/2540da209c83a68f4f5922848f7376ed\",\"name\":\"Daniel Gutierrez\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/5780282e7e567e2a502233e948464542?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/5780282e7e567e2a502233e948464542?s=96&d=mm&r=g\",\"caption\":\"Daniel Gutierrez\"},\"description\":\"Daniel D. Gutierrez is a Data Scientist with Los Angeles-based AMULET Analytics, a service division of AMULET Development Corp. He's been involved with data science and Big Data long before it came in vogue, so imagine his delight when the Harvard Business Review recently deemed \\\"data scientist\\\" as the sexiest profession for the 21st century. Previously, he taught computer science and database classes at UCLA Extension for over 15 years, and authored three computer industry books on database technology. He also served as technical editor, columnist and writer at a major computer industry monthly publication for 7 years. Follow his data science musings at @AMULETAnalytics.\",\"sameAs\":[\"http:\/\/www.insidebigdata.com\",\"https:\/\/twitter.com\/@AMULETAnalytics\"],\"url\":\"https:\/\/insidebigdata.com\/author\/dangutierrez\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 July 2021 - insideBIGDATA","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/insidebigdata.com\/2021\/08\/16\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-july-2021\/","og_locale":"en_US","og_type":"article","og_title":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 July 2021 - insideBIGDATA","og_description":"In this recurring monthly feature, we will filter all the recent research papers appearing in the arXiv.org preprint server for subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the month.","og_url":"https:\/\/insidebigdata.com\/2021\/08\/16\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-july-2021\/","og_site_name":"insideBIGDATA","article_publisher":"http:\/\/www.facebook.com\/insidebigdata","article_published_time":"2021-08-16T13:00:00+00:00","article_modified_time":"2021-08-17T15:53:20+00:00","og_image":[{"width":450,"height":380,"url":"https:\/\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg","type":"image\/jpeg"}],"author":"Daniel Gutierrez","twitter_card":"summary_large_image","twitter_creator":"@AMULETAnalytics","twitter_site":"@insideBigData","twitter_misc":{"Written by":"Daniel Gutierrez","Est. reading time":"9 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/insidebigdata.com\/2021\/08\/16\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-july-2021\/","url":"https:\/\/insidebigdata.com\/2021\/08\/16\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-july-2021\/","name":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 July 2021 - insideBIGDATA","isPartOf":{"@id":"https:\/\/insidebigdata.com\/#website"},"datePublished":"2021-08-16T13:00:00+00:00","dateModified":"2021-08-17T15:53:20+00:00","author":{"@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/2540da209c83a68f4f5922848f7376ed"},"breadcrumb":{"@id":"https:\/\/insidebigdata.com\/2021\/08\/16\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-july-2021\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/insidebigdata.com\/2021\/08\/16\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-july-2021\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/insidebigdata.com\/2021\/08\/16\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-july-2021\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/insidebigdata.com\/"},{"@type":"ListItem","position":2,"name":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 July 2021"}]},{"@type":"WebSite","@id":"https:\/\/insidebigdata.com\/#website","url":"https:\/\/insidebigdata.com\/","name":"insideBIGDATA","description":"Your Source for AI, Data Science, Deep Learning &amp; Machine Learning Strategies","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/insidebigdata.com\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/2540da209c83a68f4f5922848f7376ed","name":"Daniel Gutierrez","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/5780282e7e567e2a502233e948464542?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/5780282e7e567e2a502233e948464542?s=96&d=mm&r=g","caption":"Daniel Gutierrez"},"description":"Daniel D. Gutierrez is a Data Scientist with Los Angeles-based AMULET Analytics, a service division of AMULET Development Corp. He's been involved with data science and Big Data long before it came in vogue, so imagine his delight when the Harvard Business Review recently deemed \"data scientist\" as the sexiest profession for the 21st century. Previously, he taught computer science and database classes at UCLA Extension for over 15 years, and authored three computer industry books on database technology. He also served as technical editor, columnist and writer at a major computer industry monthly publication for 7 years. Follow his data science musings at @AMULETAnalytics.","sameAs":["http:\/\/www.insidebigdata.com","https:\/\/twitter.com\/@AMULETAnalytics"],"url":"https:\/\/insidebigdata.com\/author\/dangutierrez\/"}]}},"jetpack_featured_media_url":"https:\/\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg","jetpack_shortlink":"https:\/\/wp.me\/p9eA3j-6ZA","jetpack-related-posts":[{"id":21994,"url":"https:\/\/insidebigdata.com\/2019\/01\/16\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-december-2018\/","url_meta":{"origin":26882,"position":0},"title":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 December 2018","date":"January 16, 2019","format":false,"excerpt":"In this recurring monthly feature, we will filter all the recent research papers appearing in the arXiv.org preprint server for subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":22953,"url":"https:\/\/insidebigdata.com\/2019\/07\/18\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-june-2019\/","url_meta":{"origin":26882,"position":1},"title":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 June 2019","date":"July 18, 2019","format":false,"excerpt":"In this recurring monthly feature, we will filter all the recent research papers appearing in the arXiv.org preprint server for subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":22263,"url":"https:\/\/insidebigdata.com\/2019\/03\/15\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-february-2019\/","url_meta":{"origin":26882,"position":2},"title":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 February 2019","date":"March 15, 2019","format":false,"excerpt":"In this recurring monthly feature, we will filter all the recent research papers appearing in the arXiv.org preprint server for subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":22431,"url":"https:\/\/insidebigdata.com\/2019\/04\/09\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-march-2019\/","url_meta":{"origin":26882,"position":3},"title":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 March 2019","date":"April 9, 2019","format":false,"excerpt":"In this recurring monthly feature, we will filter all the recent research papers appearing in the arXiv.org preprint server for subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":21252,"url":"https:\/\/insidebigdata.com\/2018\/10\/17\/best-arxiv-org-ai-machine-learning-deep-learning-september-2018\/","url_meta":{"origin":26882,"position":4},"title":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 September 2018","date":"October 17, 2018","format":false,"excerpt":"In this recurring monthly feature, we will filter all the recent research papers appearing in the arXiv.org preprint server for subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":20910,"url":"https:\/\/insidebigdata.com\/2018\/08\/13\/best-arxiv-org-ai-machine-learning-deep-learning-july-2018\/","url_meta":{"origin":26882,"position":5},"title":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 July 2018","date":"August 13, 2018","format":false,"excerpt":"In this recurring monthly feature, we will filter all the recent research papers appearing in the arXiv.org preprint server for subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]}],"_links":{"self":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts\/26882"}],"collection":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/users\/37"}],"replies":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/comments?post=26882"}],"version-history":[{"count":0,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts\/26882\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/media\/6361"}],"wp:attachment":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/media?parent=26882"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/categories?post=26882"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/tags?post=26882"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}