{"id":31639,"date":"2023-02-15T06:00:00","date_gmt":"2023-02-15T14:00:00","guid":{"rendered":"https:\/\/insidebigdata.com\/?p=31639"},"modified":"2023-02-16T09:31:20","modified_gmt":"2023-02-16T17:31:20","slug":"video-highlights-attention-is-all-you-need-paper-explained","status":"publish","type":"post","link":"https:\/\/insidebigdata.com\/2023\/02\/15\/video-highlights-attention-is-all-you-need-paper-explained\/","title":{"rendered":"Video Highlights: Attention Is All You Need &#8211; Paper Explained"},"content":{"rendered":"\n<p>In this video presentation, Mohammad Namvarpour presents a comprehensive study on Ashish Vaswani and his coauthors&#8217; renowned paper, \u201c<a href=\"https:\/\/papers.nips.cc\/paper\/2017\/file\/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Attention Is All You Need<\/a>.\u201d This paper is a major turning point in deep learning research. The transformer architecture, which was introduced in this paper, is now used in a variety of state-of-the-art models in natural language processing and beyond. Transformers are the basis of the large language models (LLMs) we&#8217;re seeing today. <\/p>\n\n\n\n<p>If you wish to get up to speed with transformers, here is an incredible learning resource complete with tutorials, videos, course notes, links to seminal papers, and much more. Visit the GitHub repo <a href=\"https:\/\/github.com\/dair-ai\/Transformers-Recipe\" target=\"_blank\" rel=\"noreferrer noopener\">HERE<\/a>. <\/p>\n\n\n\n<figure class=\"wp-block-embed aligncenter is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe loading=\"lazy\"  id=\"_ytid_89032\"  width=\"480\" height=\"270\"  data-origwidth=\"480\" data-origheight=\"270\" src=\"https:\/\/www.youtube.com\/embed\/XowwKOAWYoQ?enablejsapi=1&#038;autoplay=0&#038;cc_load_policy=0&#038;cc_lang_pref=&#038;iv_load_policy=1&#038;loop=0&#038;modestbranding=0&#038;rel=1&#038;fs=1&#038;playsinline=0&#038;autohide=2&#038;theme=dark&#038;color=red&#038;controls=1&#038;\" class=\"__youtube_prefs__  epyt-is-override  no-lazyload\" title=\"YouTube player\"  allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture\" allowfullscreen data-no-lazy=\"1\" data-skipgform_ajax_framebjll=\"\"><\/iframe>\n<\/div><\/figure>\n\n\n\n<p>Here is another video presentation drilling down on the &#8220;Attention Is All You Need&#8221; paper, this one by Yannic Kilcher.<\/p>\n\n\n\n<p><span class=\"embed-youtube\" style=\"text-align:center; display: block;\"><iframe loading=\"lazy\" class=\"youtube-player\" width=\"640\" height=\"360\" src=\"https:\/\/www.youtube.com\/embed\/iDulhoQ2pro?version=3&#038;rel=1&#038;showsearch=0&#038;showinfo=1&#038;iv_load_policy=1&#038;fs=1&#038;hl=en-US&#038;autohide=2&#038;wmode=transparent\" allowfullscreen=\"true\" style=\"border:0;\" sandbox=\"allow-scripts allow-same-origin allow-popups allow-presentation\"><\/iframe><\/span> <\/p>\n\n\n\n<p><em>Sign up for the free insideBIGDATA&nbsp;<a href=\"http:\/\/inside-bigdata.com\/newsletter\/\" target=\"_blank\" rel=\"noreferrer noopener\">newsletter<\/a>.<\/em><\/p>\n\n\n\n<p><em>Join us on Twitter:&nbsp;<a href=\"https:\/\/twitter.com\/InsideBigData1\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/twitter.com\/InsideBigData1<\/a><\/em><\/p>\n\n\n\n<p><em>Join us on LinkedIn:&nbsp;<a href=\"https:\/\/www.linkedin.com\/company\/insidebigdata\/\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/www.linkedin.com\/company\/insidebigdata\/<\/a><\/em><\/p>\n\n\n\n<p><em>Join us on Facebook:&nbsp;<a href=\"https:\/\/www.facebook.com\/insideBIGDATANOW\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/www.facebook.com\/insideBIGDATANOW<\/a><\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this video presentation, Mohammad Namvarpour presents a comprehensive study on Ashish Vaswani and his coauthors&#8217; renowned paper, \u201cAttention Is All You Need.\u201d This paper is a major turning point in deep learning research. The transformer architecture, which was introduced in this paper, is now used in a variety of state-of-the-art models in natural language processing and beyond. Transformers are the basis of the large language models (LLMs) we&#8217;re seeing today. <\/p>\n","protected":false},"author":10513,"featured_media":23389,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"footnotes":""},"categories":[526,115,87,180,56,1,85],"tags":[437,264,1248,652,1131,96],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v20.6 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Video Highlights: Attention Is All You Need - Paper Explained - insideBIGDATA<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/insidebigdata.com\/2023\/02\/15\/video-highlights-attention-is-all-you-need-paper-explained\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Video Highlights: Attention Is All You Need - Paper Explained - insideBIGDATA\" \/>\n<meta property=\"og:description\" content=\"In this video presentation, Mohammad Namvarpour presents a comprehensive study on Ashish Vaswani and his coauthors&#039; renowned paper, \u201cAttention Is All You Need.\u201d This paper is a major turning point in deep learning research. The transformer architecture, which was introduced in this paper, is now used in a variety of state-of-the-art models in natural language processing and beyond. Transformers are the basis of the large language models (LLMs) we&#039;re seeing today.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/insidebigdata.com\/2023\/02\/15\/video-highlights-attention-is-all-you-need-paper-explained\/\" \/>\n<meta property=\"og:site_name\" content=\"insideBIGDATA\" \/>\n<meta property=\"article:publisher\" content=\"http:\/\/www.facebook.com\/insidebigdata\" \/>\n<meta property=\"article:published_time\" content=\"2023-02-15T14:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-02-16T17:31:20+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2019\/10\/NLP_shutterstock_299138114.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"300\" \/>\n\t<meta property=\"og:image:height\" content=\"200\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Editorial Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@insideBigData\" \/>\n<meta name=\"twitter:site\" content=\"@insideBigData\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Editorial Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/insidebigdata.com\/2023\/02\/15\/video-highlights-attention-is-all-you-need-paper-explained\/\",\"url\":\"https:\/\/insidebigdata.com\/2023\/02\/15\/video-highlights-attention-is-all-you-need-paper-explained\/\",\"name\":\"Video Highlights: Attention Is All You Need - Paper Explained - insideBIGDATA\",\"isPartOf\":{\"@id\":\"https:\/\/insidebigdata.com\/#website\"},\"datePublished\":\"2023-02-15T14:00:00+00:00\",\"dateModified\":\"2023-02-16T17:31:20+00:00\",\"author\":{\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/2949e412c144601cdbcc803bd234e1b9\"},\"breadcrumb\":{\"@id\":\"https:\/\/insidebigdata.com\/2023\/02\/15\/video-highlights-attention-is-all-you-need-paper-explained\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/insidebigdata.com\/2023\/02\/15\/video-highlights-attention-is-all-you-need-paper-explained\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/insidebigdata.com\/2023\/02\/15\/video-highlights-attention-is-all-you-need-paper-explained\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/insidebigdata.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Video Highlights: Attention Is All You Need &#8211; Paper Explained\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/insidebigdata.com\/#website\",\"url\":\"https:\/\/insidebigdata.com\/\",\"name\":\"insideBIGDATA\",\"description\":\"Your Source for AI, Data Science, Deep Learning &amp; Machine Learning Strategies\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/insidebigdata.com\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/2949e412c144601cdbcc803bd234e1b9\",\"name\":\"Editorial Team\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/e137ce7ea40e38bd4d25bb7860cfe3e4?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/e137ce7ea40e38bd4d25bb7860cfe3e4?s=96&d=mm&r=g\",\"caption\":\"Editorial Team\"},\"sameAs\":[\"http:\/\/www.insidebigdata.com\"],\"url\":\"https:\/\/insidebigdata.com\/author\/editorial\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Video Highlights: Attention Is All You Need - Paper Explained - insideBIGDATA","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/insidebigdata.com\/2023\/02\/15\/video-highlights-attention-is-all-you-need-paper-explained\/","og_locale":"en_US","og_type":"article","og_title":"Video Highlights: Attention Is All You Need - Paper Explained - insideBIGDATA","og_description":"In this video presentation, Mohammad Namvarpour presents a comprehensive study on Ashish Vaswani and his coauthors' renowned paper, \u201cAttention Is All You Need.\u201d This paper is a major turning point in deep learning research. The transformer architecture, which was introduced in this paper, is now used in a variety of state-of-the-art models in natural language processing and beyond. Transformers are the basis of the large language models (LLMs) we're seeing today.","og_url":"https:\/\/insidebigdata.com\/2023\/02\/15\/video-highlights-attention-is-all-you-need-paper-explained\/","og_site_name":"insideBIGDATA","article_publisher":"http:\/\/www.facebook.com\/insidebigdata","article_published_time":"2023-02-15T14:00:00+00:00","article_modified_time":"2023-02-16T17:31:20+00:00","og_image":[{"width":300,"height":200,"url":"https:\/\/insidebigdata.com\/wp-content\/uploads\/2019\/10\/NLP_shutterstock_299138114.jpg","type":"image\/jpeg"}],"author":"Editorial Team","twitter_card":"summary_large_image","twitter_creator":"@insideBigData","twitter_site":"@insideBigData","twitter_misc":{"Written by":"Editorial Team","Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/insidebigdata.com\/2023\/02\/15\/video-highlights-attention-is-all-you-need-paper-explained\/","url":"https:\/\/insidebigdata.com\/2023\/02\/15\/video-highlights-attention-is-all-you-need-paper-explained\/","name":"Video Highlights: Attention Is All You Need - Paper Explained - insideBIGDATA","isPartOf":{"@id":"https:\/\/insidebigdata.com\/#website"},"datePublished":"2023-02-15T14:00:00+00:00","dateModified":"2023-02-16T17:31:20+00:00","author":{"@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/2949e412c144601cdbcc803bd234e1b9"},"breadcrumb":{"@id":"https:\/\/insidebigdata.com\/2023\/02\/15\/video-highlights-attention-is-all-you-need-paper-explained\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/insidebigdata.com\/2023\/02\/15\/video-highlights-attention-is-all-you-need-paper-explained\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/insidebigdata.com\/2023\/02\/15\/video-highlights-attention-is-all-you-need-paper-explained\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/insidebigdata.com\/"},{"@type":"ListItem","position":2,"name":"Video Highlights: Attention Is All You Need &#8211; Paper Explained"}]},{"@type":"WebSite","@id":"https:\/\/insidebigdata.com\/#website","url":"https:\/\/insidebigdata.com\/","name":"insideBIGDATA","description":"Your Source for AI, Data Science, Deep Learning &amp; Machine Learning Strategies","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/insidebigdata.com\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/2949e412c144601cdbcc803bd234e1b9","name":"Editorial Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/e137ce7ea40e38bd4d25bb7860cfe3e4?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/e137ce7ea40e38bd4d25bb7860cfe3e4?s=96&d=mm&r=g","caption":"Editorial Team"},"sameAs":["http:\/\/www.insidebigdata.com"],"url":"https:\/\/insidebigdata.com\/author\/editorial\/"}]}},"jetpack_featured_media_url":"https:\/\/insidebigdata.com\/wp-content\/uploads\/2019\/10\/NLP_shutterstock_299138114.jpg","jetpack_shortlink":"https:\/\/wp.me\/p9eA3j-8ej","jetpack-related-posts":[{"id":24721,"url":"https:\/\/insidebigdata.com\/2020\/07\/11\/research-highlights-exbert\/","url_meta":{"origin":31639,"position":0},"title":"Research Highlights: ExBERT","date":"July 11, 2020","format":false,"excerpt":"In the insideBIGDATA Research Highlights column we take a look at new and upcoming results from the research community for data science, machine learning, AI and deep learning. Our readers need to get a glimpse for technology coming down the pipeline that will make their efforts more strategic and competitive.\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2020\/07\/exBERT_fig.png?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":27482,"url":"https:\/\/insidebigdata.com\/2021\/11\/04\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-october-2021\/","url_meta":{"origin":31639,"position":1},"title":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 October 2021","date":"November 4, 2021","format":false,"excerpt":"In this recurring monthly feature, we filter recent research papers appearing on the arXiv.org preprint server for compelling subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the past month.\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":26425,"url":"https:\/\/insidebigdata.com\/2021\/06\/09\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-may-2021\/","url_meta":{"origin":31639,"position":2},"title":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 May 2021","date":"June 9, 2021","format":false,"excerpt":"In this recurring monthly feature, we will filter all the recent research papers appearing in the arXiv.org preprint server for subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":27358,"url":"https:\/\/insidebigdata.com\/2021\/10\/15\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-september-2021\/","url_meta":{"origin":31639,"position":3},"title":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 September 2021","date":"October 15, 2021","format":false,"excerpt":"In this recurring monthly feature, we will filter all the recent research papers appearing in the arXiv.org preprint server for subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":26685,"url":"https:\/\/insidebigdata.com\/2021\/07\/19\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-june-2021\/","url_meta":{"origin":31639,"position":4},"title":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 June 2021","date":"July 19, 2021","format":false,"excerpt":"In this recurring monthly feature, we will filter all the recent research papers appearing in the arXiv.org preprint server for subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":23163,"url":"https:\/\/insidebigdata.com\/2019\/08\/28\/best-of-arxiv-org-for-ai-machine-learning-and-deep-learning-july-2019\/","url_meta":{"origin":31639,"position":5},"title":"Best of arXiv.org for AI, Machine Learning, and Deep Learning \u2013 July 2019","date":"August 28, 2019","format":false,"excerpt":"In this recurring monthly feature, we will filter all the recent research papers appearing in the arXiv.org preprint server for subjects relating to AI, machine learning and deep learning \u2013 from disciplines including statistics, mathematics and computer science \u2013 and provide you with a useful \u201cbest of\u201d list for the\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2013\/12\/arxiv.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]}],"_links":{"self":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts\/31639"}],"collection":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/users\/10513"}],"replies":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/comments?post=31639"}],"version-history":[{"count":0,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts\/31639\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/media\/23389"}],"wp:attachment":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/media?parent=31639"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/categories?post=31639"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/tags?post=31639"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}