{"id":25335,"date":"2020-12-08T06:00:00","date_gmt":"2020-12-08T14:00:00","guid":{"rendered":"https:\/\/insidebigdata.com\/?p=25335"},"modified":"2020-12-09T10:25:37","modified_gmt":"2020-12-09T18:25:37","slug":"have-a-goal-in-mind-gpt-3-pegasus-and-new-frameworks-for-text-summarization-in-healthcare-and-bfsi","status":"publish","type":"post","link":"https:\/\/insidebigdata.com\/2020\/12\/08\/have-a-goal-in-mind-gpt-3-pegasus-and-new-frameworks-for-text-summarization-in-healthcare-and-bfsi\/","title":{"rendered":"Have a Goal in Mind: GPT-3, PEGASUS, and New Frameworks for Text Summarization in Healthcare and BFSI"},"content":{"rendered":"\n<p>OpenAI\u2019s <a href=\"https:\/\/github.com\/openai\/gpt-3\" target=\"_blank\" rel=\"noreferrer noopener\">GPT-3<\/a> has been grabbing headlines almost as fast as the neural-network language model can generate them. Since its private beta release in July 2020, natural language processing (NLP) experts have been blown away by the sheer scale and complexity of the project.<\/p>\n\n\n\n<p>Yet flying under the radar is another approach to NLP that could overcome a significant bottleneck faced by GPT-3 and other large scale generalized NLP projects. <a href=\"https:\/\/ai.googleblog.com\/2020\/06\/pegasus-state-of-art-model-for.html\" target=\"_blank\" rel=\"noreferrer noopener\">Google\u2019s PEGASUS model<\/a> not only shows remarkable promise when it comes to text summarization and synthesis, but its non-generalized approach could push industries such as healthcare to embrace NLP much earlier than was once supposed.<\/p>\n\n\n\n<p><strong>Generalized v. Specific Pre-Training Objectives<\/strong><\/p>\n\n\n\n<p>GPT-3\u2019s transformer encoder-decoder model is essentially an autocomplete tool with billions of weighted connections, or parameters, between words that can predict the likelihood of one word following another. The power in OpenAI\u2019s latest solution is in its astounding size. The first GPT had only 117 million parameters. GPT-2 had 1.5 billion parameters. GPT-3 has 175 billion parameters.<\/p>\n\n\n\n<p>This is an order of magnitude larger than its next closest competitor, and it\u2019s allowed GPT-3 to stay highly generalized while still being easily applied to specific tasks. This means that pre-training, or the early modeling of the neural network against vast datasets, is done without a specific goal in mind, but, once generally trained, it can learn a new task while only being fed a handful of examples rather than the tens of thousands that other models require. Want it to translate a book into Korean? Feed GPT-3 a few words of English and the corresponding Korean words, and it will do the rest.<\/p>\n\n\n\n<p>Yet bigger might not always be better, and the OpenAI team cedes that its model might be running into the limits of generalized pre-training in a 75-page paper called <a href=\"https:\/\/arxiv.org\/abs\/2005.14165\" target=\"_blank\" rel=\"noreferrer noopener\">Language Models are Few-Shot Learners<\/a>. \u201cOur current objective weights every token equally and lacks a notion of what is most important to predict and what is less important,\u201d they explain, highlighting the importance of having specific pre-training objectives.<\/p>\n\n\n\n<p><strong>Abstract Text Summarization and Synthesis<\/strong><\/p>\n\n\n\n<p>This means that a massive yet generalized approach in pre-training, while impressive and remarkably flexible, might not be the answer for many tasks. In fact, the OpenAI team mention in the paper\u2019s limitations section that GPT-3 still has \u201cnotable weaknesses in text synthesis.\u201d<\/p>\n\n\n\n<p>A team at Google has created the PEGASUS model to fix weaknesses in text synthesis and abstractive text summarization&nbsp; \u2013 one of the most challenging tasks in NLP because, unlike traditional text summarization, it doesn\u2019t merely highlight key passages, but generates entirely new text. <a href=\"https:\/\/ai.googleblog.com\/2020\/06\/pegasus-state-of-art-model-for.html\" target=\"_blank\" rel=\"noreferrer noopener\">As they explain in their blog<\/a>, \u201cour hypothesis is that the closer the pre-training self-supervised objective is to the final down-stream task, the better the fine-tuning performance.\u201d<\/p>\n\n\n\n<p>PEGASUS was able to align their pre-training self-supervised objectives with the down-stream task of text summarization through gap-sentence generation (GSG). The idea with GSG is that certain sentences are removed from a document and the model must then predict those sentences.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"635\" height=\"124\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2020\/12\/Persistent_pic1.png\" alt=\"\" class=\"wp-image-25336\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2020\/12\/Persistent_pic1.png 635w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2020\/12\/Persistent_pic1-150x29.png 150w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2020\/12\/Persistent_pic1-300x59.png 300w\" sizes=\"(max-width: 635px) 100vw, 635px\" \/><figcaption>Self-Supervised Pre-training in PEGASUS. From Google AI Blog<\/figcaption><\/figure><\/div>\n\n\n\n<p>By pre-training with GSG and then fine-tuning their model on text summarization datasets, PEGASUS was able to achieve state-of-the-art results. Impressively, it reached near human-like accuracy in under 1,000 fine-tuned examples.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"572\" height=\"586\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2020\/12\/Persistent_pic2.png\" alt=\"\" class=\"wp-image-25337\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2020\/12\/Persistent_pic2.png 572w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2020\/12\/Persistent_pic2-146x150.png 146w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2020\/12\/Persistent_pic2-293x300.png 293w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2020\/12\/Persistent_pic2-50x50.png 50w\" sizes=\"(max-width: 572px) 100vw, 572px\" \/><figcaption>The dotted line represents a baseline transformer encoder-decoder model that used the full supervised data, which in some cases had many orders of magnitude more examples than PEGASUS. From Google AI Blog<\/figcaption><\/figure><\/div>\n\n\n\n<p>During pre-training, three variants of the ROGUE score were used (higher the better). In some cases, PEGASUS outperformed the baseline in under 100 examples, approaching the efficiency of GPT-3 with <a href=\"https:\/\/arxiv.org\/pdf\/1912.08777.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">.3% of its parameters<\/a>.&nbsp;<\/p>\n\n\n\n<p>Although the ROGUE score is useful, ultimately you must test the validity of the summary using human experts. The Google team had people evaluate PEGASUS text summaries against human-written summaries without knowing which was which. The results were indistinguishable. PEGASUS was even able to pull out ideas that had only been implied in the original text such as the specific number of ships in a particular passage, although it didn\u2019t always do this perfectly. This highlights that pre-training with specific objectives might be the future of abstractive text summarization.<\/p>\n\n\n\n<p><strong>Healthcare and BFSI Applications<\/strong><\/p>\n\n\n\n<p>With this new model for text summarization and others that embrace a non-generalized pre-training objective framework, there are several key healthcare and banking, financial services and insurance (BFSI) use cases that are growing in importance:<\/p>\n\n\n\n<ol type=\"1\"><li><strong>Streamlining Research:<\/strong> Healthcare and financial data is growing exponentially, and researchers and clinicians alike are already struggling to process and understand the vast sums of information that are being produced daily. Text summarization could drastically reduce the amount of time they spend pouring through individual papers, reports, data, etc. to find key highlights while also identifying possible trends across multiple documents.<\/li><\/ol>\n\n\n\n<ol type=\"1\" start=\"2\"><li><strong>Q&amp;A Chatbots: <\/strong>One of the most important areas for efficiency and positive patient\/customer experience is ensuring that they can have their questions answered quickly. Text summarization tools could help customers resolve their most pressing banking issues and patients triage their symptoms to find the help that they need as quickly as possible while reducing the burden on financial and healthcare professionals.<\/li><\/ol>\n\n\n\n<ol type=\"1\" start=\"3\"><li><strong>Scripting and Summarizing Telemeetings: <\/strong>As telehealth balloons during the pandemic, more patient data than ever is digital. Similarly, more customers are making use of remote meeting technology to interact with their financial institutions. Text summarization could help healthcare and financial organizations turn speech into text, take notes, and then summarize that information in an easily digestible fashion.<\/li><\/ol>\n\n\n\n<ol type=\"1\" start=\"4\"><li><strong>Supporting Decisions: <\/strong>Text summarization could also compile the information from these scripted notes with other electronic health records or financial data that is often unstructured and narrative in form. This could help physicians make decisions about drug dosing, remind them about patient-specific allergies, and detect and alert physicians about presentations of specific diseases, while also helping financial professionals approve loans, evaluate stocks, and quickly derive market signals.<\/li><\/ol>\n\n\n\n<ol type=\"1\" start=\"5\"><li><strong>Creating Legal, Insurance, and Billing Efficiencies: <\/strong>Healthcare and BFSI come with significant legal requirements, and text summarization could help streamline this process so that people can easily understand long legal documents through legal contract analysis that highlights and summarizes the riskier sections. Additionally, the healthcare system is full of complicated medical codes for insurance and billing, and text summarization could quickly review and summarize physicians\u2019 notes to pull out the proper designations.<\/li><\/ol>\n\n\n\n<p><strong>Have a Goal in Mind<\/strong><\/p>\n\n\n\n<p>Regardless of the hype or potential limitations of GPT-3, it is an amazing tool that will only grow more sophisticated over time. We\u2019re living in the golden age of NLP and text summarization that is opening up efficiencies in healthcare and BFSI that couldn\u2019t have been imagined even 5 years ago. Healthcare and financial organizations should explore how they can apply text summarization, keeping in mind the potential power of models that are pre-training with clear objectives aimed at downstream applications.<\/p>\n\n\n\n<p><strong>About the Author<\/strong><\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"alignleft size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"150\" height=\"150\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2020\/12\/dattaraj-rao-1.jpg\" alt=\"\" class=\"wp-image-25338\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2020\/12\/dattaraj-rao-1.jpg 150w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2020\/12\/dattaraj-rao-1-110x110.jpg 110w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2020\/12\/dattaraj-rao-1-50x50.jpg 50w\" sizes=\"(max-width: 150px) 100vw, 150px\" \/><\/figure><\/div>\n\n\n\n<p><em>Dattaraj Rao, Innovation and R&amp;D Architect at&nbsp;<a href=\"https:\/\/www.persistent.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">Persistent Systems<\/a>, is the author of the book \u201cKeras to Kubernetes: The Journey of a Machine Learning Model to Production.\u201d At Persistent Systems, Dattaraj leads the AI Research Lab that explores state-of-the-art algorithms in Computer Vision, Natural Language Understanding, Probabilistic programming, Reinforcement Learning, Explainable AI, etc. and demonstrates applicability in Healthcare, Banking and Industrial domains. Dattaraj has 11 patents in Machine Learning and Computer Vision.<\/em><\/p>\n\n\n\n<p><em>Sign up for the free insideBIGDATA&nbsp;<a rel=\"noreferrer noopener\" href=\"http:\/\/insidebigdata.com\/newsletter\/\" target=\"_blank\">newsletter<\/a>.<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this contributed article, Dattaraj Rao, Innovation and R&#038;D Architect at Persistent Systems, discusses the rise in interest for neutral network language models, specifically the recent Google PEGASUS model. This model not only shows remarkable promise when it comes to text summarization and synthesis, but its non-generalized approach could push industries such as healthcare to embrace NLP much earlier than was once supposed.<\/p>\n","protected":false},"author":10513,"featured_media":23389,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"footnotes":""},"categories":[526,115,87,180,56,97,1],"tags":[437,264,947,948,949,95],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v20.6 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Have a Goal in Mind: GPT-3, PEGASUS, and New Frameworks for Text Summarization in Healthcare and BFSI - insideBIGDATA<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/insidebigdata.com\/2020\/12\/08\/have-a-goal-in-mind-gpt-3-pegasus-and-new-frameworks-for-text-summarization-in-healthcare-and-bfsi\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Have a Goal in Mind: GPT-3, PEGASUS, and New Frameworks for Text Summarization in Healthcare and BFSI - insideBIGDATA\" \/>\n<meta property=\"og:description\" content=\"In this contributed article, Dattaraj Rao, Innovation and R&amp;D Architect at Persistent Systems, discusses the rise in interest for neutral network language models, specifically the recent Google PEGASUS model. This model not only shows remarkable promise when it comes to text summarization and synthesis, but its non-generalized approach could push industries such as healthcare to embrace NLP much earlier than was once supposed.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/insidebigdata.com\/2020\/12\/08\/have-a-goal-in-mind-gpt-3-pegasus-and-new-frameworks-for-text-summarization-in-healthcare-and-bfsi\/\" \/>\n<meta property=\"og:site_name\" content=\"insideBIGDATA\" \/>\n<meta property=\"article:publisher\" content=\"http:\/\/www.facebook.com\/insidebigdata\" \/>\n<meta property=\"article:published_time\" content=\"2020-12-08T14:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2020-12-09T18:25:37+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2019\/10\/NLP_shutterstock_299138114.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"300\" \/>\n\t<meta property=\"og:image:height\" content=\"200\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Editorial Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@insideBigData\" \/>\n<meta name=\"twitter:site\" content=\"@insideBigData\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Editorial Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/insidebigdata.com\/2020\/12\/08\/have-a-goal-in-mind-gpt-3-pegasus-and-new-frameworks-for-text-summarization-in-healthcare-and-bfsi\/\",\"url\":\"https:\/\/insidebigdata.com\/2020\/12\/08\/have-a-goal-in-mind-gpt-3-pegasus-and-new-frameworks-for-text-summarization-in-healthcare-and-bfsi\/\",\"name\":\"Have a Goal in Mind: GPT-3, PEGASUS, and New Frameworks for Text Summarization in Healthcare and BFSI - insideBIGDATA\",\"isPartOf\":{\"@id\":\"https:\/\/insidebigdata.com\/#website\"},\"datePublished\":\"2020-12-08T14:00:00+00:00\",\"dateModified\":\"2020-12-09T18:25:37+00:00\",\"author\":{\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/2949e412c144601cdbcc803bd234e1b9\"},\"breadcrumb\":{\"@id\":\"https:\/\/insidebigdata.com\/2020\/12\/08\/have-a-goal-in-mind-gpt-3-pegasus-and-new-frameworks-for-text-summarization-in-healthcare-and-bfsi\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/insidebigdata.com\/2020\/12\/08\/have-a-goal-in-mind-gpt-3-pegasus-and-new-frameworks-for-text-summarization-in-healthcare-and-bfsi\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/insidebigdata.com\/2020\/12\/08\/have-a-goal-in-mind-gpt-3-pegasus-and-new-frameworks-for-text-summarization-in-healthcare-and-bfsi\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/insidebigdata.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Have a Goal in Mind: GPT-3, PEGASUS, and New Frameworks for Text Summarization in Healthcare and BFSI\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/insidebigdata.com\/#website\",\"url\":\"https:\/\/insidebigdata.com\/\",\"name\":\"insideBIGDATA\",\"description\":\"Your Source for AI, Data Science, Deep Learning &amp; Machine Learning Strategies\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/insidebigdata.com\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/2949e412c144601cdbcc803bd234e1b9\",\"name\":\"Editorial Team\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/e137ce7ea40e38bd4d25bb7860cfe3e4?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/e137ce7ea40e38bd4d25bb7860cfe3e4?s=96&d=mm&r=g\",\"caption\":\"Editorial Team\"},\"sameAs\":[\"http:\/\/www.insidebigdata.com\"],\"url\":\"https:\/\/insidebigdata.com\/author\/editorial\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Have a Goal in Mind: GPT-3, PEGASUS, and New Frameworks for Text Summarization in Healthcare and BFSI - insideBIGDATA","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/insidebigdata.com\/2020\/12\/08\/have-a-goal-in-mind-gpt-3-pegasus-and-new-frameworks-for-text-summarization-in-healthcare-and-bfsi\/","og_locale":"en_US","og_type":"article","og_title":"Have a Goal in Mind: GPT-3, PEGASUS, and New Frameworks for Text Summarization in Healthcare and BFSI - insideBIGDATA","og_description":"In this contributed article, Dattaraj Rao, Innovation and R&D Architect at Persistent Systems, discusses the rise in interest for neutral network language models, specifically the recent Google PEGASUS model. This model not only shows remarkable promise when it comes to text summarization and synthesis, but its non-generalized approach could push industries such as healthcare to embrace NLP much earlier than was once supposed.","og_url":"https:\/\/insidebigdata.com\/2020\/12\/08\/have-a-goal-in-mind-gpt-3-pegasus-and-new-frameworks-for-text-summarization-in-healthcare-and-bfsi\/","og_site_name":"insideBIGDATA","article_publisher":"http:\/\/www.facebook.com\/insidebigdata","article_published_time":"2020-12-08T14:00:00+00:00","article_modified_time":"2020-12-09T18:25:37+00:00","og_image":[{"width":300,"height":200,"url":"https:\/\/insidebigdata.com\/wp-content\/uploads\/2019\/10\/NLP_shutterstock_299138114.jpg","type":"image\/jpeg"}],"author":"Editorial Team","twitter_card":"summary_large_image","twitter_creator":"@insideBigData","twitter_site":"@insideBigData","twitter_misc":{"Written by":"Editorial Team","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/insidebigdata.com\/2020\/12\/08\/have-a-goal-in-mind-gpt-3-pegasus-and-new-frameworks-for-text-summarization-in-healthcare-and-bfsi\/","url":"https:\/\/insidebigdata.com\/2020\/12\/08\/have-a-goal-in-mind-gpt-3-pegasus-and-new-frameworks-for-text-summarization-in-healthcare-and-bfsi\/","name":"Have a Goal in Mind: GPT-3, PEGASUS, and New Frameworks for Text Summarization in Healthcare and BFSI - insideBIGDATA","isPartOf":{"@id":"https:\/\/insidebigdata.com\/#website"},"datePublished":"2020-12-08T14:00:00+00:00","dateModified":"2020-12-09T18:25:37+00:00","author":{"@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/2949e412c144601cdbcc803bd234e1b9"},"breadcrumb":{"@id":"https:\/\/insidebigdata.com\/2020\/12\/08\/have-a-goal-in-mind-gpt-3-pegasus-and-new-frameworks-for-text-summarization-in-healthcare-and-bfsi\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/insidebigdata.com\/2020\/12\/08\/have-a-goal-in-mind-gpt-3-pegasus-and-new-frameworks-for-text-summarization-in-healthcare-and-bfsi\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/insidebigdata.com\/2020\/12\/08\/have-a-goal-in-mind-gpt-3-pegasus-and-new-frameworks-for-text-summarization-in-healthcare-and-bfsi\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/insidebigdata.com\/"},{"@type":"ListItem","position":2,"name":"Have a Goal in Mind: GPT-3, PEGASUS, and New Frameworks for Text Summarization in Healthcare and BFSI"}]},{"@type":"WebSite","@id":"https:\/\/insidebigdata.com\/#website","url":"https:\/\/insidebigdata.com\/","name":"insideBIGDATA","description":"Your Source for AI, Data Science, Deep Learning &amp; Machine Learning Strategies","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/insidebigdata.com\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/2949e412c144601cdbcc803bd234e1b9","name":"Editorial Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/e137ce7ea40e38bd4d25bb7860cfe3e4?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/e137ce7ea40e38bd4d25bb7860cfe3e4?s=96&d=mm&r=g","caption":"Editorial Team"},"sameAs":["http:\/\/www.insidebigdata.com"],"url":"https:\/\/insidebigdata.com\/author\/editorial\/"}]}},"jetpack_featured_media_url":"https:\/\/insidebigdata.com\/wp-content\/uploads\/2019\/10\/NLP_shutterstock_299138114.jpg","jetpack_shortlink":"https:\/\/wp.me\/p9eA3j-6AD","jetpack-related-posts":[{"id":32512,"url":"https:\/\/insidebigdata.com\/2023\/06\/03\/video-highlights-fine-tune-gpt-j-6b-in-under-3-hours-on-ipus\/","url_meta":{"origin":25335,"position":0},"title":"Video Highlights: Fine Tune GPT-J 6B in Under 3 Hours on IPUs","date":"June 3, 2023","format":false,"excerpt":"Did you know you can run GPT-J 6B on Graphcore IPU in the cloud? Following the now infamous leaked Google memo, there's been a real storm in the AI world recently around smaller, open source language models, like GPT-J, that are cheaper and faster to fine-tune, run and perform just\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2019\/05\/Deep_Learning_shutterstock_386816095.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":32861,"url":"https:\/\/insidebigdata.com\/2023\/07\/17\/brief-history-of-llms\/","url_meta":{"origin":25335,"position":1},"title":"Brief History of LLMs","date":"July 17, 2023","format":false,"excerpt":"The early days of natural language processing saw researchers experiment with many different approaches, including conceptual ontologies and rule-based systems. While some of these methods proved narrowly useful, none yielded robust results. That changed in the 2010s when NLP research intersected with the then-bustling field of neural networks. The collision\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2023\/06\/GenerativeAI_shutterstock_2284999159_special.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":23446,"url":"https:\/\/insidebigdata.com\/2019\/10\/18\/how-nlp-and-bert-will-change-the-language-game\/","url_meta":{"origin":25335,"position":2},"title":"How NLP and BERT Will Change the Language Game","date":"October 18, 2019","format":false,"excerpt":"In this contributed article, Rob Dalgety, Industry Specialist at Peltarion, discusses how the recent model open-sourced by Google in October 2018, BERT (Bidirectional Encoder Representations from Transformers, is now reshaping the NLP landscape. BERT is significantly more evolved in its understanding of word semantics given its context and has an\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2019\/10\/Peltarion_pic1.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":24908,"url":"https:\/\/insidebigdata.com\/2020\/08\/24\/interview-andy-horng-co-founder-and-head-of-ai-cultivate\/","url_meta":{"origin":25335,"position":3},"title":"Interview: Andy Horng, Co-Founder and Head of AI, Cultivate","date":"August 24, 2020","format":false,"excerpt":"I recently caught up with Andy Horng, Co-Founder and Head of AI at Cultivate, to get a sense for the technology underlying the company's AI-powered leadership development platform. NLP plays an important role, and as a result they're using the RoBERTa language model for very good results.","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2020\/08\/Andy-Horng.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":31837,"url":"https:\/\/insidebigdata.com\/2023\/03\/13\/data-science-bows-before-prompt-engineering-and-few-shot-learning\/","url_meta":{"origin":25335,"position":4},"title":"Data Science Bows Before Prompt Engineering and Few Shot Learning\u00a0","date":"March 13, 2023","format":false,"excerpt":"In this contributed article, editorial consultant Jelani Harper takes a new look at the GPT phenomenon by exploring how prompt engineering (stores, databases) coupled with few shot learning can constitute a significant adjunct to traditional data science.","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2023\/02\/GPT4_shutterstock_2252419881_small.png?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":31422,"url":"https:\/\/insidebigdata.com\/2023\/01\/18\/originality-ai-allows-users-to-quickly-detect-ai-written-content-with-a-chrome-extension\/","url_meta":{"origin":25335,"position":5},"title":"Originality.AI Allows Users to Quickly Detect AI Written Content With a Chrome Extension\u00a0","date":"January 18, 2023","format":false,"excerpt":"Originality.AI recently launched a tool that allows users to screen for content created by popular AI tools, such as ChatGPT. To increase efficiency for the user, Originality.AI has also launched a Google Chrome Extension to make it faster and easier to check content.","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/img.youtube.com\/vi\/vYZy9giw9Fw\/0.jpg?resize=350%2C200","width":350,"height":200},"classes":[]}],"_links":{"self":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts\/25335"}],"collection":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/users\/10513"}],"replies":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/comments?post=25335"}],"version-history":[{"count":0,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts\/25335\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/media\/23389"}],"wp:attachment":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/media?parent=25335"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/categories?post=25335"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/tags?post=25335"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}