{"id":32512,"date":"2023-06-03T06:00:00","date_gmt":"2023-06-03T13:00:00","guid":{"rendered":"https:\/\/insidebigdata.com\/?p=32512"},"modified":"2023-05-30T14:23:08","modified_gmt":"2023-05-30T21:23:08","slug":"video-highlights-fine-tune-gpt-j-6b-in-under-3-hours-on-ipus","status":"publish","type":"post","link":"https:\/\/insidebigdata.com\/2023\/06\/03\/video-highlights-fine-tune-gpt-j-6b-in-under-3-hours-on-ipus\/","title":{"rendered":"Video Highlights: Fine Tune GPT-J 6B in Under 3 Hours on IPUs"},"content":{"rendered":"<div class=\"wp-block-image\">\n<figure class=\"alignright size-full\"><img decoding=\"async\" loading=\"lazy\" width=\"361\" height=\"154\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/05\/Graphcore_logo.png\" alt=\"\" class=\"wp-image-32513\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/05\/Graphcore_logo.png 361w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/05\/Graphcore_logo-300x128.png 300w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/05\/Graphcore_logo-150x64.png 150w\" sizes=\"(max-width: 361px) 100vw, 361px\" \/><\/figure><\/div>\n\n\n<p>Did you know you can run GPT-J 6B on <a href=\"https:\/\/www.graphcore.ai\/\" target=\"_blank\" rel=\"noreferrer noopener\">Graphcore<\/a> IPU in the cloud? Following the now infamous leaked Google memo, there&#8217;s been a real storm in the AI world recently around smaller, open source language models, like GPT-J, that are cheaper and faster to fine-tune, run and perform just as well as larger models for many language tasks.\u00a0<\/p>\n\n\n\n<p>Graphcore offers two pre-trained, ready-made GPT-J notebooks ready to try today on IPUs in Paperspace cloud for fine-tuning and inference:<\/p>\n\n\n\n<p><a href=\"https:\/\/console.paperspace.com\/github\/graphcore\/Gradient-HuggingFace?machine=IPU-POD16&amp;container=graphcore%2Fpytorch-jupyter%3A3.2.0-ubuntu-20.04-20230331&amp;file=gptj-text-generation%2Ffinetuning.ipynb\" target=\"_blank\" rel=\"noreferrer noopener\">Text entailment on IPU using GPT-J &#8211; Fine-tuning<\/a><\/p>\n\n\n\n<p><a href=\"https:\/\/console.paperspace.com\/github\/graphcore\/Gradient-HuggingFace?machine=IPU-POD16&amp;container=graphcore%2Fpytorch-jupyter%3A3.2.0-ubuntu-20.04-20230331&amp;file=gptj-text-generation%2FGPTJ-generative-inference.ipynb\" target=\"_blank\" rel=\"noreferrer noopener\">Text generation on IPU using GPT-J &#8211; Inference<\/a><\/p>\n\n\n\n<p>It takes around 2 hours 45 minutes to work through the GPT-J fine tuning notebook in its entirety on a 16-IPU platform &#8211; this will vary, of course, if you want to bring your own data for fine tuning, which is good to know to work out costs on Paperspace.<\/p>\n\n\n\n<p>If you want to use GPT-J in production, contact Graphcore to find out their special pricing or to try it much faster on a 64-IPU system which is coming soon to Paperspace.<\/p>\n\n\n\n<p>What is GPT-J? A powerful, efficient alternative to large language models (LLMs) such as GPT-3 and GPT-4 for many NLP tasks. Fine-tuning GPT-J lets you tailor the model for specific applications, using a task-relevant dataset. <\/p>\n\n\n\n<p>In the video presentation below, Graphcore engineer Sofia Liguori walks through the process of fine-tuning GPT-J 6B on a Paperspace Gradient notebook (Google Colab alternative), powered by Graphcore IPUs. Run the Paperspace Gradient Notebook &#8211; Fine tuning: text entailment on IPU using GPT-J. You can also use GPT-J for text generation (inference) on Paperspace. Run the Paperspace Gradient Notebook &#8211; Inference: text generation on IPU using GPT-J. Read more about the the benefits of GPT-J in the company&#8217;s blog <a href=\"https:\/\/www.graphcore.ai\/posts\/fine-tuned-gpt-j-a-cost-effective-alternative-to-gpt-4-for-nlp-tasks\" target=\"_blank\" rel=\"noreferrer noopener\">Fine-tune GPT-J: effective GPT-4 alternative for many NLP tasks<\/a>. <\/p>\n\n\n\n<figure class=\"wp-block-embed aligncenter is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-4-3 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe loading=\"lazy\"  id=\"_ytid_34363\"  width=\"480\" height=\"360\"  data-origwidth=\"480\" data-origheight=\"360\" src=\"https:\/\/www.youtube.com\/embed\/bH6gq0M9bZ0?enablejsapi=1&#038;autoplay=0&#038;cc_load_policy=0&#038;cc_lang_pref=&#038;iv_load_policy=1&#038;loop=0&#038;modestbranding=0&#038;rel=1&#038;fs=1&#038;playsinline=0&#038;autohide=2&#038;theme=dark&#038;color=red&#038;controls=1&#038;\" class=\"__youtube_prefs__  epyt-is-override  no-lazyload\" title=\"YouTube player\"  allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture\" allowfullscreen data-no-lazy=\"1\" data-skipgform_ajax_framebjll=\"\"><\/iframe>\n<\/div><\/figure>\n\n\n\n<p><em>Sign up for the free insideBIGDATA&nbsp;<a href=\"http:\/\/inside-bigdata.com\/newsletter\/\" target=\"_blank\" rel=\"noreferrer noopener\">newsletter<\/a>.<\/em><\/p>\n\n\n\n<p><em>Join us on Twitter:&nbsp;<a href=\"https:\/\/twitter.com\/InsideBigData1\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/twitter.com\/InsideBigData1<\/a><\/em><\/p>\n\n\n\n<p><em>Join us on LinkedIn:&nbsp;<a href=\"https:\/\/www.linkedin.com\/company\/insidebigdata\/\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/www.linkedin.com\/company\/insidebigdata\/<\/a><\/em><\/p>\n\n\n\n<p><em>Join us on Facebook:&nbsp;<a href=\"https:\/\/www.facebook.com\/insideBIGDATANOW\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/www.facebook.com\/insideBIGDATANOW<\/a><\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Did you know you can run GPT-J 6B on Graphcore IPU in the cloud? Following the now infamous leaked Google memo, there&#8217;s been a real storm in the AI world recently around smaller, open source language models, like GPT-J, that are cheaper and faster to fine-tune, run and perform just as well as larger models for many language tasks.\u00a0<\/p>\n","protected":false},"author":10513,"featured_media":22568,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"footnotes":""},"categories":[526,182,90,180,67,268,56,1,85],"tags":[437,264,1248,1131,96],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v20.6 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Video Highlights: Fine Tune GPT-J 6B in Under 3 Hours on IPUs - insideBIGDATA<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/insidebigdata.com\/2023\/06\/03\/video-highlights-fine-tune-gpt-j-6b-in-under-3-hours-on-ipus\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Video Highlights: Fine Tune GPT-J 6B in Under 3 Hours on IPUs - insideBIGDATA\" \/>\n<meta property=\"og:description\" content=\"Did you know you can run GPT-J 6B on Graphcore IPU in the cloud? Following the now infamous leaked Google memo, there&#039;s been a real storm in the AI world recently around smaller, open source language models, like GPT-J, that are cheaper and faster to fine-tune, run and perform just as well as larger models for many language tasks.\u00a0\" \/>\n<meta property=\"og:url\" content=\"https:\/\/insidebigdata.com\/2023\/06\/03\/video-highlights-fine-tune-gpt-j-6b-in-under-3-hours-on-ipus\/\" \/>\n<meta property=\"og:site_name\" content=\"insideBIGDATA\" \/>\n<meta property=\"article:publisher\" content=\"http:\/\/www.facebook.com\/insidebigdata\" \/>\n<meta property=\"article:published_time\" content=\"2023-06-03T13:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-05-30T21:23:08+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2019\/05\/Deep_Learning_shutterstock_386816095.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"300\" \/>\n\t<meta property=\"og:image:height\" content=\"240\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Editorial Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@insideBigData\" \/>\n<meta name=\"twitter:site\" content=\"@insideBigData\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Editorial Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/insidebigdata.com\/2023\/06\/03\/video-highlights-fine-tune-gpt-j-6b-in-under-3-hours-on-ipus\/\",\"url\":\"https:\/\/insidebigdata.com\/2023\/06\/03\/video-highlights-fine-tune-gpt-j-6b-in-under-3-hours-on-ipus\/\",\"name\":\"Video Highlights: Fine Tune GPT-J 6B in Under 3 Hours on IPUs - insideBIGDATA\",\"isPartOf\":{\"@id\":\"https:\/\/insidebigdata.com\/#website\"},\"datePublished\":\"2023-06-03T13:00:00+00:00\",\"dateModified\":\"2023-05-30T21:23:08+00:00\",\"author\":{\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/2949e412c144601cdbcc803bd234e1b9\"},\"breadcrumb\":{\"@id\":\"https:\/\/insidebigdata.com\/2023\/06\/03\/video-highlights-fine-tune-gpt-j-6b-in-under-3-hours-on-ipus\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/insidebigdata.com\/2023\/06\/03\/video-highlights-fine-tune-gpt-j-6b-in-under-3-hours-on-ipus\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/insidebigdata.com\/2023\/06\/03\/video-highlights-fine-tune-gpt-j-6b-in-under-3-hours-on-ipus\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/insidebigdata.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Video Highlights: Fine Tune GPT-J 6B in Under 3 Hours on IPUs\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/insidebigdata.com\/#website\",\"url\":\"https:\/\/insidebigdata.com\/\",\"name\":\"insideBIGDATA\",\"description\":\"Your Source for AI, Data Science, Deep Learning &amp; Machine Learning Strategies\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/insidebigdata.com\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/2949e412c144601cdbcc803bd234e1b9\",\"name\":\"Editorial Team\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/e137ce7ea40e38bd4d25bb7860cfe3e4?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/e137ce7ea40e38bd4d25bb7860cfe3e4?s=96&d=mm&r=g\",\"caption\":\"Editorial Team\"},\"sameAs\":[\"http:\/\/www.insidebigdata.com\"],\"url\":\"https:\/\/insidebigdata.com\/author\/editorial\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Video Highlights: Fine Tune GPT-J 6B in Under 3 Hours on IPUs - insideBIGDATA","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/insidebigdata.com\/2023\/06\/03\/video-highlights-fine-tune-gpt-j-6b-in-under-3-hours-on-ipus\/","og_locale":"en_US","og_type":"article","og_title":"Video Highlights: Fine Tune GPT-J 6B in Under 3 Hours on IPUs - insideBIGDATA","og_description":"Did you know you can run GPT-J 6B on Graphcore IPU in the cloud? Following the now infamous leaked Google memo, there's been a real storm in the AI world recently around smaller, open source language models, like GPT-J, that are cheaper and faster to fine-tune, run and perform just as well as larger models for many language tasks.\u00a0","og_url":"https:\/\/insidebigdata.com\/2023\/06\/03\/video-highlights-fine-tune-gpt-j-6b-in-under-3-hours-on-ipus\/","og_site_name":"insideBIGDATA","article_publisher":"http:\/\/www.facebook.com\/insidebigdata","article_published_time":"2023-06-03T13:00:00+00:00","article_modified_time":"2023-05-30T21:23:08+00:00","og_image":[{"width":300,"height":240,"url":"https:\/\/insidebigdata.com\/wp-content\/uploads\/2019\/05\/Deep_Learning_shutterstock_386816095.jpg","type":"image\/jpeg"}],"author":"Editorial Team","twitter_card":"summary_large_image","twitter_creator":"@insideBigData","twitter_site":"@insideBigData","twitter_misc":{"Written by":"Editorial Team","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/insidebigdata.com\/2023\/06\/03\/video-highlights-fine-tune-gpt-j-6b-in-under-3-hours-on-ipus\/","url":"https:\/\/insidebigdata.com\/2023\/06\/03\/video-highlights-fine-tune-gpt-j-6b-in-under-3-hours-on-ipus\/","name":"Video Highlights: Fine Tune GPT-J 6B in Under 3 Hours on IPUs - insideBIGDATA","isPartOf":{"@id":"https:\/\/insidebigdata.com\/#website"},"datePublished":"2023-06-03T13:00:00+00:00","dateModified":"2023-05-30T21:23:08+00:00","author":{"@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/2949e412c144601cdbcc803bd234e1b9"},"breadcrumb":{"@id":"https:\/\/insidebigdata.com\/2023\/06\/03\/video-highlights-fine-tune-gpt-j-6b-in-under-3-hours-on-ipus\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/insidebigdata.com\/2023\/06\/03\/video-highlights-fine-tune-gpt-j-6b-in-under-3-hours-on-ipus\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/insidebigdata.com\/2023\/06\/03\/video-highlights-fine-tune-gpt-j-6b-in-under-3-hours-on-ipus\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/insidebigdata.com\/"},{"@type":"ListItem","position":2,"name":"Video Highlights: Fine Tune GPT-J 6B in Under 3 Hours on IPUs"}]},{"@type":"WebSite","@id":"https:\/\/insidebigdata.com\/#website","url":"https:\/\/insidebigdata.com\/","name":"insideBIGDATA","description":"Your Source for AI, Data Science, Deep Learning &amp; Machine Learning Strategies","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/insidebigdata.com\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/2949e412c144601cdbcc803bd234e1b9","name":"Editorial Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/e137ce7ea40e38bd4d25bb7860cfe3e4?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/e137ce7ea40e38bd4d25bb7860cfe3e4?s=96&d=mm&r=g","caption":"Editorial Team"},"sameAs":["http:\/\/www.insidebigdata.com"],"url":"https:\/\/insidebigdata.com\/author\/editorial\/"}]}},"jetpack_featured_media_url":"https:\/\/insidebigdata.com\/wp-content\/uploads\/2019\/05\/Deep_Learning_shutterstock_386816095.jpg","jetpack_shortlink":"https:\/\/wp.me\/p9eA3j-8so","jetpack-related-posts":[{"id":33231,"url":"https:\/\/insidebigdata.com\/2023\/08\/28\/generative-ai-report-nutanix-simplifies-adoption-of-generative-ai-with-new-nutanix-gpt-in-a-box-solution\/","url_meta":{"origin":32512,"position":0},"title":"Generative AI Report: Nutanix Simplifies Adoption of Generative AI with New Nutanix GPT-in-a-Box Solution","date":"August 28, 2023","format":false,"excerpt":"Nutanix\u00a0(NASDAQ:\u00a0NTNX), a leader in hybrid multicloud computing, announced the Nutanix GPT-in-a-Box\u2122\u00a0solution for customers looking to jump-start their artificial intelligence (AI) and machine learning (ML) innovation, while maintaining control over their data. The new offering is a full-stack software-defined AI-ready platform, along with services to help organizations size and configure hardware\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2023\/08\/Generative_AI_shutterstock_2273007347_special.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":33355,"url":"https:\/\/insidebigdata.com\/2023\/09\/13\/insidebigdata-ai-news-briefs-9-13-2023\/","url_meta":{"origin":32512,"position":1},"title":"insideBIGDATA AI News Briefs \u2013 9\/13\/2023","date":"September 13, 2023","format":false,"excerpt":"Welcome insideBIGDATA AI News Briefs, our timely new feature bringing you the latest industry insights and perspectives surrounding the field of AI including deep learning, large language models, generative AI, and transformers. We\u2019re working tirelessly to dig up the most timely and curious tidbits underlying the day\u2019s most popular technologies.\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2023\/07\/AI-News-Briefs-column-banner.png?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":31422,"url":"https:\/\/insidebigdata.com\/2023\/01\/18\/originality-ai-allows-users-to-quickly-detect-ai-written-content-with-a-chrome-extension\/","url_meta":{"origin":32512,"position":2},"title":"Originality.AI Allows Users to Quickly Detect AI Written Content With a Chrome Extension\u00a0","date":"January 18, 2023","format":false,"excerpt":"Originality.AI recently launched a tool that allows users to screen for content created by popular AI tools, such as ChatGPT. To increase efficiency for the user, Originality.AI has also launched a Google Chrome Extension to make it faster and easier to check content.","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/img.youtube.com\/vi\/vYZy9giw9Fw\/0.jpg?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":32866,"url":"https:\/\/insidebigdata.com\/2023\/07\/21\/ultimate-guide-to-scaling-ml-models-megatron-lm-zero-deepspeed-mixed-precision\/","url_meta":{"origin":32512,"position":3},"title":"Video Highlights: Ultimate Guide To Scaling ML Models &#8211; Megatron-LM | ZeRO | DeepSpeed | Mixed Precision","date":"July 21, 2023","format":false,"excerpt":"In this video presentation, Aleksa Gordi\u0107\u00a0explains what it takes to scale ML models up to trillions of parameters! He covers the fundamental ideas behind all of the recent big ML models like Meta's OPT-175B, BigScience BLOOM 176B, EleutherAI's GPT-NeoX-20B, GPT-J, OpenAI's GPT-3, Google's PaLM, DeepMind's Chinchilla\/Gopher models, etc.","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2019\/12\/Machine_Learning_shutterstock_344688470.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":32878,"url":"https:\/\/insidebigdata.com\/2023\/07\/25\/video-highlights-generative-ai-with-large-language-models\/","url_meta":{"origin":32512,"position":4},"title":"Video Highlights: Generative AI with Large Language Models","date":"July 25, 2023","format":false,"excerpt":"At an unprecedented pace, Large Language Models like GPT-4 are transforming the world in general and the field of data science in particular. This two-hour training video presentation by Jon Krohn, Co-Founder and Chief Data Scientist at the machine learning company Nebula, introduces deep learning transformer architectures including LLMs.","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2023\/06\/GenerativeAI_shutterstock_2313909647_special.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":32725,"url":"https:\/\/insidebigdata.com\/2023\/06\/28\/generative-ai-report-mosaicml-releases-open-source-mpt-30b-llms-trained-on-h100s-to-power-generative-ai-applications\/","url_meta":{"origin":32512,"position":5},"title":"MosaicML Releases Open-Source MPT-30B LLMs, Trained on H100s to Power Generative AI Applications","date":"June 28, 2023","format":false,"excerpt":"MosaicML\u00a0announced the availability of MPT-30B Base, Instruct, and Chat, the most advanced models in their MPT (MosaicML Pretrained Transformer) series of open-source large language models. These state-of-the-art models - which were trained with an 8k token context window - surpass the quality of the original GPT-3 and can be used\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2023\/06\/GenerativeAI_shutterstock_2313909647_special.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]}],"_links":{"self":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts\/32512"}],"collection":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/users\/10513"}],"replies":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/comments?post=32512"}],"version-history":[{"count":0,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts\/32512\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/media\/22568"}],"wp:attachment":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/media?parent=32512"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/categories?post=32512"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/tags?post=32512"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}