{"id":34119,"date":"2023-12-08T03:00:00","date_gmt":"2023-12-08T11:00:00","guid":{"rendered":"https:\/\/insidebigdata.com\/?p=34119"},"modified":"2023-12-07T15:33:53","modified_gmt":"2023-12-07T23:33:53","slug":"hallucination-index-identifies-best-llms-for-most-popular-ai-use-cases","status":"publish","type":"post","link":"https:\/\/insidebigdata.com\/2023\/12\/08\/hallucination-index-identifies-best-llms-for-most-popular-ai-use-cases\/","title":{"rendered":"Hallucination Index Identifies Best LLMs for Most Popular AI Use Cases"},"content":{"rendered":"<div class=\"wp-block-image\">\n<figure class=\"alignright size-large\"><img decoding=\"async\" loading=\"lazy\" width=\"216\" height=\"117\" src=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2022\/05\/Galileo_logo.png\" alt=\"\" class=\"wp-image-29232\" srcset=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2022\/05\/Galileo_logo.png 216w, https:\/\/insidebigdata.com\/wp-content\/uploads\/2022\/05\/Galileo_logo-150x81.png 150w\" sizes=\"(max-width: 216px) 100vw, 216px\" \/><\/figure><\/div>\n\n\n<p><a href=\"https:\/\/www.rungalileo.io\/\" target=\"_blank\" rel=\"noreferrer noopener\">Galileo<\/a>, a leading machine learning (ML) company for unstructured data, released a Hallucination Index developed by its research arm, Galileo Labs, to help users of today\u2019s leading LLMs determine which model is least likely to hallucinate for their intended application. The findings can be viewed <a href=\"https:\/\/www.rungalileo.io\/blog\/hallucination-index\" target=\"_blank\" rel=\"noreferrer noopener\">HERE<\/a>. <\/p>\n\n\n\n<p><em>\u201c2023 has been the year of LLMs. While everyone from individual developers to Fortune 50 enterprises has been learning how to wrangle this novel new technology, two things are clear: first, LLMs are not one size fits all and second, hallucinations remain one of the greatest hurdles to LLM adoption,\u201d said Atindriyo Sanyal, Galileo\u2019s co-founder and CTO. \u201cTo help builders identify which LLMs to use for their applications, Galileo Labs created a ranking of the most popular LLMs based on their propensity to hallucinate using our proprietary hallucination evaluation metrics, Correctness and Context Adherence. We hope this effort sheds light on LLMs and helps teams pick the perfect LLM for their use case.\u201d<\/em><\/p>\n\n\n\n<p>While businesses of all sizes are building LLM-based applications, these efforts are being hindered by hallucinations that pose significant challenges in generating accurate and reliable responses. With hallucinations, AI generates information that appears realistic at first glance yet is ultimately incorrect or disconnected from the context.&nbsp;<\/p>\n\n\n\n<p>To help teams get a handle on hallucinations and identify the best LLM that suits their needs, Galileo Labs developed a Hallucination Index that takes 11 LLMs from Open AI (GPT-4-0613, GPT-3.5-turbo-1106, GPT-3.5-turbo-0613 and GPT-3.5-turbo-instruct), Meta (Llama-2-70b, Llama-2-13b and Llama-2-7b), TII UAE (Falcon-40b-instruct), Mosaic ML (MPT-7b-instruct), Mistral.ai (Mistral-7b-instruct) and Hugging Face (Zephyr-7b-beta) and evaluates each LLM\u2019s likelihood to hallucinate in common generative AI task types.<\/p>\n\n\n\n<p>Key insights include:<\/p>\n\n\n\n<ul>\n<li>Question &amp; Answer without Retrieval (RAG):&nbsp;In this comprehensive evaluation, OpenAI&#8217;s GPT-4 emerges as the top performer with a Correctness Score of 0.77, demonstrating remarkable accuracy and the least likelihood of hallucination in the Question &amp; Answer without RAG task, underscoring its dominance in general knowledge applications.<br><br>Among open source models, Meta&#8217;s Llama-2-70b leads (Correctness Score = 0.65), while other models like Meta\u2019s Llama-2-7b-chat and Mosaic ML\u2019s MPT-7b-instruct showed a higher propensity for hallucinations in similar tasks with Correctness Scores of 0.52 and 0.40 respectively.<br><br>The Index recommends GPT-4-0613 for reliable and accurate AI performance in this task type.<\/li>\n<\/ul>\n\n\n\n<ul>\n<li>Question &amp; Answer with RAG:&nbsp;OpenAI&#8217;s GPT-4-0613 excelled as the top contender with a Context Adherence score of 0.76, while the more cost-effective and faster GPT-3.5-turbo-0613 and -1106 models matched its performance closely with Context Adherence scores of 0.75 and 0.74 respectively.<br><br>Surprisingly, Hugging Face&#8217;s Zephyr-7b (Context Adherence Score = 0.71), an open source model, surpassed Meta&#8217;s much larger Llama-2-70b (Context Adherence Score = 0.68), challenging the notion that bigger models are inherently superior.<br><br>However, TII UAE&#8217;s Falcon-40b (Context Adherence Score = 0.60) and Mosaic ML&#8217;s MPT-7b (Context Adherence Score = 0.58) lagged for this task.<br><br>The Index recommends GPT-3.5-turbo-0613 for this task type.<br><\/li>\n\n\n\n<li>Long-form Text Generation:&nbsp;OpenAI&#8217;s GPT-4-0613 again emerged as a top performer (Correctness Score = 0.83), showing the least tendency to hallucinate, while GPT-3.5-turbo models (1106 and 0613) matched its prowess with Correctness Scores of 0.82 and 0.81 respectively, offering potential cost savings and enhanced performance.&nbsp;<br><br>Remarkably, Meta&#8217;s open source Llama-2-70b-chat rivaled GPT-4&#8217;s capabilities (Correctness Score = 0.82), presenting an efficient alternative for this task. Conversely, TII UAE&#8217;s Falcon-40b (Correctness Score = 0.65) and Mosaic ML&#8217;s MPT-7b (Correctness Score = 0.53) trailed behind in effectiveness.&nbsp;<br><br>The Index recommends Llama-2-70b-chat for an optimal balance of cost and performance in Long-form Text Generation.<\/li>\n\n\n\n<li>Open AI Comes Out on Top:&nbsp;When it comes to hallucinations, Open AI\u2019s models outperformed their peers across all task types. This however comes at a cost, as Open AI\u2019s API-based pricing model can quickly drive up costs associated with building a Generative AI product.&nbsp;<\/li>\n\n\n\n<li>Open Source Cost Savings Opportunities:\u00a0Within OpenAI\u2019s model offerings, organizations can reduce spend by opting for lower-cost versions of their models, such as GPT-3.5-turbo. The biggest cost savings however come from going with open source models.\n<ul>\n<li>For Long-form Text Generation task types, Meta\u2019s open source Llama-2-13b-chat model is a worthy alternative to Open AI\u2019s models.&nbsp;<\/li>\n\n\n\n<li>For Question &amp; Answer with RAG task types, users can confidently try the nimble but powerful Zephyr model from Hugging Face instead of OpenAI. Inference cost of Zephyr is 10x lesser than GPT-3.5 Turbo.&nbsp;<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p>Supporting these analyses are Galileo\u2019s proprietary evaluation metrics Correctness and Context Adherence. These metrics are powered by ChainPoll, a hallucination detection methodology developed by Galileo Labs. During the creation of the index, Galileo\u2019s evaluation metrics were proven to detect hallucinations with 87% accuracy, finally giving teams a reliable way to automatically detect hallucination risk saving teams time and cost typically spent on manual evaluation.<\/p>\n\n\n\n<p>By helping teams catch errors of stale knowledge, wrong knowledge, logical fallacies and mathematical errors, Galileo hopes to help organizations find the perfect LLM for their use case, move from sandbox to production and more quickly deploy reliable and trustworthy AI.&nbsp;<\/p>\n\n\n\n<p>Additional Resources:&nbsp;<\/p>\n\n\n\n<ul>\n<li>Read the ChainPoll: A high efficacy method for LLM hallucination detection paper:\u00a0<a href=\"https:\/\/arxiv.org\/abs\/2310.18344\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/arxiv.org\/abs\/2310.18344<\/a><\/li>\n\n\n\n<li>Read the Hallucination Index blog:\u00a0<a href=\"https:\/\/www.rungalileo.io\/blog\/hallucination-index\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/www.rungalileo.io\/blog\/hallucination-index\u00a0<\/a><\/li>\n<\/ul>\n\n\n\n<p><em>Sign up for the free insideBIGDATA\u00a0<a href=\"http:\/\/inside-bigdata.com\/newsletter\/\" target=\"_blank\" rel=\"noreferrer noopener\">newsletter<\/a>.<\/em><\/p>\n\n\n\n<p><em>Join us on Twitter:&nbsp;<a href=\"https:\/\/twitter.com\/InsideBigData1\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/twitter.com\/InsideBigData1<\/a><\/em><\/p>\n\n\n\n<p><em>Join us on LinkedIn:&nbsp;<a href=\"https:\/\/www.linkedin.com\/company\/insidebigdata\/\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/www.linkedin.com\/company\/insidebigdata\/<\/a><\/em><\/p>\n\n\n\n<p><em>Join us on Facebook:&nbsp;<a href=\"https:\/\/www.facebook.com\/insideBIGDATANOW\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/www.facebook.com\/insideBIGDATANOW<\/a><\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Galileo, a leading machine learning (ML) company for unstructured data, released a Hallucination Index developed by its research arm, Galileo Labs, to help users of today\u2019s leading LLMs determine which model is least likely to hallucinate for their intended application.<\/p>\n","protected":false},"author":10513,"featured_media":32645,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"footnotes":""},"categories":[526,180,67,268,56,84,1],"tags":[437,1245,1248,96],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v20.6 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Hallucination Index Identifies Best LLMs for Most Popular AI Use Cases - insideBIGDATA<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/insidebigdata.com\/2023\/12\/08\/hallucination-index-identifies-best-llms-for-most-popular-ai-use-cases\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Hallucination Index Identifies Best LLMs for Most Popular AI Use Cases - insideBIGDATA\" \/>\n<meta property=\"og:description\" content=\"Galileo, a leading machine learning (ML) company for unstructured data, released a Hallucination Index developed by its research arm, Galileo Labs, to help users of today\u2019s leading LLMs determine which model is least likely to hallucinate for their intended application.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/insidebigdata.com\/2023\/12\/08\/hallucination-index-identifies-best-llms-for-most-popular-ai-use-cases\/\" \/>\n<meta property=\"og:site_name\" content=\"insideBIGDATA\" \/>\n<meta property=\"article:publisher\" content=\"http:\/\/www.facebook.com\/insidebigdata\" \/>\n<meta property=\"article:published_time\" content=\"2023-12-08T11:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-12-07T23:33:53+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/06\/AI_shutterstock_2287025875_special-1.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1100\" \/>\n\t<meta property=\"og:image:height\" content=\"550\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Editorial Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@insideBigData\" \/>\n<meta name=\"twitter:site\" content=\"@insideBigData\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Editorial Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/insidebigdata.com\/2023\/12\/08\/hallucination-index-identifies-best-llms-for-most-popular-ai-use-cases\/\",\"url\":\"https:\/\/insidebigdata.com\/2023\/12\/08\/hallucination-index-identifies-best-llms-for-most-popular-ai-use-cases\/\",\"name\":\"Hallucination Index Identifies Best LLMs for Most Popular AI Use Cases - insideBIGDATA\",\"isPartOf\":{\"@id\":\"https:\/\/insidebigdata.com\/#website\"},\"datePublished\":\"2023-12-08T11:00:00+00:00\",\"dateModified\":\"2023-12-07T23:33:53+00:00\",\"author\":{\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/2949e412c144601cdbcc803bd234e1b9\"},\"breadcrumb\":{\"@id\":\"https:\/\/insidebigdata.com\/2023\/12\/08\/hallucination-index-identifies-best-llms-for-most-popular-ai-use-cases\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/insidebigdata.com\/2023\/12\/08\/hallucination-index-identifies-best-llms-for-most-popular-ai-use-cases\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/insidebigdata.com\/2023\/12\/08\/hallucination-index-identifies-best-llms-for-most-popular-ai-use-cases\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/insidebigdata.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Hallucination Index Identifies Best LLMs for Most Popular AI Use Cases\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/insidebigdata.com\/#website\",\"url\":\"https:\/\/insidebigdata.com\/\",\"name\":\"insideBIGDATA\",\"description\":\"Your Source for AI, Data Science, Deep Learning &amp; Machine Learning Strategies\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/insidebigdata.com\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/2949e412c144601cdbcc803bd234e1b9\",\"name\":\"Editorial Team\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/insidebigdata.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/e137ce7ea40e38bd4d25bb7860cfe3e4?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/e137ce7ea40e38bd4d25bb7860cfe3e4?s=96&d=mm&r=g\",\"caption\":\"Editorial Team\"},\"sameAs\":[\"http:\/\/www.insidebigdata.com\"],\"url\":\"https:\/\/insidebigdata.com\/author\/editorial\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Hallucination Index Identifies Best LLMs for Most Popular AI Use Cases - insideBIGDATA","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/insidebigdata.com\/2023\/12\/08\/hallucination-index-identifies-best-llms-for-most-popular-ai-use-cases\/","og_locale":"en_US","og_type":"article","og_title":"Hallucination Index Identifies Best LLMs for Most Popular AI Use Cases - insideBIGDATA","og_description":"Galileo, a leading machine learning (ML) company for unstructured data, released a Hallucination Index developed by its research arm, Galileo Labs, to help users of today\u2019s leading LLMs determine which model is least likely to hallucinate for their intended application.","og_url":"https:\/\/insidebigdata.com\/2023\/12\/08\/hallucination-index-identifies-best-llms-for-most-popular-ai-use-cases\/","og_site_name":"insideBIGDATA","article_publisher":"http:\/\/www.facebook.com\/insidebigdata","article_published_time":"2023-12-08T11:00:00+00:00","article_modified_time":"2023-12-07T23:33:53+00:00","og_image":[{"width":1100,"height":550,"url":"https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/06\/AI_shutterstock_2287025875_special-1.jpg","type":"image\/jpeg"}],"author":"Editorial Team","twitter_card":"summary_large_image","twitter_creator":"@insideBigData","twitter_site":"@insideBigData","twitter_misc":{"Written by":"Editorial Team","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/insidebigdata.com\/2023\/12\/08\/hallucination-index-identifies-best-llms-for-most-popular-ai-use-cases\/","url":"https:\/\/insidebigdata.com\/2023\/12\/08\/hallucination-index-identifies-best-llms-for-most-popular-ai-use-cases\/","name":"Hallucination Index Identifies Best LLMs for Most Popular AI Use Cases - insideBIGDATA","isPartOf":{"@id":"https:\/\/insidebigdata.com\/#website"},"datePublished":"2023-12-08T11:00:00+00:00","dateModified":"2023-12-07T23:33:53+00:00","author":{"@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/2949e412c144601cdbcc803bd234e1b9"},"breadcrumb":{"@id":"https:\/\/insidebigdata.com\/2023\/12\/08\/hallucination-index-identifies-best-llms-for-most-popular-ai-use-cases\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/insidebigdata.com\/2023\/12\/08\/hallucination-index-identifies-best-llms-for-most-popular-ai-use-cases\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/insidebigdata.com\/2023\/12\/08\/hallucination-index-identifies-best-llms-for-most-popular-ai-use-cases\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/insidebigdata.com\/"},{"@type":"ListItem","position":2,"name":"Hallucination Index Identifies Best LLMs for Most Popular AI Use Cases"}]},{"@type":"WebSite","@id":"https:\/\/insidebigdata.com\/#website","url":"https:\/\/insidebigdata.com\/","name":"insideBIGDATA","description":"Your Source for AI, Data Science, Deep Learning &amp; Machine Learning Strategies","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/insidebigdata.com\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/2949e412c144601cdbcc803bd234e1b9","name":"Editorial Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/insidebigdata.com\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/e137ce7ea40e38bd4d25bb7860cfe3e4?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/e137ce7ea40e38bd4d25bb7860cfe3e4?s=96&d=mm&r=g","caption":"Editorial Team"},"sameAs":["http:\/\/www.insidebigdata.com"],"url":"https:\/\/insidebigdata.com\/author\/editorial\/"}]}},"jetpack_featured_media_url":"https:\/\/insidebigdata.com\/wp-content\/uploads\/2023\/06\/AI_shutterstock_2287025875_special-1.jpg","jetpack_shortlink":"https:\/\/wp.me\/p9eA3j-8Sj","jetpack-related-posts":[{"id":29231,"url":"https:\/\/insidebigdata.com\/2022\/05\/03\/galileo-launches-to-give-data-scientists-the-superpowers-they-need-for-unstructured-data-machine-learning\/","url_meta":{"origin":34119,"position":0},"title":"Galileo Launches to Give Data Scientists the Superpowers They Need for Unstructured Data Machine Learning","date":"May 3, 2022","format":false,"excerpt":"Galileo emerged from stealth with the first machine learning (ML) data intelligence platform for unstructured data that gives data scientists the ability to inspect, discover and fix critical ML data errors 10x faster across the\u00a0entire\u00a0ML lifecycle \u2013 from pre-training to post-training to post-production. The platform is currently in private beta\u2026","rel":"","context":"In &quot;Big Data&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":33169,"url":"https:\/\/insidebigdata.com\/2023\/08\/23\/survey-more-than-75-of-enterprises-dont-plan-to-use-commercial-llms-in-production-citing-data-privacy-as-primary-concern\/","url_meta":{"origin":34119,"position":1},"title":"Survey: More than 75% of Enterprises Don\u2019t Plan to Use Commercial LLMs in Production Citing Data Privacy as Primary Concern\u00a0","date":"August 23, 2023","format":false,"excerpt":"Predibase, the commercially available low-code declarative ML platform for developers, today released a new report, \u201cBeyond the Buzz: A Look at Large Language Models in Production.\u201d Based on survey data from organizations experimenting with LLMs, the report offers insight into real-world concerns, opportunities, and priorities for organizations as they embrace\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2023\/06\/GenerativeAI_shutterstock_2313909647_special.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":32095,"url":"https:\/\/insidebigdata.com\/2023\/04\/12\/latest-version-of-zilliz-cloud-aims-to-cure-ai-hallucinations\/","url_meta":{"origin":34119,"position":2},"title":"Latest Version of Zilliz Cloud Aims to Cure AI \u2018Hallucinations\u2019","date":"April 12, 2023","format":false,"excerpt":"Zilliz Cloud is the managed service from\u00a0Zilliz, the inventors of Milvus, the open source vector database used by more than 1,000 enterprises around the world. It\u2019s purpose-built for AI and other applications powered by unstructured data. It represents data as high-dimensional vectors, or embeddings \u2014 the kind generated by machine-learning\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":33551,"url":"https:\/\/insidebigdata.com\/2023\/10\/01\/video-highlights-vicuna-gorilla-chatbot-arena-and-socially-beneficial-llms-with-prof-joey-gonzalez\/","url_meta":{"origin":34119,"position":3},"title":"Video Highlights: Vicu\u00f1a, Gorilla, Chatbot Arena and Socially Beneficial LLMs \u2014 with Prof. Joey Gonzalez","date":"October 1, 2023","format":false,"excerpt":"LLM Vicu\u00f1a, Chatbot Arena, and the race to increase LLM context windows: In this video presentation, guest Joey Gonzalez joins our good friend\u00a0Jon Krohn, Co-Founder and Chief Data Scientist at the machine learning company\u00a0Nebula,\u00a0to talk about developing models and platforms that leverage and improve LLMs, as well as the future\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2023\/08\/Neural_net_shutterstock_1615182352_special.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":33083,"url":"https:\/\/insidebigdata.com\/2023\/08\/08\/netspi-debuts-ml-ai-penetration-testing-a-holistic-approach-to-securing-machine-learning-models-and-llm-implementations\/","url_meta":{"origin":34119,"position":4},"title":"NetSPI Debuts ML\/AI Penetration Testing, a Holistic Approach to Securing Machine Learning Models and LLM Implementations","date":"August 8, 2023","format":false,"excerpt":"NetSPI, the global leader in offensive security, today debuted its ML\/AI Pentesting solution to bring a more holistic and proactive approach to safeguarding machine learning model implementations. The first-of-its-kind solution focuses on two core components: Identifying, analyzing, and remediating vulnerabilities on machine learning systems such as Large Language Models (LLMs)\u2026","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2023\/08\/Data_center_shutterstock_1062915266_special.jpg?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":32025,"url":"https:\/\/insidebigdata.com\/2023\/04\/15\/video-highlights-building-machine-learning-apps-with-hugging-face-llms-to-diffusion-modeling\/","url_meta":{"origin":34119,"position":5},"title":"Video Highlights: Building Machine Learning Apps with Hugging Face: LLMs to Diffusion Modeling","date":"April 15, 2023","format":false,"excerpt":"In this video presentation from our friends over at FourthBrain we have a timely presentation by Jeff Boudier, Product Director at Hugging Face, to discuss building machine learning apps with Hugging Face from LLMs to diffusion modeling.","rel":"","context":"In &quot;AI Deep Learning&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/insidebigdata.com\/wp-content\/uploads\/2023\/02\/GPT4_shutterstock_2252419881_small.png?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]}],"_links":{"self":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts\/34119"}],"collection":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/users\/10513"}],"replies":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/comments?post=34119"}],"version-history":[{"count":0,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/posts\/34119\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/media\/32645"}],"wp:attachment":[{"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/media?parent=34119"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/categories?post=34119"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/insidebigdata.com\/wp-json\/wp\/v2\/tags?post=34119"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}