llm Archives - AI News https://www.artificialintelligence-news.com/tag/llm/ Artificial Intelligence News Wed, 25 Oct 2023 13:31:08 +0000 en-GB hourly 1 https://www.artificialintelligence-news.com/wp-content/uploads/sites/9/2020/09/ai-icon-60x60.png llm Archives - AI News https://www.artificialintelligence-news.com/tag/llm/ 32 32 Bob Briski, DEPT®:  A dive into the future of AI-powered experiences https://www.artificialintelligence-news.com/2023/10/25/bob-briski-dept-a-dive-into-future-ai-powered-experiences/ https://www.artificialintelligence-news.com/2023/10/25/bob-briski-dept-a-dive-into-future-ai-powered-experiences/#respond Wed, 25 Oct 2023 10:25:58 +0000 https://www.artificialintelligence-news.com/?p=13782 AI News caught up with Bob Briski, CTO of DEPT®, to discuss the intricate fusion of creativity and technology that promises a new era in digital experiences. At the core of DEPT®’s approach is the strategic utilisation of large language models. Briski articulated the delicate balance between the ‘pioneering’ and ’boutique’ ethos encapsulated in their... Read more »

The post Bob Briski, DEPT®:  A dive into the future of AI-powered experiences appeared first on AI News.

]]>
AI News caught up with Bob Briski, CTO of DEPT®, to discuss the intricate fusion of creativity and technology that promises a new era in digital experiences.

At the core of DEPT®’s approach is the strategic utilisation of large language models. Briski articulated the delicate balance between the ‘pioneering’ and ’boutique’ ethos encapsulated in their tagline, “pioneering work on a global scale with a boutique culture.”

While ‘pioneering’ and ’boutique’ evokes innovation and personalised attention, ‘global scale’ signifies the broad outreach. DEPT® harnesses large language models to disseminate highly targeted, personalised messages to expansive audiences. These models, Briski pointed out, enable DEPT® to comprehend individuals at a massive scale and foster meaningful and individualised interactions.

“The way that we have been using a lot of these large language models is really to deliver really small and targeted messages to a large audience,” says Briski.

However, the integration of AI into various domains – such as retail, sports, education, and healthcare – poses both opportunities and challenges. DEPT® navigates this complexity by leveraging generative AI and large language models trained on diverse datasets, including vast repositories like Wikipedia and the Library of Congress.

Briski emphasised the importance of marrying pre-trained data with DEPT®’s domain expertise to ensure precise contextual responses. This approach guarantees that clients receive accurate and relevant information tailored to their specific sectors.

“The pre-training of these models allows them to really expound upon a bunch of different domains,” explains Briski. “We can be pretty sure that the answer is correct and that we want to either send it back to the client or the consumer or some other system that is sitting in front of the consumer.”

One of DEPT®’s standout achievements lies in its foray into the web3 space and the metaverse. Briski shared the company’s collaboration with Roblox, a platform synonymous with interactive user experiences. DEPT®’s collaboration with Roblox revolves around empowering users to create and enjoy user-generated content at an unprecedented scale. 

DEPT®’s internal project, Prepare to Pioneer, epitomises its commitment to innovation by nurturing embryonic ideas within its ‘Greenhouse’. DEPT® hones concepts to withstand the rigours of the external world, ensuring only the most robust ideas reach their clients.

“We have this internal project called The Greenhouse where we take these seeds of ideas and try to grow them into something that’s tough enough to handle the external world,” says Briski. “The ones that do survive are much more ready to use with our clients.”

While the allure of AI-driven solutions is undeniable, Briski underscored the need for caution. He warns that AI is not inherently transparent and trustworthy and emphasises the imperative of constructing robust foundations for quality assurance.

DEPT® employs automated testing to ensure responses align with expectations. Briski also stressed the importance of setting stringent parameters to guide AI conversations, ensuring alignment with the company’s ethos and desired consumer interactions.

“It’s important to really keep focused on the exact conversation you want to have with your consumer or your customer and put really strict guardrails around how you would like the model to answer those questions,” explains Briski.

In December, DEPT® is sponsoring AI & Big Data Expo Global and will be in attendance to share its unique insights. Briski is a speaker at the event and will be providing a deep dive into business intelligence (BI), illuminating strategies to enhance responsiveness through large language models.

“I’ll be diving into how we can transform BI to be much more responsive to the business, especially with the help of large language models,” says Briski.

As DEPT® continues to redefine digital paradigms, we look forward to observing how the company’s innovations deliver a new era in AI-powered experiences.

DEPT® is a key sponsor of this year’s AI & Big Data Expo Global on 30 Nov – 1 Dec 2023. Swing by DEPT®’s booth to hear more about AI and LLMs from the company’s experts and watch Briski’s day one presentation.

The post Bob Briski, DEPT®:  A dive into the future of AI-powered experiences appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/2023/10/25/bob-briski-dept-a-dive-into-future-ai-powered-experiences/feed/ 0
Deutsche Telekom and SK Telecom partner on telco-focused LLM https://www.artificialintelligence-news.com/2023/10/23/deutsche-telekom-and-sk-telecom-partner-telco-focused-llm/ https://www.artificialintelligence-news.com/2023/10/23/deutsche-telekom-and-sk-telecom-partner-telco-focused-llm/#respond Mon, 23 Oct 2023 14:31:39 +0000 https://www.artificialintelligence-news.com/?p=13776 SK Telecom and Deutsche Telekom have officially inked a Letter of Intent (LOI) to collaborate on developing a specialised LLM (Large Language Model) tailored for telecommunication companies. This momentous agreement – signed in a ceremony at SK Seorin Building, Seoul – marks the culmination of discussions initiated by the Global Telco AI Alliance, a consortium... Read more »

The post Deutsche Telekom and SK Telecom partner on telco-focused LLM appeared first on AI News.

]]>
SK Telecom and Deutsche Telekom have officially inked a Letter of Intent (LOI) to collaborate on developing a specialised LLM (Large Language Model) tailored for telecommunication companies.

This momentous agreement – signed in a ceremony at SK Seorin Building, Seoul – marks the culmination of discussions initiated by the Global Telco AI Alliance, a consortium launched in July 2023 by SK Telecom, Deutsche Telekom, E&, and Singtel.

This innovative partnership aims to create a telco-specific LLM that empowers global telcos to effortlessly and rapidly construct generative AI models. With a focus on multilingual capabilities (including German, English, and Korean), this LLM is designed to enhance customer services—particularly in areas like AI-powered contact centres.

Claudia Nemat, Member of the Board of Management for Technology and Innovation at Deutsche Telekom, said:

“AI shows impressive potential to significantly enhance human problem-solving capabilities.

To maximise its use, especially in customer service, we need to adapt existing large language models and train them with our unique data. This will elevate our generative AI tools.”

The collaboration also involves key AI industry players, such as Anthropic (Claude 2) and Meta (Llama2), enabling the co-development of a sophisticated LLM.

Anticipated to debut in the first quarter of 2024, the new telco-focused LLM will offer a deeper understanding of telecommunication service-related areas and customer intentions that surpass the capabilities of general LLMs.

One of the primary objectives of this collaboration is to assist telcos worldwide in developing flexible generative AI services, including AI agents. By streamlining the process of building AI-driven solutions like contact centres, telcos can save time and costs and open new avenues for business growth and innovation.

Ryu Young-sang, CEO of SK Telecom, commented:

“Through our partnership with Deutsche Telekom, we have secured a strong opportunity and momentum to gain global AI leadership and drive new growth.

By combining the strengths and capabilities of the two companies in AI technology, platform, and infrastructure, we expect to empower enterprises in many different industries to deliver new and higher value to their customers.”

This collaboration signifies a proactive response to the escalating demand for AI solutions within the telco industry, promising a paradigm shift in the traditional telecommunications landscape. The announcement follows SK Telecom’s $100 million investment in Anthropic in August.

See also: UMG files landmark lawsuit against AI developer Anthropic

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Deutsche Telekom and SK Telecom partner on telco-focused LLM appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/2023/10/23/deutsche-telekom-and-sk-telecom-partner-telco-focused-llm/feed/ 0
Jaromir Dzialo, Exfluency: How companies can benefit from LLMs https://www.artificialintelligence-news.com/2023/10/20/jaromir-dzialo-exfluency-how-companies-can-benefit-from-llms/ https://www.artificialintelligence-news.com/2023/10/20/jaromir-dzialo-exfluency-how-companies-can-benefit-from-llms/#respond Fri, 20 Oct 2023 15:13:43 +0000 https://www.artificialintelligence-news.com/?p=13726 Can you tell us a little bit about Exfluency and what the company does? Exfluency is a tech company providing hybrid intelligence solutions for multilingual communication. By harnessing AI and blockchain technology we provide tech-savvy companies with access to modern language tools. Our goal is to make linguistic assets as precious as any other corporate... Read more »

The post Jaromir Dzialo, Exfluency: How companies can benefit from LLMs appeared first on AI News.

]]>
Can you tell us a little bit about Exfluency and what the company does?

Exfluency is a tech company providing hybrid intelligence solutions for multilingual communication. By harnessing AI and blockchain technology we provide tech-savvy companies with access to modern language tools. Our goal is to make linguistic assets as precious as any other corporate asset.

What tech trends have you noticed developing in the multilingual communication space?

As in every other walk of life, AI in general and ChatGPT specifically is dominating the agenda. Companies operating in the language space are either panicking or scrambling to play catch-up. The main challenge is the size of the tech deficit in this vertical. Innovation and, more especially AI-innovation is not a plug-in.

What are some of the benefits of using LLMs?

Off the shelf LLMs (ChatGPT, Bard, etc.) have a quick-fix attraction. Magically, it seems, well formulated answers appear on your screen. One cannot fail to be impressed.

The true benefits of LLMs will be realised by the players who can provide immutable data with which feed the models. They are what we feed them.

What do LLMs rely on when learning language?

Overall, LLMs learn language by analysing vast amounts of text data, understanding patterns and relationships, and using statistical methods to generate contextually appropriate responses. Their ability to generalise from data and generate coherent text makes them versatile tools for various language-related tasks.

Large Language Models (LLMs) like GPT-4 rely on a combination of data, pattern recognition, and statistical relationships to learn language. Here are the key components they rely on:

  1. Data: LLMs are trained on vast amounts of text data from the internet. This data includes a wide range of sources, such as books, articles, websites, and more. The diverse nature of the data helps the model learn a wide variety of language patterns, styles, and topics.
  2. Patterns and Relationships: LLMs learn language by identifying patterns and relationships within the data. They analyze the co-occurrence of words, phrases, and sentences to understand how they fit together grammatically and semantically.
  3. Statistical Learning: LLMs use statistical techniques to learn the probabilities of word sequences. They estimate the likelihood of a word appearing given the previous words in a sentence. This enables them to generate coherent and contextually relevant text.
  4. Contextual Information: LLMs focus on contextual understanding. They consider not only the preceding words but also the entire context of a sentence or passage. This contextual information helps them disambiguate words with multiple meanings and produce more accurate and contextually appropriate responses.
  5. Attention Mechanisms: Many LLMs, including GPT-4, employ attention mechanisms. These mechanisms allow the model to weigh the importance of different words in a sentence based on the context. This helps the model focus on relevant information while generating responses.
  6. Transfer Learning: LLMs use a technique called transfer learning. They are pretrained on a large dataset and then fine-tuned on specific tasks. This allows the model to leverage its broad language knowledge from pretraining while adapting to perform specialised tasks like translation, summarisation, or conversation.
  7. Encoder-Decoder Architecture: In certain tasks like translation or summarisation, LLMs use an encoder-decoder architecture. The encoder processes the input text and converts it into a context-rich representation, which the decoder then uses to generate the output text in the desired language or format.
  8. Feedback Loop: LLMs can learn from user interactions. When a user provides corrections or feedback on generated text, the model can adjust its responses based on that feedback over time, improving its performance.

What are some of the challenges of using LLMs?

A fundamental issue, which has been there ever since we started giving away data to Google, Facebook and the like, is that “we” are the product. The big players are earning untold billions on our rush to feed their apps with our data. ChatGPT, for example, is enjoying the fastest growing onboarding in history. Just think how Microsoft has benefitted from the millions of prompts people have already thrown at it.

The open LLMs hallucinate and, because answers to prompts are so well formulated, one can be easily duped into believing what they tell you.
And to make matters worse, there are no references/links to tell you from where they sourced their answers.

How can these challenges be overcome?

LLMs are what we feed them. Blockchain technology allows us to create an immutable audit trail and with it immutable, clean data. No need to trawl the internet. In this manner we are in complete control of what data is going in, can keep it confidential, and support it with a wealth of useful meta data. It can also be multilingual!

Secondly, as this data is stored in our databases, we can also provide the necessary source links. If you can’t quite believe the answer to your prompt, open the source data directly to see who wrote it, when, in which language and which context.

What advice would you give to companies that want to utilise private, anonymised LLMs for multilingual communication?

Make sure your data is immutable, multilingual, of a high quality – and stored for your eyes only. LLMs then become a true game changer.

What do you think the future holds for multilingual communication?

As in many other walks of life, language will embrace forms of hybrid intelligence. For example, in the Exfluency ecosystem, the AI-driven workflow takes care of 90% of the translation – our fantastic bilingual subject matter experts then only need to focus on the final 10%. This balance will change over time – AI will take an ever-increasing proportion of the workload. But the human input will remain crucial. The concept is encapsulated in our strapline: Powered by technology, perfected by people.

What plans does Exfluency have for the coming year?

Lots! We aim to roll out the tech to new verticals and build communities of SMEs to serve them. There is also great interest in our Knowledge Mining app, designed to leverage the information hidden away in the millions of linguistic assets. 2024 is going to be exciting!

  • Jaromir Dzialo is the co-founder and CTO of Exfluency, which offers affordable AI-powered language and security solutions with global talent networks for organisations of all sizes.

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Jaromir Dzialo, Exfluency: How companies can benefit from LLMs appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/2023/10/20/jaromir-dzialo-exfluency-how-companies-can-benefit-from-llms/feed/ 0
MLPerf Inference v3.1 introduces new LLM and recommendation benchmarks https://www.artificialintelligence-news.com/2023/09/12/mlperf-inference-v3-1-new-llm-recommendation-benchmarks/ https://www.artificialintelligence-news.com/2023/09/12/mlperf-inference-v3-1-new-llm-recommendation-benchmarks/#respond Tue, 12 Sep 2023 11:46:58 +0000 https://www.artificialintelligence-news.com/?p=13581 The latest release of MLPerf Inference introduces new LLM and recommendation benchmarks, marking a leap forward in the realm of AI testing. The v3.1 iteration of the benchmark suite has seen record participation, boasting over 13,500 performance results and delivering up to a 40 percent improvement in performance.  What sets this achievement apart is the... Read more »

The post MLPerf Inference v3.1 introduces new LLM and recommendation benchmarks appeared first on AI News.

]]>
The latest release of MLPerf Inference introduces new LLM and recommendation benchmarks, marking a leap forward in the realm of AI testing.

The v3.1 iteration of the benchmark suite has seen record participation, boasting over 13,500 performance results and delivering up to a 40 percent improvement in performance. 

What sets this achievement apart is the diverse pool of 26 different submitters and over 2,000 power results, demonstrating the broad spectrum of industry players investing in AI innovation.

Among the list of submitters are tech giants like Google, Intel, and NVIDIA, as well as newcomers Connect Tech, Nutanix, Oracle, and TTA, who are participating in the MLPerf Inference benchmark for the first time.

David Kanter, Executive Director of MLCommons, highlighted the significance of this achievement:

“Submitting to MLPerf is not trivial. It’s a significant accomplishment, as this is not a simple point-and-click benchmark. It requires real engineering work and is a testament to our submitters’ commitment to AI, to their customers, and to ML.”

MLPerf Inference is a critical benchmark suite that measures the speed at which AI systems can execute models in various deployment scenarios. These scenarios span from the latest generative AI chatbots to the safety-enhancing features in vehicles, such as automatic lane-keeping and speech-to-text interfaces.

The spotlight of MLPerf Inference v3.1 shines on the introduction of two new benchmarks:

  • An LLM utilising the GPT-J reference model to summarise CNN news articles garnered submissions from 15 different participants, showcasing the rapid adoption of generative AI.
  • An updated recommender benchmark – refined to align more closely with industry practices – employs the DLRM-DCNv2 reference model and larger datasets, attracting nine submissions. These new benchmarks are designed to push the boundaries of AI and ensure that industry-standard benchmarks remain aligned with the latest trends in AI adoption, serving as a valuable guide for customers, vendors, and researchers alike.

Mitchelle Rasquinha, co-chair of the MLPerf Inference Working Group, commented: “The submissions for MLPerf Inference v3.1 are indicative of a wide range of accelerators being developed to serve ML workloads.

“The current benchmark suite has broad coverage among ML domains, and the most recent addition of GPT-J is a welcome contribution to the generative AI space. The results should be very helpful to users when selecting the best accelerators for their respective domains.”

MLPerf Inference benchmarks primarily focus on datacenter and edge systems. The v3.1 submissions showcase various processors and accelerators across use cases in computer vision, recommender systems, and language processing.

The benchmark suite encompasses both open and closed submissions in the performance, power, and networking categories. Closed submissions employ the same reference model to ensure a level playing field across systems, while participants in the open division are permitted to submit a variety of models.

As AI continues to permeate various aspects of our lives, MLPerf’s benchmarks serve as vital tools for evaluating and shaping the future of AI technology.

Find the detailed results of MLPerf Inference v3.1 here.

(Photo by Mauro Sbicego on Unsplash)

See also: GitLab: Developers view AI as ‘essential’ despite concerns

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post MLPerf Inference v3.1 introduces new LLM and recommendation benchmarks appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/2023/09/12/mlperf-inference-v3-1-new-llm-recommendation-benchmarks/feed/ 0
Anthropic launches ChatGPT rival Claude 2 https://www.artificialintelligence-news.com/2023/07/12/anthropic-launches-chatgpt-rival-claude-2/ https://www.artificialintelligence-news.com/2023/07/12/anthropic-launches-chatgpt-rival-claude-2/#respond Wed, 12 Jul 2023 15:28:16 +0000 https://www.artificialintelligence-news.com/?p=13274 Anthropic has launched Claude 2, an advanced large language model (LLM) that excels in coding, mathematics, and reasoning tasks. Claude 2 is designed to simulate conversations with a helpful colleague or personal assistant. The latest version has been fine-tuned to deliver an improved user experience, with enhanced conversational abilities, clearer explanations, reduced production of harmful... Read more »

The post Anthropic launches ChatGPT rival Claude 2 appeared first on AI News.

]]>
Anthropic has launched Claude 2, an advanced large language model (LLM) that excels in coding, mathematics, and reasoning tasks.

Claude 2 is designed to simulate conversations with a helpful colleague or personal assistant. The latest version has been fine-tuned to deliver an improved user experience, with enhanced conversational abilities, clearer explanations, reduced production of harmful outputs, and extended memory.

In coding proficiency, Claude 2 outperforms its predecessor and achieves a higher score on the Codex HumanEval Python programming test. Its proficiency in solving grade-school math problems, evaluated through GSM8k, has also seen a notable improvement.

“When it comes to AI coding, devs need fast and reliable access to context about their unique codebase and a powerful LLM with a large context window and strong general reasoning capabilities,” says Quinn Slack, CEO & Co-founder of Sourcegraph.

“The slowest and most frustrating parts of the dev workflow are becoming faster and more enjoyable. Thanks to Claude 2, Cody’s helping more devs build more software that pushes the world forward.”

Claude 2 introduces expanded input and output length capabilities, allowing it to process prompts of up to 100,000 tokens. This enhancement enables the model to analyse lengthy documents such as technical guides or entire books, and generate longer compositions as outputs.

“We are really happy to be among the first to offer Claude 2 to our customers, bringing enhanced semantics, up-to-date knowledge training, improved reasoning for complex prompts, and the ability to effortlessly remix existing content with a 3X larger context window,” said Greg Larson, VP of Engineering at Jasper.

“We are proud to help our customers stay ahead of the curve through partnerships like this one with Anthropic.”

Anthropic has focused on minimising the generation of harmful or offensive outputs by Claude 2. While measuring such qualities is challenging, an internal evaluation showed that Claude 2 was twice as effective at providing harmless responses compared to its predecessor, Claude 1.3.

Anthropic acknowledges that while Claude 2 can analyse complex works, it is vital to recognise the limitations of language models. Users should exercise caution and not rely on them as factual references. Instead, Claude 2 should be utilised to process data provided by users who are already knowledgeable about the subject matter and can validate the results.

As users leverage Claude 2’s capabilities, it is crucial to understand its limitations and use it responsibly for tasks that align with its strengths, such as information summarisation and organisation.

Users can explore Claude 2 for free here.

(Image Credit: Anthropic)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The event is co-located with Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Anthropic launches ChatGPT rival Claude 2 appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/2023/07/12/anthropic-launches-chatgpt-rival-claude-2/feed/ 0
Databricks acquires LLM pioneer MosaicML for $1.3B https://www.artificialintelligence-news.com/2023/06/28/databricks-acquires-llm-pioneer-mosaicml-for-1-3b/ https://www.artificialintelligence-news.com/2023/06/28/databricks-acquires-llm-pioneer-mosaicml-for-1-3b/#respond Wed, 28 Jun 2023 09:22:15 +0000 https://www.artificialintelligence-news.com/?p=13238 Databricks has announced its definitive agreement to acquire MosaicML, a pioneer in large language models (LLMs). This strategic move aims to make generative AI accessible to organisations of all sizes, allowing them to develop, possess, and safeguard their own generative AI models using their own data.  The acquisition, valued at ~$1.3 billion – inclusive of... Read more »

The post Databricks acquires LLM pioneer MosaicML for $1.3B appeared first on AI News.

]]>
Databricks has announced its definitive agreement to acquire MosaicML, a pioneer in large language models (LLMs).

This strategic move aims to make generative AI accessible to organisations of all sizes, allowing them to develop, possess, and safeguard their own generative AI models using their own data. 

The acquisition, valued at ~$1.3 billion – inclusive of retention packages – showcases Databricks’ commitment to democratising AI and reinforcing the company’s Lakehouse platform as a leading environment for building generative AI and LLMs.

Naveen Rao, Co-Founder and CEO at MosaicML, said:

“At MosaicML, we believe in a world where everyone is empowered to build and train their own models, imbued with their own opinions and viewpoints — and joining forces with Databricks will help us make that belief a reality.

We started MosaicML to solve the hard engineering and research problems necessary to make large-scale training more accessible to everyone. With the recent generative AI wave, this mission has taken centre stage.

Together with Databricks, we will tip the scales in the favour of many — and we’ll do it as kindred spirits: researchers turned entrepreneurs sharing a similar mission. We look forward to continuing this journey together with the AI community.”

MosaicML has gained recognition for its cutting-edge MPT large language models, with millions of downloads for MPT-7B and the recent release of MPT-30B.

The platform has demonstrated how organisations can swiftly construct and train their own state-of-the-art models cost-effectively by utilising their own data. Esteemed customers like AI2, Generally Intelligent, Hippocratic AI, Replit, and Scatter Labs have leveraged MosaicML for a diverse range of generative AI applications.

The primary objective of this acquisition is to provide organisations with a simple and rapid method to develop, own, and secure their models. By combining the capabilities of Databricks’ Lakehouse Platform with MosaicML’s technology, customers can maintain control, security, and ownership of their valuable data without incurring exorbitant costs.

MosaicML’s automatic optimisation of model training enables 2x-7x faster training compared to standard approaches, and the near linear scaling of resources allows for the training of multi-billion-parameter models within hours. Consequently, Databricks and MosaicML aim to reduce the cost of training and utilising LLMs from millions to thousands of dollars.

The integration of Databricks’ unified Data and AI platform with MosaicML’s generative AI training capabilities will result in a robust and flexible platform capable of serving the largest organisations and addressing various AI use cases.

Upon the completion of the transaction, the entire MosaicML team – including its renowned research team – is expected to join Databricks.

MosaicML’s machine learning and neural networks experts are at the forefront of AI research, striving to enhance model training efficiency. They have contributed to popular open-source foundational models like MPT-30B, as well as the training algorithms powering MosaicML’s products.

The MosaicML platform will be progressively supported, scaled, and integrated to provide customers with a seamless unified platform where they can build, own, and secure their generative AI models. The partnership between Databricks and MosaicML empowers customers with the freedom to construct their own models, train them using their unique data, and develop differentiating intellectual property for their businesses.

The completion of the proposed acquisition is subject to customary closing conditions, including regulatory clearances.

(Photo by Glen Carrie on Unsplash)

See also: MosaicML’s latest models outperform GPT-3 with just 30B parameters

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The event is co-located with Digital Transformation Week.

The post Databricks acquires LLM pioneer MosaicML for $1.3B appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/2023/06/28/databricks-acquires-llm-pioneer-mosaicml-for-1-3b/feed/ 0
MosaicML’s latest models outperform GPT-3 with just 30B parameters https://www.artificialintelligence-news.com/2023/06/23/mosaicml-models-outperform-gpt-3-30b-parameters/ https://www.artificialintelligence-news.com/2023/06/23/mosaicml-models-outperform-gpt-3-30b-parameters/#respond Fri, 23 Jun 2023 08:16:22 +0000 https://www.artificialintelligence-news.com/?p=13210 Open-source LLM provider MosaicML has announced the release of its most advanced models to date, the MPT-30B Base, Instruct, and Chat. These state-of-the-art models have been trained on the MosaicML Platform using NVIDIA’s latest-generation H100 accelerators and claim to offer superior quality compared to the original GPT-3 model. With MPT-30B, businesses can leverage the power... Read more »

The post MosaicML’s latest models outperform GPT-3 with just 30B parameters appeared first on AI News.

]]>
Open-source LLM provider MosaicML has announced the release of its most advanced models to date, the MPT-30B Base, Instruct, and Chat.

These state-of-the-art models have been trained on the MosaicML Platform using NVIDIA’s latest-generation H100 accelerators and claim to offer superior quality compared to the original GPT-3 model.

With MPT-30B, businesses can leverage the power of generative AI while maintaining data privacy and security.

Since their launch in May 2023, the MPT-7B models have gained significant popularity, with over 3.3 million downloads. The newly released MPT-30B models provide even higher quality and open up new possibilities for various applications.

MosaicML’s MPT models are optimised for efficient training and inference, allowing developers to build and deploy enterprise-grade models with ease.

One notable achievement of MPT-30B is its ability to surpass the quality of GPT-3 while using only 30 billion parameters compared to GPT-3’s 175 billion. This makes MPT-30B more accessible to run on local hardware and significantly cheaper to deploy for inference.

The cost of training custom models based on MPT-30B is also considerably lower than the estimates for training the original GPT-3, making it an attractive option for enterprises.

Furthermore, MPT-30B was trained on longer sequences of up to 8,000 tokens, enabling it to handle data-heavy enterprise applications. Its performance is backed by the usage of NVIDIA’s H100 GPUs, which provide increased throughput and faster training times.

Several companies have already embraced MosaicML’s MPT models for their AI applications. 

Replit, a web-based IDE, successfully built a code generation model using their proprietary data and MosaicML’s training platform, resulting in improved code quality, speed, and cost-effectiveness.

Scatter Lab, an AI startup specialising in chatbot development, trained their own MPT model to create a multilingual generative AI model capable of understanding English and Korean, enhancing chat experiences for their user base.

Navan, a global travel and expense management software company, is leveraging the MPT foundation to develop custom LLMs for applications such as virtual travel agents and conversational business intelligence agents.

Ilan Twig, Co-Founder and CTO at Navan, said:

“At Navan, we use generative AI across our products and services, powering experiences such as our virtual travel agent and our conversational business intelligence agent.

MosaicML’s foundation models offer state-of-the-art language capabilities while being extremely efficient to fine-tune and serve inference at scale.” 

Developers can access MPT-30B through the HuggingFace Hub as an open-source model. They have the flexibility to fine-tune the model on their data and deploy it for inference on their infrastructure.

Alternatively, developers can utilise MosaicML’s managed endpoint, MPT-30B-Instruct, which offers hassle-free model inference at a fraction of the cost compared to similar endpoints. At $0.005 per 1,000 tokens, MPT-30B-Instruct provides a cost-effective solution for developers.

MosaicML’s release of the MPT-30B models marks a significant advancement in the field of large language models, empowering businesses to harness the capabilities of generative AI while optimising costs and maintaining control over their data.

(Photo by Joshua Golde on Unsplash)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The event is co-located with Digital Transformation Week.

The post MosaicML’s latest models outperform GPT-3 with just 30B parameters appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/2023/06/23/mosaicml-models-outperform-gpt-3-30b-parameters/feed/ 0
Palantir demos how AI can be used in the military https://www.artificialintelligence-news.com/2023/04/28/palantir-demos-how-ai-can-used-military/ https://www.artificialintelligence-news.com/2023/04/28/palantir-demos-how-ai-can-used-military/#respond Fri, 28 Apr 2023 13:29:50 +0000 https://www.artificialintelligence-news.com/?p=12995 Palantir has demonstrated how AI can be used for national defense and other military purposes. The use of AI in the military is highly controversial. In this context, Large Language Models (LLMs) and algorithms must be implemented as ethically as possible. Palantir believes that’s where its AI Platform (AIP) comes in. AIP offers cutting-edge AI... Read more »

The post Palantir demos how AI can be used in the military appeared first on AI News.

]]>
Palantir has demonstrated how AI can be used for national defense and other military purposes.

The use of AI in the military is highly controversial. In this context, Large Language Models (LLMs) and algorithms must be implemented as ethically as possible.

Palantir believes that’s where its AI Platform (AIP) comes in. AIP offers cutting-edge AI capabilities and claims to ensure that the use of LLMs and AI in the military context is guided by ethical principles.

AIP is able to deploy LLMs and AI across any network, from classified networks to devices on the tactical edge. AIP connects highly sensitive and classified intelligence data to create a real-time representation of the environment.

The solution’s security features let you define what LLMs and AI can and cannot see and what they can and cannot do with safe AI and handoff functions. This control and governance are crucial for mitigating significant legal, regulatory, and ethical risks posed by LLMs and AI in sensitive and classified settings.

AIP also implements guardrails to control, govern, and increase trust. As operators and AI take action on the platform, AIP generates a secure digital record of operations. These capabilities are essential for responsible, effective, and compliant deployment of AI in the military.

In a demo showcasing AIP, a military operator responsible for monitoring activity within Eastern Europe receives an alert that military equipment is amassed in a field 30km from friendly forces.

AIP leverages large language models to allow operators to quickly ask questions such as:

  • What enemy units are in the region?
  • Task new imagery for this location at a resolution of one metre or higher
  • Generate three courses of action to target this enemy equipment
  • Analyse the battlefield, considering a Stryker vehicle and a platoon-size unit
  • How many Javelin missiles does Team Omega have?
  • Assign jammers to each of the validated high-priority communications targets
  • Summarise the operational plan

As the operator poses questions, the LLM is using real-time information integrated from across public and classified sources. Data is automatically tagged and protected by classification markings, and AIP enforces which parts of the organisation the LLM has access to while respecting an individual’s permissions, role, and need to know.

Every response from AIP retains links back to the underlying data records to enable transparency for the user who can investigate as necessary.

AIP unleashes the power of large language models and cutting-edge AI for defense and military organisations while aiming to do so with the appropriate guardrails and high levels of ethics and transparency that are required for such sensitive applications.


(Image Credit: Palantir)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The event is co-located with Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Palantir demos how AI can be used in the military appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/2023/04/28/palantir-demos-how-ai-can-used-military/feed/ 0
Alibaba unveils ChatGPT rival and custom LLMs https://www.artificialintelligence-news.com/2023/04/11/alibaba-unveils-chatgpt-rival-custom-llms/ https://www.artificialintelligence-news.com/2023/04/11/alibaba-unveils-chatgpt-rival-custom-llms/#respond Tue, 11 Apr 2023 12:40:51 +0000 https://www.artificialintelligence-news.com/?p=12910 Chinese tech giant Alibaba has unveiled a ChatGPT rival and the ability to create custom LLMs (Large Language Models) for customers. Alibaba’s ChatGPT rival is called Tongyi Qianwen and will be integrated across the company’s various businesses in the “near future,” but it is yet to give a rollout timeline. “We are at a technological... Read more »

The post Alibaba unveils ChatGPT rival and custom LLMs appeared first on AI News.

]]>
Chinese tech giant Alibaba has unveiled a ChatGPT rival and the ability to create custom LLMs (Large Language Models) for customers.

Alibaba’s ChatGPT rival is called Tongyi Qianwen and will be integrated across the company’s various businesses in the “near future,” but it is yet to give a rollout timeline.

“We are at a technological watershed moment driven by generative AI and cloud computing, and businesses across all sectors have started to embrace intelligence transformation to stay ahead of the game,” said Daniel Zhang, Chairman and CEO of Alibaba Group and CEO of Alibaba Cloud Intelligence.

“As a leading global cloud computing service provider, Alibaba Cloud is committed to making computing and AI services more accessible and inclusive for enterprises and developers, enabling them to uncover more insights, explore new business models for growth, and create more cutting-edge products and services for society.”

Tongyi Qianwen roughly translates to “seeking an answer by asking a thousand questions” and will support both English and Chinese languages.

Alibaba has stated that the chatbot will first be added to DingTalk, its workplace messaging app. Tongyi Qianwen will be able to perform several tasks at launch, including taking notes in meetings, writing emails, and drafting business proposals.

The chatbot will be integrated into Tmall Genie, similar to Amazon’s line of Echo smart speakers. That integration will give Alibaba an advantage over its Western counterparts such as Google which are yet to integrate their own equivalents into their smart speakers. 

Tongyi Qianwen is powered by an LLM that reportedly consists of ten trillion parameters, which is significantly more than GPT-4 (estimated to consist of around one trillion parameters.)

The model will be used as the foundation for a new service by Alibaba that will see the company build custom LLMs for customers. The LLMs will use “customers’ proprietary intelligence and industrial know-how” to build AI-infused apps without developing a model from scratch. A beta version of a Tongyi Qianwen API is already available for Chinese developers.

“Generative AI powered by large language models is ushering in an unprecedented new phase. In this latest AI era, we can create additional value for our customers and broader communities through our resilient public cloud infrastructure and proven AI capabilities,” said Jingren Zhou, CTO of Alibaba Cloud Intelligence.

“We are witnessing a new paradigm of AI development where cloud and AI models play an essential role. By making this paradigm more inclusive, we hope to facilitate businesses from all industries with their intelligence transformation and, ultimately, help boost their business productivity and expand their expertise and capabilities while unlocking more exciting opportunities through innovations.”

Last month, a group of high-profile figures in the technology industry called for the suspension of training powerful AI systems. Twitter CEO Elon Musk and Apple co-founder Steve Wozniak were among those who signed an open letter warning of potential risks and said the race to develop AI systems is out of control.

A report by investment bank Goldman Sachs estimated that AI could replace the equivalent of 300 million full-time jobs. An AI think tank, meanwhile, called GPT-4 a risk to public safety.

Alibaba’s announcements were made at its Cloud Summit, which also featured the debut of three-month trials for its Infrastructure-as-a-Service (IaaS) and PolarDB services. The company is offering a 50 percent discount for its storage-as-a-service offering if users reserve capacity in a specific region for a year.

The company has not yet revealed the cost of using Tongyi Qianwen.

(Image Source: www.alibabagroup.com)

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The event is co-located with Digital Transformation Week.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Alibaba unveils ChatGPT rival and custom LLMs appeared first on AI News.

]]>
https://www.artificialintelligence-news.com/2023/04/11/alibaba-unveils-chatgpt-rival-custom-llms/feed/ 0