- The French AI startup is taking generative AI to new heights with its commercial and open source LLM.
- With strong backing from high-profile investors like Microsoft and Andreessen Horowitz, Mistral has reached a $5 billion valuation, positioning itself as a formidable competitor in the increasingly crowded generative AI market.
- By making its LLM more accessible than some of the most powerful AI companies, Mistral argues that “by training our own models, publishing them publicly, and facilitating community contributions, we can build a credible alternative to the emerging AI oligopoly.”
What is Mistral AI?
Mistral AI is a French artificial intelligence startup launched in 2023. It builds open source and commercial AI models, some of which achieve state-of-the-art performance on multiple industry benchmarks.
With strong backing from high-profile investors like Microsoft and Andreessen Horowitz, Mistral is valued at $5 billion, positioning itself as a formidable competitor in the increasingly crowded generative AI market. The company’s top commercial LLM outperforms those developed by incumbents such as Google and Anthropic on several industry benchmarks, and even makes OpenAI’s GPT-4, often considered the gold standard of AI model performance, worth the money.
The company has also made a set of open source models that anyone can use and modify for free. By making its LLM more accessible than some of the most powerful AI companies, Mistral argues that “by training our own models, publishing them publicly, and facilitating community contributions, we can build a credible alternative to the emerging AI oligopoly.”
What does Mistral AI offer?
Mistral AI offers several LLMs, both commercial and open source. Each has their own unique set of strengths and abilities.
All of Mistral’s commercial models are closed-source and only available through its API.
Mistral Large: The most advanced of Mistral AI’s models.
Ideal for complex tasks like synthetic text generation and code generation.
Ranks second to GPT-4 in several industry benchmarks.
Has a maximum context window of 32k tokens.
Natively fluent in English, French, Spanish, German and Italian, as well as code.
Mistral Small: Focused on efficient reasoning for low latency workloads.
Ideal for simple tasks that can be done in bulk, like text generation and text classification.
Has a maximum context window of 32k tokens.
Natively fluent in English, French, Spanish, German and Italian, as well as code.
Mistral Embed: Converts text into numerical representations (aka “embeddings”) so it can process and analyse words in a way that is understandable to a computer.
Ideal for tasks like sentiment analysis and text classification.
Currently available in English only.
All of Mistral’s open source models are available for free under Apache 2.0, a fully permissive license that allows anyone to use them anywhere, with no restrictions.
Mistral 7B:Designed for easy customisation and fast deployment.
Can handle high volumes of data faster and with minimal computational cost.
Trained on a dataset of about 7 billion parameters, but it outperforms Llama 2 (13 billion parameters) and matches models with up to 30 billion parameters.
Has a maximum context window of 32k tokens.
Can be used in English and code.
Mixtral 8x7B: Designed to perform well with minimal computational effort.
Uses a mixture of experts architecture; only uses about 12 billion of its potential 45 billion parameters for inference.
Outperforms both Llama 2 (70 billion parameters) and GPT-3.5 (175 billion parameters) on most benchmarks.
Has a maximum context window of 32k tokens.
Natively fluent in English, French, Spanish, German and Italian, as well as code.
Mixtral 8x22B: The most advanced of Mistral AI’s open source models
Ideal for tasks like summarising large documents or generating lots of text.
A bigger version of Mixtral 8x7B; only uses about 39 billion of its potential 141 billion parameters for inference.
Outperforms Llama 2 70B and Cohere’s Command R and R+ in cost-performance ratio.
Has a maximum context window of 64k tokens.
Natively fluent in English, French, Spanish, German and Italian, as well as code.
Le Chat: In addition to its LLMs, Mistral AI offers Le Chat, an AI chatbot that can generate content and carry on conversations with users — similar to platforms like ChatGPT, Gemini and Claude. Mistral AI also allows users to choose which of its models they want operating under the hood — Mistral Large for better reasoning, Mistral Small for speed and cost-effectiveness or Mistral Next, a prototype model that is designed to give brief and concise answers.
Le Chat does not have real-time access to the internet, though, so its answers may not always be up-to-date. And like any generative AI tool, it can produce biased responses and get things wrong. But Mistral says it is working to make its models as “useful and as little opinionated as possible.”
Also read: How to create a large language model (LLM)?
What are Mistral AI used for?
All LLMS of Mistral AI are base models, which means they can be fine-tuned and used for a wide range of natural language processing tasks, such as:
Chatbots: Enable chatbots to understand users’ natural language queries and respond in a more accurate and human way.
Text summary: Extract the essence of articles and documents and summarise their key points in a concise overview.
Content creation: Generate natural language text, including emails, social media copy, short stories, cover letters, and more.
Text classification: Categorises text into different categories, such as marking email as spam or non-spam based on content.
Code completion: Generate snippets of code, optimise existing code and suggest bug fixes to speed up the development process.
How to use Mistral AI model?
All models of Mistral AI can be found on its website. They are also available on platforms such as Amazon Bedrock, Databricks, Snowflake Cortex, and Azure AI.
To use these models directly on Mistral AI’s website, visit La Plateforme, its AI development and deployment platform. From there, you can set up guardrails and fine-tune models to your specifications before integrating them into your own applications and projects. The pricing range depends on the model you use. For example, the Mistral 7B costs $0.25 per million input tokens, while the Mistral Large costs $24 per million output tokens.
You can also interact with large and small models of Mistral via Le Chat, the company’s free AI chatbot.All models of Mistral AI can be found on its website. They are also available on platforms such as Amazon Bedrock, Databricks, Snowflake Cortex, and Azure AI.
To use these models directly on Mistral AI’s website, visit La Plateforme, its AI development and deployment platform. From there, you can set up guardrails and fine-tune models to your specifications before integrating them into your own applications and projects. The pricing range depends on the model you use. For example, the Mistral 7B costs $0.25 per million input tokens, while the Mistral Large costs $24 per million output tokens.
You can also interact with large and small models of Mistral via Le Chat, the company’s free AI chatbot.