Established for 2 years, each employee is worth $21 million. Why did MosaicML sell for $1.3 billion?

MosaicML was acquired by big data giant Databricks for about $1.3 billion, with its valuation increasing sixfold in this transaction, making it the largest acquisition in the first half of this year. With only two years of existence and over 60 employees, what has driven MosaicML’s high valuation?

Recently, there has been a wave of investment and acquisitions in the AI field. Prominent global companies like Salesforce invested $450 million in Anthropic, while Runway successfully raised $141 million in funding. In addition, Snowflake announced the acquisition of Neeva, and Chinese domestic giant Meituan acquired AI company Beyond Limits for $2.065 billion.

However, the most eye-catching transaction is undoubtedly the acquisition of the startup company MosaicML. It is understood that MosaicML was acquired by big data giant Databricks for about $1.3 billion, with its valuation increasing sixfold in this transaction, making it the largest acquisition in the first half of this year. With only two years of existence and over 60 employees, what has driven MosaicML’s high valuation?

Databricks acquires MosaicML, accelerating the democratization of generative AI technology

Databricks recently announced its acquisition of generative AI startup company MosaicML for about $1.3 billion (approximately 930 million RMB) to provide services for enterprises to build ChatGPT-like tools.

After the acquisition, MosaicML will become part of the Databricks Lakehouse platform, and the entire MosaicML team and technology will be incorporated into Databricks, providing enterprises with a unified platform to manage data assets and enabling them to build, own, and protect their generative AI models using their proprietary data.

MosaicML is a very young generative AI company founded in San Francisco in 2021. It has only disclosed one round of financing and has only 62 employees. In the previous financing round, its valuation was $220 million, which means that the valuation of MosaicML in this acquisition has directly increased sixfold. This transaction is the largest acquisition in the generative AI field announced so far this year. Not long ago, cloud computing giant Snowflake also announced the acquisition of another generative AI company, Neeva. After several months of investment frenzy, it seems that large enterprises are starting a wave of large-scale acquisitions of generative AI startups.

Databricks originated from UC Berkeley and was involved in the development of the ALianGuaiche SLianGuairk project. As a data storage and analysis giant, it has a valuation of $31 billion as of 2022 and helps large companies like AT&T, Shell, Walgreens process data. Recently, it open-sourced its large model Dolly, aiming to achieve similar effects to ChatGPT with fewer parameters. With the increasing popularity of cloud computing, the “lakehouse” concept proposed by SLianGuairk has deeply influenced a group of big data startups. Since its establishment in 2013, Databricks has rapidly grown into the world’s hottest data infrastructure company. Last year, Databricks announced annual revenue exceeding $1 billion, and after completing the latest financing round in August 2021, its latest valuation reached $38 billion.

Advantages of MosaicML MPT Series Models

MosaicML’s MPT series models are subclassed from the HuggingFace PretrainedModel base class and are fully compatible with the HuggingFace ecosystem. The MPT-7B model is one of the most popular models of MosaicML, with billions of parameters and the ability to handle over 2,000 natural language processing tasks. The optimization layers of MPT-7B include FlashAttention and low-precision layer normalization, which make the model 2-7 times faster than traditional training methods. The near-linear scalability of resources ensures that models with billions of parameters can be trained in a matter of hours instead of days. MosaicML has also released a new commercially available open source large language model, MPT-30B, which has 30 billion parameters and outperforms GPT-3 in performance.

Data source: Evaluation of mainstream models of MosaicML by MT-Bench

The advantages of the MPT series models lie in their efficiency and low cost. The complexity of AI models “trained” with a large amount of data has dramatically increased, and it now costs millions of dollars to train a model. This cost is beyond the reach of most small and medium-sized enterprises, except for large companies. However, MosaicML’s MPT series models allow businesses to train their own language models at a lower cost and with higher efficiency, making it easier to apply generative AI technology and achieve better business performance. Most open source language models can only handle sequences with a maximum of a few thousand tokens (see Figure 1). However, with the MosaicML platform and a single node with 8xA100-40GB, users can easily fine-tune MPT-7B to handle context lengths of up to 65k. The ability to handle such extreme context lengths comes from ALiBi, which is one of the key architectural choices in MPT-7B.

For example, the full text of “The Great Gatsby” is less than 68k tokens. In one test, the model StoryWriter read “The Great Gatsby” and generated an ending. One of the endings generated by the model is shown in Figure 2. StoryWriter read “The Great Gatsby” in about 20 seconds (about 150,000 words per minute). Due to the longer sequence length, its “typing” speed is slower than other MPT-7B models, at about 105 words per minute. Although StoryWriter was fine-tuned with a context length of 65k, ALiBi enables the model to infer longer inputs than what it was trained on: in the case of “The Great Gatsby,” up to 68k tokens, and in the test, up to 84k tokens.

Figure 2: MPT-7B-StoryWriter-65k+ wrote an ending for "The Great Gatsby." The result is providing the full text of "The Great Gatsby" (approximately 68k tokens) as input to the model, followed by the word "ending," and allowing the model to continue generating.

The Popularization of Generative AI Technology

Generative AI technology is a branch of artificial intelligence that uses a large amount of data and deep learning algorithms to automatically generate original text, images, computer code, and other content. The emergence of this technology has made it more convenient for people to process and analyze data, better serving human needs. With the rapid development of big data and artificial intelligence technology, generative AI technology has been widely applied in natural language processing, image recognition, virtual reality, and other fields. For example, in the field of natural language processing, GPT-4 has become one of the most popular generative AI models, used for tasks such as generating articles, translating languages, and answering questions. In the field of image recognition, StyleGAN2 can generate high-quality images and is used in game development, film production, virtual reality, and other fields.

The CEO of MosaicML, Naveen Rao, previously stated that since 2018, the complexity of AI models trained using large amounts of data has dramatically increased. It now costs millions of dollars to train a model, making it unaffordable for most small and medium-sized enterprises, except for large companies. However, after this acquisition, the joint product of Databricks’ Lakehouse platform and MosaicML’s technology will allow enterprises to train and build generative AI models using their proprietary data in a simple, fast, and cost-effective manner. Users will have control and ownership of their data, enabling them to develop custom AI models. According to Databricks, with the support of their platform and technology, the cost of training and using LLMs for enterprises will significantly decrease and is expected to be around thousands of dollars. This provides convenience for the popularization of generative AI.

Significance of Databricks’ Acquisition of MosaicML

The main purpose of Databricks’ acquisition of MosaicML is to accelerate the development and democratization of generative AI technology. By integrating the technologies and resources of both companies, Databricks can better meet customer needs and provide more efficient and convenient solutions. Specifically, this acquisition will bring the following changes:

1. More efficient large language models

After acquiring MosaicML, Databricks can integrate the MPT series models into its Lakehouse platform, providing customers with more efficient and cost-effective large language models. This will help enterprises better handle natural language processing tasks, improving business efficiency and accuracy.

2. Faster model training speed

MosaicML’s MPT series models have the characteristic of fast training, which will help Databricks provide faster model training services. This is particularly important for enterprises that need to quickly respond to market demands, enabling them to better meet customer needs.

3. Higher degree of democratization

Databricks’ acquisition of MosaicML also means that the democratization of generative AI technology will be further enhanced. MosaicML’s MPT series models allow small and medium-sized enterprises to more easily train their own language models, enabling them to better apply generative AI technology and achieve better business performance. This will contribute to the development and application of generative AI technology, promoting the popularization and development of artificial intelligence technology.

Summary

Generative AI applications aim to generate original text, images, and computer code based on user’s natural language prompts. Since the launch of the online generative AI chatbot ChatGPT by AI startup OpenAI in November last year, interest in this technology has surged. “Every organization should be able to benefit from the AI revolution and have more control over how they use their data. Databricks and MosaicML have an incredible opportunity to democratize AI and make Lakehouse the best place to build generative AI,” said Ali Ghodsi, co-founder and CEO of Databricks.

The significance of Databricks’ acquisition of MosaicML lies not only in accelerating the development and democratization of generative AI technology, but also in integrating the technology and resources of the two companies to provide customers with more efficient and convenient solutions. With the rapid development and application of artificial intelligence technology, generative AI technology will play an increasingly important role. Databricks’ acquisition of MosaicML reflects the attention and investment of various companies in this direction. Companies like Anthropic and OpenAI license their existing language models to enterprises, which then build generative AI applications on top of them. Driven by strong commercial demand for these models, opportunities have been created for startups like MosaicML. From the successive acquisitions by Snowflake and Databricks, we can see that large tech companies are gradually moving from independent research and strategic investments to the stage of mergers and acquisitions in generative AI technology.

Reference Source:

https://www.databricks.com/company/newsroom/press-releases/databricks-signs-definitive-agreement-acquire-mosaicml-leading-generative-ai-platform

This Week in AI: Databricks’ Acquisition of MosaicML

https://twitter.com/lmsysorg/status/1672077353533730817/photo/1

https://www.mosaicml.com/blog/mpt-7b#appendix-eval

https://www.mosaicml.com/blog/mpt-30b

Like what you're reading? Subscribe to our top stories.

We will continue to update Gambling Chain; if you have any questions or suggestions, please contact us!

Follow us on Twitter, Facebook, YouTube, and TikTok.

Share:

Was this article helpful?

93 out of 132 found this helpful

Gambling Chain Logo
Industry
Digital Asset Investment
Location
Real world, Metaverse and Network.
Goals
Build Daos that bring Decentralized finance to more and more persons Who love Web3.
Type
Website and other Media Daos

Products used

GC Wallet

Send targeted currencies to the right people at the right time.