In 2017 Aidan Gomez (pictured, center) was the youngest of a very special group of eight researchers. All of them were part of Google Brain, the division that ended up merging with DeepMind. And all of them published at that time the most relevant study in the field of artificial intelligence in years.
That publication, titled ‘Attention is all you need‘, introduced the concept of Transformer. That was the trigger for the emergence of generative AI models and chatbots like ChatGPT.
Gomez would end up working with Geoffrey Hinton in his Toronto laboratory, but he would not last long there. In 2019 decided to found Cohere alongside Nick Frosst (pictured, right) and Ivan Zhang (left).
All of them had studied at the University of Toronto, and that ended up making them come together to create an AI startup that is starting to make noise and could stand up to the big companies.
This is how Cohere differentiates itself from the AI giants
Cohere is a Canadian company with about 300 employees that has become an interesting alternative in this segment. And it is because it has a very different approach from its rivals, and especially from OpenAI, with which it is often compared. Because? Three reasons.
- SaaS model. Companies like OpenAI offer access to their AI models through an API and charge for each token their LLMs generate. Furthermore, these companies execute these queries on infrastructure that they have set up in cloud infrastructures such as those of Microsoft, Google or Amazon. Cohere proposes a SaaS (Software-as-a-Service) model in which the client already has its own infrastructure, and uses Cohere models, which charges a commission for it. As explained Gomez, “that allows us to have much higher margins, because we are not paying for computing.”
- The chatbot doesn’t matter (that much). The AI majors boast chatbots, but Cohere does not have a product of that type. These assistants, which take advantage of the freemium model, require a huge computing infrastructure and certainly allow many users to be convinced to switch to their paid plans, but at Cohere they prefer to save a lot on inference costs (generating responses). “We are starting to reach the tipping point where the cost of computing for inference is higher than that of training [de los modelos]which indicates the maturity of the market,” Gomez highlighted.
- Ad hoc models for companies. The other great differentiating point of Cohere is that it is not aimed at end users but at companies. And it is to these companies that it offers highly tuned models for specific purposes. Nick Frosst explained on Business Insider how “we have discovered that fitting small models with data sets [especializados] allows us to obtain great results.”
Small, precise and cheap models
That strategy seems to be gaining strength. In March 2024 the company He launched a scalable generative model specifically aimed at companies that he called Command R. A month later he launched Command R+a supercharged version of the above that also offers a 128k token context window.
Both can be used (with limits and achieving a Free API testing) on platforms like OpenRouter either Hugging Facebut also from Cohere’s own platform through Coralyour user interface for evaluating your models.
In both cases Cohere promises something important: it makes use of a Retrieval Augmented Generation (RAG) system, which according to them reduces the usual “hallucinations” of AI models. To do this, it includes citations and references—in the style of what Perplexity.ai does, for example—and above all it tries to adapt to the specific needs of each company and each scenario in which these models are used.
This allows important advantages according to those responsible for Cohere. Internal tests indicated how when analyzing scientific and financial information, fine-tuned versions of Command R were more accurate than their rivals.
Thus, at summarize meetings the accuracy of Command R was 80.2%, while GPT-4 was 78.8% and in Claude Opus 77.9%. In the analysis of financial data, Cohere’s model was 6.2% more accurate than OpenAI’s and 5.3% than Anthropic’s, although we insist, according to internal tests.
Even more important and interesting is the cost: running these fine-tuned Cohere models, which are called inference costs, it is noticeably cheaper than to use OpenAI’s: generate a million tokens costs between 2 and 4 dollars with Cohere, but costs between 30 and 60 dollars with GPT-4.
They are not the only proposals from Cohere, who just a few days ago presented its Aya 23 8B and 35B models, which propose a bet close to Llama 3, with open weight models, behavior apparently remarkable and also available in 23 different languages.
But there is also uncertainty
In June 2023 Cohere announced a financing round of 270 million dollars led by the investment firm Inovia Capital, but in which giants such as NVIDIA, Oracle and Salesforce also participated.
This made the firm’s valuation rise to 2.1 billion euros, a remarkable figure. Even so, this investment is far from those that had been made until then in firms such as OpenAI (11.3 billion dollars according to TechCrunch), or Anthropic ($450 million).
In recent weeks there has been talk that a new investment round of another $500 million is being prepared that would bring its market valuation to $5 billion. That would provide even more room for growth for the company and would consolidate its commitment to a different model from other companies in this field, but there is also uncertainty about its future.
Above all, for the income, still very modest. According to The InformationCohere had only cashed in $13 million in all of 2023, although things improved at the beginning of the year: at the end of the first quarter of the year that figure had increased to $35 million.
These data are far behind if we look at the theoretical income of Anthropic and OpenAI. Once again according to The Information, Anthropic will generate more than 850 million dollars in 2024, while OpenAI reach to 5,000 million according to estimates.
Competition is also accelerating, especially in the case of the large companies, which are investing huge amounts of money to develop their models and make them available to all audiences.
The rise of Open Source models – although they are not completely Open Source – such as Llama 3 also threatens Cohere’s proposal, especially because it facilitates the implementation of customized implementations if companies have experts who can train and fine-tune them. secure and private those models.
Cohere undoubtedly has great assets to be relevant in this increasingly competitive market, but it will be interesting to see if its approach, notably different from that of its competitors, ends up bearing the fruits they expect.
Image | Cohere
In Xataka | This is why Google didn’t launch its AI chatbot before ChatGPT (according to former Google product manager Brain)
Add Comment