Tired of AI doomsday tropes, Cohere CEO says his goal is technology that ‘contributes to humanity’

By | March 26, 2024

Aidan Gomez can take some credit for the ‘T’ at the end of ChatGPT. He was part of a group of Google engineers who first introduced a new artificial intelligence model called Transformer.

This helped lay the foundation for today’s generative AI boom, which ChatGPT maker OpenAI and others are building on. Gomez, one of eight co-authors of Google’s 2017 paper, was a 20-year-old intern at the time.

He is currently the CEO and co-founder of Cohere, a Toronto-based startup that competes with other leading AI companies in providing large businesses and organizations with large language models and the chatbots they support.

Gomez spoke with The Associated Press about the future of generative artificial intelligence. The interview has been edited for length and clarity.

Q: What is a transformer?

A: Transformer is the architecture of a neural network; It is the structure of the calculation that takes place within the model. The reason transformers are special compared to their counterparts (other competing architectures, other ways of structuring neural networks) is that they actually scale very well. They can be trained on not just thousands, but tens of thousands of chips. They can be trained extremely quickly. They use many different processes for which these GPUs (graphics chips) are specifically designed. They do this process faster and more efficiently compared to before the transformer.

Q: How important are these to what you do at Cohere?

A: Very important. We use transformer architecture like everyone else when building large language models. Cohere’s focus is on scalability and production readiness for businesses. Some of the other models we compete with are very large and extremely inefficient. You can’t actually put this into production because once you encounter real users the costs will rise and the economy will collapse.

Q: What is a specific example of how a customer uses the Cohere model?

A: I have a favorite example in healthcare. This is due to the surprising fact that 40% of a doctor’s working day is spent writing patient notes. What if we could have doctors wear a little passive listening device that would follow them throughout the day between patient visits by listening to the conversation and filling out those notes in advance so that instead of having to write it from scratch, the first draft is there. They can read it and just make edits. Suddenly the capacity of doctors increased greatly.

Q: How do you address customer concerns about AI language models being prone to “hallucinations” (errors) and bias?

A: Clients are always concerned about hallucinations and biases. It leads to a bad product experience. Therefore, this is an issue we focus on very much. For hallucinations, we focus on RAG, the retrieval-augmented generation. We have released a new model called the Command R that is explicitly targeted at RAG. It allows you to link the model to specific sources that contain reliable information. These could be your organization’s internal documents or a specific employee’s emails. You’re giving the model access to information it doesn’t otherwise see on the web while it’s learning. The important thing is that it also allows you to check the authenticity of the model, because now instead of just inputting and outputting text, the model actually references documents. He can quote where he got this information from. You can check its operation and gain much more confidence when working with the tool. It greatly reduces hallucination.

Q: What are the biggest public misconceptions about generative AI?

A: The fear held by some individuals and organizations that this technology is a terminal, existential risk. These are the stories humanity has been telling itself for decades. Technology is coming, taking over and replacing us, making us submissive. These are deeply ingrained in the cultural brainstem of the people. A very interesting narrative. It’s easier to capture people’s imagination and fears when you tell them that. We pay a lot of attention to this because it is a very gripping story. But the truth is, I think this technology will be extremely good. There are many arguments as to how the situation could get worse; Those of us who develop the technology are aware of these risks and are working to reduce them. We all want this to go well. We all want technology to contribute to humanity, not be a threat.

Q: Not only OpenAI, but also a number of major tech companies are now openly saying they are trying to build artificial general intelligence (a term loosely used for better-than-human AI). Is AGI part of your mission?

A: No, I do not see this as part of my duty. For me, AGI is not the end goal. The ultimate goal is to create a profound positive impact for the world with this technology. This is a very general technology. This is reasoning, this is intelligence. So it’s valid everywhere. And we want to make sure that it’s the most effective form of technology possible, as early as possible. This is not a pseudo-religious quest for AGI whose definition we don’t even know.

Q: What’s next?

A: I think everyone should pay attention to tool use and more agent-like behavior. Models that you can present for the first time with a tool you created yourself. Maybe it’s a software program or an API (application programming interface). And you can say: ‘Hey model, I just did this. Here’s what he did. Here’s how to interact with it. It’s part of your toolkit of things you can do.’ I think the general principle of being able to give a model a tool that has never been seen before and adopt it effectively will be very powerful. To do most things, you need access to external tools. The current status quo is that models can only write (text) characters to you. If you give them access to the tools, they can take action on your behalf in the real world.

Leave a Reply

Your email address will not be published. Required fields are marked *