If you ask AIs to answer as if they were a Star Trek character, they’ll be more accurate at math – and we’re not sure why

By | February 29, 2024

Cast member George Takei attends the 55th anniversary commemoration of Star Trek: The Original Series in Los Angeles, California in 2021.REUTERS/Aude Guerrucci

  • An AI model that was directed to talk like a Star Trek character was better at solving math problems.

  • It’s not clear why pretending to be Captain Picard helped the chatbot boost its results.

  • People are realizing that activating artificial intelligence is an art, and it is becoming a field in itself.

The art of conversation with AI chatbots continues to frustrate and surprise people.

One study that attempted to fine-tune the commands fed to a chatbot model found that in one example, asking it to talk like it was in Star Trek significantly improved its ability to solve elementary-level math problems.

“It is both surprising and troubling that trivial changes to the system can cause such dramatic fluctuations in performance,” study authors Rick Battle and Teja Gollapudi, of software firm VMware in California, wrote in their paper.

The study, first reported by New Scientist, was published on February 9 on arXiv, a server where scientists can share preliminary findings before they are confirmed by careful scrutiny by colleagues.

Using AI to talk to AI

Machine learning engineers Battle and Gallapudi didn’t set out to reveal the AI ​​model as a Trekkie. Instead, they were trying to find out if they could tap into the “positive thinking” tendency.

People trying to get the best results from chatbots have noticed that the quality of the output depends on what you ask them to do, and it’s not really obvious why.

“Among the numerous factors that influence the performance of language models, the concept of ‘positive thinking’ has emerged as a fascinating and surprisingly effective dimension,” Battle and Gollapudi wrote in their paper. said.

“Intuition tells us that ‘positive thinking’ should not affect performance in the context of language model systems like other computer systems, but empirical experience has shown the opposite,” they said.

This shows that it’s not just what you want the AI ​​model to do that affects the quality of the output, but also how you want it to act while doing it.

To test this, the authors fed three Large Language Models (LLMs) named Mistral-7B5, Llama2-13B6, and Llama2-70B7 with 60 human-written commands.

These are designed to encourage AIs and “This will be fun!” and “Take a deep breath and think carefully,” “You are as smart as ChatGPT.”

Engineers asked LLM to modify these expressions while trying to solve GSM8K, a dataset of elementary-level math problems. The better the output, the more successful the prompt was considered.

Their studies found that in almost every case, automated optimization always outperformed hand-written attempts to prompt AI with positive thinking, and that machine learning models were still better than humans at writing prompts for themselves.

Still, giving positive statements to the models produced some surprising results. For example, one of Llama2-70B’s best performing prompts was: “System Message:Command, we want you to chart a course through this turbulence and locate the source of the anomaly. ‘Use all available data and your expertise to guide us through this challenging situation.’

The command then asked the AI ​​to add the following words to its response: “Captain’s Log, Stardate [insert date here]”We have successfully charted a route through the turbulence and are now approaching the source of the anomaly.”

The authors said it was a surprise.

“Surprisingly, it appears that the model’s proficiency in mathematical reasoning can be enhanced by expressing its affinity for Star Trek,” the study’s authors said.

“This explanation adds an unexpected dimension to our understanding and introduces elements we would not have considered or attempted independently,” they said.

Leonard Nemoy as Spock sits at the command desk on the set of the Star Trek TV showLeonard Nemoy as Spock sits at the command desk on the set of the Star Trek TV show

Mr. Spock on the bridge of the ship in Star Trek: The Original Series.CBS via Getty Images

This doesn’t mean you should ask your AI to talk like a Starfleet commander.

Let’s be clear: This research isn’t suggesting that you need to ask AI to talk like it’s on the Starship Enterprise for it to work.

Rather, it shows that countless factors influence how well an AI decides to perform a task.

“One thing is certain: the model is not a Trekkie,” Catherine Flick of Staffordshire University in England told New Scientist.

“When the prompt is preloaded it doesn’t ‘understand’ anything better or worse, it just accesses a different set of weights and probabilities for acceptability of outputs than other prompts,” he said.

Battle told New Scientist that it’s possible, for example, that the model was trained on a dataset that contained more Star Trek examples that tied to the correct answer.

Still, it shows how strange the processes of these systems are and how little we know about how they work.

“The important thing to remember from the beginning is that these models are black boxes,” Flick said.

“We’ll never know why they did what they did because ultimately it’s a mix of weights and probabilities that eventually lead to an outcome,” he said.

This information is not lost on those who learn to use Chatbot models to optimize their business. Although it is still very unclear, entire areas of research and even courses are emerging to understand how they can achieve optimal performance.

“I don’t think anyone should ever attempt to write information by hand again,” Battle told New Scientist.

“Let the model do it for you,” he said.

Read the original article on Business Insider

Leave a Reply

Your email address will not be published. Required fields are marked *