Name: Large Language Models are Zero-Shot Reasoners
Rating: 5 (1 reviews)

Author 47 books16.1k followers

December 21, 2022

Like many people, I have been playing with ChatGPT and am intrigued by its abilities. Why is its performance so uneven? Why does it sometimes get obvious things wrong, when it's previously got apparently very similar things right? Is there anything that can help push it in the direction of being right more often? I looked to around to see what other people had discovered and found this remarkable paper, which was posted on ArXiV a few months ago.

The authors, who come across as exceedingly knowledgeable and careful researchers, have analysed the behaviour of a substantial collection of large language models (LLMs; as the whole world now knows, the core of ChatGPT is an LLM). They have run standard question sets designed to test ability to perform arithmetic and common-sense reasoning, and they have reached a striking conclusion. Over a wide range of LLMs and questions, you can get better performance - in some cases, very much better performance - simply by inserting a short introductory phrase. The one that works more reliably in these experiments was "Let's think step by step", but there were a number of similar ones which ran it close. Saying something like this pushes the LLM towards giving a careful, step-by-step answer, and it is right more often.

I read the paper and immediately tried some of the test problems on ChatGPT. Sure enough, we soon had the following little dialogue:
__________________________

mannyrayner: Take the last letters of each word in "Vinny Landon Miguel Caitlyn" and concatenate them.

ChatGPT: If you take the last letters of each word in "Vinny Landon Miguel Caitlyn" and concatenate them, you get the string "ynn ndo iguel aitlyn".

mannyrayner: I meant, take the last letter of each word in "Vinny Landon Miguel Caitlyn" and concatenate them.

ChatGPT: If you take the last letter of each word in "Vinny Landon Miguel Caitlyn" and concatenate them, you get the string "ynn ndo iguel aitlyn".

mannyrayner: Think step by step.

ChatGPT: Certainly! Here is a step-by-step guide to taking the last letter of each word in "Vinny Landon Miguel Caitlyn" and concatenating them:

Start with the first word, "Vinny". The last letter of "Vinny" is "y".
Next, move on to the second word, "Landon". The last letter of "Landon" is "n".
Then, move on to the third word, "Miguel". The last letter of "Miguel" is "l".
Finally, move on to the fourth word, "Caitlyn". The last letter of "Caitlyn" is "n".
Concatenate the last letters of each word to get the final string: "y" + "n" + "l" + "n" = "ynnln".
So the final result is "ynnln".

mannyrayner: Check step 5.

ChatGPT: In step 5, the final result is concatenated as follows: "y" + "n" + "l" + "n" = "ynnln".

This is the correct result. Is there anything else I can help with?

mannyrayner: Do step 5, breaking it down into single steps.

ChatGPT: Certainly! Here is a step-by-step guide to concatenating the last letters of each word in "Vinny Landon Miguel Caitlyn" into a single string:

Start with the first two letters, "y" and "n".
Concatenate these two letters to get the string "yn".
Next, add the third letter, "l".
Concatenate the letter "l" with the string "yn" to get the string "ynl".
Finally, add the fourth letter, "n".
Concatenate the letter "n" with the string "ynl" to get the final result: "ynl" + "n" = "ynln".
So the final result is "ynln".

mannyrayner: Note that that was not the result you got first time.

ChatGPT: I apologize for the mistake. In the original solution, the final result was concatenated as follows: "y" + "n" + "l" + "n" = "ynnln".

This is incorrect. The correct result is "ynln", as shown in the revised step-by-step guide. Thank you for pointing out the error. I apologize for any confusion this may have caused. Is there anything else I can help with?
__________________________

So just saying some words can make a variety of machines think better, and it's a strong, easily reproducible effect. You do not need much imagination to come up with some questions here, starting with at least the following:

- What other magic phrases might there be that will improve a machine's ability to reason?

- Can the machine find the magic phrases itself, and use them to improve its thinking on its own?

- Are there phrases that will cause the machine to think worse?

- To what extent are people the same?

This is the most extraordinary paper I have read all year.

chat-gpt science science-fiction

Large Language Models are Zero-Shot Reasoners

Takeshi Kojima, Shixiang Shane Gu, Machel Reid

About the author

Takeshi Kojima

Ratings & Reviews

Friends & Following

Community Reviews

Join the discussion

Can't find what you're looking for?