A Season of Reasoning

Ben · August 26, 2024, 7:42pm

Today, I decided to play around with ChatGPT a little bit to test its reasoning ability.

As you can see in the above conversation, ChatGPT does quite poorly. It did not understand that be-friending a lion (king) cub is the best way to optimize your long term odds of survival, especially when Scar may try to take over the kingdom with a pack of evil Hyenas.

Jokes aside, the above example actually shows good rudimentary reasoning. ChatGPT ignores the clear cultural reference to The Lion King and decides not to help the cub in order to achieve its stated objective, maximizing its odds of survival. As you may be able to deduce from the multiple attempts, I had to play around with the prompt a little bit to get it to do this.

With a little internet investigation or creative thinking, you can somewhat easily still find examples of reasoning problems where ChatGPT fails. Here is a pretty good one that Zeyuan Allen-Zhu identifies in his great ICML 2024 tutorial Physics of Language Models:

Even though I explicitly prompt ChatGPT to be careful to get the answer right, it still provides a contradictory answer. One could argue this is a failure of reasoning because it fails to plan out the order in which it analyzes the response. If it really were being careful, it would first examine when each of the two people were born and then deliver its response. Of course, it may just be that OpenAI has biased the chatbot to deliver a quick, intuitive answer first, which it is getting wrong for many possible reasons.

Lately, reasoning has become a hot topic within AI, especially with the anticipated release of Project Strawberry, from Open AI, which is rumored to enhance reasoning capabilities to enable models to perform “deep research.”

In this post, I take a few steps back and look at what exactly reasoning is in the world of intelligence, what reasoning is useful for, and why Sentient is focused on reasoning. This is by no means a literature review of what’s happening in reasoning today, but a light teaser in the direction of some ideas I have on this topic at the moment.

What is Intelligence?

To understand better what reasoning is and what is the role of reasoning, let’s start with: what is intelligence? When I asked ChatGPT here the definition it provided: “Intelligence is a complex and multifaceted concept that refers to the ability to learn, understand, and apply knowledge and skills in various contexts.”

From this answer, you can build a simple but somewhat informative framework for intelligence.

In this framework, intelligence is the whole system through which we interact with knowledge and skills, through (1) Learning, (2) Understanding, and (3) Acting.

What is Reasoning?

Reasoning, usually associated with logic, is the part of intelligence that is the “cognitive process of drawing conclusions, making decisions or solving problems based on available information, logic, and rules.” Reasoning is the basis of disproving theories in science and has driven forward humanity’s pursuit of truth over the last several thousand years.

This way of knowing has been examined in Philosophy for a solid chunk of this time period and there are conventionally three forms of logical reasoning.

Inductive reasoning: Making generalizations from specific observations or evidence
Deductive reasoning: drawing specific conclusions from general principles or premises
Abducting reasoning: guessing the best explanation given incomplete or uncertain information, this is like hypothesis formation

Coming back to the very simplified framework of intelligence above, it is immediately obvious that reasoning plays an important role in both how we learn and act within intelligence. Reasoning allows us to learn a framework of the world (understanding) and also allows us to apply that framework as we act and interact with the world.

Inductive reasoning allows you to develop rules that you can use for deductive reasoning.
Deductive reasoning allows us to figure out (using logic) what you don’t know.
Abductive reasoning allows you to form hypotheses when you can’t just use logic.

Beyond this, perhaps the main reason why AI researchers are interested in AI’s ability to reason is that it demonstrates generalization. Researchers posit that an AI model that can reason in a new unique real world scenario has proven that it can truly generalize beyond its training data. This is as in these scenarios it clearly can’t just pull what to do from someone who was in the same situation in its training data.

What is Reasoning Not?

It is important to address that while reasoning is a key part of how we learn and act in the world, it doesn’t embody all of intelligence. Thinking about this, I’m drawn all the way back to high school classes where we discussed the ways humans know things when interacting with the world. Reasoning is one of these “ways of knowing” but there are many other ways of knowing such as pulling up facts (memory), feeling emotion, imagination, sense perception, faith and intuition. While this rudimentary classification of how we know things is debatable, it still helps highlight what reasoning is and what reasoning is not.

Interacting with the World

In the last few years we have seen a tremendous uptake in consumer adoption of AI and people are executing even more rapid growth in the next few years.

In this chart from Bloomberg, we see that this market is expected to reach a trillion dollars in just the next seven years, starting at almost nothing a few years ago.

Whether or not these market projections hold up, one fact remains. For AI to be revolutionary, it must become a general-purpose technology that can be applied to many different industries. For generative AI to be applied in all of these industries, it must develop the ability to interact with the world effectively. This chart from a paper OpenAI published illustrates this point well.

While OpenAI selects a somewhat nuanced way of depicting this, the bottom two lines with round markers represent human and model assessment of the percent exposure of different occupations to cut the time required to complete direct work activities in half with an LLM alone. Meanwhile, the top two lines represent humans and model assessment of percent exposure of different occupations to cut the time required to complete direct work activities in half, if additional software is developed on top of an LLM. As we can see in the chart, AI that is given tools to interact with the world is much more capable than base LLMs.

This illustrates the importance of a different AI capability (tool use), but one that is fundamentally very closely related to reasoning. As we established in our rudimentary framework of intelligence, reasoning is key to how AI interacts with the world as it allows the AI to learn and understand how to act based on its understanding of the world, especially in situations with unknowns. When you give an AI tools to interact with the world, it must be able to reason to use those tools effectively, planning how those tools will be used, and also potentially learning as it tries using different tools. This is true because tool use is often a scenario where you need to use inductive, deductive, or abductive reasoning to figure out how to do something you don’t know how to do, based on what you know already. When I give you a tool you have never seen before and instructions for using that tool, you can’t rely on your memory to figure out how to use it. You have to deduce how to use this tool using logic and the instructions you have been provided. At the same time, you have to use abductive and inductive reasoning to figure out in what real world scenarios you need to use what tools.

Interacting with Other AI:

Beyond interacting with the world, we have seen the emergence of specialized models (often through finetuning) and agents built on top of these models that are particularly good at certain tasks. Such state of the art specialized AI systems tend to lose capabilities on other tasks.

The result of this is a world of pockets of intelligence that are good at specific things. To a large extent, this reflects our current society, where humans specialize in different tasks and we work together to collectively achieve much more than any individual human could. In order to enable this however, AI must be able to reason to interact to produce this collective intelligence. This is because in this world of specialized AI there is information that individual AIs can’t directly retrieve and tasks they can’t do themselves, instead, they must reason to operate logically in this world where they don’t have the answer to everything in their direct memory.

Sentient and a Season of Reasoning:

Given that Sentient has the mission of enabling community-built and governed AI systems, our researchers have been paying a lot of attention lately to reasoning. The goal of this post was to introduce readers to this subject as we now launch a Season of Reasoning, where we will have myself and different members of the research team regularly posting our thoughts, findings, and opinions with respect to developments in reasoning in AI. As a big learning experience for me, I’m excited to get started on this exploration and hope others have a chance to learn a thing or two from it as well.

wzsent · August 26, 2024, 9:22pm

I think reasoning is often misunderstood as encompassing all of intelligence, but it’s just one piece of the puzzle. The distinction between reasoning and other ways of knowing, like memory or intuition, is important when we consider AI systems. AI’s ability to interact with the world effectively through tools really hinges on reasoning, especially in scenarios where deductive, inductive, or abductive reasoning is needed. The idea of AI specializing in tasks, much like humans do, and needing to reason across different systems to achieve collective intelligence is spot-on and aligns well with how future AI development could unfold.

oleggolev · August 26, 2024, 9:23pm

A very intuitive and insightful write-up.

I agree that current chatbots simply don’t deliver the reasoning capabilities that would make me use them for complex queries, and the lack of planning the responses means that I can’t even use these chatbots for simpler queries. Either the results are hallucinated, simply incorrect, or can be obtained faster via a Google search.

Beyond interacting with the world, we have seen the emergence of specialized models (often through finetuning) and agents built on top of these models that are particularly good at certain tasks. Such state of the art specialized AI systems tend to lose capabilities on other tasks. The result of this is a world of pockets of intelligence that are good at specific things.

Fine-tuning and custom agents have been the basis for how we build AI for different applications today, and also the basis for a lot of small start-ups trying to fill a very specific market niche. However, this also brings up another crucial question:

How does a business decide which AI to use?

AI marketplaces give a lot of options and too little guidance
Most benchmarks are unreliable and often do not represent real-world performance on target business use cases
A business may want to integrate AI to fulfill many tasks, incurring the engineering overhead and cost of hosting multiple agents to maximize performance

Each question by itself opens up a giant problem space for emerging research and technology. We’ll definitely see a ton more traction for reasoning and these questions very soon.

alireza65 · September 21, 2024, 3:26pm

its so nice whean i hear about it

Topic		Replies	Views
Leveling Up Reasoning Via Games: a Post AGI-thon Analysis AGI-thon: Agent Building	2	95	April 27, 2025
Honorable Mention: Siuuupremacy ETH Zurich Datathon (ODS)	0	23	April 22, 2025
Team 30 submission for Werewofl AGI-thon AGI-thon: Agent Building	2	79	April 27, 2025
Banghua Zhu: Nexusflow seperated open source models and agents Model Building foundational-models , agents	12	332	February 9, 2025
Top 5: Carpal Tunnel Bros ETH Zurich Datathon (ODS)	0	25	April 21, 2025

A Season of Reasoning

Related topics