Karthik Narasimhan: Language agents and multi-agent interaction

OpenAGISummit · July 27, 2024, 12:45pm

July 7th, 2024 Open AGI Summit Brussels

Professor Narasimhan (Princeton University, Sierra AI) presented some of his thoughts on language agents and multi-agent interaction at the Open AGI summit last week in Brussels; here are the key points he covered.

Full Talk:

Talk Notes:

We can define a language agent as one who can understand and generate language while also being capable of taking action.
- This includes programming languages or even math.
Humans are language agents.

An example of an agent:

In Software Engineering, writing code is usually only 20% of the work. Most of the time and effort goes into debugging, refactoring, and maintaining the code.
Professor Narasimhan’s lab at Princeton recently developed SWE-agent, an agent that can automate parts of software engineering: taking an issue description and trying hundreds of actions until the issue is resolved. [1] (This is an Open Source Devin)

Multi-agent interaction:

To some extent, the entire world is a giant multi-agent playground.
When you think about it, it’s obvious that: Language is key for any multi-agent communication and collaboration.
Language has been key to how we have built society and accelerated progress.
Language is key for language agents because it:
- Allows for real-time communication with other agents (humans and AI)
- Allows for understanding the world by ‘reading’ and ‘listening’
In the near term, it’s likely that we will have agents talking to each other, not just humans.

Challenges for building AI Agents:

The primary challenges for building Agents in the days ahead include:

Building good evaluations:

Static benchmarks and datasets are not likely to be successful for testing how agents will perform in the real world in a dynamic setting.

Developing and using principled frameworks for agent development (e.g., CoALA [Sumers et al., 2023]). [2]
Ensure trustworthiness and safety:

Agents can be much more powerful and dynamic than one-pass models and have the potential to affect society in ways that we can’t even imagine today.

Recent work at Sierra on building good evaluations:

At Professor Narasimhan’s startup Sierra AI, they have recently developed a new benchmark for evaluating agents [t - bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains; Yao et al., 2024]: [3]

TAU-bench: Tool Agent User Benchmark. In this benchmark:
- An agent has to take on the role of a customer service representative
The agent is interacting with a human and an environment.
This is a challenging and realistic task:
- The agent has to deal with partial information
- The agent has to interact with tools
Testing popular models on this benchmark, we find that current language agents are not up for this task.

As depicted in the figure above, as you run the scenario a few times, even the best models degrade in performance very quickly
While progress is being made towards a multi-agent future, we need to solve critical issues like reliability and dependability.
If we can solve the crux of the issue, which is having them understand language and use language as a tool for reasoning or more complex computation, we can really have a collaborative multi-agent society in the future.

References:

[1] John Yang et al., “SWE-Agent: Agent-Computer Interfaces Enable Automated Software Engineering” (arXiv, May 30, 2024), https://doi.org/10.48550/arXiv.2405.15793.

[2] Theodore R. Sumers et al., “Cognitive Architectures for Language Agents” (arXiv, March 15, 2024), https://doi.org/10.48550/arXiv.2309.02427.

[3] Shunyu Yao et al., “$\tau$-Bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains” (arXiv, June 17, 2024), https://doi.org/10.48550/arXiv.2406.12045.

Ben · July 27, 2024, 1:12pm

Really exciting concepts discussed here. I wonder what are the ways the global ecosystem of agents will be like the system of humans we have created and the ways that it will be different. What social structures are required for human organizations that are not required for agent orgs? What is required?

mohi · September 30, 2024, 7:02pm

i liked the youtube video and understand better this way

Topic		Replies	Views
About the AGI-thon: Agent Building category AGI-thon: Agent Building	0	32	November 2, 2024
Leveling Up Reasoning Via Games: a Post AGI-thon Analysis AGI-thon: Agent Building	2	152	April 27, 2025
Documentation - AGI-thon Werewolf Agents Tournament AGI-thon: Agent Building	2	127	April 27, 2025
Banghua Zhu: Nexusflow seperated open source models and agents Model Building foundational-models , agents	12	358	February 9, 2025
Versus_runner.py tutorial AGI-thon AGI-thon: Agent Building	2	51	April 27, 2025

Karthik Narasimhan: Language agents and multi-agent interaction

Related topics