Automatic data generation with STRATEGIST

Here are some of my notes on Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search by Light et al.

Full Article: [2408.10635] Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search

Motivation:

  • Our interest lies in exploring how LLMs can acquire and refine skills autonomously, especially in multi-agent settings where strategic decision-making is crucial.

  • We aim to develop methods that improve agent reasoning through self-play and reflection, enabling them to learn without constant human supervision.

  • The intersection of action planning and dialogue generation in games can serve as a valuable testbed for improving reasoning and adaptability in real-world scenarios.

  • The intersection of action planning and dialogue generation in games can serve as a valuable testbed for improving reasoning and adaptability in real-world scenarios.

Method:

  • The core approach is a self-evolving system where agents improve through self-play simulations, guided by Monte Carlo tree search (MCTS) and LLM-based reflection.
  • The process minimizes human intervention by allowing agents to gather feedback, evaluate states, and learn strategic skills autonomously.
  • This self-improvement loop enables agents to refine both high-level strategy and low-level execution over time, optimizing their performance in complex environments.

Our Vision:

  • The paper demonstrates a promising direction in using LLMs for reasoning in complex environments, a concept we can expand upon by incorporating more diverse games and scenarios.
  • We envision enhancing this approach by integrating more sophisticated LLM-based reasoning frameworks that allow for better state representation and planning.
  • By focusing on continuous improvement, we can push LLMs to reason in more complex, real-world tasks beyond games, such as multi-agent collaborations in dynamic systems.
4 Likes

Interesting description, but the link to the full article doesn’t show for me :thinking:

1 Like