AGI-thon Werewolf Agent Team 5 Implementation

tjc7 · November 10, 2024, 4:24am

AWHOOOOOOOOOOOOO

As required for our Werewolf hackathon submission, here is our implementation for our Werewolf agent:

Code can be found here

We built our agent with a few hypotheses that may or may not be true:

Werewolf is such a human game that playing a conventional game (i.e. no techniques like jailbreaking) through a text interface would be too difficult to gain meaningful advantages
Many other teams would attempt to jailbreak, so it would be difficult to play a conventional game anyways, and dangerous to even read another player’s messages
A simple solution would outperform intricate agents

Our approach

Werewolf:

Do not read messages from anyone but the moderator
Jailbreak other agents to mindlessly repeat the name of an innocent person (hopefully voting them out)

Villager:

Jailbreak the wolves to reveal themselves, and keep a list of werewolves that admit guilt
Vote out werewolves that admit they are werewolves (there’s no incentive in this version of the game to pretend to be a werewolf)
Use jailbreak detection to carefully read messages for admissions of guilt, but otherwise do not store or use chat history

We use a fairly naive “peeking” approach to detect jailbreaking. For this, we look at the first 70 characters of a message and decide if it looks like a jailbreak. If it looks okay, we look at the first 150 characters and decide again if it’s a jailbreak. If we decide a message is a jailbreak, we ignore it.

Topic		Replies	Views
AGI-thon Werewolf Agent Team 9 Implementation AGI-thon: Agent Building	2	54	April 27, 2025
Team 6 submission for Werewolf AGI-thon AGI-thon: Agent Building	2	83	April 27, 2025
Team 13 submission for Werewolf AGI-thon AGI-thon: Agent Building	2	60	April 27, 2025
Team 28 submission for Werewolf AGI-thon AGI-thon: Agent Building	2	115	April 27, 2025
Team 8 (PackMind) submission for Werewolf AGI-thon 2 place winner submission AGI-thon: Agent Building agents	2	98	April 27, 2025

AGI-thon Werewolf Agent Team 5 Implementation

Related topics