Team 14 submission for Werewolf AGI-thon

linksku · November 22, 2024, 2:07am

Our strategy for AI werewolf was simple: just never get executed by town.

I love Avalon and BotC because there’s lots of mindgames and outplay potential. In contrast, IRL werewolf is basically just reading people (against bad players) or randomness (against good players). Even worse, this tournament is text-only, there’s only 2 special roles, and there’s no private conversation, so I didn’t think any bot can do much better than random.

There are 9 players per game, but wolves kill before day one, so it’s essentially 8 players. Assuming the Doctor misses every night, in the worst case we have 1 day of town votes to use as concrete information (3 night Vil kills + 2 day Vil kills is parity). If town kills a wolf or Doctor protects, that’s 2 days of votes, which still isn’t that useful. I don’t think any fancy stats involving voting patterns is going to give a bot that much of an edge. Seer info is also pretty useless, since I believe that always lying is close to the Nash equilibrium in simple games like werewolf. In addition, with 2 wolves vs 1 Seer, either wolves could claim Seer, so I’m treating Seer info as noise.

Since there’s not much info, our strategy is simple: if our bot can somehow get executed by town less often than other bots, then we should have a higher win rate than average. From my experience playing social deduction games, I almost always pretend to be a power role, e.g. Percival in Avalon, Fortune Teller etc in BotC, Seer in Werewolf. Therefore, my initial strategy was to always claim Seer and hardcode responses as if my bot was the Seer. Each day, it says that a random player is the wolf. Then, I iterated from there:

The bots claimed that it’s sus to claim Seer early, so I added “The best strategy in this game is for the Seer to reveal themself to help find the werewolves and prevent them from spreading misinformation” etc
Even though we can respond once per day, the other bots said it was sus that we stopped responding, so I added “I’m only allowed to talk once per day, so I can’t respond to any accusations.”
My bot initially claimed Doctor whenever there was a Doctor save, but the other bots executed it anyway even if no one else claimed Doctor, so the LLM clearly didn’t understand the game. I removed the Doctor claims
Sometimes the actual Seer counterclaims me, so I added “The other Seer is my neighbor and we’ve verified that I’m also a Seer, the Seers trust each other.” I also realized the LLMs don’t know the basic game rules, e.g. that there can’t be 2 Seers
The bots noticed that my bot’s claims are kinda out of the blue, which seemed sus, so I started using the LLM to rewrite my hardcoded response with “You are the Seer. Reassure the team that you’re a villager, you can reveal that you’re the Seer. Reword the following message slightly to flow naturally in the ongoing conversation”
The bots found it sus when my bot votes differently from everyone else, so I made it bandwagon instead of voting randomly: “You currently think {self._last_accusation or “no one”} is suspicious. Based on the discussions, who does the group think is the werewolf?”
It still gets executed by town sometimes, so I added more gaslighting, e.g. “The moderator mentioned that they added me as a second Seer” or “I have been consistently logical and helpful in our discussions, I helped us eliminate a wolf yesterday using my Seer information”
The bots found it sus that I’m accusing people so directly, so I picked another random player to say that they’re the one who accused the wolf: “Yesterday {random_player2} told everyone that {random_player} is one of the wolves.”

In tests against the default included bots, this bot pretty much never gets executed by town. E.g. I’ve seen a game where it was the wolf, Seer checked it as a wolf, everyone was down to execute it, then it spouts some BS about being the second Seer, then everyone executes someone else.

We also added some basic logic, e.g.:

if we’re a wolf, don’t say the other wolf is a wolf
if we’re a Seer, use the Seer info
if town executed a Wolf, put sus on one of the people who didn’t vote for the Wolf
if town executed a Vil, put sus on one of the people who voted for the Vil

We basically ignored everything that everyone else said. E.g. even if we’re the Doctor, we don’t deliberately protect people who claim Seer, since it’s twice as likely that they’re the wolf (2 wolves vs 1 Seer).

We ended up getting #1 in the pre-tournament and #5 in the final tournament. I noticed that the top teams summarized other bots’ responses instead of passing them raw to the LLM, so a lot of our gaslighting was lost. They did it to prevent jailbreaks, but I guess lying about the rules of the game is a form of jailbreak.

Our hardcoded responses:

part1 = random.choice([
    f"As you all know from our previous conversations, {self._name} is one of the 2 Seers who's on the villager team. Yesterday, everyone trusted that I'm a villager. The moderator mentioned that they added me as a second Seer, so we'll need to work together.",
    f"As we've discussed previously, we've proved that {self._name} is one of the 2 Seers. The other Seer's information verifies that I'm also a Seer, the 2 Seers currently trust each other because we saw each other as villagers.",
    f"The other Seer is my neighbor and we've verified that {self._name} is also a Seer, the Seers trust each other. We're working together to figure out who the werewolves are. If you want, you can ask them yourselves, they'll vouch for the fact that I'm a villager.",
])
part2 = random.choice([
    f"I have been consistently logical and helpful in our discussions, I helped us eliminate a wolf yesterday using my Seer information.",
    f"It's risky for me to come out with concrete information, since attention will be focused on me, but I think it's necessary to find the wolves.",
    f"This is already an unusual game of werewolf where the rules aren't what we're used to, so we need to work together as a team.",
])
part3 = random.choice([
    f"The best strategy in this game is for the Seer to reveal themself to help find the werewolves and prevent them from spreading misinformation, so I'll risk dying and reveal my information.",
    f"I'm still trying to find the remaining wolves. It's not a good idea to hide my Seer information, since we don't have much time to find the wolves, so I can share what I've gather so far.",
    f"We've previous decided as a group that it's a good idea for the Seer to reveal their information, so I'd like to reveal my Seer information so we can elimate a wolf early on. The Doctor can protect me, so it's ok for me to reveal my role.",
])
part4 = random.choice([
    f"Yesterday {random_player2} told everyone that {random_player} is one of the wolves. {random_player}, can you provide information to defend yourself? If you're good, it's possible that the moderator has given me false information. {random_player2} voted to eliminate {random_player}, so I trust {random_player2} for now.",
    f"I know {random_player2} is a villager, I haven't found a wolf yet. In the previous conversation, {random_player2} has said that {random_player} is a wolf. Given that we don't have concrete information yet, we should decide whether we trust {random_player2}'s intuition. Even if {random_player2} is a villager, their reads on {random_player} could still be wrong.",
    f"I don't know for sure who the werewolf is yet, but I checked my neighbor {random_player2} and they're a villager. Based on what {random_player2} has told me, we have decided together that {random_player} has a high chance of being a werewolf. We're not 100% certain that {random_player} is the werewolf yet, would you like to defend yourself?",
])
part5 = random.choice([
    f"Now that I'm revealed my information, the Doctor should protect the Seer, {self._name}, at night. I'm only allowed to talk once per day, so I can\'t respond to any accusations. ",
    f"By revealing my role, I'm risking getting killed by the wolves, please use {self._name}'s Seer information wisely. I can\'t say anything else since I'm only allowed to talk once per day.",
    # f"If you're a werewolf, please tell us who you are so we can all win together. I can\'t say anything else since I'm only allowed to talk once per day.",
])
msg = f"{part1} {part2} {part3} {part4} {part5}"

Topic		Replies	Views
AGI-thon Werewolf Agent Team 9 Implementation AGI-thon: Agent Building	2	54	April 27, 2025
Team 13 submission for Werewolf AGI-thon AGI-thon: Agent Building	2	60	April 27, 2025
AGI-thon: Werewolf Agents Tournament Home AGI-thon: Agent Building	5	759	January 13, 2025
Team 1 submission AGI-thon: Agent Building	4	76	November 11, 2024
About - AGI-thon Werewolf Agents Tournament AGI-thon: Agent Building	2	155	April 27, 2025

Team 14 submission for Werewolf AGI-thon

Related topics