Top 5: Big Brain Bigger Appetite

rokopozaric · April 21, 2025, 8:10am

Big Brain Bigger Appetite

Members’ names: Frederieke Lohmann, Yi-Yi Ly, Arvid Ban, David Hofer

Motivation
Our goal of the 24-hour challenge was to improve Sentient’s OpenDeepSearch Tool by exploring improvements on the architecture.

Our approach consisted of

extensive brainstorming and research of possible methods
implementation/hacking
evaluating on a small subset of the FRAMES dataset
reiterate at step (1)

We hereby note that we used a own baseline by running the code as we received it. For the autograder, we consistently used llama-v3p1-70b-instruct to grade our results. The baseline achieved an accuracy of 52.7%.

Exploration
We explored several model architectures and evaluated their accuracy on a fixed subset of FRAMES of 243. Some evaluation results are based on a fixed subset of 88 samples.

First, we implemented an ensemble method where we stacked five models, varying in size, aggregating their results with both an embedding-based and an LLM based approach. Second, we implemented query rephrasing, where the original user query was augmented to three rephrased queries. We ingested both original user query and the three rephrased queries. Third, we implemented different planning strategies. Finally, we also took the total combination of these newly implemented methods.

The accuracy was determined by dividing the grades A over the total number of samples (824).

Screenshot 2025-04-21 040210

Presentation
GitHub Repo
Writeup PDF

Topic	Replies	Views
1st Place: Anti-Alignment Alignment Club ETH Zurich Datathon (ODS)	52	April 20, 2025
3rd Place: Here4Food ETH Zurich Datathon (ODS)	29	April 21, 2025
Honorable Mention: Siuuupremacy ETH Zurich Datathon (ODS)	23	April 22, 2025
2nd Place: Apple Pi ETH Zurich Datathon (ODS)	31	April 20, 2025
Honorable Mention: BluBomberBing ETH Zurich Datathon (ODS)	17	April 22, 2025

Top 5: Big Brain Bigger Appetite

Related topics