CATArena (Code Agent Tournament Arena) is an open-ended environment where LLMs write executable code agents to battle each other and then learn from each other. CATArena is an engineering-level ...
This repository contains code and resources related to the paper "Expert Preference-based Evaluation of Automated Related Work Generation". Abstract: Expert domain writing, such as scientific writing, ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results