Battle of the Transformer Models

Hui Soo Chae
Sep 1, 2024
1 min read

Updated: Sep 2, 2024

Learning to critically analyze the output of large language models (LLM) is an important component of developing A.I. literacy and responsible A.I. use.

Check out the tools below to begin evaluating output from LLMs like OpenAI ChatGPT, Google Gemini, and Anthropic Claude:

LLM Arena - https://llmarena.ai/

LMSYS Chatbot Arena -https://lmarena.ai/

ChatArena -https://www.chatarena.ai/

For folks who are familiar with Git, check out LLM Comparator, an "interactive visualization tool with a python library, for analyzing side-by-side LLM evaluation results." You can learn more in the video below or the paper, "LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models."

References

ACM SIGCHI. (2024, May 9). LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models [Video]. YouTube. https://www.youtube.com/watch?v=mnCvEHVc3ac

Kahng, M., Tenney, I., Pushkarna, M., Liu, M. X., Wexler, J., Reif, E., ... & Dixon, L. (2024, May). Llm comparator: Visual analytics for side-by-side evaluation of large language models. In Extended Abstracts of the CHI Conference on Human Factors in Computing Systems (pp. 1-7).

NEXUS

Battle of the Transformer Models

Related Posts

Comments

NEXUS

Learning and Teaching

ABOUT

PROJECTS

NETWORK