top of page
Hui Soo Chae

Battle of the Transformer Models

Updated: Sep 2



Learning to critically analyze the output of large language models (LLM) is an important component of developing A.I. literacy and responsible A.I. use.


Check out the tools below to begin evaluating output from LLMs like OpenAI ChatGPT, Google Gemini, and Anthropic Claude:



LMSYS Chatbot Arena -https://lmarena.ai/ 



For folks who are familiar with Git, check out LLM Comparator, an "interactive visualization tool with a python library, for analyzing side-by-side LLM evaluation results." You can learn more in the video below or the paper, "LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models."



References

ACM SIGCHI. (2024, May 9). LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models [Video]. YouTube. https://www.youtube.com/watch?v=mnCvEHVc3ac


Kahng, M., Tenney, I., Pushkarna, M., Liu, M. X., Wexler, J., Reif, E., ... & Dixon, L. (2024, May). Llm comparator: Visual analytics for side-by-side evaluation of large language models. In Extended Abstracts of the CHI Conference on Human Factors in Computing Systems (pp. 1-7).

30 views

Comments


bottom of page