The Center for Digital Trust (C4DT) hosted the hands‑on workshop “LLM Benchmarking” on May 19th 2026. Thirteen attendees from a wide range of our partners, ELCA, Kudelski, SICPA and Swisscom, gathered to learn how to use benchmarks to choose the best LLM model for their use cases. The day‑long event started off with a lecture on the topic by Dr. Anna Sotnikova from EPFL’s NLP laboratory aimed at decision makers and engineers alike. Lively discussions took place as participants seized the occasion to ask questions and share their perspectives. In the afternoon, the engineers took part in an exercise session prepared by C4DT.
Dr. Anna Sotnikova’s started her lecture by clearing up common misconceptions about LLM benchmarking, such as the informative value of public benchmarks. She then discussed the objectives of LLM evaluation, and how they inform which approach to choose. Afterwards, she gave an overview of these different approaches, and concluded by stressing the importance of benchmarks tailored to one’s use case.

In the afternoon, the participants worked in groups to develop benchmarks for predefined use cases. They used general benchmarks to preselect models, then formulated success and failure criterias together with additional requirements. The groups then identified the required model’s capabilities and formulated evaluation examples.

The exercise session concluded with each group presenting their results and discussing what they have learned about LLM benchmarking.