Lukas Fluri
Open Menu
Close Menu
Bio
News
Papers
Blog
Publications
Lukas Fluri
,
Leon Lang
,
Alessandro Abate
,
Patrick Forré
,
David Krueger
,
Joar Skalse
(2025).
The Perils of Optimizing Learned Reward Functions: Low Training Error Does Not Guarantee Low Regret
.
ICML 2025
.
PDF
Cite
Lukas Fluri
,
Daniel Paleka
,
Florian Tramèr
(2024).
Evaluating Superhuman Models with Consistency Checks
.
SaTML 2024
.
PDF
Cite
Code