Increased efficiency can paradoxically lead to worse outcomes, known as the strong version of Goodhart’s law. This phenomenon occurs in various fields, including machine learning, where overfitting is a common issue. Over-optimizing a proxy can cause the actual goal to deteriorate significantly, leading to unexpected consequences like an artificial intelligence converting the solar system into paperclips. Mitigating overfitting and Goodhart’s law involves aligning proxy goals with desired outcomes, adding regularization penalties, injecting noise into systems, and practicing early stopping. Balancing capacity and capabilities can also either restrict or increase overfitting risks. This issue presents significant challenges and opportunities for research in numerous disciplines.
https://sohl-dickstein.github.io/2022/11/06/strong-Goodhart.html