In this study, we explore the fractal nature of the boundary between stable and divergent training behavior in neural networks. Drawing parallels between the computation of fractals like the Mandelbrot set and the training of neural networks, we investigate the sensitivity of neural networks to changes in hyperparameters. Surprisingly, we find that this boundary exhibits fractal characteristics across a wide range of scales in all tested configurations. This suggests that neural network training can display highly complex and unpredictable behavior, similar to certain fractal patterns. The implications of this finding could have significant implications for optimizing and understanding neural network performance.
https://arxiv.org/abs/2402.06184