In a recent update, the author reflects on a decade-old lecture on natural language processing and the use of softmax in machine learning. The benefits of softmax, including maximum entropy and interpretability of learning signals, are highlighted. A new harmonic formulation is introduced as an alternative to softmax. The author emphasizes the challenging gradient-based optimization of the harmonic formulation and its extreme behavior near the origin. Despite the potential drawbacks, the author suggests that iterative optimization methods may mitigate these challenges. The post concludes with a humorous mention of potential solutions or simply enjoying a Friday evening.
https://kyunghyuncho.me/softmax-forever-or-why-i-like-softmax/