The author attended PyData Berlin 2024, where Polars was a hot topic among attendees. While Polars is known for features like lazy execution and query optimization, a unique innovation that often goes unnoticed is non-elementary group-by aggregations. The author explains group-by operations and compares elementary aggregations in both pandas and Polars. They highlight the limitations of pandas in expressing complex operations efficiently and showcase how Polars allows for cleaner and more efficient non-elementary aggregations using expressions. The message is clear: dataframe libraries should focus on innovating their API to enable new possibilities rather than blindly copying existing ones. This article serves as a plea for future dataframe authors to prioritize syntax and functionality over imitation.
https://labs.quansight.org/blog/dataframe-group-by