Star Fork DataFusion is a powerful query engine in Rust that leverages Apache Arrow for high-performance data processing. The core project provides libraries and binaries for developers to create fast and feature-rich database and analytic systems tailored to specific needs. Surprisingly, the project also includes subprojects like DataFusion Python for SQL and DataFrame queries, DataFusion Ray for distributed scaling on Ray clusters, and DataFusion Comet as a Spark accelerator. With built-in support for various data formats, extensive customization options, and a vibrant community, DataFusion offers a comprehensive solution for data processing needs. The library’s flexibility allows customization at every level, from additional data sources to custom operators. Check out the architecture section for more information and explore the user guide for examples and developer resources.
https://datafusion.apache.org/