Fire-Flyer File System from DeepSeek

Fire-Flyer File System, or 3FS, is a high-performance distributed file system tailored for AI training and inference workloads. Leveraging modern SSDs and RDMA networks, 3FS simplifies the development of distributed applications by offering a shared storage layer. Notable features include a disaggregated architecture that combines the power of numerous SSDs and storage nodes, delivering storage access in a locality-oblivious manner. Strong consistency is ensured through Chain Replication with Apportioned Queries (CRAQ), while stateless metadata services support a transactional key-value store. The system boasts efficient data preparation, random access to training samples, and parallel checkpointing. KVCache optimizes the inference process by caching key and value vectors. Surprisingly, GraySort tests yield impressive results, sorting 110.5 TiB of data across 8,192 partitions in just over 30 minutes with an average throughput of 3.66 TiB/min. The documentation provides a setup guide, design notes, and API references for users. Overall, 3FS is a versatile and powerful distributed file system tailored for demanding AI workloads.

https://github.com/deepseek-ai/3FS

To top