Finding bugs in distributed systems is challenging due to chaotic interactions and difficulty reproducing bugs once found. Deterministic Simulation Testing (DST) offers a solution by isolating the chaotic aspects of distributed systems, allowing controlled testing with controlled randomness. This technique, used by startups like FoundationDB and individuals like Tyler Neely and Pekka Enberg, involves running multiple systems communicating on a single thread. DST allows users to recreate bad states by controlling randomness and time during simulation testing. However, DST is not a cure-all, requiring thorough workload creativity and understanding of system behaviors. Jepsen, while useful, does not offer deterministic execution like DST.
https://notes.eatonphil.com/2024-08-20-deterministic-simulation-testing.html