At Qovery, the engine-gateway service was running smoothly until an unexpected out-of-memory crash occurred. Despite stable memory consumption, the service was abruptly restarted due to exceeding the memory limit set. After investigating with deep kernel messages, it was discovered that the OOM issue was caused by a subtle difference in how errors were logged. The use of the anyhow library was triggering unnecessary symbolization and backtrace capturing, leading to a memory surge. By changing environment variables and switching to jemalloc for memory profiling, the issue was resolved. This incident highlights the importance of understanding documentation and how monitoring can sometimes mislead.
https://www.qovery.com/blog/rust-investigating-a-strange-out-of-memory-error/