Statisticians tackle missing data with randomness, an issue encountered across social and biological sciences. Stef van Buuren highlights the limitations of standard statistical tools when data is incomplete. Leaving out incomplete data is considered cheating as it biases results. Donald Rubin introduced multiple imputation in the 1970s, aiming for a method both accurate and general. The technique involves making several guesses for missing data and using those guesses. While some researchers initially resisted the approach, it has become the go-to method for dealing with missing data in various fields, including medicine. Multiple imputation remains a widespread and essential technique, ensuring accurate predictions and honest analyses.
https://www.quantamagazine.org/when-data-is-missing-scientists-guess-then-guess-again-20241002/