Update: Added support for Stable Diffusion XL 1.0 Base, which can now run on a Raspberry Pi Zero 2. The challenge was running the large transformer model with almost 1 billion parameters on a microcomputer with only 512MB of RAM without using additional swap space or offloading results on disk. To achieve this, the author developed OnnxStream, a small and hackable inference library focused on minimizing memory consumption. OnnxStream can consume significantly less memory than OnnxRuntime while being slightly slower. The author also implemented specific optimizations for Stable Diffusion XL 1.0, including attention slicing and tiled decoding to reduce memory consumption.
https://github.com/vitoplantamura/OnnxStream/tree/846da873570a737b49154e8f835704264864b0fe