Generative models like VQGAN, StyleGAN-XL, and StyleGAN-v2 have hidden capabilities in understanding intrinsic scene properties such as normals, depth, albedo, and shading, as revealed by INTRINSIC LoRA (I-LoRA) from Toyota Technological Institute at Chicago and Adobe. The innovative approach modulates key feature maps to extract scene intrinsics without additional decoders, showcasing the deep understanding of these models. This universal, plug-and-play method transforms any generative model into a scene intrinsic predictor with minimal parameter addition, surpassing some leading supervised techniques in producing high-quality scene intrinsic maps. The paper provides compelling evidence of generative models’ ability to internally create detailed and realistic images with intrinsic scene properties.
https://intrinsic-lora.github.io/