“Magic123” is a groundbreaking solution for generating high-quality, textured 3D meshes from a single unposed image. This innovative approach utilizes both 2D and 3D priors in a two-stage coarse-to-fine process. In the first stage, a neural radiance field is optimized to create a coarse geometry. Then, a memory-efficient differentiable mesh representation is used in the second stage to produce a high-resolution mesh with a visually pleasing texture. What sets “Magic123” apart is its integration of reference view supervision, novel views, and a single tradeoff parameter between 2D and 3D priors to control the level of exploration and precision in the generated geometry. To further enhance results, textual inversion and monocular depth regularization techniques are employed. Through extensive experimentation, “Magic123” proves to be a significant improvement over previous image-to-3D methods, earning it high validation in both synthetic benchmarks and real-world scenarios.
https://guochengqian.github.io/project/magic123/