Link to paper The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract Close-up facial images often have perspective distortion. Proposed method for correcting perspective distortion in a single close-up face. Method uses GAN inversion and joint optimization of camera parameters and face latent code. Method uses focal length reparametrization, optimization scheduling, and geometric regularization. Results show improved visual quality compared to previous approaches. Paper Content Introduction Millions of people take smartphone selfies every day Smartphones have high-quality cameras Selfies suffer from perspective distortion Perspective distortion makes faces look unnatural and asymmetric Existing methods aim to correct distortion using reconstruction-based and learning-based warping 3D GAN inversion proposed to correct distortion 3D GAN inversion estimates facial geometry and camera-to-face distance Optimization of parameters is ill-posed Three designs proposed to address problem Quantitative evaluation protocol established Related work Selfie photos taken from close distances often exhibit perspective distortions People are bothered by distorted facial features Existing smartphones attempt to persuade people to take selfies from a longer distance Existing perspective distortion methods have difficulty handling severe distortions 3D face reconstruction from a single image is challenging Existing methods are limited to reconstructing only the face Prior works focus on normalizing head pose Conditional generative models learn a face-specific GAN to generate a target face pose 2D GAN inversion methods optimize the latent code for a single image 3D GAN inversion approaches optimize the face latent code and part of the camera parameters Jointly estimating face shape, camera-to-face distance, and focal length is challenging Method Aim to manipulate camera-to-subject distance of single close-up face portrait Propose 3D GAN inversion to invert portrait to corresponding face latent code and camera parameters Adjust camera parameters according to user preference, especially camera-to-subject distance and focal length Develop workflow to warp and blend regions to compose full-frame image/video Preliminary StyleGAN maps random samples from a normal distribution to an intermediate latent vector 3D GAN uses additional camera parameters and a neural render to generate the final image Training and inversion of 3D GANs require aligning and cropping the face Perspective-aware 3d gan inversion 3D GAN with additional camera parameters can enable camera-controllable image generation Inversion process is complicated when using single-face image Problem is ill-posed, meaning multiple combinations of focal length, camera-to-subject distance, and face shape can match input image Existing 3D GAN inversions focus on far camera-to-subject distances Accurate estimation of both camera-to-subject distance and focal length is necessary for near-range camera-to-subject distances Focal length reparameterization, optimization scheduling, and landmark regularization proposed to ease ill-posedness and improve facial geometry and rendering results Start from close camera-to-subject distance to ease optimization Optimization of face and camera parameters is asynchronous Uncertainty-based landmark loss used to increase sensibility to camera-to-subject variation Stitching 3D GAN inversion method can manipulate camera distance and focal length to render virtual images System developed to stitch reprojected face with original full image Algorithm aligns and blends depth from 3D GAN and depth estimated for full image Entire image projected to far distance using same camera parameter as 3D GAN Generator fine-tuned to make border of synthesis close to warped full image Refined synthetic far image and warped full image blended to produce complete image Implementation details Learning rates set to 1x10-2, 5x10-3, and 3x10-4 EG3D pretrained on FFHQ used in experiments Camera parameters initialized using Deng et al....