サクサク読めて、アプリ限定の機能も多数!
トップへ戻る
Switch 2
grail.cs.washington.edu
Abstract We introduce a free-viewpoint rendering method -- HumanNeRF -- that works on a given monocular video of a human performing complex body motions, e.g. a video from YouTube. Our method enables pausing the video at any frame and rendering the subject from arbitrary new camera viewpoints or even a full 360-degree camera path for that particular frame and body pose. This task is particularly c
European Conference on Computer Vision (ECCV), 2020 We introduce a fully automatic pipeline for inferring depth, occlusion, and lighting/shadow information from image sequences of a scene. All theses information is extracted just by using people (and other objects such as cars) as scene probes to passively scan the scene. We also develop a tool for image compositing based on the inferred depth, oc
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020 Using a handheld smartphone camera, we capture two images of a scene, one with the subject and one without. We employ a deep network with an adversarial loss to recover alpha matte and foreground color. We composite the result onto a novel background. Abstract We propose a method for creating a matte – the per-pixel foreground
Chung-Yi Weng1, Brian Curless1, Ira Kemelmacher-Shlizerman1,2 1University of Washington 2Facebook, Inc Given a single photo as input (far left), we create a 3D animatable version of the subject, which can now walk towards the viewer (middle). The 3D result can be experienced in augmented reality (right); in the result above the user has virtually hung the artwork with a HoloLens headset and ca
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018 Abstract We present a system that transforms a monocular video of a soccer game into a moving 3D reconstruction, in which the players and field can be rendered interactively with a 3D viewer or through an Augmented Reality device. At the heart of our paper is an approach to estimate the depth map of each player, using a CNN th
Given audio of President Barack Obama, we synthesize a high quality video of him speaking with accurate lip sync, composited into a target video clip. Trained on many hours of his weekly address footage, a recurrent neural network learns the mapping from raw audio features to mouth shapes. Given the mouth shape at each time instant, we synthesize high quality mouth texture, and composite it with p
Ricardo Martin-Brualla1 David Gallup2 Steven M. Seitz1,2 1University of Washington 2Google Inc. Overview We introduce an approach for synthesizing time-lapse videos of popular landmarks from large community photo collections. The approach is completely automated and leverages the vast quantity of photos available online. First, we cluster 86 million photos into landmarks and popular viewpoints
We present an approach that takes a single video of a person's face and reconstructs a high detail 3D shape for each video frame. We target videos taken under uncontrolled and uncalibrated imaging conditions, such as youtube videos of celebrities. In the heart of this work is a new dense 3D flow estimation method coupled with shape from shading. Unlike related works we do not assume availability o
Interested in the tool? Enter your email here or follow Ira on twitter. We'll annouce as soon as the tool is out--we're working on it! Description | Results | Popular press | Bibtex | Contact Description We present an approach that takes a single photograph of a child as input and automatically produces a series of age-progressed outputs between 1 and 80 years of age, accounting for pose, expressi
The emergence of multi-core computers represents a fundamental shift, with major implications for the design of computer vision algorithms. Most computers sold today have a multicore CPU with 2-16 cores and a GPU with anywhere from 4 to 128 cores. Exploiting this hardware parallelism will be key to the success and scalability of computer vision algorithms in the future. In this project, we conside
PMVS is a multi-view stereo software that takes a set of images and camera parameters, then reconstructs 3D structure of an object or a scene visible in the images. Only rigid structure is reconstructed, in other words, the software automatically ignores non-rigid objects such as pedestrians in front of a building. The software outputs a set of oriented points instead of a polygonal (or a mesh) mo
Abstract This paper proposes a fully automated 3D reconstruction and visualization system for architectural scenes (interiors and exteriors). The reconstruction of indoor environments from photographs is particularly challenging due to texture-poor planar surfaces such as uniformly-painted walls. Our system first uses structure-from-motion, multi-view stereo, and a stereo algorithm specifically de
Entering the search term Rome on Flickr returns more than two million photographs. This collection represents an increasingly complete photographic record of the city, capturing every popular site, facade, interior, fountain, sculpture, painting, cafe, and so forth. It also offers us an unprecedented opportunity to richly capture, explore and study the three dimensional shape of the city. In this
Abstract Multi-view stereo (MVS) algorithms now produce reconstructions that rival laser range scanner accuracy. However, stereo algorithms require textured surfaces, and therefore work poorly for many architectural scenes (e.g., building interiors with textureless, painted alls). This paper presents a novel MVS approach to overcome these limitations for Manhattan World scenes, i.e., scenes that
Abstract We present a framework for automatically enhancing videos of a static scene using a few photographs of the same scene. For example, our system can transfer photographic qualities such as high resolution, high dynamic range and better lighting from the photographs to the video. Additionally, the user can quickly modify the video by editing only a few still images of the scene. Finally, our
Abstract We describe an interactive, computer-assisted framework for combining parts of a set of photographs into a single composite picture, a process we call "digital photomontage." Our framework makes use of two techniques primarily: graph-cut optimization, to choose good seams within the constituent images so that they can be combined as seamlessly as possible; and gradient-domain fusion, a pr
ACM Doctoral Dissertation Award Seth Cooper (winner, 2011) Noah Snavely (honorable mention, 2009) Aseem Agarwala (honorable mention, 2006) Sloan Research Fellowships Noah Snavely (2012) Karen Liu (2010) Li Zhang (2010) Brian Curless (2000) TR35 Awards Adriana Schulz (2020) Noah Snavely (2011) Adrien Treuille (2009) Karen Liu (2008) ACM SIGGRAPH Awards Noah Snavely – Significant New Researcher Awar
We are exploring a strategy for searching through an image database, in which the query is expressed either as a low-resolution image from a scanner or video camera, or as a rough sketch painted by the user. Our searching algorithm makes use of multiresolution (wavelet) decompositions of the query and database images. The method is both effective and fast. If you have an Adobe Acrobat reader or pl
このページを最初にブックマークしてみませんか?
『GRAIL: UW Graphics and Imaging Laboratory』の新着エントリーを見る
j次のブックマーク
k前のブックマーク
lあとで読む
eコメント一覧を開く
oページを開く