This document summarizes recent developments in action recognition using deep learning techniques. It discusses early approaches using improved dense trajectories and two-stream convolutional neural networks. It then focuses on advances using 3D convolutional networks, enabled by large video datasets like Kinetics. State-of-the-art results are achieved using inflated 3D convolutional networks and temporal aggregation methods like temporal linear encoding. The document provides an overview of popular datasets and challenges and concludes with tips on training models at scale.
【DL輪読会】NeRF-VAE: A Geometry Aware 3D Scene Generative ModelDeep Learning JP
NeRF-VAE is a 3D scene generative model that combines Neural Radiance Fields (NeRF) and Generative Query Networks (GQN) with a variational autoencoder (VAE). It uses a NeRF decoder to generate novel views conditioned on a latent code. An encoder extracts latent codes from input views. During training, it maximizes the evidence lower bound to learn the latent space of scenes and allow for novel view synthesis. NeRF-VAE aims to generate photorealistic novel views of scenes by leveraging NeRF's view synthesis abilities within a generative model framework.
Detecting attended visual targets in video の勉強会用資料Yasunori Ozaki
第三回 全日本コンピュータビジョン勉強会(後編)で発表した Detecting attended visual targets in video のまとめ資料です。映像中にいる人物が注意を払っている対象を推定するタスクを解いた話です。コンピュータビジョンや認知科学などに興味がある方はぜひご覧ください。
シェーダーを活用した3Dライブ演出のアップデート ~『ラブライブ!スクールアイドルフェスティバル ALL STARS』(スクスタ)の開発事例~KLab Inc. / Tech
This document discusses updates to 3D live performance rendering in Love Live! School Idol Festival All Stars (SIFAS) using shaders. It describes how vertex shaders were used to animate butterfly wings flapping and fans waving to reduce CPU load while maintaining production efficiency. Particle systems were combined with custom vertex streams and shader modifications to extend the single butterfly implementation to multiple butterflies. GPU instancing was also proposed as an alternative solution.
This document summarizes recent developments in action recognition using deep learning techniques. It discusses early approaches using improved dense trajectories and two-stream convolutional neural networks. It then focuses on advances using 3D convolutional networks, enabled by large video datasets like Kinetics. State-of-the-art results are achieved using inflated 3D convolutional networks and temporal aggregation methods like temporal linear encoding. The document provides an overview of popular datasets and challenges and concludes with tips on training models at scale.
【DL輪読会】NeRF-VAE: A Geometry Aware 3D Scene Generative ModelDeep Learning JP
NeRF-VAE is a 3D scene generative model that combines Neural Radiance Fields (NeRF) and Generative Query Networks (GQN) with a variational autoencoder (VAE). It uses a NeRF decoder to generate novel views conditioned on a latent code. An encoder extracts latent codes from input views. During training, it maximizes the evidence lower bound to learn the latent space of scenes and allow for novel view synthesis. NeRF-VAE aims to generate photorealistic novel views of scenes by leveraging NeRF's view synthesis abilities within a generative model framework.
Detecting attended visual targets in video の勉強会用資料Yasunori Ozaki
第三回 全日本コンピュータビジョン勉強会(後編)で発表した Detecting attended visual targets in video のまとめ資料です。映像中にいる人物が注意を払っている対象を推定するタスクを解いた話です。コンピュータビジョンや認知科学などに興味がある方はぜひご覧ください。
シェーダーを活用した3Dライブ演出のアップデート ~『ラブライブ!スクールアイドルフェスティバル ALL STARS』(スクスタ)の開発事例~KLab Inc. / Tech
This document discusses updates to 3D live performance rendering in Love Live! School Idol Festival All Stars (SIFAS) using shaders. It describes how vertex shaders were used to animate butterfly wings flapping and fans waving to reduce CPU load while maintaining production efficiency. Particle systems were combined with custom vertex streams and shader modifications to extend the single butterfly implementation to multiple butterflies. GPU instancing was also proposed as an alternative solution.