EVENT
Event News
Talk on Profs. Kukelova, Sattler, and Yang on computer vision topics
We are pleased to inform you about the upcoming seminar by "Profs. Kukelova, Sattler, and Yang on computer vision topics" Everyone interested is cordially invited to attend!
Talk 1:
A Brief Introduction to Camera Geometry Estimation Solvers
Abstract:
We will briefly introduce the most common camera geometry estimation problems, including relative and absolute pose problems for calibrated, uncalibrated, and partially calibrated cameras. Starting with a short historic overview, we will then discuss the current state-of-the-art for these problems. This includes highlighting the challenges faced when aiming for efficient and robust solutions for camera geometry estimation.
Speaker Bio:
Zuzana Kukelova (Czech Technical University in Prague)
https://cmp.felk.cvut.cz/~kukelova/
Talk 2:
3D Reconstruction with Gaussian Splatting
Abstract:
Accurate 3D reconstruction is a core computer vision problem with many applications, including autonomous robots such as self-driving cars, cultural heritage documentation, and content creation for the entertainment industry (movies, games, etc.). Traditionally, 3D reconstructions have been based on 3D meshes and point clouds. Recently, learning-based approaches, such as neural radiance fields (NeRFs) and most recently 3D Gaussian Splatting (3DGS), have become popular. These representations are learned from images with known intrinsics and extrinsics and generate (close-to) photorealistic representations of scenes and objects. Compared to NeRFs, which can be slow to train and slow to render, 3DGS offers both faster training and test times. This talk first briefly reviews the original 3DGS formulation before identifying shortcomings and explaining how to resolve them. In particular, we will discuss (i) how to handle artifacts in the reconstruction caused by a limited set of training viewpoints, (ii) how to extend the original formulation for handling images taken under different conditions (day, night, etc.), and (iii) how to extract accurate 3D meshes from 3DGS representations by defining a field on top of the 3D Gaussians used to represent the scene. In addition, we will briefly mention ongoing efforts to ensure that benchmark results are comparable and that comparisons are fair.
Speaker Bio:
Torsten Sattler (Czech Technical University in Prague)
https://tsattler.github.io/
Talk 3:
Video Understanding and Generation with Multimodal Foundation Models
Abstract:
Recent advances in vision and language models have significantly improved visual understanding and generation tasks. In this talk, I will present our latest research on designing effective tokenizers for transformers and our efforts to adapt frozen large language models for diverse vision tasks. These tasks include visual classification,ideo-text retrieval, visual captioning, visual question answering, visual grounding, video generation, stylization, outpainting, and video-to-audio conversion. If time permits, I will also discuss our recent findings in dynamic 3D vision.
Speaker Bio:
Ming-Hsuan Yang (University of California, Merced/Google DeepMind)
https://faculty.ucmerced.edu/mhyang/
Time/Date:
13:30 - 16:45 February 17 (Monday), 2025
Place:
Room 1902 & 1903 , NII and Online
Online:
zoom
Contact:
If you would like to join, please contact by email.
Email :sugimoto[at]nii.ac.jp