Digital Content and Media Sciences Research Division
IKEHATA Satoshi Digital Content and Media Sciences Research Division Assistant Professor
Research Fields: Pattern Media
Introduction of research by science writer
Aiming to implement high-precision 3D reconstruction techniques
Reconstructing 2D information in 3D
We humans perceive the objects that we see as three-dimensional (3D) objects by combining information about them, such as depth, in a complex way. I am researching 3D computer vision, which examines how to reproduce this function of the human eye using computers, in other words, how to take two-dimensional (2D) image information, such as a photograph, and reconstruct the 3D information.
Up until now, my research has centered on the photometric stereo method, which uses one camera to photograph an object under light coming from different angles and then performs 3D reconstruction based on multiple shading patterns for a single object. This method is capable of elaborately reconstructing fine surface details such as the brush strokes of oil paintings, as well as delicate shapes of small objects such as tumor in endoscopic images.
Application in the real estate industry
However, there are still very few examples of application of 3D computer vision. In my previous position at Washington University in St. Louis, I conducted 3D modeling of real estate property information through joint research with a real estate company, with the goal of applying the research results. In the United States, the decision to purchase or rent real estate is often made on the basis of Internet information only, and the creation of detailed 3D information on properties was required.
There were examples of 3D reconstruction of indoor spaces before then, but they were only ever floor plans converted into 3D and could not provide meaningful information required for living in a place, such as room divisions and connections, or which rooms are bedrooms, for example. Using proprietary 3D reconstruction methods based on panoramic depth images taken in multiple places in the rooms, we succeeded in creating user-friendly property information that can be linked to CAD and provides meaningful information to 3D models.
Aiming towards integration with deep learning
Currently, I am researching combining 3D reconstruction with deep learning to improve reconstruction accuracy by loading large amounts of data, such as shading patterns, and having the system learn from it. However, outputting 3D information from inputted 2D information is difficult using current deep learning technology especially when the number of input is large (e.g., one thousand). There have not yet been any examples of research anywhere in the world that has successfully reconstructed accurate 3D information, and I am attempting to solve a difficult problem that could have a global impact.
Another problem with existing deep learning technology is that once the object category has been determined, the algorithm can only be applied to that object category. In other words, a 3D reconstruction algorithm with a certain purpose cannot be used for other objects. I am therefore also working on removing such constraints and developing techniques that improve versatility. Additionally, in the same way as human sight reconstructs objects in 3D using multiple clues such as parallax due to binocular vision, shading, and texture, I would like to make it possible for computers to combine multiple clues to perform 3D reconstruction.
In the future, I aim to share these research results for the benefit of society by developing applications based on these 3D reconstruction techniques.