Principles of Informatics Research Division
Principles of Informatics Research Division, Associate Professor
Efficiently finding useful information from huge amounts of data
Phenomena in the natural world produce analog (continuous) information but are measured as digital (discrete) information. My research involves taking this discrete information and finding characteristics represented by the data using computers. It also aims at efficiently extracting information that is useful to humankind from huge amounts of data.
Including statistical methods in pattern mining
Finding patterns in huge amounts of data is called pattern mining. One example of its application is in research on genes associated with diseases. In this research, associations found in data on genetic markers from many people are represented by dots and lines, as shown in the figure. Then patterns, combinations of markers, related to a disease are detected. If there are n markers, the number of combinations that must be analyzed is 2n (2 to the power of n). When there are large numbers of markers, it is almost impossible to obtain results. Analyzing the data using a discrete structure approach such as the set inclusion relationship makes it possible to produce results with extremely little computational effort. If we also analyze the results from the perspective of statistical significance, we can really explore accurately what the genetic patterns associated with the disease are. This is the research that I am currently working on. My research method involves combining methods of discrete structures, information geometry, and statistics to construct a theory that enables correct decisions to be made more efficiently, and then testing the theory by using computers and reflecting the results in the theory.
Including a discrete approach in deep learning for deeper analysis
Facial recognition, speech recognition, and many other useful services are emerging, and deep learning technology is contributing greatly to this. Deep learning technology establishes correlations in bulk data and constructs general models so that certain judgments can be made when certain characteristics are combined. It could perhaps be called a sophisticated form of signal processing. I believe that the addition of inductive logic programming, which once played a central role in artificial intelligence research, and symbolic, discrete approaches such as pattern mining, which is at the center of data mining research, will lead to the development of robust machine learning. Recently, a program developed to play the board game Go using deep learning has been beating the top human Go players, but it is difficult to explain the model's strategy. Including another approach may enable deeper analysis and make it possible to explain the records of the games satisfactorily. I hope that my research will contribute to this.