update: 2025-05-27
Company-provided data
Yahoo! Dataset
The NII provides Yahoo! Dataset to researchers, which was offered by LY Corporation.
- "Yahoo! Chiebukuro" Data (3rd edition)
LIFULL HOME'S Dataset
The NII provides LIFULL HOME'S Dataset to researchers, which was offered by LIFULL Co., Ltd.
- Snapshot Data of Rentals
- Monthly Data of Rentals and Sales
- Posting Period Data of Rentals and Sales (2025-05-27 add)
NTCIR Test Collection
Test collections that NTCIR Project organized by NII built. IDR provides the following test collections. The list of test collections provided by IDR is here. For other test collections that are provided by NTCIR secretariat, please refer to "Test Collections".
Speech Corpus
Speech corpora that Speech Resources Consortium established in NII accepted from various institutions and groups. These are provided by Speech Resources Consortium for the time being. Please refer to "Corpus List".
Researcher-provided data
Osaka University Multimodal Dialogue Corpus (Hazumi)
A multimodal human-agent dialogue corpus collected by Osaka University.
OSX Cooking Video Dataset (COM Kitchens)
A cross-modal dataset consisting of 177 cooking videos and associated recipe texts created through a joint research between Cookpad Inc. and OMRON SINIC X Corporation (OSX).
Discontinued dataset
Video Database (terminated)
Video databases for evaluation of video processing built by VDBWG, SIG-PRMU, IEICE. Distribution of the data was terminated. (Mar, 2018)