> HOME > Data List

Data Set List

This page presents the list of data sets that NII provides for informatcis-related researchers. Some of the data sets are under preparation.

update: 2017-11-28

Yahoo! Data Set

Data sets that NII accepted from Yahoo! Japan Corporation and provides for researchers.

  1. "Yahoo! Chiebukuro" Data (2nd edition)

Rakuten Data Set --- 2017-11-28 Update

Data sets that NII, in cooperation with Rakuten, Inc., provides for researchers.

  1. Rakuten Ichiba: merchandise data and review data
  2. Rakuten Travel: facilities data and review data
  3. Rakuten GORA: golf course data and review data
  4. Rakuten Recipe: recipe information and recipe image
  5. PriceMinister: user review, products reviews interests -- 2017-04-03 NEW!!
  6. Annotated data
    • Tsukuba sentiment-tagged corpus (TSUKUBA corpus)
    • Product images dataset with category label
    • Images with character area
    • Floor plan from Rakuten Real Estate and wall label -- 2017-11-28 NEW!!

NTCIR Test Collection

Test collections that NTCIR Project organized by NII built. IDR provides the following test collections. The list of test collections provided by IDR is here. For other test collections that are provided by NTCIR secretariat, please refer to "Test Collections".

Speech Corpus

Speech corpora that Speech Resources Consortium established in NII accepted from various institutions and groups. These are provided by Speech Resources Consortium for the time being.

Video Database (suspended)

Video databases for evaluation of video processing built by VDBWG, SIG-PRMU, IEICE. Currently, acceptance of new applications is stopped. When distribution is restarted, it will be announced at this site.