Title | Building a Terabyte-scale Web Data Collection "NW1000G-04" in the NTCIR-5 WEB Task |
Authors | Masao Takaku, Keizo Oyama, Akiko Aizawa, Haruko Ishikawa, Kengo Minamide, Shin Kato, Hayato Yamana, and Junya Hayashi |
Abstract | We built a terabyte-scale web data collection, NW1000G-04, which was used in the NTCIR-5 WEB task. This paper describes the process of building the collection and some statistics of it in detail. |
Language | English |
Published | Sep 7, 2006 |
Pages | 8p |
PDF File | 06-012E.pdf |