| Title | Building a Terabyte-scale Web Data Collection "NW1000G-04" in the NTCIR-5 WEB Task |
| Authors | Masao Takaku, Keizo Oyama, Akiko Aizawa, Haruko Ishikawa, Kengo Minamide, Shin Kato, Hayato Yamana, and Junya Hayashi |
| Abstract | We built a terabyte-scale web data collection, NW1000G-04, which was used in the NTCIR-5 WEB task. This paper describes the process of building the collection and some statistics of it in detail. |
| Language | English |
| Published | Sep 7, 2006 |
| Pages | 8p |
| PDF File | 06-012E.pdf |