| 1. Experimental Study on the Relationship between the Performance and Training Data Size of Text Categorization | |
| author: | Atsuhiro TAKASU(National Institute of Informatics), Kenro AIHARA(National Institute of Informatics) |
| abstract: | This paper discusses the relationship between the performance of text categorization and the size of training data. In the studies of text categorization, a pair of training data and test data is used to evaluate the performance of text categorizers. In this paper we conduct experiments on the convergence of text categorizers with respect to the training data size and discuss the validity of text categorization experiments. |
| 2. Information Extraction from HTML Pages and its Integration | |
| author: | Kumi ITAI(Graduate School of Information Science and Technology, University of Tokyo),Atsuhiro TAKASU(National Institute of Informatics), Jun ADACHI(National Institute of Informatics) |
| abstract: | We propose a method of transformation and integration of HTML tables into a common XML list structure. HTML tables tend to have diversified structures, and such integration will help us browse and compare all related information in separate HTML pages simultaneously. This paper focuses on tasks of information extraction from tables and data categorization. For this purpose, we applied three algorithms; (I) data classification by Support Vector Machine, (II) table structure estimation and data categorization by Hidden Markov Model, and (III) data classification by the combination of Support Vector Machine with Hidden Markov Model. Finally we report the experimental results and remaining issues. |
| 3. Optimising Queries by Pushing contains() in XQuery for XML Views constructed with an aggregation | |
| author: | Hiroyuki KATO(National Institute of Informatics), Jun ADACHI(National Institute of Informatics) |
| abstract: | We Propose an optimization method for XML views constructed with an aggregation peculiar to text type. The point of our optimization menthod is that this optimazation is performed in XQuery and we focus on a selection condition ’contains()’ which could be evaluated cheaply by pushing ’contains()’ toward leaves in query trees. We show that the prominent feature of the aggregation requires rewriting a whole query to keep equivalence between input queries and optimized output queries. |
| 4. NTCIR-3 WEB: An Evaluation Workshop for Web Retrieval | |
| author: | Koji EGUCHI(National Institute of Informatics), Keizo OYAMA(National Institute of Informatics), Emi ISHIDA(National Institute of Informatics), Noriko KANDO(National Institute of Informatics), Kazuko KURIYAMA(Shirayuri College) |
| abstract: | The authors conducted the Web Retrieval Task (’NTCIR-3 WEB’) from 2001 to 2002 at the Third NTCIR Workshop. In the NTCIR-3 WEB, they attempted to assess the retrieval effectiveness of Web search engine systems using a common data set, and to build re-usable test collections that are suitable for evaluating Web information retrieval systems. With these objectives, they evaluated on searches using various types of user input, user models and document models. As the document data sets, they constructed 100-gigabyte and 10-gigabyte document collections that were gathered from the ‘.jp’ domain. The user input was given as query term(s), sentence, and document(s). They assumed two user models where comprehensive relevant documents are required, and where precision of the top-ranked results is emphasized. They also assumed several document models, such as a document as an individual page, and a document as a page set connected by hyperlinks. This paper describes an overview of the test collections constructed in the NTCIR-3 WEB, the proposed evaluation methods, and the evaluation results. The evaluation results suggested that the link-based techniques can perform more effectively when short queries are input. |
| 5. Development of Mobile Phone Culture in Japan and Its Implications to Library Services: Prospecting Information Services in Coming "Ubiquitous Society" | |
| author: | Masamitsu NEGISHI(National Institute of Informatics) |
| abstract: | Prevalence of mobile phone capable of the internet access in Japan since its first service in 1999 is so remarkable as to be proudly denoted in the Information and Communications White Paper of the government, and a new adaptable culture or life style is being created. In this environment, the internet access to library catalog databases by mobile phones has been launched at several libraries. The paper surveys general situation on the mobile internet access with an emphasis on func-tionality and applicability of mobile phone to the every aspect of the society. Then it discusses the implications of mobile access to the whole library services in a "Ubiquitous Society," for which the government is enthusiastically formulating and promoting its policies in these years. |
| 6. For Building e-Confidence: A Proposal for a Trusted Third Party Model | |
| author: | Reiko GOTO(Institute of Socio-Information and Communication Studies, the University of Tokyo) |
| abstract: | In this paper, we discuss how to build confidence in electronic transactions (e-Confidence), focusing on “trusted third par-ties (TTPs)” as key agents. Construction of an institutional framework is one important focal point for expanding the market of electronic commerce. In the information society, to promote private dynamism and to activate innovations, it will be nec-essary to switch from a legal system of ex ante restrictions to a system centered on ex post controls, based on highly trans-parent standards and the private-initiative principle. So how shall we reconstruct a set of institutions? Establishing a clear position for trusted third parties (TTPs) may be necessary to create a new system for the future. We define TTP as a non-public sector in the intermediate domain between the for-profit, market domain and the public, government domain. By fo-cusing on the functions of TTPs in the fields of personal data protection laws and alternative dispute resolution (ADR), we validate the potential of TTPs. If the for-profit sector, government, academia and private individuals cooperate to assign suitable functions and authorities by organizing TTPs, a system that nourishes e-Confidence will be generated, and the mar-ket for electronic commerce will expand. |
| 7. Engineering Education and Professional Development in Germany, France and United Kingdom-Examples for Establishing Continuing Professional Develop-ment of Engineers in Japan | |
| author: | Henri ANGELINO(National Institute of Informatics) |
| abstract: | Rapid technological development, internationalization of enterprises and globalization of world economy impose to Na-tions to "produce" the best graduates, especially in science, technology and in economics. In particular companies are more inclined to recruit the best engineers and/or scientists whatever their nationality. For different reasons various Japanese or-ganizations have proposed the establishment of an Integrated System to support Professional Development of Engineers (PDE). This system will cover engineering education, training and practice, professional certification and Continuing Profes-sional Development (CPD). One of the targets is to have a national and an international recognition of the professional com-petences of Japanese engineers. Another target is to promote CPD to maintain engineers professional qualification and/or to give them the opportunity of career development by adding competences in different fields. In order to establish such modifications it has been decided to study what others Nations are doing in these fields and to propose "the Japanese way" which must integrated the cultural aspect of Japanese Society. This paper analyzes the situation in three European Countries, France, Germany and United Kingdom, having strong influence in the world economy. It also analyzes the recent evolution linked to the Bologna Declaration. |