Invited Talks

  • An Overview of Temporal Information Extraction[abstract], Prof Kam-Fai Wong, The Chinese University of Hong Kong, Hong Kong
  • Advanced Techologies for Information Access [abstract], Dr Tetsuya Sakai, Toshiba Laboratory, Kawasaki, Japan
  • Translation Probabilities in Cross-Language Information Retrieval [abstract], Prof Jong-Hyeok Lee, Pohang University of Science and Technology, Korea
  • Text Summarization with Rhetorical Information [abstract], Prof Sung Hyon Myaeng, Information and Comunications University, Korea
  • Automatic Identification of Oriental and Other Scripts in Image Documents [abstract], by Prof Ching Y. Suen, Concordia University, Canada
  • Sentiment and Content Analysis of Chinese News Coverage [abstract], by Prof Benjamin T'sou, City University of Hong Kong, Hong Kong

 
  • An Overview of Temporal Information Extraction, Prof Kam-Fai Wong, The Chinese University of Hong Kong, Hong Kong

    • Abstract
      Research of temporal Information Extraction originated in 1990’s as a subtask of named
      entity recognition. To date, scope of this research is extended ranging from temporal expression
      extraction and annotation to temporal reasoning and understanding. This research currently
      becomes an independent hot topic and greatly benefits NLP tasks such as question answering,
      information extraction, and text summarization. In this article we target at a brief presentation of
      most aspects of this research followed by an analysis of challenges encountered together with
      solutions currently employed. An investigation of future directions of this research is provided in this
      article.
    • Bibliography of Prof Wong
      K. F. Wong obtained his PhD from Edinburgh University, Scotland, in 1987. After his PhD, he has performed research in Heriot-Watt University (Scotland), UniSys (Scotland) and ECRC (Germany). At present he is a professor in the Department of Systems Engineering and Engineering Management, the Chinese University of Hong Kong (CUHK) and in parallel serves as the director of the Centre for Innovation and Technology (CINTEC), CUHK. His research interest focuses on Chinese computing and parallel database and information retrieval. He has published over 130 technical papers in these areas in various international journals and conferences and books. He is a member of the ACM, CLCS, IEEE-CS AND IEE (UK). He is the founding Editor-In-Chief of ACM Transactions on Asian Language Processing (TALIP), co-Editor-in-Chief of International Journal on Computer Processing of Oriental Languages and a member of the editorial boards of the Journal on Distributed and Parallel Databases and International Journal on Computational Linguistics and Chinese Language Processing. He is the panel chair of VLDB2002, PC co-chair of ICCPOL01, ICCPOL99, IJCNLP05, and General Chair of AIRS04 and IPAL00, and also PC members of many international conferences, some recent ones being: WISE02, ICWL02, COLING02, IRAL03, ICCPOLO3 and SIGMOO04.

[TOP]

  • Advanced Techologies for Information Access, Dr Tetsuya Sakai, Toshiba Laboratory, Kawasaki, Japan
    • Abstract
      This paper briefly describes Toshiba Knowledge Media Laboratory's recent research efforts for effective information retrieval and access. Firstly, I will mention the main research topics that are being tackled by our information access group, including document retrieval, speech-input/multimedia question answering, and evaluation metrics. Secondly, I will focus on the problem of cross-language information retrieval and access, and describe a system called BRIDJE (Bi-directional Retriever/Information Distiller for Japanese and English), which achieved many gold-medal performances at the recent NTCIR (a.k.a. "Asian TREC") workshop. Finally, I will conclude the paper by mentioning some unsolved problems and suggesting possible directions for future Information Access research.
    • Bibliography of Dr Sakai
      Tetsuya Sakai received a Master's degree in Engineering from Waseda University in 1993 and joined Toshiba Corporate R&D Center in the same year. He received a Ph.D from Waseda University in 2000 for his work on information retrieval and filtering systems.From 2000 to 2001, he was a visiting researcher at University of CambridgeComputer Laboratory, under the supervision of Professor Karen Sparck Jones and Professor Steve Robertson. He is currently a Research Scientist at Toshiba Corporate R&D Center Knowledge Media Laboratory.

    [TOP]

  • Translation Probabilities in Cross-Language Information Retrieval, Prof Jong-Hyeok Lee, Pohang University of Science and Technology, Korea
    • Abstract
      Translation ambiguity is one of the major problems in dictionary-based cross-language information retrieval. To attack the problem, indirect methods, which do not explicitly resolve translation ambiguity, rely on query-structuring techniques such as Pirkola's method and balanced translation. Direct methods try to assign translation probabilities to translations, normally by employing co-occurrence of translations in target documents as disambiguation clues. So far, translation probabilities in direct methods have been mainly used to select top N translations that are equally correctly considered in query formulation. However, translation probabilities themselves may influence term importance, resulting in affecting retrieval effectiveness. In order to study the effect of translation probabilities on retrieval effectiveness in direct methods, this paper empirically investigates the following issues: factors affecting translation probabilities, translation probabilities vs. term weights, the accuracy of translation disambiguation vs. retrieval effectiveness, top N translations vs. retrieval effectiveness.

    • Bibliography of Prof Lee
      Jong-Hyeok Lee received his B.S. degree in mathematics education from Seoul National University in 1980, and then his M.S. and Ph.D. degrees in Computer Science from KAIST (Korea Advanced Institute of Science and Technology), in 1982 and 1988, respectively. From Nov. 1989 through Jan. 1991, he worked as a visiting researcher for NEC C&C institute, Japan. After then he joined and has been with POSTECH (Pohang University of Science and Technology) as an assistant and associate professor till Mar. 2003, and then as a professor. During the year from Aug. 1998 he worked as a visiting scholar for CRL/NMSU, USA. His research interests include machine translation, information retrieval, and multi-lingual language processing.

    [TOP]

  • Text Summarization with Rhetorical Information, Prof Sung Hyon Myaeng, Information and Comunications University, Korea
    • Abstract
      We describe a text summarization method that employs a hierarchical clustering algorithm and rhetorical structure information. A summary consists of key sentences representing the core content of a document, which are selected based on the result of sentence clustering. Since individual sentences are often too short for similarity calculations in clustering, they are combined based on the rhetorical structure information in the document. Instead of relying on full parsing, we only use the rhetorical structure information immediately recognizable at the surface level, as a way to make this approach practical and usable across different languages.
    • Bibliography of Prof Myaeng
      Dr. Sung Hyon Myaeng is currently a professor at Information and Communications University (ICU), Korea. Prior to this appointment, he was a faculty at Chungnam National University, Korea, and Syracuse University, USA. His research work has been in cross-language IR, summarization, topic detection & tracking, categorization, distributed IR and digital libraries. He was a program committee chair for ACM SIGIR, 2002, and for AIRS, 2004.

    [TOP]

  • Automatic Identification of Oriental and Other Scripts in Image Documents, Prof Ching Y. Suen, Concordia University, Canada
    • Abstract
      Large quantities of paper documents are still produced or received by many organizations. Nowadays they are being handled electronically through imaging and digital means such that the resulting document images can be processed by OCR for information retrieval and data mining. Since OCR is language dependent, the language of the original document must be identified first by advanced technology. This paper describes two methods of identifying Oriental languages among four language groups, i.e. Oriental, Roman, Cyrillic, and Arabic. One method is based on features extracted from the shapes of words and letters, while the other one is based on global analysis of text pieces using Gabor filters. Experimental results on hundreds of documents indicate that both automatic classification approaches look quite promising. The use of linguistic analysis to enhance the results is also discussed.
    • Bibliography of Prof Suen
      Ching Y. Suen received an M.Sc.(Eng.) degree from the University of Hong Kong and a Ph.D. degree from the University of British Columbia, Canada. In 1972, he joined the Department of Computer Science of Concordia University where he became Professor in 1979 and served as Chairman from 1980 to 1984, and as Associate Dean for Research of the Faculty of Engineering and Computer Science from 1993 to 1997. He has guided/hosted 65 visiting scientists and professors, and supervised 60 doctoral and master's graduates. Currently he holds the distinguished Concordia Research Chair in Artificial Intelligence and Pattern Recognition, and is the Director of CENPARMI, the Centre for PR & MI.

      Prof. Suen is the author/editor of 11 books and more than 400 papers on subjects ranging from computer vision and handwriting recognition, to expert systems and computational linguistics. He is the founder of "The International Journal of Computer Processing of Oriental Languages" and served as its first Editor-in-Chief for 10 years. Presently he is an Associate Editor of several journals related to pattern recognition.
      A Fellow of the IEEE, IAPR, and the Academy of Sciences of the Royal Society of Canada, he has served several professional societies as President, Vice-President, or Governor. He is also the founder and chair of several conference series including ICDAR, IWFHR, and VI. He was the General Chair of numerous international conferences, including the International Conference on Computer Processing of Chinese and Oriental Languages in August 1988 held in Toronto, International Conference on Document Analysis and Recognition held in Montreal in August 1995, and the International Conference on Pattern Recognition held in Quebec City in August 2002.

      Dr. Suen has given 150 sem
      inars at major computer industries and various government and academic institutions. He has been the principal investigator of 25 industrial/government research contracts, and is the recipient of prestigious awards, including the ITAC/NSERC Award from the Information Technology Association of Canada and the Natural Sciences and Engineering Research Council of Canada in 1992 and the Concordia "Research Fellow" award in 1998.

    [TOP]

  • Sentiment and Content Analysis of Chinese News Coverage, by Prof Benjamin T'sou, City University of Hong Kong, Hong Kong
    • Abstract
      This paper explores the salient differences between spread of the polar items in terms of paragraphs and sentences and the problem of multiple foci within coherent textual segments. Our findings indicate that paragraph spread is comparable to sentence spread and that the problem of multiple foci is far-reaching and deserves considerable further attention. This is especially so for political figures in an election when comparison between the challenger and the incumbent is much more common than coverage or analysis of only the incumbent.
    • Bibliography of Prof T'sou

    • Professor Benjamin T'sou is chair professor of Linguistics and Asian Languages, director of Language Information Sciences Research Centre at City University of Hong Kong.

    [TOP]