A Framework for Delivery of Thai Content through Mobile Devices
|
|
- Clara Simpson
- 6 years ago
- Views:
Transcription
1 A Framework for Delivery of Thai Content through Mobile Devices Chuleerat Jaruskulchai, Atichart Khanthong, and Wanlapa Tantiprasongchai Intelligent Information Retrieval and Database Department of Computer Science, Faculty of Science, Kasetsart University, Bangkok, Thailand. Abstract With the increasing of mobile devices, there are challenges in providing text information to the mobile clients. Unfortunately, mobile devices have limited display and navigation capabilities. Furthermore, the inconvenient of tiny keypad makes more difficult to input keywords or other information. This problem is more challenge when working with Thai Text. This research paper introduces a framework for delivery of Thai Content through mobile devices. It explores on particular aspect of the automated construction of personalized focus or user s attention. Documents are disseminated based on the personalized focus and routed to a mobile device. Instead of delivery every document, the documents are clustered, topic is extracted for each cluster. Additionally, content of each document is summarized. Basic Naive Bay algorithm is deployed for filter user s attention and topic extraction is based on term frequency and inverse document frequencies. Important sentences are extracted for summarization. An object-oriented technology is used to develop this demo system. Keywords: Mobile Device application, Document summarization, processing of Thai Text. 1. Introduction It has been expected that the handheld computer market will grow larger than computer industry and mobile clients have become a new target for business industry. Due to the popularity and capability of Personal Digital Assistants (PDA) and Mobile devices, such capabilities will increase the usability of PDAs. According to the wireless technology, it provides an opportunity to access information in any time and any places. Thus, numerous information services are offer through the mobile clients, such as travel guides, entertainment advice, news, flight schedules, driving directions. However, these services are task-specific and mobile clients known where to locating information. For web browsing and searching application, mobile devices have limited display, graphics capabilities, navigation capabilities, and processing speed. Furthermore, the inconvenient of tiny keypad makes more difficult to input keywords or other information. This problem poses a number of issues in designing user interface for PDA. It is a challenge when working with Thai Text due to the numerous Thai alphabets. Most of Thai PDA s applications offered are installed and run on PDA. This paper presents a framework to facilitate web navigation, searching, and browsing for small devices for Thai texts. Documents are clustered and automatic created topic to describe the cluster content. To solve the limited display, text summarization techniques are proposed. To save user time for locating information, Naïve Bayes classifier is employed to classified user s preference and notify user by Related Works There are several aspects of mobile client which are attractive to the researchers and can be categorized into three groups. First basic aspect is the effective browsing for client devices. Work in [Matt J. et al. 1999], reports the study of the impact of display size will reduce the user effectiveness by up to 50% original tasks. The effectiveness is measured by the number of scrolling for viewing information. However, Dillon et al reported that the comprehension rate PDA is the same as desktop s display (cited in [Matt J. et al. 1999]). Second aspect is the study of dynamic transformation of format in web pages to small devices [Watters C. and Zhang R., 2003]. Another approach in this aspect is called Web Clipping Application. This approach use propriety language to request portions of web pages they reformat for display. Third aspect is the design of web searching for PDAs which is effective from information retrieval. Thus, this aspect is more concern in the navigation capabilities and how users interact with PDA for searching information. Many successful information retrieval methodologies are deployed such as text summarization, clustering of documents using concept hierarchies and term extraction [Chan D.L. et al. 2002]. Most of the current applications for PDA are available download form the Internet and run on local device. Problem for web browsing for Thai PDA uses the web clipping technology. 190 Jaruskulchai, C.; Khanthong, A. and Tantiprasongchai, W.
2 In processing of Thai text, there are several issues such as word boundary, sentence extraction. Report from National Electronics and Computer Technology Center (NECTEC) [Sornlertlamvanich V. et al. 2000] states that there are number of processing of Thai text has not been fully resolved. Due to the Thai writing system has no end word marker, word segmentation research still one of the research topic. The effectiveness of word segmentation is around 80-95% in precision and 80% in recall. However, many researchers have moved to discover sentence extraction, and national of language processing is employed to improve the word segmentation. 3. A Framework for Delivery of Thai Content Framework for delivery of Thai content is an extended our previous research [Jaruskulchai J., Kiewsuwansuk S., and Kantasena J., 2001] to provide facility to mobile clients. The main focus of our previous research is to investigate the clustering algorithm and topic extraction for Thai text. The research shows the potential to research in Thai text with little concern on word or sentence boundary. It is a challenge to move forward our research to serve mobile clients. Our framework offers several utilities to manage personal information. Not Only navigate function for users to browse information, but system will monitor new information and inform user through . Additionally, two types of information are offered to mobile clients, full text document and summarized document. There are many issues for designing and developing client mobile application. Michael and Kim [Michaeal J. Apbers and Loel Kim, 2000] had lay out a theoretical framework for understanding differences between handheld and full-sized web environment and their intended uses. Michael and Kim reported that there are two types of functions which can be provided to mobile clients, simple look-up and information manipulation. Simple look-up describes the skills and activities involved in locating and recognizing a desired chunk of information, such as checking a stock price, looking up a phone number, or reading an . Information manipulation is more complex task, generally, means the user needs to interact with different pieces of information, such as comparing airline fares. In our framework, we apply many mechanisms from information retrieval area such as simple lookup, information filtering according to user s preferences, document clustering and text summarization. The design framework consists of two main modules. Server module is responsible for collecting, indexing, filtering and summarizing documents. Server also mails user s preferences through mail. Client module allows users to manage (edit, review, delete) their profiles through web browser or PDA s devices. To access personal information is controlled by an address and a password. Finger 1 shows our framework. Detail of information filtering and summarizing are described in next section. Figure 1. A Framework for Delivery of Thai Content Jaruskulchai, C.; Khanthong, A. and Tantiprasongchai, W. 191
3 3.1 An automated Construction of Personalized Information Personal information and an information filtering can be used interchangeable and it is closely to text classification algorithm. Thus, to classify user s preferences, the probabilistic naive Bayes is employed. The naïve Bayes is very efficient algorithm and has been employed in many research fields, such as text classification, classification of , automate mail filtering. To estimate naïve Bayes parameters, user needs to provide a set of training documents or user s preference document. User s interest or user s preferences about information will express in keywords of interest and assign term weight for the most important keywords. In theory, keywords and weight of the most important terms are represented as a term weight vector, according to the vector space model. Most of implementation of naïve Bayes, initial information or training data set will be obtained from user. In real world application, asking user to rate the relevant document at the first time of registration may be not workable. Thus, system precomputes the posterior probability for each user by using the user s keyword and term weight, and this probabilities will be updated when user retrieve the documents. However, system presumes the retrieved documents are the relevant documents and update the probabilities parameters. When new information or new documents are arrived in the system, each document will be classified or filtered according to the probabilistic naïve Bayes and stored in each user s preference file waiting to delivery to user. 3.2 Document Clustering Enhancing the capability of mobile device, instead of delivery each document, documents are clustered according their contents and extracting topic for describing the content of each cluster. The complete linkage is employed in this framework to cluster documents of the same similar concept before delivery back to users. Extracting topic from full text, the high frequent word is extracted and group for describing the cluster. Full detail of our technique for clustering document will be found in [Jaruskulchai J., Kiewsuwansuk S., and Kantasena J., 2001] 3.3 Extracting Summary Sentences The main focus on this research is extracting summary sentences for representing the compact content. Summarization is a process of abstracting key content from one or more information sources. A variety of methods have been investigated. If target reader or function of use is concerned, a summary can be fall into three categories, indicative, informative and critical [Hahn Udo and Mani Inderjeet, 2000]. The indicative summary provides compact content to alert user not to miss the information. Informative summary provides essential information which can substitute the original source. Lastly, critical summary not only the abstracting of information but also provides some opinion on that content. However, the most difficult task of summarization of Thai Text is the sentence boundary, if results of word boundary algorithm are acceptable. Current research in sentence boundary can be found in [Charoenpornsawat P. and Sornlertlamvanich V. 2001], these approaches are probabilistic part-of-speech trigram, grammatical rule based, feature-based are deployed. The featurebased approach was evaluated by ORCHID [Information Research and Development Division, 2003] corpus, a part-of-speech tagged corpus. Thai sentence definition has not fully defined. Figure 2 excerpts some sentences from ORCHID corpus and it was claimed that a sentence. Thus, corrected sentence boundary is not in our concerned and the summary is aimed at indicative summary. พยายามวางแผน และประสานงานอย างใกล ช ด (Try to plan and to cooperate closely) ร ฐมนตร ว าการกระทรวงว ทย าศาสตร Figure 2. Excerpt of sentences from ORCHID corpus Thus, phrases are the major component, which use in the process of Thai text summarization. The simple algorithm presented by H.P. Luhn [Luhn H.P. 1958] is used to measure the important sentences and will be referred as within sentence clustering techniques. This method has been researched in [Buyukkokten O. Garcia-Molina H. and Peapcke A. 2000] with different data set and inverted document frequency shame weight. The summaries process is started by filter out of the Thai stop words. Then, in each document, the high frequent words (TF-Total Frequencies) are computed for representing the significant words in each document. The frequent word occurs more than 10 percents across in the document are eliminated. The rare words and specific words will be not removed. Then, sentence is divided into clusters according to the distance of none significant words. Thus, a cluster is a sequence of consecutive words in which this sequence starts and ends with a significant word and not more than D n of none significant words to separate significant word. If more than one clusters in a sentence, the highest one is selected. Then, sentence is ranked by counting the square of number of significant words in cluster divided by total number of words in cluster. In Figure 3 shows details of computation of sentence ranking. 192 Jaruskulchai, C.; Khanthong, A. and Tantiprasongchai, W.
4 Sentence [ * * * * ] Cluster with in sentence Figure 3. Computation of clusters and sentence ranking If number of D n > 2 then the sentence ranking is 2.3 The number of none significant word (D n ) should greater than 2. The system limits number of sentences in the summary of each document. Thus, to reduce the lost of content, system allow user to view the original document. At current report, the system is evaluated the effectiveness of summarization using the user s satisfaction. Thirty documents are randomly selected and are evaluated by second year under graduate students. Evaluation criteria are satisfy, fair, and unacceptable. The results show that more than 53%, of summaries are fair, around 11 %, readers feel satisfactory and the rest of results are unacceptable. 4. System Implementation The developing of this framework is written in Java technology. This framework is built on a client-server model, where text based and client modules are handled at the server side. The process of indexing and retrieving is developing using RMI technology, a distributed process. The summary content delivered to client is marked up with XML tags, thus it is easy to reformatted and display on any devices. Displaying information on PDA, the kxml parser is employed. On general browser uses default IE parser. On mobile client need at least 16 MB and Palm OS version 4 or higher. There are no standard for displaying Thai on PDA. The current version for displaying Thai needs to install Thai routing from Thaihack. For text summarization, the test data set is collected from the Thai news paper, Daily News. The content of data set is a daily event, such as economic, foreign affairs, political and social news in Thailand. Figure 4 shows the some display of this framework. 6. Conclusion and Future works This framework is our experimental to explore new application in PDA devices and extended our previous research for mobile clients. This framework will be served for a particular search task. We present a compact content by applying summarization and clustering techniques. Furthermore, the user s preferences information is filtered according to their needs and mail back to users. The system has not been conducted the evaluation in the theoretical information retrieval science, since the purpose to present the possible model for delivery Thai Text to mobile users. For future research in summarization process, there number of approaches may be investigated and experiment with Thai text. The approaches for text summarization has been grouped into 4 categories, a summary consists of list of terms or concepts terms, a single passage extracted from the text, a sequence of sentences extracted from the text and use natural language understanding for generating summary. The most important for Thai text summarization is the standard data test, since there are a number of parameters needed to explore. Additionally, to improve the summarization result, natural language processing technique may need to extract proper nouns. Figure 4. Shows Original Document, Summarization Results and Clustering Results Jaruskulchai, C.; Khanthong, A. and Tantiprasongchai, W. 193
5 Acknowledgments This research was partially supported by the Office of the National Research Council of Thailand, References [1] Buyukkokten O., Garcia-Molina H. and Peapcke A., Seeing the Whole in Parts: Text Summarization for Web Browsing on Handheld Devices, Proceedings of the Tenth International World-Wide Web Conference, 2000 [2] Marsden G., Cheery R., and Haefele A., Small Screen Access to Digital Libraries, Computer Network 31 (1999) [3] Michael J. Albers, Loel Kim, Implications of the wireless web for technical communicators: User web browsing characteristics using palm handhelds for information retrieval, Proceedings of IEEE professional communication society international professional communication conference and Proceedings of the 18th annual ACM international conference on Computer documentation: technology & teamwork September 2000 [4] Watters C. and Zhang R., PDA Access to Internet Content: Focus on Forms, HICSS 36 Hawaii, Jan [5] Matt J., Gary M., Norliza M., and Boone K., Improving Web Interaction on Small Displays, Computer Network 31(1999) [7] Hahn Udo and Mani Inderjeet, The Challenges of Automatic Summarization, IEEE Computer, Nov. 2000, (Vol 33, No. 11) [8] Luhn H.P., The Automatic Creation of Literature Abstracts, Advances in Automatic Text Summarization, edited by Inderjeet Mani and Mark T. Maybury, The MIT Press, Cambridge, Massachusetts, Landon, England, [9] Charoenpornsawat P. and Sornlertlamvanich V., Automatic Sentence Break Disambiguation for Thai, Proceedings of ICCPOL2001, Korea, pp , May [10] Chan D.L., Luk R.W.P., Mark W.K., Leon H.V., Ho E.K.S. and Lu Q., Multiple Related Document Summary and Navigation using Concept Hierarchies for Mobil Clients, Proceedings of the 2002 ACM Symposium on Applied Computing (SAC), March 10-14, 2002, Madrid, Spain. ACM 2002 [11] Jaruskulchai J., Kiewsuwansuk S., and Kantasena J., Thai Text Document Clustering, The Fifth National Computer Science and Engineering Conference, 7-9 Nov 2001, Chiang Mai, Thailand, [12] Information Research and Development Division, Orchid Corpus, National Electronics and Computer Technology Centers, /itech/download.html, Jan [6] Sornlertlamvanich V., Potipiti T., Wutiwiwatchai C., and Mittrapiyanuruk P., The State of the Art in Thai Language Processing, Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics (ACL2000), Hong Kong, pp , October Jaruskulchai, C.; Khanthong, A. and Tantiprasongchai, W.
A PRELIMINARY STUDY ON THE EXTRACTION OF SOCIO-TOPICAL WEB KEYWORDS
A PRELIMINARY STUDY ON THE EXTRACTION OF SOCIO-TOPICAL WEB KEYWORDS KULWADEE SOMBOONVIWAT Graduate School of Information Science and Technology, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-0033,
More informationShrey Patel B.E. Computer Engineering, Gujarat Technological University, Ahmedabad, Gujarat, India
International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 3 ISSN : 2456-3307 Some Issues in Application of NLP to Intelligent
More informationA Web Page Segmentation Method by using Headlines to Web Contents as Separators and its Evaluations
IJCSNS International Journal of Computer Science and Network Security, VOL.13 No.1, January 2013 1 A Web Page Segmentation Method by using Headlines to Web Contents as Separators and its Evaluations Hiroyuki
More informationDomain-specific Concept-based Information Retrieval System
Domain-specific Concept-based Information Retrieval System L. Shen 1, Y. K. Lim 1, H. T. Loh 2 1 Design Technology Institute Ltd, National University of Singapore, Singapore 2 Department of Mechanical
More informationWeb of Science. LIBRARY SERVICES
Web of Science Web of Science is a comprehensive online database providing access to academic journals, conference proceedings and books in the sciences, social sciences, arts and humanities, from 1970
More informationA Frequent Max Substring Technique for. Thai Text Indexing. School of Information Technology. Todsanai Chumwatana
School of Information Technology A Frequent Max Substring Technique for Thai Text Indexing Todsanai Chumwatana This thesis is presented for the Degree of Doctor of Philosophy of Murdoch University May
More information(http://www.emeraldinsight.com)
Emerald (http://www.emeraldinsight.com) Emerald publishes the world's widest range of management journals which provides information, ideas and opportunity to gain insight into key management topics. Emerald
More informationInformation Retrieval
Multimedia Computing: Algorithms, Systems, and Applications: Information Retrieval and Search Engine By Dr. Yu Cao Department of Computer Science The University of Massachusetts Lowell Lowell, MA 01854,
More informationBlind Evaluation for Thai Search Engines
Blind Evaluation for Thai Search Engines Shisanu Tongchim, Prapass Srichaivattana, Virach Sornlertlamvanich, Hitoshi Isahara Thai Computational Linguistics Laboratory 112 Paholyothin Road, Klong 1, Klong
More informationA Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2
A Survey Of Different Text Mining Techniques Varsha C. Pande 1 and Dr. A.S. Khandelwal 2 1 Department of Electronics & Comp. Sc, RTMNU, Nagpur, India 2 Department of Computer Science, Hislop College, Nagpur,
More informationA BFS-BASED SIMILAR CONFERENCE RETRIEVAL FRAMEWORK
A BFS-BASED SIMILAR CONFERENCE RETRIEVAL FRAMEWORK Qing Guo 1, 2 1 Nanyang Technological University, Singapore 2 SAP Innovation Center Network,Singapore ABSTRACT Literature review is part of scientific
More informationNews Filtering and Summarization System Architecture for Recognition and Summarization of News Pages
Bonfring International Journal of Data Mining, Vol. 7, No. 2, May 2017 11 News Filtering and Summarization System Architecture for Recognition and Summarization of News Pages Bamber and Micah Jason Abstract---
More informationKarami, A., Zhou, B. (2015). Online Review Spam Detection by New Linguistic Features. In iconference 2015 Proceedings.
Online Review Spam Detection by New Linguistic Features Amir Karam, University of Maryland Baltimore County Bin Zhou, University of Maryland Baltimore County Karami, A., Zhou, B. (2015). Online Review
More informationAdding a Source Code Searching Capability to Yioop ADDING A SOURCE CODE SEARCHING CAPABILITY TO YIOOP CS297 REPORT
ADDING A SOURCE CODE SEARCHING CAPABILITY TO YIOOP CS297 REPORT Submitted to Dr. Chris Pollett By Snigdha Rao Parvatneni 1 1. INTRODUCTION The aim of the CS297 project is to explore and learn important
More informationKEYWORD EXTRACTION FROM DESKTOP USING TEXT MINING TECHNIQUES
KEYWORD EXTRACTION FROM DESKTOP USING TEXT MINING TECHNIQUES Dr. S.Vijayarani R.Janani S.Saranya Assistant Professor Ph.D.Research Scholar, P.G Student Department of CSE, Department of CSE, Department
More informationAdaptable and Adaptive Web Information Systems. Lecture 1: Introduction
Adaptable and Adaptive Web Information Systems School of Computer Science and Information Systems Birkbeck College University of London Lecture 1: Introduction George Magoulas gmagoulas@dcs.bbk.ac.uk October
More informationString Vector based KNN for Text Categorization
458 String Vector based KNN for Text Categorization Taeho Jo Department of Computer and Information Communication Engineering Hongik University Sejong, South Korea tjo018@hongik.ac.kr Abstract This research
More informationText Mining. Representation of Text Documents
Data Mining is typically concerned with the detection of patterns in numeric data, but very often important (e.g., critical to business) information is stored in the form of text. Unlike numeric data,
More informationYou ve Got A Workflow Management Extraction System
342 Journal of Reviews on Global Economics, 2017, 6, 342-349 You ve Got Email: A Workflow Management Extraction System Piyanuch Chaipornkaew 1, Takorn Prexawanprasut 1,* and Michael McAleer 2-6 1 College
More informationAsia Top Internet Countries June 30, 2012 China 538 India Japan Indonesia Korea, South Philippines Vietnam Pakistan Thailand Malaysia
EXPLORING TECHNOLOGY ADOPTION FACTORS OF WEB SEARCH ENGINES INFLUENCING TO USERS IN THAILAND Patthawadi Pawanprommaraj, Supaporn Kiattisin and Adisorn Leelasantitham Department of Technology of Information
More informationScitation A User Guide
Scitation A User Guide Manage your research faster and easier scitation.aip.org Scitation A Rich Resource for Accessing and Using Scholarly Publications Scitation is the online home to more than one million
More informationA Comparative Study Weighting Schemes for Double Scoring Technique
, October 19-21, 2011, San Francisco, USA A Comparative Study Weighting Schemes for Double Scoring Technique Tanakorn Wichaiwong Member, IAENG and Chuleerat Jaruskulchai Abstract In XML-IR systems, the
More informationSTUDYING OF CLASSIFYING CHINESE SMS MESSAGES
STUDYING OF CLASSIFYING CHINESE SMS MESSAGES BASED ON BAYESIAN CLASSIFICATION 1 LI FENG, 2 LI JIGANG 1,2 Computer Science Department, DongHua University, Shanghai, China E-mail: 1 Lifeng@dhu.edu.cn, 2
More informationLetterScroll: Text Entry Using a Wheel for Visually Impaired Users
LetterScroll: Text Entry Using a Wheel for Visually Impaired Users Hussain Tinwala Dept. of Computer Science and Engineering, York University 4700 Keele Street Toronto, ON, CANADA M3J 1P3 hussain@cse.yorku.ca
More informationEvaluation of Web Search Engines with Thai Queries
Evaluation of Web Search Engines with Thai Queries Virach Sornlertlamvanich, Shisanu Tongchim and Hitoshi Isahara Thai Computational Linguistics Laboratory 112 Paholyothin Road, Klong Luang, Pathumthani,
More informationA Prototype System to Browse Web News using Maps for NIE in Elementary Schools in Japan
A Prototype System to Browse Web News using Maps for NIE in Elementary Schools in Japan Yutaka Uchiyama *1 Akifumi Kuroda *2 Kazuaki Ando *3 *1, 2 Graduate School of Engineering, *3 Faculty of Engineering
More informationComment Extraction from Blog Posts and Its Applications to Opinion Mining
Comment Extraction from Blog Posts and Its Applications to Opinion Mining Huan-An Kao, Hsin-Hsi Chen Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan
More informationClassifying Twitter Data in Multiple Classes Based On Sentiment Class Labels
Classifying Twitter Data in Multiple Classes Based On Sentiment Class Labels Richa Jain 1, Namrata Sharma 2 1M.Tech Scholar, Department of CSE, Sushila Devi Bansal College of Engineering, Indore (M.P.),
More informationMURDOCH RESEARCH REPOSITORY
MURDOCH RESEARCH REPOSITORY http://researchrepository.murdoch.edu.au/ This is the author s final version of the work, as accepted for publication following peer review but without the publisher s layout
More informationFeature Selecting Model in Automatic Text Categorization of Chinese Financial Industrial News
Selecting Model in Automatic Text Categorization of Chinese Industrial 1) HUEY-MING LEE 1 ), PIN-JEN CHEN 1 ), TSUNG-YEN LEE 2) Department of Information Management, Chinese Culture University 55, Hwa-Kung
More informationCIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets
CIRGDISCO at RepLab2012 Filtering Task: A Two-Pass Approach for Company Name Disambiguation in Tweets Arjumand Younus 1,2, Colm O Riordan 1, and Gabriella Pasi 2 1 Computational Intelligence Research Group,
More informationFall 2013 Harvard Library User Survey Summary December 18, 2013
Fall 2013 Harvard Library User Survey Summary December 18, 2013 The Discovery Platform Investigation group placed links to a User Survey on the four major Harvard Library web sites (HOLLIS, HOLLIS Classic,
More informationVisoLink: A User-Centric Social Relationship Mining
VisoLink: A User-Centric Social Relationship Mining Lisa Fan and Botang Li Department of Computer Science, University of Regina Regina, Saskatchewan S4S 0A2 Canada {fan, li269}@cs.uregina.ca Abstract.
More informationA Review on Identifying the Main Content From Web Pages
A Review on Identifying the Main Content From Web Pages Madhura R. Kaddu 1, Dr. R. B. Kulkarni 2 1, 2 Department of Computer Scienece and Engineering, Walchand Institute of Technology, Solapur University,
More informationMulti-Dimensional Text Classification
Multi-Dimensional Text Classification Thanaruk THEERAMUNKONG IT Program, SIIT, Thammasat University P.O. Box 22 Thammasat Rangsit Post Office, Pathumthani, Thailand, 12121 ping@siit.tu.ac.th Verayuth LERTNATTEE
More informationClustering Web Documents using Hierarchical Method for Efficient Cluster Formation
Clustering Web Documents using Hierarchical Method for Efficient Cluster Formation I.Ceema *1, M.Kavitha *2, G.Renukadevi *3, G.sripriya *4, S. RajeshKumar #5 * Assistant Professor, Bon Secourse College
More informationAn Empirical Study of Web Interface Design on Small Display Devices
An Empirical Study of Web Interface Design on Small Display Devices Mei Kang QIU Kang ZHANG Maolin HUANG Department of Computer Science Department of Computer Science Department of Computer Systems University
More informationReading group on Ontologies and NLP:
Reading group on Ontologies and NLP: Machine Learning27th infebruary Automated 2014 1 / 25 Te Reading group on Ontologies and NLP: Machine Learning in Automated Text Categorization, by Fabrizio Sebastianini.
More informationWEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS
1 WEB SEARCH, FILTERING, AND TEXT MINING: TECHNOLOGY FOR A NEW ERA OF INFORMATION ACCESS BRUCE CROFT NSF Center for Intelligent Information Retrieval, Computer Science Department, University of Massachusetts,
More informationCS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University
CS473: CS-473 Course Review Luo Si Department of Computer Science Purdue University Basic Concepts of IR: Outline Basic Concepts of Information Retrieval: Task definition of Ad-hoc IR Terminologies and
More informationRecruitment Agency Based on SOA and XML Web Services
Recruitment Agency Based on SOA and XML Web Services Nutthapat Kaewrattanapat and Jarumon Nookhong Department of Information Science, Suan Sunandha Rajabhat University, Bangkok, Thailand Email: {nutthapat.ke,
More informationWeb Product Ranking Using Opinion Mining
Web Product Ranking Using Opinion Mining Yin-Fu Huang and Heng Lin Department of Computer Science and Information Engineering National Yunlin University of Science and Technology Yunlin, Taiwan {huangyf,
More informationarxiv: v1 [cs.hc] 14 Nov 2017
A visual search engine for Bangladeshi laws arxiv:1711.05233v1 [cs.hc] 14 Nov 2017 Manash Kumar Mandal Department of EEE Khulna University of Engineering & Technology Khulna, Bangladesh manashmndl@gmail.com
More informationDesigning and Building an Automatic Information Retrieval System for Handling the Arabic Data
American Journal of Applied Sciences (): -, ISSN -99 Science Publications Designing and Building an Automatic Information Retrieval System for Handling the Arabic Data Ibrahiem M.M. El Emary and Ja'far
More informationAdvanced Smart Mobile Monitoring Solution for Managing Efficiently Gas Facilities of Korea
Advanced Smart Mobile Monitoring Solution for Managing Efficiently Gas Facilities of Korea Jeong Seok Oh 1, Hyo Jung Bang 1, Green Bang 2 and Il-ju Ko 2, 1 Institute of Gas Safety R&D, Korea Gas Safety
More informationChapter 27 Introduction to Information Retrieval and Web Search
Chapter 27 Introduction to Information Retrieval and Web Search Copyright 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Outline Information Retrieval (IR) Concepts Retrieval
More informationInternational ejournals
Available online at www.internationalejournals.com International ejournals ISSN 0976 1411 International ejournal of Mathematics and Engineering 112 (2011) 1023-1029 ANALYZING THE REQUIREMENTS FOR TEXT
More informationArchives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment
Archives in a Networked Information Society: The Problem of Sustainability in the Digital Information Environment Shigeo Sugimoto Research Center for Knowledge Communities Graduate School of Library, Information
More informationInformation Gathering Support Interface by the Overview Presentation of Web Search Results
Information Gathering Support Interface by the Overview Presentation of Web Search Results Takumi Kobayashi Kazuo Misue Buntarou Shizuki Jiro Tanaka Graduate School of Systems and Information Engineering
More informationA hybrid method to categorize HTML documents
Data Mining VI 331 A hybrid method to categorize HTML documents M. Khordad, M. Shamsfard & F. Kazemeyni Electrical & Computer Engineering Department, Shahid Beheshti University, Iran Abstract In this paper
More informationLearning and Development. UWE Staff Profiles (USP) User Guide
Learning and Development UWE Staff Profiles (USP) User Guide About this training manual This manual is yours to keep and is intended as a guide to be used during the training course and as a reference
More informationDocument Summarization on Handheld Device:
Document Summarization on Handheld Device: An Information Visualization Tool for Mobile Commerce Christopher C. Yang Dept. of Systems Eng. and Eng. Management The Chinese University of Hong Kong Hong Kong
More informationSemantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman
Semantic Extensions to Syntactic Analysis of Queries Ben Handy, Rohini Rajaraman Abstract We intend to show that leveraging semantic features can improve precision and recall of query results in information
More informationScientific databases
SCID 305 : Generic Skills in Science Research Scientific databases Suang Udomvaraphunt Academic IT Stang Monkolsuk library and Information Division Faculty of Science Stang Mongkolsuk Library http://stang.sc.mahidol.ac.th
More informationAutomatic Text Summarization System Using Extraction Based Technique
Automatic Text Summarization System Using Extraction Based Technique 1 Priyanka Gonnade, 2 Disha Gupta 1,2 Assistant Professor 1 Department of Computer Science and Engineering, 2 Department of Computer
More informationDomain Specific Search Engine for Students
Domain Specific Search Engine for Students Domain Specific Search Engine for Students Wai Yuen Tang The Department of Computer Science City University of Hong Kong, Hong Kong wytang@cs.cityu.edu.hk Lam
More informationNext-Generation Standards Management with IHS Engineering Workbench
ENGINEERING & PRODUCT DESIGN Next-Generation Standards Management with IHS Engineering Workbench The addition of standards management capabilities in IHS Engineering Workbench provides IHS Standards Expert
More informationNews-Oriented Keyword Indexing with Maximum Entropy Principle.
News-Oriented Keyword Indexing with Maximum Entropy Principle. Li Sujian' Wang Houfeng' Yu Shiwen' Xin Chengsheng2 'Institute of Computational Linguistics, Peking University, 100871, Beijing, China Ilisujian,
More informationExtracting Summary from Documents Using K-Mean Clustering Algorithm
Extracting Summary from Documents Using K-Mean Clustering Algorithm Manjula.K.S 1, Sarvar Begum 2, D. Venkata Swetha Ramana 3 Student, CSE, RYMEC, Bellary, India 1 Student, CSE, RYMEC, Bellary, India 2
More informationJanuary- March,2016 ISSN NO
USER INTERFACES FOR INFORMATION RETRIEVAL ON THE WWW: A PERSPECTIVE OF INDIAN WOMEN. Sunil Kumar Research Scholar Bhagwant University,Ajmer sunilvats1981@gmail.com Dr. S.B.L. Tripathy Abstract Information
More informationMap-based Access to Multiple Educational On-Line Resources from Mobile Wireless Devices
Map-based Access to Multiple Educational On-Line Resources from Mobile Wireless Devices P. Brusilovsky 1 and R.Rizzo 2 1 School of Information Sciences, University of Pittsburgh, Pittsburgh PA 15260, USA
More informationKeywords Data alignment, Data annotation, Web database, Search Result Record
Volume 5, Issue 8, August 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Annotating Web
More informationINFOED. identify and. target
CREATING A RESEARCH INTEREST PROFILE WITH INFOED Research interest profiles are used by the Office of Research and Sponsored Programs (ORSP) to find funding opportunities. By having a list of interests
More informationMicrosoft SharePoint Server 2013 for the Site Owner/Power User
Course 55035B: Microsoft SharePoint Server 2013 for the Site Owner/Power User Page 1 of 6 Microsoft SharePoint Server 2013 for the Site Owner/Power User Course 55035B: 2 days; Instructor-Led Introduction
More informationDiploma Of Computing
Diploma Of Computing Course Outline Campus Intake CRICOS Course Duration Teaching Methods Assessment Course Structure Units Melbourne Burwood Campus / Jakarta Campus, Indonesia March, June, October 022638B
More informationSIRS Issues Researcher
From the main screen of SIRS, click on the SIRS Issues Researcher link. 1 This tutorial will provide an overview of the following features available through SIRS Issues Researcher: 2. Search Tabs 3. Reference
More informationFOUNDATIONS OF INFORMATION SYSTEMS MIS 2749 COURSE SYLLABUS Fall, Course Title and Description
FOUNDATIONS OF INFORMATION SYSTEMS MIS 2749 COURSE SYLLABUS Fall, 2013 Instructor s Name: Vicki Robertson E-mail: vrobrtsn@memphis.edu Course Title and Description Foundations of Information Systems. (3
More informationVIDEO SEARCHING AND BROWSING USING VIEWFINDER
VIDEO SEARCHING AND BROWSING USING VIEWFINDER By Dan E. Albertson Dr. Javed Mostafa John Fieber Ph. D. Student Associate Professor Ph. D. Candidate Information Science Information Science Information Science
More informationInformation Retrieval. (M&S Ch 15)
Information Retrieval (M&S Ch 15) 1 Retrieval Models A retrieval model specifies the details of: Document representation Query representation Retrieval function Determines a notion of relevance. Notion
More informationEffect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching
Effect of log-based Query Term Expansion on Retrieval Effectiveness in Patent Searching Wolfgang Tannebaum, Parvaz Madabi and Andreas Rauber Institute of Software Technology and Interactive Systems, Vienna
More informationQuoogle: A Query Expander for Google
Quoogle: A Query Expander for Google Michael Smit Faculty of Computer Science Dalhousie University 6050 University Avenue Halifax, NS B3H 1W5 smit@cs.dal.ca ABSTRACT The query is the fundamental way through
More informationAssessment Plan. Academic Cycle
College of Business and Technology Division or Department: School of Business (Business Administration, BS) Prepared by: Marcia Hardy Date: June 21, 2017 Approved by: Margaret Kilcoyne Date: June 21, 2017
More informationLecture Video Indexing and Retrieval Using Topic Keywords
Lecture Video Indexing and Retrieval Using Topic Keywords B. J. Sandesh, Saurabha Jirgi, S. Vidya, Prakash Eljer, Gowri Srinivasa International Science Index, Computer and Information Engineering waset.org/publication/10007915
More informationPELLISSIPPI STATE TECHNICAL COMMUNITY COLLEGE MASTER SYLLABUS CIW JAVASCRIPT FUNDAMENTALS CERTIFICATION WEB 2391
PELLISSIPPI STATE TECHNICAL COMMUNITY COLLEGE MASTER SYLLABUS CIW JAVASCRIPT FUNDAMENTALS CERTIFICATION WEB 2391 Class Hours: 1.0 Credit Hours: 1.0 Laboratory Hours: 0.0 Revised: Fall 06 Note: This course
More informationCS54701: Information Retrieval
CS54701: Information Retrieval Basic Concepts 19 January 2016 Prof. Chris Clifton 1 Text Representation: Process of Indexing Remove Stopword, Stemming, Phrase Extraction etc Document Parser Extract useful
More informationKeyword Extraction by KNN considering Similarity among Features
64 Int'l Conf. on Advances in Big Data Analytics ABDA'15 Keyword Extraction by KNN considering Similarity among Features Taeho Jo Department of Computer and Information Engineering, Inha University, Incheon,
More informationI-Pats: An Intelligent Search System for Thai Patents
I-Pats: An Intelligent Search System for Thai Patents Marut Buranarach 1 Choochart Haruechaiyasak 2 Alisa Kongthon 3 1,2,3 Human Language Technology Laboratory, National Electronics and Computer Technology
More informationAutomatic Domain Partitioning for Multi-Domain Learning
Automatic Domain Partitioning for Multi-Domain Learning Di Wang diwang@cs.cmu.edu Chenyan Xiong cx@cs.cmu.edu William Yang Wang ww@cmu.edu Abstract Multi-Domain learning (MDL) assumes that the domain labels
More informationUser-Centered Guidelines for Design of Mobile Applications
The Fourth International Conference on Electronic Business (ICEB2004) / Beijing 853 User-Centered Guidelines for Design of Mobile Applications Xiaowen Fang, Susy Chan, Jacek Brzezinski, Shuang Xu, Jean
More informationOverview of the INEX 2009 Link the Wiki Track
Overview of the INEX 2009 Link the Wiki Track Wei Che (Darren) Huang 1, Shlomo Geva 2 and Andrew Trotman 3 Faculty of Science and Technology, Queensland University of Technology, Brisbane, Australia 1,
More informationA Linear Regression Model for Assessing the Ranking of Web Sites Based on Number of Visits
A Linear Regression Model for Assessing the Ranking of Web Sites Based on Number of Visits Dowming Yeh, Pei-Chen Sun, and Jia-Wen Lee National Kaoshiung Normal University Kaoshiung, Taiwan 802, Republic
More informationEnhancing Cluster Quality by Using User Browsing Time
Enhancing Cluster Quality by Using User Browsing Time Rehab M. Duwairi* and Khaleifah Al.jada'** * Department of Computer Information Systems, Jordan University of Science and Technology, Irbid 22110,
More informationEnhancing Web Page Skimmability
Enhancing Web Page Skimmability Chen-Hsiang Yu MIT CSAIL 32 Vassar St Cambridge, MA 02139 chyu@mit.edu Robert C. Miller MIT CSAIL 32 Vassar St Cambridge, MA 02139 rcm@mit.edu Abstract Information overload
More informationEfficient Web Browsing on Handheld Devices Using Page and Form Summarization
Efficient Web Browsing on Handheld Devices Using Page and Form Summarization ORKUT BUYUKKOKTEN, OLIVER KALJUVEE, HECTOR GARCIA-MOLINA, ANDREAS PAEPCKE and TERRY WINOGRAD Stanford University We present
More informationWrapper: An Application for Evaluating Exploratory Searching Outside of the Lab
Wrapper: An Application for Evaluating Exploratory Searching Outside of the Lab Bernard J Jansen College of Information Sciences and Technology The Pennsylvania State University University Park PA 16802
More informationA Study of Thai Succession Law Ontology on Supreme Court Sentences Retrieval
A Study of Thai Succession Law Ontology on Supreme Court Sentences Retrieval Tanapon Tantisripreecha and Nuanwan Soonthornphisaj Abstract This paper presents an improvement of our approach called SCRO_II
More informationEfficient Web Browsing on Handheld Devices Using Page and Form Summarization
Efficient Web Browsing on Handheld Devices Using Page and Form Summarization Orkut Buyukkokten, Oliver Kaljuvee, Hector Garcia-Molina, Andreas Paepcke, Terry Winograd Digital Libraries Lab(InfoLab), Stanford
More informationBachelor of Arts Program in Information Science
Bachelor of Arts Program in Information Science Philosophy Creativity Service-minded Information Specialist Degree Bachelor of Arts (Information Science) B.A. (Information Science) Now in the process of
More informationDeveloping Focused Crawlers for Genre Specific Search Engines
Developing Focused Crawlers for Genre Specific Search Engines Nikhil Priyatam Thesis Advisor: Prof. Vasudeva Varma IIIT Hyderabad July 7, 2014 Examples of Genre Specific Search Engines MedlinePlus Naukri.com
More informationImproving Quality of Products in Hard Drive Manufacturing by Decision Tree Technique
Improving Quality of Products in Hard Drive Manufacturing by Decision Tree Technique Anotai Siltepavet 1, Sukree Sinthupinyo 2 and Prabhas Chongstitvatana 3 1 Computer Engineering, Chulalongkorn University,
More informationParticipant Training Guide
http://secnet.cch.com March, 2010 Table of Contents Introduction...2 Objectives...2 Accessing...3 Home Page...4 Filings...5 Viewing Search Results...7 Viewing Documents...8 Record Keeping...9 Today s Filings...10
More informationInformation Management (IM)
1 2 3 4 5 6 7 8 9 Information Management (IM) Information Management (IM) is primarily concerned with the capture, digitization, representation, organization, transformation, and presentation of information;
More informationEnhancing Cluster Quality by Using User Browsing Time
Enhancing Cluster Quality by Using User Browsing Time Rehab Duwairi Dept. of Computer Information Systems Jordan Univ. of Sc. and Technology Irbid, Jordan rehab@just.edu.jo Khaleifah Al.jada' Dept. of
More informationI. General regulations
Degree and examination regulations for the consecutive international master's program in Architecture Typology at Faculty VI of the Technische Universität Berlin, October 2, 206 On October 2, 206, the
More informationText Documents clustering using K Means Algorithm
Text Documents clustering using K Means Algorithm Mrs Sanjivani Tushar Deokar Assistant professor sanjivanideokar@gmail.com Abstract: With the advancement of technology and reduced storage costs, individuals
More informationInvestigating the Effects of User Age on Readability
Investigating the Effects of User Age on Readability Kyung Hoon Hyun, Ji-Hyun Lee, and Hwon Ihm Korea Advanced Institute of Science and Technology, Korea {hellohoon,jihyunl87,raccoon}@kaist.ac.kr Abstract.
More informationIntegration of Handwriting Recognition in Butterfly Net
Integration of Handwriting Recognition in Butterfly Net Sye-Min Christina Chan Department of Computer Science Stanford University Stanford, CA 94305 USA sychan@stanford.edu Abstract ButterflyNet allows
More informationWEBMASTER OVERVIEW PURPOSE ELIGIBILITY TIME LIMITS
WEBMASTER OVERVIEW Participants are required to design, build and launch a World Wide Web site that features the school s career and technology education program, the TSA chapter, and the chapter s ability
More informationCSI5387: Data Mining Project
CSI5387: Data Mining Project Terri Oda April 14, 2008 1 Introduction Web pages have become more like applications that documents. Not only do they provide dynamic content, they also allow users to play
More informationMURDOCH RESEARCH REPOSITORY
MURDOCH RESEARCH REPOSITORY http://researchrepository.murdoch.edu.au/ This is the author s final version of the work, as accepted for publication following peer review but without the publisher s layout
More information