Developer Recommendation for Crowdsourced Software Development Tasks

Size: px
Start display at page:

Download "Developer Recommendation for Crowdsourced Software Development Tasks"

Transcription

1 Developer Recommendation for Crowdsourced Software Development Tasks CREST Centre University College London March 30, 2015 San Francisco, USA

2 OUTLINE INTRODUCTION Motivation Problem BACKGROUND TopCoder Recommendation METHODOLOGY Framework Features EVALUATION RQs Setting Results Developer Recommendation for Crowdsourced Software Development Tasks

3 MOTIVATION Developer s Perspective: Information overload. Figure 1: Available tasks listed on TopCoder on Dec. 12, 2014.

4 MOTIVATION Platform s Perspective: The limitation of pull-based model.

5 PROBLEM Historical Task Register History Participants Winner 1. Suitable developers for participation? 2. Reliable developer for delivering qualified assets?... Win History Recommend developers for participation New Task Participants Winner?? Recommend developers for diverging qualified assets Figure 2: Developer recommendation utilising historical data.

6 TOPCODER Established in 2001 Largest community for CSD Over 770,000 developers

7 TOPCODER Figure 3: TopCoder case studies

8 TOPCODER PROCESS Figure 4: Crowdsourced software development process and its task phases (derived from TopCoder.com and Mao et al.

9 RECOMMENDER SYSTEM Content-based filtering Collaborative filtering Hybrid

10 CRWODREX: FRAMEWORK Recommend Developers for Delivering Qualified Assets Recommend Developers for Participation Win History Register History Data Filtering 1. Data Filtering 2. Feature Extraction 3. Learner Training 4. Learner Application Winners Learner(WL) WL Developer Distribution Feature Extraction New Task Feature Extraction Data Transform Participants Learner(PL) PL Developer Distribution Rank Developers for Delivering Qualified Assets Recommended Developers Rank Developers for Participation Recommended Developers Figure 5: The framework of CrowdRex

11 CRWODREX: DATA FILTERING Recommend Developers for Delivering Qualified Assets Recommend Developers for Participation Win History Register History Data Filtering 1. Data Filtering 2. Feature Extraction 3. Learner Training 4. Learner Application Winners Learner(WL) WL Developer Distribution Feature Extraction New Task Feature Extraction Data Transform Participants Learner(PL) PL Developer Distribution Rank Developers for Delivering Qualified Assets Recommended Developers Rank Developers for Participation Recommended Developers Figure 6: The framework of CrowdRex

12 CRWODREX: FEATURE EXTRACTION Recommend Developers for Delivering Qualified Assets Recommend Developers for Participation Win History Register History Data Filtering 1. Data Filtering 2. Feature Extraction 3. Learner Training 4. Learner Application Winners Learner(WL) WL Developer Distribution Feature Extraction New Task Feature Extraction Data Transform Participants Learner(PL) PL Developer Distribution Rank Developers for Delivering Qualified Assets Recommended Developers Rank Developers for Participation Recommended Developers Figure 7: The framework of CrowdRex

13 FEATURES

14 FEATURES

15 FEATURES

16 FEATURES

17 FEATURES

18 FEATURES

19 FEATURES

20 FEATURES Table 1: The content features for crowdsourced software tasks Feature Format Description Date Numeric Post date of the task. PL Text What Programming Language is used. Title Text Title of the posted task. Tech Text Indicate what techniques are used. Description Text Task descriptions overview. Duration Numeric Time allocated to the task. Payment Numeric How much US dollars will the winner get.

21 FEATURES Feature Table 2: The adopted feature distance measures Distance Measure Date (Date i Date j )/Date MaxDiff PL PL i == PL j? 1 : 0 Title Tech Description Duration Payment Tit x Tit y Tit x Tit y Match(Tech i, Tech j )/NumberOfTechs Max Des x Des y Des x Des y (Duration i Duration j )/Duration Max (Payment i Payment j )/Payment Max NLP: 1. Tokenization; 2. Stop words removal; 3. Term vector model (in TF-IDF): 4. Cosine similarity; (TF IDF) w i,j = ( 1 + log(tf j ) ) log T df i (1)

22 CRWODREX: LEARNER TRAINING Recommend Developers for Delivering Qualified Assets Recommend Developers for Participation Win History Register History Data Filtering 1. Data Filtering 2. Feature Extraction 3. Learner Training 4. Learner Application Winners Learner(WL) WL Developer Distribution Feature Extraction New Task Feature Extraction Data Transform Participants Learner(PL) PL Developer Distribution Rank Developers for Delivering Qualified Assets Recommended Developers Rank Developers for Participation Recommended Developers Figure 8: The framework of CrowdRex

23 CRWODREX: LEARNER APPLICATION Recommend Developers for Delivering Qualified Assets Recommend Developers for Participation Win History Register History Data Filtering 1. Data Filtering 2. Feature Extraction 3. Learner Training 4. Learner Application Winners Learner(WL) WL Developer Distribution Feature Extraction New Task Feature Extraction Data Transform Participants Learner(PL) PL Developer Distribution Rank Developers for Delivering Qualified Assets Recommended Developers Rank Developers for Participation Recommended Developers Figure 9: The framework of CrowdRex

24 RESEARCH QUESTIONS RQ1. Baseline Comparison Active baseline: Top N based on statistical performance / activeness RQ2. Performance Assessment Best learner Accuracy and Diversity RQ3. Insights

25 RESEARCH QUESTIONS RQ1. Baseline Comparison Active baseline: Top N based on statistical performance / activeness RQ2. Performance Assessment Best learner Accuracy and Diversity RQ3. Insights

26 RESEARCH QUESTIONS RQ1. Baseline Comparison Active baseline: Top N based on statistical performance / activeness RQ2. Performance Assessment Best learner Accuracy and Diversity RQ3. Insights

27 DATASET 2 datasets from TopCoder.com Oct to Mar ,094 historical tasks Table 3: Statistics of the evaluation datasets Dataset # Tasks # Reg. # Win. Duration Development 1093/ / / Assembly 1505/ / /

28 EXPERIMENTAL SETTING Split each dataset into 10 folds Recommend 5, 10, 20 developers Train C4.5, NaiveBayes, KNN 1, KNN 5 learners 7 6 D e v e lo p m e n t A s s e m b ly N u m b e r o f s u b m is s io n s N u m b e r o f re g is tra n ts Figure 10: The relationship between the number of registrants and submissions.

29 EVALUATION METRICS Accuracy Acc i = 1 T ( ( correct Ri (t) ) / Ri (t) ) (2) t T Diversity / Div i = R i (t) Actual (t) t T r t T (3)

30 RESULTS Table 4: Accuracy and diversity of developer recommendation for delivering qualified assets DS DEV ASM Rec. C4.5 NaïveBayes KNN 1 KNN 5 Active Acc Div Acc Div Acc Div Acc Div Acc Div 5 50% 40% 44% 21% 34% 27% 32% 35% 37% 5% 10 60% 45% 54% 26% 44% 30% 49% 38% 46% 11% 20 71% 52% 67% 18% 56% 40% 61% 42% 57% 22% 5 37% 72% 33% 33% 38% 47% 43% 73% 15% 6% 10 51% 74% 44% 18% 49% 49% 58% 76% 26% 12% 20 63% 77% 54% 48% 57% 53% 67% 78% 37% 23% Table 5: Accuracy and diversity of developer recommendation for participation DS DEV ASM Rec. C4.5 NaïveBayes KNN 1 KNN 5 Active Acc Div Acc Div Acc Div Acc Div Acc Div 5 30% 3% 4% 15% 3% 8% 3% 7% 24% 1% 10 23% 6% 2% 17% 5% 14% 6% 13% 18% 1% 20 21% 11% 1% 18% 6% 20% 6% 19% 12% 2% 5 69% 1% 12% 10% 10% 35% 10% 34% 65% 1% 10 48% 2% 6% 11% 16% 47% 17% 46% 44% 2% 20 30% 4% 3% 12% 17% 53% 18% 53% 25% 4%

31 RESULTS Answer to RQ1: Better than the baseline method for most cases Table 6: Accuracy and diversity of developer recommendation for delivering qualified assets DS DEV ASM Rec. C4.5 NaïveBayes KNN 1 KNN 5 Active Acc Div Acc Div Acc Div Acc Div Acc Div 5 50% 40% 44% 21% 34% 27% 32% 35% 37% 5% 10 60% 45% 54% 26% 44% 30% 49% 38% 46% 11% 20 71% 52% 67% 18% 56% 40% 61% 42% 57% 22% 5 37% 72% 33% 33% 38% 47% 43% 73% 15% 6% 10 51% 74% 44% 18% 49% 49% 58% 76% 26% 12% 20 63% 77% 54% 48% 57% 53% 67% 78% 37% 23% Table 7: Accuracy and diversity of developer recommendation for participation DS DEV ASM Rec. C4.5 NaïveBayes KNN 1 KNN 5 Active Acc Div Acc Div Acc Div Acc Div Acc Div 5 30% 3% 4% 15% 3% 8% 3% 7% 24% 1% 10 23% 6% 2% 17% 5% 14% 6% 13% 18% 1% 20 21% 11% 1% 18% 6% 20% 6% 19% 12% 2% 5 69% 1% 12% 10% 10% 35% 10% 34% 65% 1% 10 48% 2% 6% 11% 16% 47% 17% 46% 44% 2% 20 30% 4% 3% 12% 17% 53% 18% 53% 25% 4%

32 RESULTS Answer to RQ2: Accuracy - C4.5 (9/12), Diversity - KNN 1 (4/12) Table 8: Accuracy and diversity of developer recommendation for delivering qualified assets DS DEV ASM Rec. C4.5 NaïveBayes KNN 1 KNN 5 Active Acc Div Acc Div Acc Div Acc Div Acc Div 5 50% 40% 44% 21% 34% 27% 32% 35% 37% 5% 10 60% 45% 54% 26% 44% 30% 49% 38% 46% 11% 20 71% 52% 67% 18% 56% 40% 61% 42% 57% 22% 5 37% 72% 33% 33% 38% 47% 43% 73% 15% 6% 10 51% 74% 44% 18% 49% 49% 58% 76% 26% 12% 20 63% 77% 54% 48% 57% 53% 67% 78% 37% 23% Table 9: Accuracy and diversity of developer recommendation for participation DS DEV ASM Rec. C4.5 NaïveBayes KNN 1 KNN 5 Active Acc Div Acc Div Acc Div Acc Div Acc Div 5 30% 3% 4% 15% 3% 8% 3% 7% 24% 1% 10 23% 6% 2% 17% 5% 14% 6% 13% 18% 1% 20 21% 11% 1% 18% 6% 20% 6% 19% 12% 2% 5 69% 1% 12% 10% 10% 35% 10% 34% 65% 1% 10 48% 2% 6% 11% 16% 47% 17% 46% 44% 2% 20 30% 4% 3% 12% 17% 53% 18% 53% 25% 4%

33 RESULTS: COMPARISON For Delivering Qualified Assets For Par9cipa9on 80% 70% 60% 50% 40% 30% 20% 10% 0% 35% 30% 25% 20% 15% 10% Development C4.5 NaiveBayse KNN_1 KNN_5 Ac:ve Development 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 80% 70% 60% 50% 40% 30% 20% Assembly C4.5 NaiveBayse KNN_1 KNN_5 Ac:ve Assembly Top5 (Acc) Top10 (Acc) Top20 (Acc) Top5 (Div) Top10 (Div) Top20 (Div) 5% 10% 0% C4.5 NaiveBayse KNN_1 KNN_5 Ac:ve 0% C4.5 NaiveBayse KNN_1 KNN_5 Ac:ve Figure 11: Performance comparison of developer recommendation when recommending 5, 10 and 20 developers. The x-axis shows the machine learners and the baseline method Active. The y-axis shows the value for Accuracy (Acc) and Diversity (Div) measures. The scatter points are linked for the purpose of improving the readability

34 RESULTS: COMPARISON Answer to RQ3: 1. Selection: no-free-lunch 2. Trade-off: Accuracy-Diversity dilemma 3. Action: Active with low coverage For Delivering Qualified Assets For Par9cipa9on 80% 70% 60% 50% 40% 30% 20% 10% 0% 35% 30% 25% 20% 15% 10% Development C4.5 NaiveBayse KNN_1 KNN_5 Ac:ve Development 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 80% 70% 60% 50% 40% 30% 20% Assembly C4.5 NaiveBayse KNN_1 KNN_5 Ac:ve Assembly Top5 (Acc) Top10 (Acc) Top20 (Acc) Top5 (Div) Top10 (Div) Top20 (Div) 5% 10% 0% C4.5 NaiveBayse KNN_1 KNN_5 Ac:ve 0% C4.5 NaiveBayse KNN_1 KNN_5 Ac:ve Figure 12: Performance comparison

35 RESULTS: COMPARISON Answer to RQ3: 1. Selection: no-free-lunch 2. Trade-off: Accuracy-Diversity dilemma 3. Action: Active with low coverage For Delivering Qualified Assets For Par9cipa9on 80% 70% 60% 50% 40% 30% 20% 10% 0% 35% 30% 25% 20% 15% 10% Development C4.5 NaiveBayse KNN_1 KNN_5 Ac:ve Development 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 80% 70% 60% 50% 40% 30% 20% Assembly C4.5 NaiveBayse KNN_1 KNN_5 Ac:ve Assembly Top5 (Acc) Top10 (Acc) Top20 (Acc) Top5 (Div) Top10 (Div) Top20 (Div) 5% 10% 0% C4.5 NaiveBayse KNN_1 KNN_5 Ac:ve 0% C4.5 NaiveBayse KNN_1 KNN_5 Ac:ve Figure 12: Performance comparison

36 RESULTS: COMPARISON Answer to RQ3: 1. Selection: no-free-lunch 2. Trade-off: Accuracy-Diversity dilemma 3. Action: Active with low coverage For Delivering Qualified Assets For Par9cipa9on 80% 70% 60% 50% 40% 30% 20% 10% 0% 35% 30% 25% 20% 15% 10% Development C4.5 NaiveBayse KNN_1 KNN_5 Ac:ve Development 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 80% 70% 60% 50% 40% 30% 20% Assembly C4.5 NaiveBayse KNN_1 KNN_5 Ac:ve Assembly Top5 (Acc) Top10 (Acc) Top20 (Acc) Top5 (Div) Top10 (Div) Top20 (Div) 5% 10% 0% C4.5 NaiveBayse KNN_1 KNN_5 Ac:ve 0% C4.5 NaiveBayse KNN_1 KNN_5 Ac:ve Figure 12: Performance comparison

37 SUMMARY Motivation A dauntingly large set of task options Inappropriate developer-task matching may harm the quality Aim Automatically match tasks and developers Method CrowdRex: Content-based recommendation Results Evaluated 4 machine learners on 3,094 historical tasks Accuracy 50%-71% and diversity 40%-52%

Semantic Estimation for Texts in Software Engineering

Semantic Estimation for Texts in Software Engineering Semantic Estimation for Texts in Software Engineering 汇报人 : Reporter:Xiaochen Li Dalian University of Technology, China 大连理工大学 2016 年 11 月 29 日 Oscar Lab 2 Ph.D. candidate at OSCAR Lab, in Dalian University

More information

The HPE Living Progress Challenge

The HPE Living Progress Challenge December 15, 2015 The HPE Living Progress Challenge Overview Chris Wellise Director, Strategic Initiatives The power of digital inclusion The potential for technology to break down barriers is limitless,

More information

Adobe Target Analyst Adobe Certified Expert Exam Guide

Adobe Target Analyst Adobe Certified Expert Exam Guide Adobe Target Analyst Adobe Certified Expert Exam Guide Exam number: 9A0-399 Note: To become certified as an Adobe Target Analyst requires passing this exam and exam 9A0-398 Adobe Target Business Practitioner.

More information

CPD provider network. Provider Handbook

CPD provider network. Provider Handbook CPD provider network Provider Handbook Welcome to the Australian Institute of Architects Refuel CPD Provider Network. The following information has been written to guide you through the process of developing

More information

Adobe Target Analyst Adobe Certified Expert Exam Guide

Adobe Target Analyst Adobe Certified Expert Exam Guide Adobe Target Analyst Adobe Certified Expert Exam Guide Exam number: 9A0-399 Note: To become certified as an Adobe Target Analyst requires passing this exam and exam 9A0-398 Adobe Target Business Practitioner.

More information

Access Control and Physical Security Management. Contents are subject to change. For the latest updates visit

Access Control and Physical Security Management. Contents are subject to change. For the latest updates visit Access Control and Physical Security Management Page 1 of 6 Why Attend Today s security landscape requires individuals and businesses to take the threat to safety and security seriously. Safe and secure

More information

Wireless Location Accuracy. January 2019

Wireless Location Accuracy. January 2019 Wireless 9-1-1 Location Accuracy January 2019 An Overview of Wireless 9-1-1 Location Accuracy * Wireless Carrier Network 9-1-1 Network Public Safety Answering Point (PSAP) Police Fire EMS Key: Handset/Network

More information

Automatic Cluster Number Selection using a Split and Merge K-Means Approach

Automatic Cluster Number Selection using a Split and Merge K-Means Approach Automatic Cluster Number Selection using a Split and Merge K-Means Approach Markus Muhr and Michael Granitzer 31st August 2009 The Know-Center is partner of Austria's Competence Center Program COMET. Agenda

More information

Certificate in Security Management

Certificate in Security Management Certificate in Security Management Page 1 of 6 Why Attend This course will provide participants with an insight into the fundamentals of managing modern and effective security operations. It will address

More information

Matrix Co-factorization for Recommendation with Rich Side Information HetRec 2011 and Implicit 1 / Feedb 23

Matrix Co-factorization for Recommendation with Rich Side Information HetRec 2011 and Implicit 1 / Feedb 23 Matrix Co-factorization for Recommendation with Rich Side Information and Implicit Feedback Yi Fang and Luo Si Department of Computer Science Purdue University West Lafayette, IN 47906, USA fangy@cs.purdue.edu

More information

Characterization and Modeling of Deleted Questions on Stack Overflow

Characterization and Modeling of Deleted Questions on Stack Overflow Characterization and Modeling of Deleted Questions on Stack Overflow Denzil Correa, Ashish Sureka http://correa.in/ February 16, 2014 Denzil Correa, Ashish Sureka (http://correa.in/) ACM WWW-2014 February

More information

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University

CS473: Course Review CS-473. Luo Si Department of Computer Science Purdue University CS473: CS-473 Course Review Luo Si Department of Computer Science Purdue University Basic Concepts of IR: Outline Basic Concepts of Information Retrieval: Task definition of Ad-hoc IR Terminologies and

More information

Master Project. Various Aspects of Recommender Systems. Prof. Dr. Georg Lausen Dr. Michael Färber Anas Alzoghbi Victor Anthony Arrascue Ayala

Master Project. Various Aspects of Recommender Systems. Prof. Dr. Georg Lausen Dr. Michael Färber Anas Alzoghbi Victor Anthony Arrascue Ayala Master Project Various Aspects of Recommender Systems May 2nd, 2017 Master project SS17 Albert-Ludwigs-Universität Freiburg Prof. Dr. Georg Lausen Dr. Michael Färber Anas Alzoghbi Victor Anthony Arrascue

More information

APNIC Update. RIPE 59 October 2009

APNIC Update. RIPE 59 October 2009 APNIC Update RIPE 59 October 2009 Overview APNIC Services Update APNIC 28 policy outcomes APNIC Members and Stakeholder Survey Next APNIC Meetings Resource Delegations (1 Oct 09) No of /8 delegated No

More information

Basic Tokenizing, Indexing, and Implementation of Vector-Space Retrieval

Basic Tokenizing, Indexing, and Implementation of Vector-Space Retrieval Basic Tokenizing, Indexing, and Implementation of Vector-Space Retrieval 1 Naïve Implementation Convert all documents in collection D to tf-idf weighted vectors, d j, for keyword vocabulary V. Convert

More information

Recommender Systems: Practical Aspects, Case Studies. Radek Pelánek

Recommender Systems: Practical Aspects, Case Studies. Radek Pelánek Recommender Systems: Practical Aspects, Case Studies Radek Pelánek 2017 This Lecture practical aspects : attacks, context, shared accounts,... case studies, illustrations of application illustration of

More information

Knowledge Discovery and Data Mining 1 (VO) ( )

Knowledge Discovery and Data Mining 1 (VO) ( ) Knowledge Discovery and Data Mining 1 (VO) (707.003) Data Matrices and Vector Space Model Denis Helic KTI, TU Graz Nov 6, 2014 Denis Helic (KTI, TU Graz) KDDM1 Nov 6, 2014 1 / 55 Big picture: KDDM Probability

More information

A Query Weighted-based Method for User Modeling

A Query Weighted-based Method for User Modeling A Query Weighted-based Method for User Modeling Hu Juan,Bai Yu, Cai Dongfeng Knowledge Engineering Research Center Shenyang Aerospace University Outlines Background Query Weighted-based User Modeling Experiments

More information

ENGINEERING INTEGRITY Asset Integrity & Corrosion Control Consultants

ENGINEERING INTEGRITY Asset Integrity & Corrosion Control Consultants Training Schedule Jan - June, 2018 OUR SERVICES Asset Integrity Engineering Welding Engineering Consultancy Inspection Services API / ASME Training Welding Engineering NDT Services & Training # 4A, First

More information

ENGINEERING INTEGRITY Asset Integrity & Corrosion Control Consultants

ENGINEERING INTEGRITY Asset Integrity & Corrosion Control Consultants Training Schedule 2018 OUR SERVICES Asset Integrity Engineering Welding Engineering Consultancy Inspection Services API / ASME Training Welding Engineering NDT Services & Training # 4A, First Cross Street,

More information

Intelligence Preparation of the Cyber Environment. Rob Dartnall Director Cyber Intelligence

Intelligence Preparation of the Cyber Environment. Rob Dartnall Director Cyber Intelligence Intelligence Preparation of the Cyber Rob Dartnall Director Cyber Intelligence Rob is a CREST Certified Threat Intelligence Manager (CCTIM) and Cyber Intelligence Director/CEO of Security Alliance - a

More information

MISO-PJM Cross-Border Planning. February 19, 2014 MISO-PJM JCM

MISO-PJM Cross-Border Planning. February 19, 2014 MISO-PJM JCM MISO-PJM Cross-Border Planning February 19, 2014 MISO-PJM JCM Planning Issues at the Seam 2 Background JOA Historical and Projected Congestion Study - 2012 to 2014 Future Scenario based 80+ major project

More information

Collaborative Filtering using a Spreading Activation Approach

Collaborative Filtering using a Spreading Activation Approach Collaborative Filtering using a Spreading Activation Approach Josephine Griffith *, Colm O Riordan *, Humphrey Sorensen ** * Department of Information Technology, NUI, Galway ** Computer Science Department,

More information

Houston Economy Update. Patrick Jankowski

Houston Economy Update. Patrick Jankowski Houston Economy Update Patrick Jankowski Worst is over 2 $ Per Barrel NYMEX WTI Spot Price 120 100 80 Avg. Last Week of Oct = $49/barrel 60 40 20 0 Jun '14 Dec '14 Jun '15 Dec '15 Jun '16 Dec '16 Source:

More information

SCAFFOLDER. Code: SCAFFOLDER

SCAFFOLDER. Code: SCAFFOLDER SCAFFOLDER Code: 641902 SCAFFOLDER 1 Scoping Agenda Welcome Attendance and introductions Expectations QCTO Mandate Purpose of Occupational Qualifications Organising Framework of Occupations Distinction

More information

SYLLABUS. Departmental Syllabus CIST Departmental Syllabus. Departmental Syllabus. Departmental Syllabus. Departmental Syllabus

SYLLABUS. Departmental Syllabus CIST Departmental Syllabus. Departmental Syllabus. Departmental Syllabus. Departmental Syllabus SYLLABUS DATE OF LAST REVIEW: 02/2013 CIP CODE: 11.0901 SEMESTER: COURSE TITLE: Operating System Security (Windows 2008 Server) COURSE NUMBER: CIST-0254 CREDIT HOURS: 4 INSTRUCTOR: OFFICE LOCATION: OFFICE

More information

Label Distribution Learning. Wei Han

Label Distribution Learning. Wei Han Label Distribution Learning Wei Han, Big Data Research Center, UESTC Email:wei.hb.han@gmail.com Outline 1. Why label distribution learning? 2. What is label distribution learning? 2.1. Problem Formulation

More information

INFO 4300 / CS4300 Information Retrieval. slides adapted from Hinrich Schütze s, linked from

INFO 4300 / CS4300 Information Retrieval. slides adapted from Hinrich Schütze s, linked from INFO 4300 / CS4300 Information Retrieval slides adapted from Hinrich Schütze s, linked from http://informationretrieval.org/ IR 6: Index Compression Paul Ginsparg Cornell University, Ithaca, NY 15 Sep

More information

ESERCITAZIONE PIATTAFORMA WEKA. Croce Danilo Web Mining & Retrieval 2015/2016

ESERCITAZIONE PIATTAFORMA WEKA. Croce Danilo Web Mining & Retrieval 2015/2016 ESERCITAZIONE PIATTAFORMA WEKA Croce Danilo Web Mining & Retrieval 2015/2016 Outline Weka: a brief recap ARFF Format Performance measures Confusion Matrix Precision, Recall, F1, Accuracy Question Classification

More information

Multi-Stage Rocchio Classification for Large-scale Multilabeled

Multi-Stage Rocchio Classification for Large-scale Multilabeled Multi-Stage Rocchio Classification for Large-scale Multilabeled Text data Dong-Hyun Lee Nangman Computing, 117D Garden five Tools, Munjeong-dong Songpa-gu, Seoul, Korea dhlee347@gmail.com Abstract. Large-scale

More information

Birkbeck (University of London)

Birkbeck (University of London) Birkbeck (University of London) MSc Examination for Internal Students Department of Computer Science and Information Systems Information Retrieval and Organisation (COIY64H7) Credit Value: 5 Date of Examination:

More information

Your Trusted Advisors in the Oil and Gas Industry API Q2 SPECIFICATION & TECHNICAL APPLICATION FOR LEAD AUDITOR. Version 1.0

Your Trusted Advisors in the Oil and Gas Industry API Q2 SPECIFICATION & TECHNICAL APPLICATION FOR LEAD AUDITOR. Version 1.0 Your Trusted Advisors in the Oil and Gas Industry API Q2 SPECIFICATION & TECHNICAL APPLICATION FOR LEAD AUDITOR Version 1.0 Program Overview This course provides participants with an in-depth understanding,

More information

RLAT Rapid Language Adaptation Toolkit

RLAT Rapid Language Adaptation Toolkit RLAT Rapid Language Adaptation Toolkit Tim Schlippe May 15, 2012 RLAT Rapid Language Adaptation Toolkit - 2 RLAT Rapid Language Adaptation Toolkit RLAT Rapid Language Adaptation Toolkit - 3 Outline Introduction

More information

JSR Review Process. May Patrick Curran, Mike Milinkovich, Heather Vancura, Bruno Souza

JSR Review Process. May Patrick Curran, Mike Milinkovich, Heather Vancura, Bruno Souza JSR Review Process May 14-15 2013 Patrick Curran, Mike Milinkovich, Heather Vancura, Bruno Souza Agenda Background Goals Information to be gathered Implementation notes Questions, discussion, next steps

More information

Domain-Specific Query Translation for Multilingual Information Access using Machine Translation Augmented With Dictionaries Mined from Wikipedia

Domain-Specific Query Translation for Multilingual Information Access using Machine Translation Augmented With Dictionaries Mined from Wikipedia Domain-Specific Query Translation for Multilingual Information Access using Machine Translation Augmented With Dictionaries Mined from Wikipedia Gareth J. F. Jones, Fabio Fantino, Eamonn Newman, Ying Zhang

More information

Hands on Datamining & Machine Learning with Weka

Hands on Datamining & Machine Learning with Weka Step1: Click the Experimenter button to launch the Weka Experimenter. The Weka Experimenter allows you to design your own experiments of running algorithms on datasets, run the experiments and analyze

More information

FUTUR ET RUPTURES RESTITUTION DAY. Presented by: AMAL ELLOUZE February 2, 2017

FUTUR ET RUPTURES RESTITUTION DAY. Presented by: AMAL ELLOUZE February 2, 2017 FUTUR ET RUPTURES RESTITUTION DAY Presented by: AMAL ELLOUZE February 2, 2017 Applications offloading in Mobile Cloud Computing environment OUTLINE 1. Motivation and Objectives 2. Mobile Applications Offloading

More information

Urban 3D Challenge & Future Directions

Urban 3D Challenge & Future Directions DISTRIBUTION STATEMENT A APPROVED FOR PUBLIC RELEASE; DISTRIBUTION IS UNLIMITED This work was supported by the United States Special Operations Command (USSOCOM). The views and conclusions contained herein

More information

ISO STANDARD IMPLEMENTATION AND TECHNOLOGY CONSOLIDATION

ISO STANDARD IMPLEMENTATION AND TECHNOLOGY CONSOLIDATION ISO STANDARD IMPLEMENTATION AND TECHNOLOGY CONSOLIDATION Cathy Bates Senior Consultant, Vantage Technology Consulting Group January 30, 2018 Campus Orientation Initiative and Project Orientation Project

More information

INFORMATION TECHNOLOGY ONE-YEAR PLAN

INFORMATION TECHNOLOGY ONE-YEAR PLAN INFORMATION TECHNOLOGY ONE-YEAR PLAN 2016-2017 Information and Communications Technology One-year Plan 2016-2017 The purpose of this document is to identify the activities being undertaken this year by

More information

Evaluation Metrics. (Classifiers) CS229 Section Anand Avati

Evaluation Metrics. (Classifiers) CS229 Section Anand Avati Evaluation Metrics (Classifiers) CS Section Anand Avati Topics Why? Binary classifiers Metrics Rank view Thresholding Confusion Matrix Point metrics: Accuracy, Precision, Recall / Sensitivity, Specificity,

More information

VIRTUAL REALITY BASED END-USER ASSESSMENT TOOL for Remote Product /System Testing & Support MECHANICAL & INDUSTRIAL ENGINEERING

VIRTUAL REALITY BASED END-USER ASSESSMENT TOOL for Remote Product /System Testing & Support MECHANICAL & INDUSTRIAL ENGINEERING VIRTUAL REALITY BASED END-USER ASSESSMENT TOOL for Remote Product /System Testing & Support PRESENTATION OUTLINE AIM OF RESEARCH MOTIVATIONS BACKGROUND & RELATED WORK ANALYSIS APPROACH MODEL ARCHITECTURE

More information

Searching the Deep Web

Searching the Deep Web Searching the Deep Web 1 What is Deep Web? Information accessed only through HTML form pages database queries results embedded in HTML pages Also can included other information on Web can t directly index

More information

Introduction to BSafe.Network

Introduction to BSafe.Network Introduction to BSafe.Network Shigeya Suzuki, Ph.D Associate Director, Technology Officer, Blockchain Laboratory, Keio Research Institute at SFC Project Associate Professor, Graduate School of Media and

More information

CHAPTER 4 STOCK PRICE PREDICTION USING MODIFIED K-NEAREST NEIGHBOR (MKNN) ALGORITHM

CHAPTER 4 STOCK PRICE PREDICTION USING MODIFIED K-NEAREST NEIGHBOR (MKNN) ALGORITHM CHAPTER 4 STOCK PRICE PREDICTION USING MODIFIED K-NEAREST NEIGHBOR (MKNN) ALGORITHM 4.1 Introduction Nowadays money investment in stock market gains major attention because of its dynamic nature. So the

More information

Text Categorization. Foundations of Statistic Natural Language Processing The MIT Press1999

Text Categorization. Foundations of Statistic Natural Language Processing The MIT Press1999 Text Categorization Foundations of Statistic Natural Language Processing The MIT Press1999 Outline Introduction Decision Trees Maximum Entropy Modeling (optional) Perceptrons K Nearest Neighbor Classification

More information

Towards Ensuring Collective Availability in Volatile Resource Pools via Forecasting

Towards Ensuring Collective Availability in Volatile Resource Pools via Forecasting Towards CloudComputing@home: Ensuring Collective Availability in Volatile Resource Pools via Forecasting Artur Andrzejak Berlin (ZIB) andrzejak[at]zib.de Zuse-Institute Derrick Kondo David P. Anderson

More information

Partnered with API Q2 TECHNICAL APPLICATION FOR LEAD AUDITOR. Version 3.0

Partnered with API Q2 TECHNICAL APPLICATION FOR LEAD AUDITOR. Version 3.0 API Q2 TECHNICAL APPLICATION FOR LEAD AUDITOR Version 3.0 Program Overview This course provides participants with an in-depth understanding, knowledge, and skills needed to carry out successful internal

More information

Digital Libraries: Language Technologies

Digital Libraries: Language Technologies Digital Libraries: Language Technologies RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Recall: Inverted Index..........................................

More information

ISO / IEC 27001:2005. A brief introduction. Dimitris Petropoulos Managing Director ENCODE Middle East September 2006

ISO / IEC 27001:2005. A brief introduction. Dimitris Petropoulos Managing Director ENCODE Middle East September 2006 ISO / IEC 27001:2005 A brief introduction Dimitris Petropoulos Managing Director ENCODE Middle East September 2006 Information Information is an asset which, like other important business assets, has value

More information

A Constrained Spreading Activation Approach to Collaborative Filtering

A Constrained Spreading Activation Approach to Collaborative Filtering A Constrained Spreading Activation Approach to Collaborative Filtering Josephine Griffith 1, Colm O Riordan 1, and Humphrey Sorensen 2 1 Dept. of Information Technology, National University of Ireland,

More information

A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995)

A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Department of Information, Operations and Management Sciences Stern School of Business, NYU padamopo@stern.nyu.edu

More information

Cost-Aware Triage Ranking Algorithms for Bug Reporting Systems

Cost-Aware Triage Ranking Algorithms for Bug Reporting Systems Cost-Aware Triage Ranking Algorithms for Bug Reporting Systems Jin-woo Park 1, Mu-Woong Lee 1, Jinhan Kim 1, Seung-won Hwang 1, Sunghun Kim 2 POSTECH, 대한민국 1 HKUST, Hong Kong 2 Outline 1. CosTriage: A

More information

A probabilistic model to resolve diversity-accuracy challenge of recommendation systems

A probabilistic model to resolve diversity-accuracy challenge of recommendation systems A probabilistic model to resolve diversity-accuracy challenge of recommendation systems AMIN JAVARI MAHDI JALILI 1 Received: 17 Mar 2013 / Revised: 19 May 2014 / Accepted: 30 Jun 2014 Recommendation systems

More information

Expose Existing z Systems Assets as APIs to extend your Customer Reach

Expose Existing z Systems Assets as APIs to extend your Customer Reach Expose Existing z Systems Assets as APIs to extend your Customer Reach Unlocking mainframe assets for mobile and cloud applications Asit Dan z Services API Management, Chief Architect asit@us.ibm.com Insert

More information

Citizen Information Project

Citizen Information Project Final report: Annex 2: Stakeholders' processes, systems and data 2A: Overview Version Control Date of Issue 14 th June 2005 Version Number 1.0 Version Date Issued by Status 1.0 14/06/2005 PJ Maycock Final

More information

TREC OpenSearch Planning Session

TREC OpenSearch Planning Session TREC OpenSearch Planning Session Anne Schuth, Krisztian Balog TREC OpenSearch OpenSearch is a new evaluation paradigm for IR. The experimentation platform is an existing search engine. Researchers have

More information

IT Auditing and IT Fraud Detection

IT Auditing and IT Fraud Detection IT Auditing and IT Fraud Detection Page 1 of 7 Why Attend In today s world, IT fraud prevention and investigation have become an everyday part of corporate life and auditors must gain expertise in this

More information

CS 124/LINGUIST 180 From Languages to Information

CS 124/LINGUIST 180 From Languages to Information CS /LINGUIST 80 From Languages to Information Dan Jurafsky Stanford University Recommender Systems & Collaborative Filtering Slides adapted from Jure Leskovec Recommender Systems Customer X Buys CD of

More information

Efficient Lists Intersection by CPU- GPU Cooperative Computing

Efficient Lists Intersection by CPU- GPU Cooperative Computing Efficient Lists Intersection by CPU- GPU Cooperative Computing Di Wu, Fan Zhang, Naiyong Ao, Gang Wang, Xiaoguang Liu, Jing Liu Nankai-Baidu Joint Lab, Nankai University Outline Introduction Cooperative

More information

A User -Perceived Availability Evaluation of a Web-based Travel Agency

A User -Perceived Availability Evaluation of a Web-based Travel Agency A User -Perceived Availability Evaluation of a Web-based Travel Agency Mohamed Kaâniche, Karama Kanoun, Magnos Martinello Partially supported by the European Community, DSoS - Project IST-1999-11585 DSN-2003,

More information

Where Should the Bugs Be Fixed?

Where Should the Bugs Be Fixed? Where Should the Bugs Be Fixed? More Accurate Information Retrieval-Based Bug Localization Based on Bug Reports Presented by: Chandani Shrestha For CS 6704 class About the Paper and the Authors Publication

More information

Taming Rave: How to control data collection standards?

Taming Rave: How to control data collection standards? Taming Rave: How to control data collection standards? Dimitri Kutsenko (Entimo AG - Berlin/Germany) Agenda Project Initiation How to: Organize metadata Structure metadata Manage metadata Check metadata

More information

Vector Space Scoring Introduction to Information Retrieval INF 141/ CS 121 Donald J. Patterson

Vector Space Scoring Introduction to Information Retrieval INF 141/ CS 121 Donald J. Patterson Vector Space Scoring Introduction to Information Retrieval INF 141/ CS 121 Donald J. Patterson Content adapted from Hinrich Schütze http://www.informationretrieval.org Spamming indices This was invented

More information

TQUK Level 3 Diploma in Design Engineer Construct! The Digital Built Environment (RQF) Purpose Statement Qualification Number: 603/1993/8

TQUK Level 3 Diploma in Design Engineer Construct! The Digital Built Environment (RQF) Purpose Statement Qualification Number: 603/1993/8 TQUK Level 3 Diploma in Design Engineer Construct! The Digital Built Environment (RQF) Purpose Statement Qualification Number: 603/1993/8 Qualification Purpose Statement Qualification Regulation Details

More information

NTCIR-13 Core Task: Short Text Conversation (STC-2)

NTCIR-13 Core Task: Short Text Conversation (STC-2) NTCIR-13 Core Task: Short Text Conversation (STC-2) Lifeng Shang 1, Tetsuya Sakai 2, Zhengdong Lu 1, Hang Li 1, Ryuichiro Higashinaka 3, and Yusuke Miyao 4 1 Noahs Ark Lab of Huawei 2 Waseda University

More information

Certified Cyber Security Specialist

Certified Cyber Security Specialist Certified Cyber Security Specialist Page 1 of 7 Why Attend This course will provide participants with in-depth knowledge and practical skills to plan, deliver and monitor IT/cyber security to internal

More information

CS 124/LINGUIST 180 From Languages to Information

CS 124/LINGUIST 180 From Languages to Information CS /LINGUIST 80 From Languages to Information Dan Jurafsky Stanford University Recommender Systems & Collaborative Filtering Slides adapted from Jure Leskovec Recommender Systems Customer X Buys CD of

More information

NTU Approaches to Subtopic Mining and Document Ranking at NTCIR-9 Intent Task

NTU Approaches to Subtopic Mining and Document Ranking at NTCIR-9 Intent Task NTU Approaches to Subtopic Mining and Document Ranking at NTCIR-9 Intent Task Chieh-Jen Wang, Yung-Wei Lin, *Ming-Feng Tsai and Hsin-Hsi Chen Department of Computer Science and Information Engineering,

More information

SEO For Government Agencies. In Partnership with RBFF

SEO For Government Agencies. In Partnership with RBFF SEO For Government Agencies In Partnership with RBFF 2 01 Who is Blast Agenda 02 03 04 05 Why SEO Matters to You What is SEO Benefits of SEO How to Begin 4 Services Overview Data Strategy Management Business

More information

Your Trusted Advisors in Oil and Gas Industry API Q1 ESSENTIALS & AUDITING COURSE

Your Trusted Advisors in Oil and Gas Industry API Q1 ESSENTIALS & AUDITING COURSE Your Trusted Advisors in Oil and Gas Industry API Q1 ESSENTIALS & AUDITING COURSE Program Overview This course provides participants with an in-depth understanding, knowledge, and skills needed to carry

More information

APIS IN THE MAKING. Fast forward 18 years, we are seeing businesses use APIs as functionality in their applications. THE STATE OF APIS IN 2018

APIS IN THE MAKING. Fast forward 18 years, we are seeing businesses use APIs as functionality in their applications. THE STATE OF APIS IN 2018 THE STATE OF APIS IN THE MAKING While still in recent history, it s good to remember that the modern era of APIs started in the early 2000s, pioneered by Salesforce and ebay. Fast forward 18 years, we

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS6: Mining Massive Datasets Jure Leskovec, Stanford University http://cs6.stanford.edu //8 Jure Leskovec, Stanford CS6: Mining Massive Datasets High dim. data Graph data Infinite data Machine learning

More information

e-sens Nordic & Baltic Area Meeting Stockholm April 23rd 2013

e-sens Nordic & Baltic Area Meeting Stockholm April 23rd 2013 e-sens Nordic & Baltic Area Meeting Stockholm April 23rd 2013 Objectives of the afternoon parallel tracks sessions 2 Meeting objectives High level: Identification of shared interests with emphasis on those

More information

ISTQB Advanced Level (CTAL)

ISTQB Advanced Level (CTAL) ISTQB Advanced Level (CTAL) 2012 Syllabus - Overview Mike Smith Chairman, Advanced Level Working Group (ALWG) December 2012 Contents 1 2 3 4 5 6 Introduction to ISTQB CTAL 2012: What s changed? CTAL 2012:

More information

Your Trusted Advisors in Oil and Gas Industry API Q1 TECHNICAL APPLICATION FOR LEAD AUDITOR

Your Trusted Advisors in Oil and Gas Industry API Q1 TECHNICAL APPLICATION FOR LEAD AUDITOR Your Trusted Advisors in Oil and Gas Industry API Q1 TECHNICAL APPLICATION FOR LEAD AUDITOR Program Overview This course provides participants with an in-depth understanding, knowledge, and skills needed

More information

Adobe Experience Manager 6 Architect Adobe Certified Expert Exam Guide. Exam number: 9A0-385

Adobe Experience Manager 6 Architect Adobe Certified Expert Exam Guide. Exam number: 9A0-385 Adobe Experience Manager 6 Architect Adobe Certified Expert Exam Guide Exam number: 9A0-385 Revised 28 June 2018 1 This exam guide provides detail about a new version of the AEM 6 Architect exam that will

More information

Advanced IT Risk, Security management and Cybercrime Prevention

Advanced IT Risk, Security management and Cybercrime Prevention Advanced IT Risk, Security management and Cybercrime Prevention Course Goal and Objectives Information technology has created a new category of criminality, as cybercrime offers hackers and other tech-savvy

More information

Better Contextual Suggestions in ClueWeb12 Using Domain Knowledge Inferred from The Open Web

Better Contextual Suggestions in ClueWeb12 Using Domain Knowledge Inferred from The Open Web Better Contextual Suggestions in ClueWeb12 Using Domain Knowledge Inferred from The Open Web Thaer Samar 1, Alejandro Bellogín 2, and Arjen P. de Vries 1 1 Centrum Wiskunde & Informatica, {samar,arjen}@cwi.nl

More information

A Comparison of Document Clustering Techniques

A Comparison of Document Clustering Techniques A Comparison of Document Clustering Techniques M. Steinbach, G. Karypis, V. Kumar Present by Leo Chen Feb-01 Leo Chen 1 Road Map Background & Motivation (2) Basic (6) Vector Space Model Cluster Quality

More information

ITIL : Professional Education Training. Innovative solutions for modern businesses.

ITIL : Professional Education Training. Innovative solutions for modern businesses. ITIL : 2011 Professional Education Training Innovative solutions for modern businesses www.syzygal.com The ITIL Service Lifecycle ITIL (IT INFRASTRUCTURE LIBRARY) is a best practice framework for IT Service

More information

Data Modelling and Multimedia Databases M

Data Modelling and Multimedia Databases M ALMA MATER STUDIORUM - UNIERSITÀ DI BOLOGNA Data Modelling and Multimedia Databases M International Second cycle degree programme (LM) in Digital Humanities and Digital Knoledge (DHDK) University of Bologna

More information

2018 CMSC Presentations and Technical Papers Guidelines

2018 CMSC Presentations and Technical Papers Guidelines 2018 CMSC Presentations and Technical Papers Guidelines Coordinate Metrology Society Conference CMSC 2018 Reno, NV Introduction The Executive Committee of the (CMS) puts a premium on the content of technical

More information

Robin Wilson Director. Digital Identifiers Metadata Services

Robin Wilson Director. Digital Identifiers Metadata Services Robin Wilson Director Digital Identifiers Metadata Services Report Digital Object Identifiers for Publishing and the e-learning Community CONTEXT elearning the the Publishing Challenge elearning the the

More information

Contents. Resumen. List of Acronyms. List of Mathematical Symbols. List of Figures. List of Tables. I Introduction 1

Contents. Resumen. List of Acronyms. List of Mathematical Symbols. List of Figures. List of Tables. I Introduction 1 Contents Agraïments Resum Resumen Abstract List of Acronyms List of Mathematical Symbols List of Figures List of Tables VII IX XI XIII XVIII XIX XXII XXIV I Introduction 1 1 Introduction 3 1.1 Motivation...

More information

FY Bay Area UASI Risk and Grants Management Program Update. November 14, 2013

FY Bay Area UASI Risk and Grants Management Program Update. November 14, 2013 FY 2013-2014 Bay Area UASI Risk and Grants Management Program Update November 14, 2013 Overview FY 2013 Bay Area UASI Risk and Grants Management Program May 2013 December 2013 Data Management Analysis

More information

Integrated Consortium of Laboratory Networks (ICLN)

Integrated Consortium of Laboratory Networks (ICLN) Integrated Consortium of Laboratory Networks (ICLN) Dr. S. Randolph Long Deputy Director Chem Bio Division, DHS S&T Directorate FERN National Training Conference June 2009 1 Outline ICLN Organization Steps

More information

MPI-INF AT THE NTCIR-11 TEMPORAL QUERY CLASSIFICATION TASK

MPI-INF AT THE NTCIR-11 TEMPORAL QUERY CLASSIFICATION TASK MPI-INF AT THE NTCIR-11 TEMPORAL QUERY CLASSIFICATION TASK Robin Burghartz Klaus Berberich Max Planck Institute for Informatics, Saarbrücken, Germany General Approach Overall strategy for TQIC subtask:

More information

Steven Davies Marc Roper Department of Computer and Information Sciences University of Strathclyde. International Workshop on Program Debugging, 2013

Steven Davies Marc Roper Department of Computer and Information Sciences University of Strathclyde. International Workshop on Program Debugging, 2013 1/22 Bug localisation through diverse sources of information Steven Davies Marc Roper Department of Computer and Information Sciences University of Strathclyde International Workshop on Program Debugging,

More information

Use of KNN for the Netflix Prize Ted Hong, Dimitris Tsamis Stanford University

Use of KNN for the Netflix Prize Ted Hong, Dimitris Tsamis Stanford University Use of KNN for the Netflix Prize Ted Hong, Dimitris Tsamis Stanford University {tedhong, dtsamis}@stanford.edu Abstract This paper analyzes the performance of various KNNs techniques as applied to the

More information

Diversity Maximization Under Matroid Constraints

Diversity Maximization Under Matroid Constraints Diversity Maximization Under Matroid Constraints Zeinab Abbassi Department of Computer Science Columbia University zeinab@cs.olumbia.edu Vahab S. Mirrokni Google Research, New York mirrokni@google.com

More information

NIGERIA SECURITY AND CIVIL DEFENCE CORPS INSTITUTE OF SECURITY OF NIGERIA

NIGERIA SECURITY AND CIVIL DEFENCE CORPS INSTITUTE OF SECURITY OF NIGERIA NIGERIA SECURITY AND CIVIL DEFENCE CORPS IN COLLABORATION WITH THE INSTITUTE OF SECURITY OF NIGERIA 2015/2016 ADMISSION INTO MANDATORY BASIC PROFESSIONAL CERTIFICATE COURSES FOR PRIVATE AND PUBLIC SECURITY

More information

Term Frequency Normalisation Tuning for BM25 and DFR Models

Term Frequency Normalisation Tuning for BM25 and DFR Models Term Frequency Normalisation Tuning for BM25 and DFR Models Ben He and Iadh Ounis Department of Computing Science University of Glasgow United Kingdom Abstract. The term frequency normalisation parameter

More information

International Journal of Advanced Research in Computer Science and Software Engineering

International Journal of Advanced Research in Computer Science and Software Engineering Volume 3, Issue 3, March 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Special Issue:

More information

K-Nearest-Neighbours with a Novel Similarity Measure for Intrusion Detection

K-Nearest-Neighbours with a Novel Similarity Measure for Intrusion Detection K-Nearest-Neighbours with a Novel Similarity Measure for Intrusion Detection Zhenghui Ma School of Computer Science The University of Birmingham Edgbaston, B15 2TT Birmingham, UK Ata Kaban School of Computer

More information

Adopting Agile Practices

Adopting Agile Practices Adopting Agile Practices Ian Charlton Managing Consultant ReleasePoint Software Testing Solutions ANZTB SIGIST (Perth) 30 November 2010 Tonight s Agenda What is Agile? Why is Agile Important to Testers?

More information

Searching the Deep Web

Searching the Deep Web Searching the Deep Web 1 What is Deep Web? Information accessed only through HTML form pages database queries results embedded in HTML pages Also can included other information on Web can t directly index

More information

Optimize Online Testing for Site Optimization: 101. White Paper. White Paper Webtrends 2014 Webtrends, Inc. All Rights Reserved

Optimize Online Testing for Site Optimization: 101. White Paper. White Paper Webtrends 2014 Webtrends, Inc. All Rights Reserved Optimize Online Testing for Site Optimization: 101 White Paper Overview Understanding the differences between A/B and multivariate testing helps marketers select the proper method for reaching optimization

More information

CROWDSOURCING REAL-TIME TRAVELER INFORMATION SERVICES

CROWDSOURCING REAL-TIME TRAVELER INFORMATION SERVICES CROWDSOURCING REAL-TIME TRAVELER INFORMATION SERVICES Asif Rehan, Karthik C Konduri, Ashrafur Rahman, Nicholas Lownes University of Connecticut 214 Innovations in Travel Modeling, Baltimore, MD INTRODUCTION

More information

ICRCS at Intent2: Applying Rough Set and Semantic Relevance for Subtopic Mining

ICRCS at Intent2: Applying Rough Set and Semantic Relevance for Subtopic Mining ICRCS at Intent2: Applying Rough Set and Semantic Relevance for Subtopic Mining Xiao-Qiang Zhou, Yong-Shuai Hou, Xiao-Long Wang, Bo Yuan, Yao-Yun Zhang Key Laboratory of Network Oriented Intelligent Computation

More information