TREC Legal Track Learning Task Final Guidelines (revision 0.0)

Size: px
Start display at page:

Download "TREC Legal Track Learning Task Final Guidelines (revision 0.0)"

Transcription

1 TREC Legal Track Learning Task Final Guidelines (revision 0.0) Gordon V. Cormack Maura R. Grossman Abstract In the learning task, participants are given a seed set of documents from a larger collection that have previously been assessed by TREC as responsive or non-responsive to a legal discovery request. Using this information, participants must (a) rank the documents in the larger collection from most likely to least likely to be responsive; and (b) for each document, estimate the likelihood of responsiveness as a probability. The ranking will be evaluated by how well it places responsive documents before non-responsive ones, as assessed by TREC. The likelihood estimate will be evaluated by how well it estimates recall throughout the ranked list. 1 e-discovery Context The learning task models the use of automatic or semi-automatic methods to guide review strategy for the first or later passes of a multi-stage document review effort, organized as follows: 1. Preliminary search and asessment. The responding party analyzes the production request. Using ad hoc methods the team identifies a seed set of potentially responsive documents, and assesses each as responsive or not. 2. Learning by example. A learning method is used to rank the documents in the collection from most to least likely to be responsive to the production request, and to estimate this likelihood for 1

2 1 e-discovery Context 2 each document. The input to the learning method consists of the seed set, the assessments for the seed set, and the unranked collection; the output is a ranked list consisting of the document identifer and a probability of responsiveness for each document in the collection. The two components of learning by example ranking and estimation may be accomplished by the same method or by different methods. Either may be automated or manual. For example, ranking may be done using an information retrieval method or by human review using a five-point scale. Estimation may be done in the course of ranking or, for example, by sampling and reviewing documents at representative ranks. 3. Review process. A review process may be conducted, with strategy guided by the ranked list. One possible strategy is to review documents in order, thus discovering as many responsive documents as possible for a given amount of effort. Another possible strategy is triage: to review only mid-ranked documents, deeming, without further review, the top-ranked ones to be responsive, and the bottom-ranked ones to be non-responsive. Review strategy may be guided not only by the order of the ranked list, as outlined above, but also by the estimated effectiveness of various alternatives. Consider the strategy of reviewing the topranked documents. Where should a cut be made so that documents above the cut are reviewed and documents below are not? For triage, where should the two necessary cuts be made? Practically every review strategy decision boils down to the question, Of this particular set of documents, how many are responsive and how many are not? An informed choice of where to make the cut demands that we know how many documents both above and below the cut are responsive. There is no single correct answer the best choice depends on the relative risks, costs and benefits of missing responsive documents or of reviewing or producing non-responsive documents. In discovery efforts,

3 2 TREC Context 3 the cost of failing to produce responsive documents is typically much higher than the cost of producing non-responsive ones (unless they are privileged!). The specific tradeoff will be different in almost every matter. For this reason, we require not that the learning method make the cut, but that it provide the necessary information to make the best cut for the particular situation. To this end, we require an estimate of the probability that any particular document is responsive. If the probability estimates are accurate, it follows that the number of responsive documents in any set will be the sum of the probabilities associated with the documents in the set. Suppose the cost of missing a responsive document is estimated to be $95 and the cost of reviewing a non-responsive one is $5. It therefore follows that the best strategy is to review whose documents whose probability of relevance is above 5%, and none below. One simply has to sum the estimates to find the cut that makes this tradeoff. When opposing counsel demands to know how many documents were missed, or why they weren t identified, the estimate yields a defensible answer. 2 TREC Context This task is part of the TREC 2010 Legal Track. TREC The Text Retrieval Conference has the following goals: to encourage research in information retrieval based on large test collections; to increase communication among industry, academia, and government by creating an open forum for the exchange of research ideas; to speed the transfer of technology from research labs into commercial products by demonstrating substantial improvements in retrieval methodologies on real-world problems; and to increase the availability of appropriate evaluation techniques for use by industry and academia, including development of new evaluation techniques more applicable to current systems.

4 3 Task Details 4 The Legal Track, in its 5th year, pursues these goals within the context of electronic discovery. The Learning task is a successor to the batch task of TREC The principal differences are in focus, the requirement to rank every document in the corpus, and the requirement to provide a likelihood estimate for every document in the corpus. In addition, the task uses the same collection and some of the same production requests as its sister task, the TREC 2010 Legal Interactive Task, to facilitate comparison. 3 Task Details There will be eight production requests, each using the Enron document collection. For each request, the participating team will be given the text of the request, guidelines for its interpretation, a seed set of documents, and assessments for the seed documents. The results for all eight requests must be encoded in a text file according to the standard TREC format, where each line contains: requestid Q0 docid rank estp runid requestid is a number assigned by TREC identifying the production request. Q0 is a historical artifact of the TREC format. docid is a TREC-assigned document identifier. rank is the ranking of the document by estp, where 1 is the most likely relevant document for the request. estp is a probability estimate between 0.0 and 1.0. runid is a unique identifier for the submission, formed by joining a sequence of 3 or 4 characters identifying the team (composed by participant) a sequence of 3 or 4 charecters identifying the method (composed by participant) Capital A if the method is fully automated; capital M if the method is fully manual, meaning the seed documents are not used as input to any program; capital T if both automated and manual methods are used.

5 4 Manner of participation 5 The task will use the EDRM Enron v2 collection, which may be downloaded without restriction from the Web. 1 in either EDRM XML or PST formats. The document ids will be those from the EDRM XML format. Mapping files will be provided to convert information derivable from the PST-format documents to document ids. For the learning task, a document is considered to be either an message as a whole, or a particular attachment to an message. The EDRM XML version contains both native and text versions of each document, each with a unique identifier. The PST version contains the complete messages; participants will need to extract the attachments from these messages and treat them as separate documents. The Enron corpus contains many duplicate documents. For efficiency, submissions should refer to only the canonical document from each set of duplicates, as defined by TREC. 2 A mapping is provided from all document identifiers to canonical documents, so participants may eliminate duplicates prior to processing, or after the fact. The EDRM download is approximately 100GB in size. As a convenience, TREC is providing to participants a download for the text version of all canonical documents as a separate download (609 MB). Furthermore, TREC is providing the native version of all canonical attachments (7 GB). 4 Manner of participation We hope to attract participants that use various combinations of manual effort and technology. It is possible and desirable that a participating team employ two or more of the strategies, provided that the efforts are organized to conform to the information flow constraints. That is, a group might complete a fully automated review, and then undertake a technology-assisted review; or, a group might complete a manual review, and then undertake a technology-assisted review. Or a group might undertake manual and automated reviews concurrently, provided no input to the automated system, including tuning, was done by any individual having seen the production request or associated materials

6 5 Evaluation Details 6 Teams are free and encouraged to participate also in the TREC 2010 Legal Interactive Task, which will have three production requests and a privilege review. 5 Evaluation Details The effectiveness of ranking will be evaluated separately from the effectiveness of estimation. Several measures will be computed for each. The principal measures being considered to evaluate ranking are: Receiver operating characteristic (ROC) curves to display the tradeoff between missed responsive documents and produced non-responsive documents (false negatives and false positives) for all posible cuts. Area under the ROC curve (AUC) to serve as a summary measure of ranking effectiveness. AUC can be interpreted in terms of a very simple game (see below). Precision and recall at specific cuts, such as 10, 1000, documents. R-Precision. Average Precision. For estimation the principal measures being considered are: Root-mean-squared Average Recall Error (RMSRE). The RMS difference between estimated recall and actual recall, at all recall levels. Information Gain (IG). An information-theoretic measure that captures both the accuracy of the estimates and the effectiveness of the ranking. IG can be interpreted in terms of a very simple game (see below). F1. The actual F1@K measure (used in other legal tasks) achieved when K is the cut value that optimizes apparent F1@K. Apparent F1. The value of F1@K, if the estimates were accurate.

7 6 The Information Gain Game 7 6 The Information Gain Game We illustrate the notion of information gain with a game. It is not necessary to understand the theory behind it, as the rules and strategy are simple. You are given a sequence of documents to review, in some arbitrary order; For each document, you must guess a number estp between 0 and 1, which is your estimate of the likelihood that the document is responsive. (It would be inadvisable to guess 0 or 1 exactly, as the downside of an incorrect guess would be infinite.) If the document turns out to be responsive, your score is IG = 1 + log 2 estp If the document turns out to be non-responsive your score is IG = 1 + log 2 (1 estp ) A strategy to play well has the following elements: If you think the document is responsive, choose estp > 1 2 If you are pretty confident the document is responsive, choose estp 1 2 Don t be overconfident. Never choose estp = 1. If you think the document is non-responsive, choose estp < 1 2 If you are pretty confident the document is non-responsive, choose estp 1 2 Don t be overconfident. Never choose estp = 0.

8 7 The AUC Game 8 The best possible score, on average, is achieved if estp is in fact an accurate estimate of the probability. If you think the document is responsive and you think there s a 65% chance you re right, guess estp = If your guesses are sensiblee, you ll get a positive average IG score. If you get a negative IG score, you would have been better off flipping a coin! 7 The AUC Game The AUC game works like this: You are given a stack of documents, that you arrange with the most likely to be responsive on top, and so on down to the least likely on the bottom. For every responsive document and for every non-responsive document in the stack, if the responsive document is above the nonresponsive one, you score a point; otherwise you score 0. Your AUC score is AUC = points R N were points is the total number of points you scored, R is the number of responsive documents, and N is the number of nonresponsive documents.

Overview of the TREC 2005 Spam Track. Gordon V. Cormack Thomas R. Lynam. 18 November 2005

Overview of the TREC 2005 Spam Track. Gordon V. Cormack Thomas R. Lynam. 18 November 2005 Overview of the TREC 2005 Spam Track Gordon V. Cormack Thomas R. Lynam 18 November 2005 To answer questions! Why Standardized Evaluation? Is spam filtering a viable approach? What are the risks, costs,

More information

1.2 Adding Integers. Contents: Numbers on the Number Lines Adding Signed Numbers on the Number Line

1.2 Adding Integers. Contents: Numbers on the Number Lines Adding Signed Numbers on the Number Line 1.2 Adding Integers Contents: Numbers on the Number Lines Adding Signed Numbers on the Number Line Finding Sums Mentally The Commutative Property Finding Sums using And Patterns and Rules of Adding Signed

More information

A. Incorrect! This would be the negative of the range. B. Correct! The range is the maximum data value minus the minimum data value.

A. Incorrect! This would be the negative of the range. B. Correct! The range is the maximum data value minus the minimum data value. AP Statistics - Problem Drill 05: Measures of Variation No. 1 of 10 1. The range is calculated as. (A) The minimum data value minus the maximum data value. (B) The maximum data value minus the minimum

More information

Technical Brief: Domain Risk Score Proactively uncover threats using DNS and data science

Technical Brief: Domain Risk Score Proactively uncover threats using DNS and data science Technical Brief: Domain Risk Score Proactively uncover threats using DNS and data science 310 Million + Current Domain Names 11 Billion+ Historical Domain Profiles 5 Million+ New Domain Profiles Daily

More information

Bulgarian Math Olympiads with a Challenge Twist

Bulgarian Math Olympiads with a Challenge Twist Bulgarian Math Olympiads with a Challenge Twist by Zvezdelina Stankova Berkeley Math Circle Beginners Group September 0, 03 Tasks throughout this session. Harder versions of problems from last time appear

More information

EXAM PREPARATION GUIDE

EXAM PREPARATION GUIDE When Recognition Matters EXAM PREPARATION GUIDE PECB Certified ISO 14001 Lead Implementer www.pecb.com The objective of the PECB Certified ISO 14001 Lead Implementer examination is to ensure that the candidate

More information

How to Jump-Start Early Case Assessment. A Case Study based on TREC 2008

How to Jump-Start Early Case Assessment. A Case Study based on TREC 2008 How to Jump-Start Early Case Assessment A Case Study based on TREC 2008 For over a decade, the National Institute of Standards and Technology ( NIST ) has cosponsored the Text Retrieval Conference ( TREC

More information

Slice Intelligence!

Slice Intelligence! Intern @ Slice Intelligence! Wei1an(Wu( September(8,(2014( Outline!! Details about the job!! Skills required and learned!! My thoughts regarding the internship! About the company!! Slice, which we call

More information

Threat and Vulnerability Assessment Tool

Threat and Vulnerability Assessment Tool TABLE OF CONTENTS Threat & Vulnerability Assessment Process... 3 Purpose... 4 Components of a Threat & Vulnerability Assessment... 4 Administrative Safeguards... 4 Logical Safeguards... 4 Physical Safeguards...

More information

EXAM PREPARATION GUIDE

EXAM PREPARATION GUIDE When Recognition Matters EXAM PREPARATION GUIDE PECB Certified ISO 22000 Lead Implementer www.pecb.com The objective of the Certified ISO 22000 Lead Implementer examination is to ensure that the candidate

More information

EXAM PREPARATION GUIDE

EXAM PREPARATION GUIDE When Recognition Matters EXAM PREPARATION GUIDE PECB Certified ISO 31000 Risk Manager www.pecb.com The objective of the PECB Certified ISO 31000 Risk Manager examination is to ensure that the candidate

More information

Learning internal representations

Learning internal representations CHAPTER 4 Learning internal representations Introduction In the previous chapter, you trained a single-layered perceptron on the problems AND and OR using the delta rule. This architecture was incapable

More information

EXAM PREPARATION GUIDE

EXAM PREPARATION GUIDE When Recognition Matters EXAM PREPARATION GUIDE PECB Certified ISO/IEC 27005 Risk Manager www.pecb.com The objective of the PECB Certified ISO/IEC 27005 Risk Manager examination is to ensure that the candidate

More information

Rank Measures for Ordering

Rank Measures for Ordering Rank Measures for Ordering Jin Huang and Charles X. Ling Department of Computer Science The University of Western Ontario London, Ontario, Canada N6A 5B7 email: fjhuang33, clingg@csd.uwo.ca Abstract. Many

More information

Information Retrieval

Information Retrieval Information Retrieval ETH Zürich, Fall 2012 Thomas Hofmann LECTURE 6 EVALUATION 24.10.2012 Information Retrieval, ETHZ 2012 1 Today s Overview 1. User-Centric Evaluation 2. Evaluation via Relevance Assessment

More information

TRANSANA and Chapter 8 Retrieval

TRANSANA and Chapter 8 Retrieval TRANSANA and Chapter 8 Retrieval Chapter 8 in Using Software for Qualitative Research focuses on retrieval a crucial aspect of qualitatively coding data. Yet there are many aspects of this which lead to

More information

LIS 2680: Database Design and Applications

LIS 2680: Database Design and Applications School of Information Sciences - University of Pittsburgh LIS 2680: Database Design and Applications Summer 2012 Instructor: Zhen Yue School of Information Sciences, University of Pittsburgh E-mail: zhy18@pitt.edu

More information

CH 87 FUNCTIONS: TABLES AND MAPPINGS

CH 87 FUNCTIONS: TABLES AND MAPPINGS CH 87 FUNCTIONS: TABLES AND MAPPINGS INTRODUCTION T he notion of a function is much more abstract than most of the algebra concepts you ve seen so far, so we ll start with three specific non-math examples.

More information

ITG. Information Security Management System Manual

ITG. Information Security Management System Manual ITG Information Security Management System Manual This manual describes the ITG Information Security Management system and must be followed closely in order to ensure compliance with the ISO 27001:2005

More information

BPS Suite and the OCEG Capability Model. Mapping the OCEG Capability Model to the BPS Suite s product capability.

BPS Suite and the OCEG Capability Model. Mapping the OCEG Capability Model to the BPS Suite s product capability. BPS Suite and the OCEG Capability Model Mapping the OCEG Capability Model to the BPS Suite s product capability. BPS Contents Introduction... 2 GRC activities... 2 BPS and the Capability Model for GRC...

More information

ITG. Information Security Management System Manual

ITG. Information Security Management System Manual ITG Information Security Management System Manual This manual describes the ITG Information Security Management system and must be followed closely in order to ensure compliance with the ISO 27001:2005

More information

CATALOG 2017/2018 BINUS UNIVERSITY. Cyber Security. Introduction. Vision. Mission

CATALOG 2017/2018 BINUS UNIVERSITY. Cyber Security. Introduction. Vision. Mission Cyber Security Introduction Cyber attack is raising and threaten ubiquitous world on internet today. Industry and government need cyber security expert to counter and defend from this threaten. Cyber Security

More information

[Maria Jackson Hittle] Thanks, Michael. Since the RSR was implemented in 2009, HAB has been slowing shifting its focus from data submission to data

[Maria Jackson Hittle] Thanks, Michael. Since the RSR was implemented in 2009, HAB has been slowing shifting its focus from data submission to data [Michael Costa] Welcome to today s Webcast. Thank you so much for joining us today! My name is Michael Costa. I m a member of the DART Team, one of several groups engaged by HAB to provide training and

More information

Learning Objectives. Outline. Lung Cancer Workshop VIII 8/2/2012. Nicholas Petrick 1. Methodologies for Evaluation of Effects of CAD On Users

Learning Objectives. Outline. Lung Cancer Workshop VIII 8/2/2012. Nicholas Petrick 1. Methodologies for Evaluation of Effects of CAD On Users Methodologies for Evaluation of Effects of CAD On Users Nicholas Petrick Center for Devices and Radiological Health, U.S. Food and Drug Administration AAPM - Computer Aided Detection in Diagnostic Imaging

More information

Transactions Lab I. Tom Kelliher, CS 317

Transactions Lab I. Tom Kelliher, CS 317 Transactions Lab I Tom Kelliher, CS 317 The purpose of this lab is for you to gain some understanding of how transactions work, see for yourselves how the various SQL isolation levels correspond to the

More information

HOLA: Human-like Orthogonal Network Layout

HOLA: Human-like Orthogonal Network Layout HOLA: Human-like Orthogonal Network Layout S. Kieffer, T. Dwyer, K. Marriot, and M. Wybrow Emily Hindalong CPSC 547 Presentation Novermber 17, 2015 1 In a Nutshell... Let s analyze human-drawn networks

More information

Measuring Intrusion Detection Capability: An Information- Theoretic Approach

Measuring Intrusion Detection Capability: An Information- Theoretic Approach Measuring Intrusion Detection Capability: An Information- Theoretic Approach Guofei Gu, Prahlad Fogla, David Dagon, Wenke Lee Georgia Tech Boris Skoric Philips Research Lab Outline Motivation Problem Why

More information

ISO 31000:2009 & COSO ERM

ISO 31000:2009 & COSO ERM ISO 31000:2009 & COSO ERM Jakarta, 2018 Facilitators: Dr. Antonius Alijoyo MBA., ERMCP., CERG., CGAP., CCSA., CFSA., CGEIT., CFE - Ketua KomTek 03-10 BSN - Founder CRMS Indonesia - Ketua Umum IRMAPA COSO

More information

Wye Valley NHS Trust. Data protection audit report. Executive summary June 2017

Wye Valley NHS Trust. Data protection audit report. Executive summary June 2017 Wye Valley NHS Trust Data protection audit report Executive summary June 2017 1. Background The Information Commissioner is responsible for enforcing and promoting compliance with the Data Protection Act

More information

Foundation Level Syllabus Usability Tester Sample Exam

Foundation Level Syllabus Usability Tester Sample Exam Foundation Level Syllabus Usability Tester Sample Exam Version 2017 Provided by German Testing Board Copyright Notice This document may be copied in its entirety, or extracts made, if the source is acknowledged.

More information

Overview of the INEX 2009 Link the Wiki Track

Overview of the INEX 2009 Link the Wiki Track Overview of the INEX 2009 Link the Wiki Track Wei Che (Darren) Huang 1, Shlomo Geva 2 and Andrew Trotman 3 Faculty of Science and Technology, Queensland University of Technology, Brisbane, Australia 1,

More information

Introduction to the American Academy of Forensic Sciences Standards Board

Introduction to the American Academy of Forensic Sciences Standards Board Introduction to the American Academy of Forensic Sciences Standards Board Introduction Teresa Ambrosius, Secretariat Mary McKiel, Communication Liaison Technical Coordinator Position to be filled Background

More information

2. On classification and related tasks

2. On classification and related tasks 2. On classification and related tasks In this part of the course we take a concise bird s-eye view of different central tasks and concepts involved in machine learning and classification particularly.

More information

This chapter continues our overview of public-key cryptography systems (PKCSs), and begins with a description of one of the earliest and simplest

This chapter continues our overview of public-key cryptography systems (PKCSs), and begins with a description of one of the earliest and simplest 1 2 3 This chapter continues our overview of public-key cryptography systems (PKCSs), and begins with a description of one of the earliest and simplest PKCS, Diffie- Hellman key exchange. This first published

More information

Equating. Lecture #10 ICPSR Item Response Theory Workshop

Equating. Lecture #10 ICPSR Item Response Theory Workshop Equating Lecture #10 ICPSR Item Response Theory Workshop Lecture #10: 1of 81 Lecture Overview Test Score Equating Using IRT How do we get the results from separate calibrations onto the same scale, so

More information

Definition: A data structure is a way of organizing data in a computer so that it can be used efficiently.

Definition: A data structure is a way of organizing data in a computer so that it can be used efficiently. The Science of Computing I Lesson 4: Introduction to Data Structures Living with Cyber Pillar: Data Structures The need for data structures The algorithms we design to solve problems rarely do so without

More information

Using TAR 2.0 to Streamline Document Review for Japanese Patent Litigation

Using TAR 2.0 to Streamline Document Review for Japanese Patent Litigation 877.557.4273 catalystsecure.com CASE STUDY Using TAR 2.0 to Streamline Document Review for Japanese Patent Litigation Continuous Active Review Cuts Cost by Over 85% Catalyst s review platform and its ability

More information

Week 7 Picturing Network. Vahe and Bethany

Week 7 Picturing Network. Vahe and Bethany Week 7 Picturing Network Vahe and Bethany Freeman (2005) - Graphic Techniques for Exploring Social Network Data The two main goals of analyzing social network data are identification of cohesive groups

More information

CS261: A Second Course in Algorithms Lecture #14: Online Bipartite Matching

CS261: A Second Course in Algorithms Lecture #14: Online Bipartite Matching CS61: A Second Course in Algorithms Lecture #14: Online Bipartite Matching Tim Roughgarden February 18, 16 1 Online Bipartite Matching Our final lecture on online algorithms concerns the online bipartite

More information

ALGORITHMIC TRADING AND ORDER ROUTING SERVICES POLICY

ALGORITHMIC TRADING AND ORDER ROUTING SERVICES POLICY ALGORITHMIC TRADING AND ORDER ROUTING SERVICES POLICY Please respond to: Trading Operations THE LONDON METAL EXCHANGE 10 Finsbury Square, London EC2A 1AJ Tel +44 (0)20 7113 8888 Registered in England no

More information

Evaluation. David Kauchak cs160 Fall 2009 adapted from:

Evaluation. David Kauchak cs160 Fall 2009 adapted from: Evaluation David Kauchak cs160 Fall 2009 adapted from: http://www.stanford.edu/class/cs276/handouts/lecture8-evaluation.ppt Administrative How are things going? Slides Points Zipf s law IR Evaluation For

More information

Vocabulary: Looking For Pythagoras

Vocabulary: Looking For Pythagoras Vocabulary: Looking For Pythagoras Concept Finding areas of squares and other figures by subdividing or enclosing: These strategies for finding areas were developed in Covering and Surrounding. Students

More information

Lab 2 Test collections

Lab 2 Test collections Lab 2 Test collections Information Retrieval, 2017 Goal Introduction The objective of this lab is for you to get acquainted with working with an IR test collection and Lemur Indri retrieval system. Instructions

More information

Hi everyone. I hope everyone had a good Fourth of July. Today we're going to be covering graph search. Now, whenever we bring up graph algorithms, we

Hi everyone. I hope everyone had a good Fourth of July. Today we're going to be covering graph search. Now, whenever we bring up graph algorithms, we Hi everyone. I hope everyone had a good Fourth of July. Today we're going to be covering graph search. Now, whenever we bring up graph algorithms, we have to talk about the way in which we represent the

More information

Q: Can we download a copy of the ICM/FSA video presentation for reference?

Q: Can we download a copy of the ICM/FSA video presentation for reference? General Q: Can we download a copy of the ICM/FSA video presentation for reference? A: Yes. A PowerPoint document is posted under the General Information section of the ICM Dashboard. Q: Is it OK not to

More information

p x i 1 i n x, y, z = 2 x 3 y 5 z

p x i 1 i n x, y, z = 2 x 3 y 5 z 3 Pairing and encoding functions Our aim in this part of the course is to show that register machines can compute everything that can be computed, and to show that there are things that can t be computed.

More information

The Basics of Graphical Models

The Basics of Graphical Models The Basics of Graphical Models David M. Blei Columbia University September 30, 2016 1 Introduction (These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan.

More information

Informed Search CS457 David Kauchak Fall 2011

Informed Search CS457 David Kauchak Fall 2011 Admin Informed Search CS57 David Kauchak Fall 011 Some material used from : Sara Owsley Sood and others Q3 mean: 6. median: 7 Final projects proposals looked pretty good start working plan out exactly

More information

Enhancing the Cybersecurity of Federal Information and Assets through CSIP

Enhancing the Cybersecurity of Federal Information and Assets through CSIP TECH BRIEF How BeyondTrust Helps Government Agencies Address Privileged Access Management to Improve Security Contents Introduction... 2 Achieving CSIP Objectives... 2 Steps to improve protection... 3

More information

Variables and Data Representation

Variables and Data Representation You will recall that a computer program is a set of instructions that tell a computer how to transform a given set of input into a specific output. Any program, procedural, event driven or object oriented

More information

EDRMS Document Migration Guideline

EDRMS Document Migration Guideline Title EDRMS Document Migration Guideline Creation Date 23 December 2016 Version 3.0 Last Revised 28 March 2018 Approved by Records Manager and IT&S Business Partner Approval date 28 March 2018 TABLE OF

More information

Synology Security Whitepaper

Synology Security Whitepaper Synology Security Whitepaper 1 Table of Contents Introduction 3 Security Policy 4 DiskStation Manager Life Cycle Severity Ratings Standards Security Program 10 Product Security Incident Response Team Bounty

More information

EXAM PREPARATION GUIDE

EXAM PREPARATION GUIDE When Recognition Matters EXAM PREPARATION GUIDE PECB Certified ISO/IEC 20000 Lead Auditor www.pecb.com The objective of the Certified ISO/IEC 20000 Lead Auditor examination is to ensure that the candidate

More information

CSCI 5417 Information Retrieval Systems. Jim Martin!

CSCI 5417 Information Retrieval Systems. Jim Martin! CSCI 5417 Information Retrieval Systems Jim Martin! Lecture 7 9/13/2011 Today Review Efficient scoring schemes Approximate scoring Evaluating IR systems 1 Normal Cosine Scoring Speedups... Compute the

More information

Sandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing

Sandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing Generalized Additive Model and Applications in Direct Marketing Sandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing Abstract Logistic regression 1 has been widely used in direct marketing applications

More information

UX Research in the Product Lifecycle

UX Research in the Product Lifecycle UX Research in the Product Lifecycle I incorporate how users work into the product early, frequently and iteratively throughout the development lifecycle. This means selecting from a suite of methods and

More information

ResPubliQA 2010

ResPubliQA 2010 SZTAKI @ ResPubliQA 2010 David Mark Nemeskey Computer and Automation Research Institute, Hungarian Academy of Sciences, Budapest, Hungary (SZTAKI) Abstract. This paper summarizes the results of our first

More information

IBM Security AppScan Enterprise v9.0.1 Importing Issues from Third Party Scanners

IBM Security AppScan Enterprise v9.0.1 Importing Issues from Third Party Scanners IBM Security AppScan Enterprise v9.0.1 Importing Issues from Third Party Scanners Anton Barua antonba@ca.ibm.com October 14, 2014 Abstract: To manage the challenge of addressing application security at

More information

It was a dark and stormy night. Seriously. There was a rain storm in Wisconsin, and the line noise dialing into the Unix machines was bad enough to

It was a dark and stormy night. Seriously. There was a rain storm in Wisconsin, and the line noise dialing into the Unix machines was bad enough to 1 2 It was a dark and stormy night. Seriously. There was a rain storm in Wisconsin, and the line noise dialing into the Unix machines was bad enough to keep putting garbage characters into the command

More information

4. Write a sum-of-products representation of the following circuit. Y = (A + B + C) (A + B + C)

4. Write a sum-of-products representation of the following circuit. Y = (A + B + C) (A + B + C) COP 273, Winter 26 Exercises 2 - combinational logic Questions. How many boolean functions can be defined on n input variables? 2. Consider the function: Y = (A B) (A C) B (a) Draw a combinational logic

More information

Agile Accessibility. Presenters: Ensuring accessibility throughout the Agile development process

Agile Accessibility. Presenters: Ensuring accessibility throughout the Agile development process Agile Accessibility Ensuring accessibility throughout the Agile development process Presenters: Andrew Nielson, CSM, PMP, MPA Ann Marie Davis, CSM, PMP, M. Ed. Cammie Truesdell, M. Ed. Overview What is

More information

Certified Digital Forensics Examiner

Certified Digital Forensics Examiner Certified Digital Forensics Examiner ACCREDITATIONS EXAM INFORMATION The Certified Digital Forensics Examiner exam is taken online through Mile2 s Assessment and Certification System ( MACS ), which is

More information

Plan for today. CS276B Text Retrieval and Mining Winter Vector spaces and XML. Text-centric XML retrieval. Vector spaces and XML

Plan for today. CS276B Text Retrieval and Mining Winter Vector spaces and XML. Text-centric XML retrieval. Vector spaces and XML CS276B Text Retrieval and Mining Winter 2005 Plan for today Vector space approaches to XML retrieval Evaluating text-centric retrieval Lecture 15 Text-centric XML retrieval Documents marked up as XML E.g.,

More information

Building an Assurance Foundation for 21 st Century Information Systems and Networks

Building an Assurance Foundation for 21 st Century Information Systems and Networks Building an Assurance Foundation for 21 st Century Information Systems and Networks The Role of IT Security Standards, Metrics, and Assessment Programs Dr. Ron Ross National Information Assurance Partnership

More information

WHITE PAPER ediscovery & Netmail. SEARCH, PRODUCE, and EXPORT EXACTLY WHAT YOU RE LOOKING FOR.

WHITE PAPER ediscovery & Netmail. SEARCH, PRODUCE, and EXPORT EXACTLY WHAT YOU RE LOOKING FOR. WHITE PAPER ediscovery & Netmail SEARCH, PRODUCE, and EXPORT EXACTLY WHAT YOU RE LOOKING FOR. ediscovery and Netmail In the ediscovery world, the Electronic Discovery Reference Model (EDRM) has been the

More information

Question 1: Duplicate Counter (30%)

Question 1: Duplicate Counter (30%) Question 1: Duplicate Counter (30%) Write a program called Duplicates.java which will determine whether a string contains duplicate characters, and where they occur. Your program should read in a single

More information

Information Bulletin

Information Bulletin Application of Primary and Secondary Reference Documents Version 1.1 Approved for release July 2014 Table of Contents 1.0 Purpose statement... 3 2.0 Audience... 3 3.0 BCA requirements and referenced documents...

More information

LAB: FOR LOOPS IN C++

LAB: FOR LOOPS IN C++ LAB: FOR LOOPS IN C++ MODULE 2 JEFFREY A. STONE and TRICIA K. CLARK COPYRIGHT 2014 VERSION 4.0 PALMS MODULE 2 LAB: FOR LOOPS IN C++ 2 Introduction This lab will provide students with an introduction to

More information

Slide Set 5. for ENEL 353 Fall Steve Norman, PhD, PEng. Electrical & Computer Engineering Schulich School of Engineering University of Calgary

Slide Set 5. for ENEL 353 Fall Steve Norman, PhD, PEng. Electrical & Computer Engineering Schulich School of Engineering University of Calgary Slide Set 5 for ENEL 353 Fall 207 Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary Fall Term, 207 SN s ENEL 353 Fall 207 Slide Set 5 slide

More information

Repetition Structures

Repetition Structures Repetition Structures Chapter 5 Fall 2016, CSUS Introduction to Repetition Structures Chapter 5.1 1 Introduction to Repetition Structures A repetition structure causes a statement or set of statements

More information

Condition-Controlled Loop. Condition-Controlled Loop. If Statement. Various Forms. Conditional-Controlled Loop. Loop Caution.

Condition-Controlled Loop. Condition-Controlled Loop. If Statement. Various Forms. Conditional-Controlled Loop. Loop Caution. Repetition Structures Introduction to Repetition Structures Chapter 5 Spring 2016, CSUS Chapter 5.1 Introduction to Repetition Structures The Problems with Duplicate Code A repetition structure causes

More information

Searching for Expertise

Searching for Expertise Searching for Expertise Toine Bogers Royal School of Library & Information Science University of Copenhagen IVA/CCC seminar April 24, 2013 Outline Introduction Expertise databases Expertise seeking tasks

More information

Non-Linearity of Scorecard Log-Odds

Non-Linearity of Scorecard Log-Odds Non-Linearity of Scorecard Log-Odds Ross McDonald, Keith Smith, Matthew Sturgess, Edward Huang Retail Decision Science, Lloyds Banking Group Edinburgh Credit Scoring Conference 6 th August 9 Lloyds Banking

More information

CS47300: Web Information Search and Management

CS47300: Web Information Search and Management CS47300: Web Information Search and Management Prof. Chris Clifton 27 August 2018 Material adapted from course created by Dr. Luo Si, now leading Alibaba research group 1 AD-hoc IR: Basic Process Information

More information

Sample Exam. Certified Tester Foundation Level

Sample Exam. Certified Tester Foundation Level Sample Exam Certified Tester Foundation Level Answer Table ASTQB Created - 2018 American Stware Testing Qualifications Board Copyright Notice This document may be copied in its entirety, or extracts made,

More information

Certified Digital Forensics Examiner

Certified Digital Forensics Examiner Certified Digital Forensics Examiner Course Title: Certified Digital Forensics Examiner Duration: 5 days Class Format Options: Instructor-led classroom Live Online Training Prerequisites: A minimum of

More information

Use of Synthetic Data in Testing Administrative Records Systems

Use of Synthetic Data in Testing Administrative Records Systems Use of Synthetic Data in Testing Administrative Records Systems K. Bradley Paxton and Thomas Hager ADI, LLC 200 Canal View Boulevard, Rochester, NY 14623 brad.paxton@adillc.net, tom.hager@adillc.net Executive

More information

XBRL US Domain Steering Committee Taxonomy Review

XBRL US Domain Steering Committee Taxonomy Review XBRL US Domain Steering Committee Taxonomy Review for XBRL US Work In Process Taxonomy 2016 As Approved by the DSC on September 8, 2016 Prepared by: Scott Theis, DSC Chairman Campbell Pryde Michelle Savage

More information

Information Technology Branch Organization of Cyber Security Technical Standard

Information Technology Branch Organization of Cyber Security Technical Standard Information Technology Branch Organization of Cyber Security Technical Standard Information Management, Administrative Directive A1461 Cyber Security Technical Standard # 1 November 20, 2014 Approved:

More information

Multimedia Information Retrieval

Multimedia Information Retrieval Multimedia Information Retrieval Prof Stefan Rüger Multimedia and Information Systems Knowledge Media Institute The Open University http://kmi.open.ac.uk/mmis Multimedia Information Retrieval 1. What are

More information

Viewpoint Review & Analytics

Viewpoint Review & Analytics The Viewpoint all-in-one e-discovery platform enables law firms, corporations and service providers to manage every phase of the e-discovery lifecycle with the power of a single product. The Viewpoint

More information

Certification Standing Committee (CSC) Charter. Appendix A Certification Standing Committee (CSC) Charter

Certification Standing Committee (CSC) Charter. Appendix A Certification Standing Committee (CSC) Charter Appendix A A1 Introduction A1.1 CSC Vision and Mission and Objectives Alignment with Boundaryless Information Flow: Our vision is a foundation of a scalable high integrity TOGAF certification programs

More information

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes

More information

OCM ACADEMIC SERVICES PROJECT INITIATION DOCUMENT. Project Title: Online Coursework Management

OCM ACADEMIC SERVICES PROJECT INITIATION DOCUMENT. Project Title: Online Coursework Management OCM-12-025 ACADEMIC SERVICES PROJECT INITIATION DOCUMENT Project Title: Online Coursework Management Change Record Date Author Version Change Reference March 2012 Sue Milward v1 Initial draft April 2012

More information

NYS Early Learning Trainer Credential. Portfolio Instructions

NYS Early Learning Trainer Credential. Portfolio Instructions TC Portfolio Guidelines 8/17/2011 Contents Trainer Definitions 3 General Instructions.....4 Credential Levels...4 Portfolio Structure...5 Portfolio Content Parts 1 and 2: Portfolio Entries.........5 Portfolio

More information

Chapter 4 EDGE Approval Protocol for Auditors Version 3.0 June 2017

Chapter 4 EDGE Approval Protocol for Auditors Version 3.0 June 2017 Chapter 4 EDGE Approval Protocol for Auditors Version 3.0 June 2017 Copyright 2017 International Finance Corporation. All rights reserved. The material in this publication is copyrighted by International

More information

EXAM PREPARATION GUIDE

EXAM PREPARATION GUIDE When Recognition Matters EXAM PREPARATION GUIDE PECB Certified ISO 22000 Lead Auditor www.pecb.com The objective of the Certified ISO 22000 Lead Auditor examination is to ensure that the candidate has

More information

Kuratowski Notes , Fall 2005, Prof. Peter Shor Revised Fall 2007

Kuratowski Notes , Fall 2005, Prof. Peter Shor Revised Fall 2007 Kuratowski Notes 8.30, Fall 005, Prof. Peter Shor Revised Fall 007 Unfortunately, the OCW notes on Kuratowski s theorem seem to have several things substantially wrong with the proof, and the notes from

More information

Graphical Models. David M. Blei Columbia University. September 17, 2014

Graphical Models. David M. Blei Columbia University. September 17, 2014 Graphical Models David M. Blei Columbia University September 17, 2014 These lecture notes follow the ideas in Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. In addition,

More information

Today s Topics. Percentile ranks and percentiles. Standardized scores. Using standardized scores to estimate percentiles

Today s Topics. Percentile ranks and percentiles. Standardized scores. Using standardized scores to estimate percentiles Today s Topics Percentile ranks and percentiles Standardized scores Using standardized scores to estimate percentiles Using µ and σ x to learn about percentiles Percentiles, standardized scores, and the

More information

Configuring IP Multicast Routing

Configuring IP Multicast Routing 34 CHAPTER This chapter describes how to configure IP multicast routing on the Cisco ME 3400 Ethernet Access switch. IP multicasting is a more efficient way to use network resources, especially for bandwidth-intensive

More information

Introduction to Counting, Some Basic Principles

Introduction to Counting, Some Basic Principles Introduction to Counting, Some Basic Principles These are the class notes for week. Before we begin, let me just say something about the structure of the class notes. I wrote up all of these class notes

More information

Instructions for Form DS-7787: Disclosure of Violations of the Arms Export Control Act

Instructions for Form DS-7787: Disclosure of Violations of the Arms Export Control Act Instructions for Form DS-7787: Disclosure of Violations of the Arms Export Control Act General Instructions: 1 The size of the text field will correspond to the type of information required, with more

More information

Linear combinations of simple classifiers for the PASCAL challenge

Linear combinations of simple classifiers for the PASCAL challenge Linear combinations of simple classifiers for the PASCAL challenge Nik A. Melchior and David Lee 16 721 Advanced Perception The Robotics Institute Carnegie Mellon University Email: melchior@cmu.edu, dlee1@andrew.cmu.edu

More information

HMH Leadership Form (Leader Guide)

HMH Leadership Form (Leader Guide) b a a b c Overview: Leadership Evaluation Form Getting Started: On the Home Page, you ll see a To-Do tile for any modules that have pending tasks assigned to you. Either (a) Click on the hyperlink to navigate

More information

WHITEPAPER THE EVOLUTION OF APPSEC: FROM WAFS TO AUTONOMOUS APPLICATION PROTECTION

WHITEPAPER THE EVOLUTION OF APPSEC: FROM WAFS TO AUTONOMOUS APPLICATION PROTECTION WHITEPAPER THE EVOLUTION OF APPSEC: FROM WAFS TO AUTONOMOUS APPLICATION PROTECTION 2 Web application firewalls (WAFs) entered the security market at the turn of the century as web apps became increasingly

More information

Sample Exam. Advanced Test Automation - Engineer

Sample Exam. Advanced Test Automation - Engineer Sample Exam Advanced Test Automation - Engineer Questions ASTQB Created - 2018 American Software Testing Qualifications Board Copyright Notice This document may be copied in its entirety, or extracts made,

More information

Data Center Consolidation and Migration Made Simpler with Visibility

Data Center Consolidation and Migration Made Simpler with Visibility Data Center Consolidation and Migration Made Simpler with Visibility Abstract The ExtraHop platform takes the guesswork out of data center consolidation and migration efforts by providing complete visibility

More information

The HIPAA Omnibus Rule

The HIPAA Omnibus Rule The HIPAA Omnibus Rule What You Should Know and Do as Enforcement Begins Rebecca Fayed, Associate General Counsel and Privacy Officer Eric Banks, Information Security Officer 3 Biographies Rebecca C. Fayed

More information

WECC Internal Controls Evaluation Process WECC Compliance Oversight Effective date: October 15, 2017

WECC Internal Controls Evaluation Process WECC Compliance Oversight Effective date: October 15, 2017 WECC Internal Controls Evaluation Process WECC Compliance Oversight Effective date: October 15, 2017 155 North 400 West, Suite 200 Salt Lake City, Utah 84103-1114 WECC Internal Controls Evaluation Process

More information