Usage Guide to Handling of Bayesian Class Data

Size: px
Start display at page:

Download "Usage Guide to Handling of Bayesian Class Data"

Transcription

1 CAMELOT Security 2005 Page: 1 Usage Guide to Handling of Bayesian Class Data 1. Basics Classification of textual data became much more importance in the actual time. Reason for that is the strong increase of unwanted mass (SPAM). Regular text filters search the message text for known words and phrases. This is often not longer effectual enough. Mass mailer applications know in the meantime how text filters work. They change their textual phrases all the time to find backdoors and bypass the filter system. Words like VIAGRA are changing to V*I*A*G*A or V1AGRA. Simple text filters fail at this point, because the known text patterns are not longer there in its original style. The text data classification determines probabilities for different cases to find the best matching category. The base for that calculation is not only given by the data itself but also due to the conjunction between the occurrence of different data, e.g. the combination of different words or phrases in the message text. The process to classify textual data is based on Bayesian Theorem for conditional probability. This method is quite easy to use, but it needs a well structured set of test data. The existing test data defines finally the quality of the classification results. The end result of a Bayesian classification is never 100% positive, only that the probability of one event applying. However the result of a text pattern search feature is always 100% clear, but only if the procedure is used with known text patterns. The benefit of a Bayesian classification system in identifying spam is that the returns from the analysis are also usable results even with unknown data sources.

2 CAMELOT Security 2005 Page: 2 2. The Bayesian Theorem for conditional Probability (Naive Bayes) The probability calculation is bases on the Naive Bayes Method. The Naive Bayes is a procedure that classifies data, based on the Bayesian Theorem for conditional probability. This classification system will be trained, first with data from different classes, and the attributes in the training data will be used to calculate the relative probabilities and assign these relative probabilities to the corresponding classes. The results will then be stored in special database to be used as foundation for the classification of all new data. Since the probability of one event is conditional on the probability of a previous one, the predefined classes can be used to classify all the new data that is received. New data will be classified based on the stored test data. The classifier calculates the probabilities for each single word or phrase. Each probability is a result of the frequency of the class data and the frequency of the words of the message text. The end result is achieved by the class with the highest probability. It is very important to note that under normal conditions the Naïve Bayes system and data are unusable in its original state, since the classification process and success rate is based on available data; the system therefore needs to learn first what is considered by definition to be spam and what it not spam. Therefore, the pool of training data must contain sufficient recognizable attributes to allow the Bayesian Classifier to work properly. Event this is the actual benefit of this method. Existing date can be expanded by a well directed training and adapted to your personal needs. The definitions of the Naive Bayes require the independence of all the given attributes. This may not always be the case; however the procedure has historically provided excellent results, with the data records not to only belonging to a single class, but actually being identified as belonging to several classes of probabilities. So, a Bayesian classification system returns also usable results even with unknown data sources.

3 CAMELOT Security 2005 Page: 3 3. The CAMELOT Bayesian Text Classifier CAMELOT uses the Naive Bayes Method for classification of textual data. Together with the regular text pattern analyzer, SPAM can be reduced to a minimum. The text pattern analyzer detects in the run-up known text patterns and continues with the corresponding action for the respective message. All unidentified contents will be classified then by the Bayesian Text Classifier. The required test data is stored in special data stores. The system differentiates between global and user dependent data. These data stores can be expanded or reduced by several trainings methods. This is handled usually by messages declared as SPAM or NOT SPAM. To create a optimized basic dataset, the quality of the training data itself is very important. The administrator needs to deal with that data to find to right ones to train. Only well structured training data enables a functional data store. The required Data Store must be created and supported by the administrator. User dependent data can be managed by the user itself, but the administrator should always have a eye on it to guarantee a clean environment. Bad structured Bayesian class can cause unwanted results. If clean messages are wrongly trained as INVALID, all messages with similar contents will be also classified as SPAM. The can cause a loss of messages.

4 CAMELOT Security 2005 Page: 4 4. The CAMELOT Bayesian Data Manager The CAMELOT Bayesian Data Manager is a management tool for Bayesian Class Data. The administrator can use this application to train and optimize Bayesian data stores. The Bayesian Data Manager returns important information about the consistence of the data and contains a classification feature to simulate classification processes in the same ways as the security service works. The handling of Bayesian Class Data and the usage of the Bayesian Data Managers is explained in the following chapters.

5 CAMELOT Security 2005 Page: Preliminaries The configuration of Bayesian Class Data needs a few arrangement first. First, a appropriate data store must be created. The second is a well structured pool of training data. This is usually a pool a messages with known content, declared as SPAM and NOT SPAM. The usage of Bayesian classification requires the following basic rules: The class data for valid and invalid contents should be always trained approximately equal. That means that the portion of one class should not be significant bigger than the other class. A well known mistake is to train only SPAM messages. Some vendors of anti spam software deliver their products only with SPAM data. The result is that all messages will be classified as SPAM. Useful classification results can only be realized when the classification feature knows the difference between SPAM and NOT SPAM. This can only be done when the database contains valid and invalid data. The second important thing is that the data store is trained by yourself. Even the valid data is always very different. Message contents of companies in the medical sector consist in the first instance of medical terms. Messages of a travel agency contain information about traveling. So, the class data of a travel agency would never work well for a medical company. Software products with pre-defined class data fail more often than not because of this reason Creating a Data Store with Bayesian Class Data The CAMELOT setup wizard creates two Bayesian data stores automatically during the installation process. Both have the name Bayesian Class Data, but the type is different. One of them is for global data and the other one is used for user dependent data. For the first time, you should use the global profile. If you have no Bayesian Class Data profile in your environment, you can create one in a few minutes. Start the CAMELOT Configuration Utility and select the Data Stores tab. Click on Add to create a new profile. The data store wizard is coming up. On the second wizard page, you need to select the type of data to use with the store. Select Bayesian Classifier Definition data and click on Next. On the following wizard page, you need to select if it should be a global or a user dependent data store. Please select Global Data. The values for Name and Description on the next page are arbitrary.

6 CAMELOT Security 2005 Page: 6 Further on the next pages, you need to select a data source. It is strongly recommend to use a database at this point. Bayesian Class Data are very memory intensive. So, a fast data access is very important. The creating of the required database tables will handled automatically by data store wizard.

7 CAMELOT Security 2005 Page: Assembling the Training Data For the first startup, a number of text files or messages is needed. These files must be well assorted regarding its contents. Create two directories on the hard disk for this with the names CLEAN and SPAM. These directories should be used later to store the files with valid (CLEAN) and invalid (SPAM) data.. Now copy the files with the appropriate contents to this directories. The invalid data directory should contain only files with invalid contents. Use only files with contents where you think that it is SPAM. You can use regular text files or RFC822 compatible message files, as you can find in the CAMELOT directories. Maybe the Quarantine directory contains useful data. The directory for valid data should only contain files with clean data. These are messages from regular transactions. These files should contain nothing which looks like SPAM. This initial selection is very important, because this data is the base for later classifications. For your first tests, about 20 or 30 files in each directory should be enough. The tendency of the test data should be balanced all the time. A exceeding part of valid or invalid data can cause a projection to one side during the classification process.

8 CAMELOT Security 2005 Page: Starting the Bayesian Data Managers Start the Bayesian Data Manager in the CAMELOT group of the Windows start menu. The main windows comes up. Click on Open to open a data store. The store list shows all suitable existing data stores to use with Bayesian class data. If you want to select a user dependent data store, you also need to enter a user address to initialize the store. The data store will be opened in active mode by default. In this case, the Passive mode Option is not set. In active mode, a database cursor will be created but no data is loaded. The required data is loaded only when it is needed. This mode is very useful when your database is very big. Enable the Passive mode if you want to load the whole database into memory. In this case, all operations will run much faster. Also your changes are not saved immediately to the database. The data will be saved when you click on Save. This option is not recommended for big databases, because the load process can take very long time. For your first test, please open a empty data store. You can use the store created by the CAMELOT setup wizard or the one you ve created before by yourself. In both cases, you should open a global data store. Also the Passive mode is recommended at this point. This allows you to play a little bit with all the features without destroying any data in the database. After you ve opened the data store, all list boxes on the left pane will stay empty. Also the statistics on the right hand side do not show any information, because there is no data in the database.

9 CAMELOT Security 2005 Page: Building the Data Store by training with existing Data Select the Training button from the toolbar. On the right pane, you will see the training feature. This feature allows you to expand or reduce your data store with existing data. There are basically two classes, VALID and INVALID. All data, trained on this way will receive its properties for later classification. Click on the Add File button and select all files, located in your valid data directory. Then click on Train Valid to expand the data store with valid class data. All trained data will receive the property VALID in this way. Now click on Clear List and select all the files from your invalid data directory, using the Add File button. Train theses file by clicking on the Train Invalid button. All selected files will receive the INVALID property now. The lists on the left hand side will contain the trained data now. There are two lists, one for invalid and one for valid data. The probability column shows the probability for each single word. This value depends on the frequency of the word itself, as well on the number of words in the data store. If the frequency of only one word changes, the probability of all words in the store will change.

10 CAMELOT Security 2005 Page: 10 The training statistics in the lower pane shows two graphs. The right one show how much data was added to the database and now much was modified. Data will be modified when the word already exists in the database. The word is will not be added a second time. The probability of the existing record will be increased in this case. If the Untrain Valid or Untrain Invalid feature is used, the graph shows the number of records deleted from the database.. The left graph shows the part of the original data used for training. Because of a special Stopwords List, not all data is used for training. The textual data will be normalized first, by remove all words like AND, OR, THE, etc. from the original text. Since such words occur very often, their basic probability would be much higher as the probability of other words. The classification would be controlled by these words and the result would be unusable. Also the stopwords list is shown on the left hand side. You modify this list by adding or deleting words. A detailed stopwords list is a very important thing for good classification results. The No duplicates option on the right pane prevents duplicated training data. If one file is trained multiple times, it s data is counted multiple times. So also the basic probability would be much higher. This will turn the tendency of the data store in a wrong direction. The option can only be used if your data store works with a CRC Table. This table is used to register all processed text contents. If no CRC Table is used, the system is not able to check for duplicates.

11 CAMELOT Security 2005 Page: The Pool Statistics The Statistics button on the toolbar opens the statistics information tab for the actual data store in the right window pane. This area did not contain any data when the application was started for the first time. After successful training width valid and invalid data, the data store has changed its properties. Now, the table contains the corresponding values for VALID and INVALID pool data. You can see the size of the data store and its consistency. Two properties are very important here, the relationship between valid and invalid data as well as the portion of single word records in the database. The relationship between valid and invalid data is defined by the probability of both classes. This is basically not equate with the number of words in the data store. The basic probability of the data store is the sum of all probabilities of each single word. It is calculated by the number of each single word and the number of all words in the data store. So, a large amount of database records does not indicate anything about the tendency of the class. Classes with a low number of words can be rated much higher as classes with more words, because of stronger weights of single words. The basic rating of the data store is basically equal to the tendency of the store. This is shown in the left graph. If one of the classes is to heavy, the tendency of the store will point to the direction of this class. If the valid class is rated much higher than the invalid class, the data store would tend to the valid side. In this case, the hit quota for spam detection would be very low. If the invalid class is would be rated to high, the spam detection works fine, but the number of false positive classification would grow. In this case, valid messages are classified as SPAM. It is very important to keep your data store well balanced. Large tendencies to one side must be prevented. Minor tendencies are irrelevant for well classification results in case that the value for the minimum score in your policy profile is not to low. The second graph show the portion of single word records in the data store. The value is interesting for big databases. A overloaded data store with Bayesian class data will return improper results. If the amount of test data is too big, the dataset is almost equal to the training data, but far away from real-

12 CAMELOT Security 2005 Page: 12 ity. On this account, the data store must be optimized from time to time. This process is called Pruning. This term is also used in landscaping when the trees get their branches pruned. In this process, always the small branches are pruned. The same things happens when the data store gets optimized. Single word records appear only one time in the database. These records do not have a high probability value, but the classification can change its direction when many single words exist. Therefore, these records must be pruned. The right graph shows how big the portion of single word records is. The data store should be pruned when this value is too high. Note The pruning process is only reasonable for large datasets. Smaller datasets consist basically of many single word records. This is not a problem. It can be a problem when the amount of data is growing fast. In this case, the reference to real life gets lost. These datasets can be optimized by pruning of single word records.

13 CAMELOT Security 2005 Page: Data Classification The Classification feature is used to simulate classifications of textual data. This feature works in the same way as the CAMELOT Security service does. Thereby, it is possible to check the data store based on special text data. The text can be entered directly into the text box on the Text data classification tab or loaded from a file by Import button. The File data classification tab is used for classification of multiple files. In this case, the textual data will be loaded directly from file. The classification feature is very suitable to check the previously trained data. For your first test, please click on the Import button and select a file from your valid date directory. The text will appear in the text box. The HTML Mode option is not usable in this case. You should use that option when the text contains HTML code. The import feature will decode the HTML parts automatically when the file is loaded. Click on Classify to start the classification process. All single hits will be tagged in green and red color now, whereby green means valid and red invalid hits. Please note that the appearance of the colors must not be equal to the classification result. The frequency of the words is only one thing. The other one is the basic probability of every single word. The lower pane shows again two graphs. The left one is similar to the training feature. It shows the part of processed data. Also the classification process is normalizing the text data first, based on the stopwords list. You can test this process by yourself with the Normalize button. The text box will show the reduced version of the original text. Another classification would return now a 100% rate of processed data. The right graph shows the classification result. This should be almost 100% Valid in your case, because you ve selected a file from your valid data directory. You can repeat the same process with a file from your invalid data directory. In this case, your result will be almost 100% Invalid. This test can be repeated with files from your training directories. The result will always look similar.

14 CAMELOT Security 2005 Page: 14 More interesting is that test when you classify independent files. Files not originated from your training directories. The result will be somewhere between 100% Valid and 100% Invalid. This would be a typical case of classification of unknown data, as it happens everyday in real life. The results of a Bayesian classification are given by a value between 0 and 100%. The decision if the messages is VALID or INVALID is based on the minimum score of the policy profile. This value should not be too close at the edge of 50%. The File data classification feature is basically identical with the operations above. However, this feature can process multiple files at once. The result will be the total result of all files in this case.

15 CAMELOT Security 2005 Page: Optimizing Bayesian Class Data (Pruning) The optimization of Bayesian class data is even for large datasets very important. The usage of a overloaded data store can return very improper results. The store data is almost equal to the training data and out of touch with reality. The imprecision is caused mostly by so called single word records. This words appear only one time in the database. Their basic probability is rather low, but a high number of such records can change the result. Many of these words can be prevented by a suited stopwords list, but sometime it is not possible to avoid a database optimization. The optimization process is called Pruning. The landscape gardeners know that term from there every year pruning process, when they cut the thin braches from the trees. They cut always the thin braches that the tree is not running into seed. The pruning process for a Bayesian data store works in the same way. The table in the Pruning tab contains the occurrence of single word records. The graph below shows the characteristics of single word numbers. This curve tends always downwards. That means that the words with the lowest number appear most often. In your test dataset, it must be the number of 1. With the help of the ruler on the right hand side, you can scale the horizontal and vertical area of the graph. So, you need to find the right settings for your dataset. The axis might be interesting from 1 to 5, because your dataset is quite small. The option Logarithmic scaling is useful to bring out the leaps. To test his feature, please set the pruning level to 1 and click on Start Pruning. The single word records will be remove from the database now. After the process if finished, the graph will change. The lowest value is not longer 1, it is 2 now. Also the curve is more plain now. This indicates a better consistency of the data store.

16 CAMELOT Security 2005 Page: 16 Note This process is only reasonable for large dataset. Your data store was recently optimized, but in fact important records are lost. According to this, the pruning of small datasets is not a real optimization. Important data was deleted. This data was a fundamental part of the store substance.

CAMELOT Configuration Overview Step-by-Step

CAMELOT Configuration Overview Step-by-Step General Mode of Operation Page: 1 CAMELOT Configuration Overview Step-by-Step 1. General Mode of Operation CAMELOT consists basically of three analytic processes running in a row before the email reaches

More information

Guide - The limitations in screen layout using the Item Placement Tool

Guide - The limitations in screen layout using the Item Placement Tool Guide - The limitations in screen layout using the Item Placement Tool 1/8 Guide - The limitations in screen layout using the Item Placement Tool I the B1 Usability Package we have the Item Placement Tool

More information

Microsoft Excel 2007

Microsoft Excel 2007 Microsoft Excel 2007 1 Excel is Microsoft s Spreadsheet program. Spreadsheets are often used as a method of displaying and manipulating groups of data in an effective manner. It was originally created

More information

Classification Algorithms in Data Mining

Classification Algorithms in Data Mining August 9th, 2016 Suhas Mallesh Yash Thakkar Ashok Choudhary CIS660 Data Mining and Big Data Processing -Dr. Sunnie S. Chung Classification Algorithms in Data Mining Deciding on the classification algorithms

More information

Introduction to Excel

Introduction to Excel Introduction to Excel Written by Jon Agnone Center for Social Science Computation & Research 145 Savery Hall University of Washington Seattle WA 98195 U.S.A. (206)543-8110 November 2004 http://julius.csscr.washington.edu/pdf/excel.pdf

More information

edev Technologies SmartWord4TFS Release Notes

edev Technologies SmartWord4TFS Release Notes edev Technologies SmartWord4TFS Release Notes edev Technologies 3/14/2017 Table of Contents 1. SYSTEM REQUIREMENTS... 2 2. APPLICATION SETUP... 4 3. NEW FEATURES... 5 4. ENHANCED FEATURES... 5 5. KNOWN

More information

Clip Art and Graphics. Inserting Clip Art. Inserting Other Graphics. Creating Your Own Shapes. Formatting the Shape

Clip Art and Graphics. Inserting Clip Art. Inserting Other Graphics. Creating Your Own Shapes. Formatting the Shape 1 of 1 Clip Art and Graphics Inserting Clip Art Click where you want the picture to go (you can change its position later.) From the Insert tab, find the Illustrations Area and click on the Clip Art button

More information

Microsoft Access 2016 Intro to Forms and Reports

Microsoft Access 2016 Intro to Forms and Reports Microsoft Access 2016 Intro to Forms and Reports training@health.ufl.edu Access 2016: Intro to Forms and Reports 2.0 hours Topics include using the AutoForm/AutoReport tool, and the Form and Report Wizards.

More information

The American University in Cairo. Academic Computing Services. Word prepared by. Soumaia Ahmed Al Ayyat

The American University in Cairo. Academic Computing Services. Word prepared by. Soumaia Ahmed Al Ayyat The American University in Cairo Academic Computing Services Word 2000 prepared by Soumaia Ahmed Al Ayyat Spring 2001 Table of Contents: Opening the Word Program Creating, Opening, and Saving Documents

More information

Sisulizer Three simple steps to localize

Sisulizer Three simple steps to localize About this manual Sisulizer Three simple steps to localize Copyright 2006 Sisulizer Ltd. & Co KG Content changes reserved. All rights reserved, especially the permission to copy, distribute and translate

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu [Kumar et al. 99] 2/13/2013 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

More information

Applying Supervised Learning

Applying Supervised Learning Applying Supervised Learning When to Consider Supervised Learning A supervised learning algorithm takes a known set of input data (the training set) and known responses to the data (output), and trains

More information

MICROSOFT EXCEL TUTORIAL

MICROSOFT EXCEL TUTORIAL MICROSOFT EXCEL TUTORIAL G E T T I N G S T A R T E D Microsoft Excel is one of the most popular spreadsheet applications that helps you manage data, create visually persuasive charts, and thought-provoking

More information

A Comparison of Text-Categorization Methods applied to N-Gram Frequency Statistics

A Comparison of Text-Categorization Methods applied to N-Gram Frequency Statistics A Comparison of Text-Categorization Methods applied to N-Gram Frequency Statistics Helmut Berger and Dieter Merkl 2 Faculty of Information Technology, University of Technology, Sydney, NSW, Australia hberger@it.uts.edu.au

More information

9 POINTS TO A GOOD LINE GRAPH

9 POINTS TO A GOOD LINE GRAPH NAME: PD: DATE: 9 POINTS TO A GOOD LINE GRAPH - 2013 1. Independent Variable on the HORIZONTAL (X) AXIS RANGE DIVIDED BY SPACES and round up to nearest usable number to spread out across the paper. LABELED

More information

12 Key Steps to Successful Marketing

12 Key Steps to Successful  Marketing 12 Key Steps to Successful Email Marketing Contents Introduction 3 Set Objectives 4 Have a plan, but be flexible 4 Build a good database 5 Should I buy data? 5 Personalise 6 Nail your subject line 6 Use

More information

Touring the Mac S e s s i o n 4 : S A V E, P R I N T, C L O S E & Q U I T

Touring the Mac S e s s i o n 4 : S A V E, P R I N T, C L O S E & Q U I T Touring the Mac S e s s i o n 4 : S A V E, P R I N T, C L O S E & Q U I T Touring_the_Mac_Session-4_Feb-22-2011 1 To store your document for later retrieval, you must save an electronic file in your computer.

More information

Outlook Express. Setting up the View. Toolbar. Listing of messages OUTLOOK BAR. Status Bar

Outlook Express. Setting up the View. Toolbar. Listing of messages OUTLOOK BAR. Status Bar Outlook Express Toolbar Listing of messages OUTLOOK BAR Status Bar Setting up the View Before using Microsoft Outlook, you need to understand the Layout of the program. To setup a view that you are comfortable

More information

Meet the Cast. The Cosmic Defenders: Gobo, Fabu, and Pele The Cosmic Defenders are transdimensional

Meet the Cast. The Cosmic Defenders: Gobo, Fabu, and Pele The Cosmic Defenders are transdimensional Meet the Cast Mitch A computer science student who loves to make cool programs, he s passionate about movies and art, too! Mitch is an all-around good guy. The Cosmic Defenders: Gobo, Fabu, and Pele The

More information

Microsoft Excel Tutorial

Microsoft Excel Tutorial Microsoft Excel Tutorial 1. GETTING STARTED Microsoft Excel is one of the most popular spreadsheet applications that helps you manage data, create visually persuasive charts, and thought-provoking graphs.

More information

The MathType Window. The picture below shows MathType with all parts of its toolbar visible: Small bar. Tabs. Ruler. Selection.

The MathType Window. The picture below shows MathType with all parts of its toolbar visible: Small bar. Tabs. Ruler. Selection. Handle MathType User Manual The MathType Window The picture below shows MathType with all parts of its toolbar visible: Symbol palettes Template palettes Tabs Small bar Large tabbed bar Small tabbed bar

More information

A QUICK INTRODUCTION TO MINITAB

A QUICK INTRODUCTION TO MINITAB A QUICK INTRODUCTION TO MINITAB The Stern School Statistics Group uses Minitab Release 14 for Windows as its course software. This program was chosen specifically for use with courses B01.1305, C22.0103,

More information

Text. Text metrics. There are some important metrics that we must consider when working with text. Figure 4-1 shows the basics.

Text. Text metrics. There are some important metrics that we must consider when working with text. Figure 4-1 shows the basics. Text Drawing text has some special properties and thus is treated in a separate chapter. We first need to talk about the sizing of text. Then we discuss fonts and how text is actually drawn. There is then

More information

Excel 2010: Getting Started with Excel

Excel 2010: Getting Started with Excel Excel 2010: Getting Started with Excel Excel 2010 Getting Started with Excel Introduction Page 1 Excel is a spreadsheet program that allows you to store, organize, and analyze information. In this lesson,

More information

Microsoft Word 2007 on Windows

Microsoft Word 2007 on Windows 1 Microsoft Word 2007 on Windows Word is a very popular text formatting and editing program. It is the standard for writing papers and other documents. This tutorial and quick start guide will help you

More information

Using Microsoft Excel

Using Microsoft Excel Using Microsoft Excel Introduction This handout briefly outlines most of the basic uses and functions of Excel that we will be using in this course. Although Excel may be used for performing statistical

More information

The first thing we ll need is some numbers. I m going to use the set of times and drug concentration levels in a patient s bloodstream given below.

The first thing we ll need is some numbers. I m going to use the set of times and drug concentration levels in a patient s bloodstream given below. Graphing in Excel featuring Excel 2007 1 A spreadsheet can be a powerful tool for analyzing and graphing data, but it works completely differently from the graphing calculator that you re used to. If you

More information

Logic Pro 7.1 Personal Manual by Edgar Rothermich <http://homepage.mac.com/edgarrothermich>

Logic Pro 7.1 Personal Manual by Edgar Rothermich <http://homepage.mac.com/edgarrothermich> Logic Pro 7.1 File Management (2005-0904) 1 of 9 File Management Logic Pro 7.1 Personal Manual by Edgar Rothermich EdgarRothermich@mac.com File Type Logic uses

More information

Word Skills for ETD Preparation

Word Skills for ETD Preparation Word Skills for ETD Preparation Microsoft Office Word 2010 Office of Information Technology West Virginia University OIT Help Desk (304) 293-4444 1-877-327-9260 http://oit.wvu.edu/training/etd/ oithelp@mail.wvu.edu

More information

Customizing DAZ Studio

Customizing DAZ Studio Customizing DAZ Studio This tutorial covers from the beginning customization options such as setting tabs to the more advanced options such as setting hot keys and altering the menu layout. Introduction:

More information

CPSC 340: Machine Learning and Data Mining. More Regularization Fall 2017

CPSC 340: Machine Learning and Data Mining. More Regularization Fall 2017 CPSC 340: Machine Learning and Data Mining More Regularization Fall 2017 Assignment 3: Admin Out soon, due Friday of next week. Midterm: You can view your exam during instructor office hours or after class

More information

Let s Make a Front Panel using FrontCAD

Let s Make a Front Panel using FrontCAD Let s Make a Front Panel using FrontCAD By Jim Patchell FrontCad is meant to be a simple, easy to use CAD program for creating front panel designs and artwork. It is a free, open source program, with the

More information

Site Owners: Cascade Basics. May 2017

Site Owners: Cascade Basics. May 2017 Site Owners: Cascade Basics May 2017 Page 2 Logging In & Your Site Logging In Open a browser and enter the following URL (or click this link): http://mordac.itcs.northwestern.edu/ OR http://www.northwestern.edu/cms/

More information

Hans Karlsen. MDriven The book. Doing effective Business by taking control of Information. Hans Karlsen, Stockholm Sweden

Hans Karlsen. MDriven The book. Doing effective Business by taking control of Information. Hans Karlsen, Stockholm Sweden Hans Karlsen MDriven The book Doing effective Business by taking control of Information Hans Karlsen, Stockholm Sweden 2016-07-28 Part xx MDriven Turnkey 1 Information Security IT-security covers security

More information

Faculty Development Seminar Series Constructing Posters in PowerPoint 2003 Using a Template

Faculty Development Seminar Series Constructing Posters in PowerPoint 2003 Using a Template 2008-2009 Faculty Development Seminar Series Constructing Posters in PowerPoint 2003 Using a Template Office of Medical Education Research and Development Michigan State University College of Human Medicine

More information

Microsoft Word Introduction

Microsoft Word Introduction Academic Computing Services www.ku.edu/acs Abstract: This document introduces users to basic Microsoft Word 2000 tasks, such as creating a new document, formatting that document, using the toolbars, setting

More information

Microsoft Excel 2010 Handout

Microsoft Excel 2010 Handout Microsoft Excel 2010 Handout Excel is an electronic spreadsheet program you can use to enter and organize data, and perform a wide variety of number crunching tasks. Excel helps you organize and track

More information

Spam Classification Documentation

Spam Classification Documentation Spam Classification Documentation What is SPAM? Unsolicited, unwanted email that was sent indiscriminately, directly or indirectly, by a sender having no current relationship with the recipient. Objective:

More information

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes

More information

SAMLab Tip Sheet #4 Creating a Histogram

SAMLab Tip Sheet #4 Creating a Histogram Creating a Histogram Another great feature of Excel is its ability to visually display data. This Tip Sheet demonstrates how to create a histogram and provides a general overview of how to create graphs,

More information

Character Recognition

Character Recognition Character Recognition 5.1 INTRODUCTION Recognition is one of the important steps in image processing. There are different methods such as Histogram method, Hough transformation, Neural computing approaches

More information

2. On classification and related tasks

2. On classification and related tasks 2. On classification and related tasks In this part of the course we take a concise bird s-eye view of different central tasks and concepts involved in machine learning and classification particularly.

More information

Introduction to Internet Applications

Introduction to Internet Applications to Internet Applications Internet Applications, ID1354 1 / 36 Contents 2 / 36 Section 3 / 36 Local Application We are familiar with an architecture where the entire application resides on the same computer.

More information

Working with Microsoft Excel. Touring Excel. Selecting Data. Presented by: Brian Pearson

Working with Microsoft Excel. Touring Excel. Selecting Data. Presented by: Brian Pearson Working with Microsoft Excel Presented by: Brian Pearson Touring Excel Menu bar Name box Formula bar Ask a Question box Standard and Formatting toolbars sharing one row Work Area Status bar Task Pane 2

More information

EXCEL 98 TUTORIAL Chemistry C2407 fall 1998 Andy Eng, Columbia University 1998

EXCEL 98 TUTORIAL Chemistry C2407 fall 1998 Andy Eng, Columbia University 1998 Created on 09/02/98 11:58 PM 1 EXCEL 98 TUTORIAL Chemistry C2407 fall 1998 Andy Eng, Columbia University 1998 Note for Excel 97 users: All features of Excel 98 for Macintosh are available in Excel 97 for

More information

OnBase Thick Client User Guide

OnBase Thick Client User Guide OnBase Thick Client User Guide PROFORMANCE, Inc. 96 Beach Walk Boulevard, Suite 104 Conroe, TX 77304 Tel: (281) 292-9000 Table of Contents Disclaimer... 3 Starting the OnBase Client... 4 Retrieving Documents...

More information

CAPE. Community Behavioral Health Data. How to Create CAPE. Community Assessment and Education to Promote Behavioral Health Planning and Evaluation

CAPE. Community Behavioral Health Data. How to Create CAPE. Community Assessment and Education to Promote Behavioral Health Planning and Evaluation CAPE Community Behavioral Health Data How to Create CAPE Community Assessment and Education to Promote Behavioral Health Planning and Evaluation i How to Create County Community Behavioral Health Profiles

More information

Visual Dialogue User Guide. Version 6.0

Visual Dialogue User Guide. Version 6.0 Visual Dialogue User Guide Version 6.0 2013 Pitney Bowes Software Inc. All rights reserved. This document may contain confidential and proprietary information belonging to Pitney Bowes Inc. and/or its

More information

Introduction This paper will discuss the best practices for stopping the maximum amount of SPAM arriving in a user's inbox. It will outline simple

Introduction This paper will discuss the best practices for stopping the maximum amount of SPAM arriving in a user's inbox. It will outline simple Table of Contents Introduction...2 Overview...3 Common techniques to identify SPAM...4 Greylisting...5 Dictionary Attack...5 Catchalls...5 From address...5 HELO / EHLO...6 SPF records...6 Detecting SPAM...6

More information

MICROSOFT WORD 2010 BASICS

MICROSOFT WORD 2010 BASICS MICROSOFT WORD 2010 BASICS Word 2010 is a word processing program that allows you to create various types of documents such as letters, papers, flyers, and faxes. The Ribbon contains all of the commands

More information

a. divided by the. 1) Always round!! a) Even if class width comes out to a, go up one.

a. divided by the. 1) Always round!! a) Even if class width comes out to a, go up one. Probability and Statistics Chapter 2 Notes I Section 2-1 A Steps to Constructing Frequency Distributions 1 Determine number of (may be given to you) a Should be between and classes 2 Find the Range a The

More information

Civil Engineering Computation

Civil Engineering Computation Civil Engineering Computation First Steps in VBA Homework Evaluation 2 1 Homework Evaluation 3 Based on this rubric, you may resubmit Homework 1 and Homework 2 (along with today s homework) by next Monday

More information

Since its earliest days about 14 years ago Access has been a relational

Since its earliest days about 14 years ago Access has been a relational Storing and Displaying Data in Access Since its earliest days about 14 years ago Access has been a relational database program, storing data in tables and using its own queries, forms, and reports to sort,

More information

DOING MORE WITH POWERPOINT: MICROSOFT OFFICE 2013

DOING MORE WITH POWERPOINT: MICROSOFT OFFICE 2013 DOING MORE WITH POWERPOINT: MICROSOFT OFFICE 2013 GETTING STARTED PAGE 02 Prerequisites What You Will Learn USING MICROSOFT POWERPOINT PAGE 03 Slide Views MORE TASKS IN MICROSOFT POWERPOINT PAGE 05 Formatting

More information

Statistics with a Hemacytometer

Statistics with a Hemacytometer Statistics with a Hemacytometer Overview This exercise incorporates several different statistical analyses. Data gathered from cell counts with a hemacytometer is used to explore frequency distributions

More information

ENV Laboratory 2: Graphing

ENV Laboratory 2: Graphing Name: Date: Introduction It is often said that a picture is worth 1,000 words, or for scientists we might rephrase it to say that a graph is worth 1,000 words. Graphs are most often used to express data

More information

Using Reports. Access 2013 Unit D. Property of Cengage Learning. Unit Objectives. Files You Will Need

Using Reports. Access 2013 Unit D. Property of Cengage Learning. Unit Objectives. Files You Will Need Unit D CASE Samantha Hooper, a tour developer at Quest Specialty Travel, asks you to produce some reports to help her share and analyze data. A report is an Access object that creates a professional looking

More information

Creating Reports. There are several types of reports. We'll use Access Wizards and Tabs/Ribbons to design several simple reports.

Creating Reports. There are several types of reports. We'll use Access Wizards and Tabs/Ribbons to design several simple reports. Creating Reports In this tutorial we'll cover the very basic steps of creating Reports. A good manual or some knowledgeable assistance will be essential to mastering reports. There are several types of

More information

CPSC 340: Machine Learning and Data Mining. Non-Parametric Models Fall 2016

CPSC 340: Machine Learning and Data Mining. Non-Parametric Models Fall 2016 CPSC 340: Machine Learning and Data Mining Non-Parametric Models Fall 2016 Admin Course add/drop deadline tomorrow. Assignment 1 is due Friday. Setup your CS undergrad account ASAP to use Handin: https://www.cs.ubc.ca/getacct

More information

Book 5. Chapter 1: Slides with SmartArt & Pictures... 1 Working with SmartArt Formatting Pictures Adjust Group Buttons Picture Styles Group Buttons

Book 5. Chapter 1: Slides with SmartArt & Pictures... 1 Working with SmartArt Formatting Pictures Adjust Group Buttons Picture Styles Group Buttons Chapter 1: Slides with SmartArt & Pictures... 1 Working with SmartArt Formatting Pictures Adjust Group Buttons Picture Styles Group Buttons Chapter 2: Slides with Charts & Shapes... 12 Working with Charts

More information

How to speed up a database which has gotten slow

How to speed up a database which has gotten slow Triad Area, NC USA E-mail: info@geniusone.com Web: http://geniusone.com How to speed up a database which has gotten slow hardware OS database parameters Blob fields Indices table design / table contents

More information

How to use Excel Spreadsheets for Graphing

How to use Excel Spreadsheets for Graphing How to use Excel Spreadsheets for Graphing 1. Click on the Excel Program on the Desktop 2. You will notice that a screen similar to the above screen comes up. A spreadsheet is divided into Columns (A,

More information

The major change in Word is the ribbon toolbar. The File menu has been replaced with a button.

The major change in Word is the ribbon toolbar. The File menu has been replaced with a button. Word 2007 There are a lot of new changes to Office 2007. This handout will provide a few examples on how to do basic formatting. If at any point you get stuck, remember that Office has a feature that allows

More information

BIOL 417: Biostatistics Laboratory #3 Tuesday, February 8, 2011 (snow day February 1) INTRODUCTION TO MYSTAT

BIOL 417: Biostatistics Laboratory #3 Tuesday, February 8, 2011 (snow day February 1) INTRODUCTION TO MYSTAT BIOL 417: Biostatistics Laboratory #3 Tuesday, February 8, 2011 (snow day February 1) INTRODUCTION TO MYSTAT Go to the course Blackboard site and download Laboratory 3 MYSTAT Intro.xls open this file in

More information

The Data Mining Application Based on WEKA: Geographical Original of Music

The Data Mining Application Based on WEKA: Geographical Original of Music Management Science and Engineering Vol. 10, No. 4, 2016, pp. 36-46 DOI:10.3968/8997 ISSN 1913-0341 [Print] ISSN 1913-035X [Online] www.cscanada.net www.cscanada.org The Data Mining Application Based on

More information

VisualPST 2.4. Visual object report editor for PowerSchool. Copyright Park Bench Software, LLC All Rights Reserved

VisualPST 2.4. Visual object report editor for PowerSchool. Copyright Park Bench Software, LLC All Rights Reserved VisualPST 2.4 Visual object report editor for PowerSchool Copyright 2004-2015 Park Bench Software, LLC All Rights Reserved www.parkbenchsoftware.com This software is not free - if you use it, you must

More information

Creating a Histogram Creating a Histogram

Creating a Histogram Creating a Histogram Creating a Histogram Another great feature of Excel is its ability to visually display data. This Tip Sheet demonstrates how to create a histogram and provides a general overview of how to create graphs,

More information

OnCommand Insight 6.3 Data Warehouse Custom Report Hands-on Lab Guide

OnCommand Insight 6.3 Data Warehouse Custom Report Hands-on Lab Guide OnCommand Insight 6.3 Data Warehouse Custom Report Hands-on Lab Guide How to easily create adhoc and multi-tenancy reports in OnCommand Insight DWH using Business Insight Advanced Dave Collins, Technical

More information

Introduction to Excel

Introduction to Excel Introduction to Excel Written by James Dailey Center for Social Science Computation & Research 145 Savery Hall University of Washington Seattle WA 98195 U.S.A. (206)543-8110 June 2000 http://julius.csscr.washington.edu/pdf/excel.pdf

More information

Microsoft Word 2010 Tutorial

Microsoft Word 2010 Tutorial 1 Microsoft Word 2010 Tutorial Microsoft Word 2010 is a word-processing program, designed to help you create professional-quality documents. With the finest documentformatting tools, Word helps you organize

More information

MAKING TABLES WITH WORD BASIC INSTRUCTIONS. Setting the Page Orientation. Inserting the Basic Table. Daily Schedule

MAKING TABLES WITH WORD BASIC INSTRUCTIONS. Setting the Page Orientation. Inserting the Basic Table. Daily Schedule MAKING TABLES WITH WORD BASIC INSTRUCTIONS Setting the Page Orientation Once in word, decide if you want your paper to print vertically (the normal way, called portrait) or horizontally (called landscape)

More information

Mobile First: Data & Facts on mobile SERPs

Mobile First: Data & Facts on mobile SERPs Mobile First: Data & Facts on mobile SERPs Table of Contents What is the study about? 2 Data Sources 4 Organic Google-Traffic: Mobile vs. Desktop 5 Country Comparison 5 Insights 13 Click Rates: Mobile

More information

Solutions. Ans. True. Ans. False. 11. How many types of masters are available in Impress?

Solutions. Ans. True. Ans. False. 11. How many types of masters are available in Impress? Chapter 10: Presentation Tool OpenOffice Impress Solutions Summative Assessment Multiple-Choice Questions (MCQs) 1. is the extension of the Impress presentation. a..odp b..ppt c..odb d. None of the above

More information

QDA Miner. Addendum v2.0

QDA Miner. Addendum v2.0 QDA Miner Addendum v2.0 QDA Miner is an easy-to-use qualitative analysis software for coding, annotating, retrieving and reviewing coded data and documents such as open-ended responses, customer comments,

More information

CounselLink Reporting. Designer

CounselLink Reporting. Designer CounselLink Reporting Designer Contents Overview... 1 Introduction to the Document Editor... 2 Create a new document:... 2 Document Templates... 3 Datasets... 3 Document Structure... 3 Layout Area... 4

More information

Intro To Excel Spreadsheet for use in Introductory Sciences

Intro To Excel Spreadsheet for use in Introductory Sciences INTRO TO EXCEL SPREADSHEET (World Population) Objectives: Become familiar with the Excel spreadsheet environment. (Parts 1-5) Learn to create and save a worksheet. (Part 1) Perform simple calculations,

More information

1 Introduction to Using Excel Spreadsheets

1 Introduction to Using Excel Spreadsheets Survey of Math: Excel Spreadsheet Guide (for Excel 2007) Page 1 of 6 1 Introduction to Using Excel Spreadsheets This section of the guide is based on the file (a faux grade sheet created for messing with)

More information

OU EDUCATE TRAINING MANUAL

OU EDUCATE TRAINING MANUAL OU EDUCATE TRAINING MANUAL OmniUpdate Web Content Management System El Camino College Staff Development 310-660-3868 Course Topics: Section 1: OU Educate Overview and Login Section 2: The OmniUpdate Interface

More information

Sandvik Coromant Technical White Paper GTC Guidelines Introduction to Generic Tool Classification

Sandvik Coromant Technical White Paper GTC Guidelines Introduction to Generic Tool Classification GTC Guidelines Introduction to Generic Tool Classification GTC Guidelines White paper Communicating tool data among tool vendors and systems has always been quite a challenge. The introduction of the ISO

More information

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9 Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9 Contents 1 Introduction to Using Excel Spreadsheets 2 1.1 A Serious Note About Data Security.................................... 2 1.2

More information

EXCEL 2010 BASICS JOUR 772 & 472 / Ira Chinoy

EXCEL 2010 BASICS JOUR 772 & 472 / Ira Chinoy EXCEL 2010 BASICS JOUR 772 & 472 / Ira Chinoy Virus check and backups: Remember that if you are receiving a file from an external source a government agency or some other source, for example you will want

More information

DOING MORE WITH WORD: MICROSOFT OFFICE 2010

DOING MORE WITH WORD: MICROSOFT OFFICE 2010 DOING MORE WITH WORD: MICROSOFT OFFICE 2010 GETTING STARTED PAGE 02 Prerequisites What You Will Learn USING MICROSOFT WORD PAGE 03 Viewing Toolbars Adding and Removing Buttons MORE TASKS IN MICROSOFT WORD

More information

The HOME Tab: Cut Copy Vertical Alignments

The HOME Tab: Cut Copy Vertical Alignments The HOME Tab: Cut Copy Vertical Alignments Text Direction Wrap Text Paste Format Painter Borders Cell Color Text Color Horizontal Alignments Merge and Center Highlighting a cell, a column, a row, or the

More information

(Refer Slide Time: 00:02:02)

(Refer Slide Time: 00:02:02) Computer Graphics Prof. Sukhendu Das Dept. of Computer Science and Engineering Indian Institute of Technology, Madras Lecture - 20 Clipping: Lines and Polygons Hello and welcome everybody to the lecture

More information

Use of GeoGebra in teaching about central tendency and spread variability

Use of GeoGebra in teaching about central tendency and spread variability CREAT. MATH. INFORM. 21 (2012), No. 1, 57-64 Online version at http://creative-mathematics.ubm.ro/ Print Edition: ISSN 1584-286X Online Edition: ISSN 1843-441X Use of GeoGebra in teaching about central

More information

CS15100 Lab 7: File compression

CS15100 Lab 7: File compression C151 Lab 7: File compression Fall 26 November 14, 26 Complete the first 3 chapters (through the build-huffman-tree function) in lab (optionally) with a partner. The rest you must do by yourself. Write

More information

The American University in Cairo. Academic Computing Services. Excel prepared by. Maha Amer

The American University in Cairo. Academic Computing Services. Excel prepared by. Maha Amer The American University in Cairo Excel 2000 prepared by Maha Amer Spring 2001 Table of Contents: Opening the Excel Program Creating, Opening and Saving Excel Worksheets Sheet Structure Formatting Text

More information

Lesson 5 Styles, Tables, and Frames

Lesson 5 Styles, Tables, and Frames In this lesson you will learn how to create a new document that imports the custom page and paragraph styles created in earlier lessons. You will also see how to add tables to your documents. If LibreOffice

More information

Microsoft Project. Website:

Microsoft Project.   Website: Microsoft Project Email: training@health.ufl.edu Website: http://training.health.ufl.edu Microsoft Project Microsoft Project assists in designing, implementing and managing a project schedule. A project

More information

CHAPTER 8 QUADRATIC RELATIONS AND CONIC SECTIONS

CHAPTER 8 QUADRATIC RELATIONS AND CONIC SECTIONS CHAPTER 8 QUADRATIC RELATIONS AND CONIC SECTIONS Big IDEAS: 1) Writing equations of conic sections ) Graphing equations of conic sections 3) Solving quadratic systems Section: Essential Question 8-1 Apply

More information

Authors: Rostislav Pinski, Dmitry Kaganov, Eli Shtein, Alexander Gorohovski. Version 1.1. Preventing Data Leakage via

Authors: Rostislav Pinski, Dmitry Kaganov, Eli Shtein, Alexander Gorohovski. Version 1.1. Preventing Data Leakage via Ben Gurion University Deutsche Telekom Labs Authors: Rostislav Pinski, Dmitry Kaganov, Eli Shtein, Alexander Gorohovski Version 1.1 1 1. Introduction...4-16 1.1 Vision...4 1.2 The Problem Domain...5-6

More information

Auto-leveling Rotating Laser (RL430G)

Auto-leveling Rotating Laser (RL430G) Auto-leveling Rotating Laser (RL430G) Congratulations on your choice of this Auto-leveling Rotating Laser. For the purpose of long-term use of this instrument, we suggest you to read this instruction manual

More information

Organizing your Outlook Inbox

Organizing your Outlook Inbox Organizing your Outlook Inbox Tip 1: Filing system Tip 2: Create and name folders Tip 3: Folder structures Tip 4: Automatically organizing incoming emails into folders Tip 5: Using Colors Tip 6: Using

More information

Building Better s. Contents

Building Better  s. Contents Building Better Emails Contents Building Better Emails... 1 Email Marketing Basics... 2 How to Optimize HTML Emails... 2 Using OnContact to Send Email Campaigns rather than your regular email address or

More information

Process Eye Professional. Recall

Process Eye Professional. Recall Process Eye Professional Recall Process Eye Professional Recall User Manual SP104010.101 August 2005 As part of our continuous product improvement policy, we are always pleased to receive your comments

More information

Microsoft Excel. Charts

Microsoft Excel. Charts Microsoft Excel Charts Chart Wizard To create a chart in Microsoft Excel, select the data you wish to graph or place yourself with in the conjoining data set and choose Chart from the Insert menu, or click

More information

Excel Spreadsheets and Graphs

Excel Spreadsheets and Graphs Excel Spreadsheets and Graphs Spreadsheets are useful for making tables and graphs and for doing repeated calculations on a set of data. A blank spreadsheet consists of a number of cells (just blank spaces

More information

GEOGEBRA: DYNAMIC MATHEMATICS MADE EASY

GEOGEBRA: DYNAMIC MATHEMATICS MADE EASY GEOGEBRA: DYNAMIC MATHEMATICS MADE EASY Barbara K. D Ambrosia Carl R. Spitznagel John Carroll University Department of Mathematics and Computer Science Cleveland, OH 44118 bdambrosia@jcu.edu spitz@jcu.edu

More information

For Module 2 SKILLS CHECKLIST. Fraction Notation. George Hartas, MS. Educational Assistant for Mathematics Remediation MAT 025 Instructor

For Module 2 SKILLS CHECKLIST. Fraction Notation. George Hartas, MS. Educational Assistant for Mathematics Remediation MAT 025 Instructor Last Updated: // SKILLS CHECKLIST For Module Fraction Notation By George Hartas, MS Educational Assistant for Mathematics Remediation MAT 0 Instructor Assignment, Section. Divisibility SKILL: Determine

More information

In this project, I examined methods to classify a corpus of s by their content in order to suggest text blocks for semi-automatic replies.

In this project, I examined methods to classify a corpus of  s by their content in order to suggest text blocks for semi-automatic replies. December 13, 2006 IS256: Applied Natural Language Processing Final Project Email classification for semi-automated reply generation HANNES HESSE mail 2056 Emerson Street Berkeley, CA 94703 phone 1 (510)

More information