UNIT 4. Research Methods in Business

UNIT 4 Preparing Data for Analysis:- After data are obtained through questionnaires, interviews, observation or through secondary sources, they need to be edited. The blank responses, if any have to be handled in some way, the data coded, and a categorization scheme has to be set up. The data will then have to be keyed in, and some software program used to analyze them. Each of these stages of data preparation is discussed below Editing:- Editing detects errors and omission, corrects them when possible, and certifies that minimum data quality standards are achieved. The editor s purpose is to guarantee that data are (1) accurate, (2) Consistent with other information, (3) Uniformly entered, (4) Complete, and (5) Arranged to simplify coding and tabulation. Field Editing:- During the stress of data collection, the researcher often uses ad hoc observations and special symbols. Soon after the data collection the investigator should review the reporting forms. It is difficult to complete what was abbreviated or written in shorthand or noted illegibly if the entry is not caught that day. So it was recommended that such editing should be done preferably the very same day the data are collected. Office Editing:- Editing is taken place on the office after the overall data collection is come to an end. Coding:- Coding involves assigning numbers or other symbols to answers so the responses can be grouped into a limited number of classes or categories. It helps the researcher to reduce several thousand replies to a few categories containing the critical information needed for analysis. In coding, categories are the partitioning of a set; and categorization is the process of using rules to partition a body of data Coding Rules:- Appropriate to the research problem and purpose Exhaustive Mutually exclusive

Derived from one classification principle Handling Blank Responses:- Not all the respondents answer every item in the questionnaire. Answers may have been left blank because the respondent did not understand the question, did not know the answer, was not willing to answer, or was simply indifferent to the need to respond to the entire questionnaire. If a substantial number of questions say 25% of the items in the questionnaire have been left unanswered it may be a good idea to throw out the questionnaire and not include it in the data set for analysis. Way to Handle blank response:- 1. To an interval-scaled item with a midpoint would be to assign 2. Allow the computer to ignore the blank responses 3. Assign the mean value of the responses of all those who responded to that particular item. 4. Assign the mean of responses of this particular respondent to all other question measuring this variable. 5. Assign the random number within the range for that scale. Data entry:- If questionnaire data are not collected on scanner answer sheets, which can be directly entered into the computer as a data file, the raw data will have to be manually keyed into the computer. Raw data can be entered through any software program. Data Analysis and Interpretation:- DISCRIMINANT ANALYSIS This technique is used to classify individuals into one of two or more alternative groups on the basis of a set of measurements. It can be also use to identify which variables contribute to making the classification. This technique is used for analyzing the data when the criterion or dependent variable is categorical and the predictor or independent variables are interval in nature. The evaluation criterion may be good or bad, like or dislike, successful or unsuccessful

etc., It is used to identify the predictor variable ( Independent variable) which is more important when compared to the other predictor variable. When the data are non-metric it is not possible to determine the quantitative relationship between variables. In such cases, since correlation and regression techniques cannot be applied, the researcher has to use other techniques such as Discriminant analysis. Discriminant analysis techniques are described by the number of categories possessed by the criterion variable. When the criterion variable has two categories, the technique is known as Two Group Discriminant Analysis. When three or more categories are involved the technique is referred to as Multiple Discriminant Analysis. Applications: 1. In terms of demographic characteristics, how do customers who exhibit store loyalty differ from those who do not? 2. Do heavy, medium, and light users if sift drinks differ in terms of their consumptions of frozen food? 3. What psychographic characteristics help differentiate between price-sensitive and non-price-sensitive buyers of groceries? 4. Do the various market segments differ in their media consumption habits? 5. In terms of lifestyles, what are the differences between heavy patrons of regional department store chains and patrons of national chains? 6. What are the distinguishing characteristics of consumers who respond to direct mail solicitations? Similarities and Differences among ANOVA, Regression and Discriminant Analysis Similarities ANOVA Regression Discriminant Number of dependent variables One One One Number of Independent variables Multiple Multiple Multiple Differences Nature of dependent variables Metric Metric Categorical Nature of independent variables Categorical Metric Metric

Research Methods in Business

Overview of SPSS SPSS for Windows provides a powerful statistical analysis and data management system in a graphical environment, using descriptive menus and simple dialog boxes to do most of the work for you. Most tasks can be accomplished simply by pointing and clicking the mouse. In addition to the simple point-and-click interface for statistical analysis, SPSS for Windows provides: Data Editor. A versatile spreadsheet-like system for defining, entering, editing, and displaying data. Viewer. The Viewer makes it easy to browse your results, selectively show and hide output, change the display order results, and move presentation-quality tables and charts between SPSS and other applications. Multidimensional pivot tables. Your results come alive with multidimensional pivot tables. Explore your tables by rearranging rows, columns, and layers. Uncover important findings that can get lost in standard reports. Compare groups easily by splitting your table so that only one group is displayed at a time. High-resolution graphics. High-resolution, full-color pie charts, bar charts, histograms, scatterplots, 3-D graphics, and more are included as standard features in SPSS. Database access. Retrieve information from databases by using the Database Wizard instead of complicated SQL queries. Data transformations. Transformation features help get your data ready for analysis. You can easily subset data, combine categories, add, aggregate, merge, split, and transpose files, and more. Electronic distribution. Send e-mail reports to others with the click of a button, or export tables and charts in HTML format for Internet and intranet distribution. New features in SPSS 11.5 include: What's New in SPSS 11.5? New data definition tools. Two new features make defining data faster and easier: The Copy Data Properties wizard provides the ability to use an external SPSS data file as a template for defining file and variable properties in the working data file. You can also use variables in the working data file as templates for other variables in the working data file. Copy Data Properties is available on the Data

menu in the Data Editor window. See Copying Data Properties for more information. Define Variable Properties (also available on the Data menu in the Data Editor window) scans your data and lists all unique data values for any selected variables, identifies unlabeled values, and provides an auto-label feature. This is particularly useful for categorical variables that use numeric codes to represent categories--for example, 0 = Male, 1 = Female. See Defining Variable Properties for more information. Expanded support for SAS format data files. You can now save data files in SAS Version 6, SAS Version 7, and SAS Transport file format. See Saving Data: Data File Types for more information. Expanded output export capabilities. You can now export entire Viewer documents or selected output objects in Word/RTF format and Excel format (charts are not included in Excel format). See Export Output for more information. Multiple output languages. You can now produce pivot table output in different languages and switch languages during the same session. See General Options for more information. TwoStep Cluster Analysis. This new clustering procedure offers the following features not available in the other SPSS clustering procedures: Automatic selection of the best number of clusters, in addition to measures for choosing between cluster models. Ability to create cluster models simultaneously based on categorical and continuous variables. Ability to save the cluster model to an external XML file, then read that file and update the cluster model using newer data. Ability to analyze large data files with a single clustering procedure. See TwoStep Cluster Analysis for more information. New Custom Tables option. If you have used the Tables option in the past, you will discover that almost everything is new in this release, including: A simple, drag-and-drop table builder interface that allows you to preview your table as you select variables and options. A single, unified table builder interface instead of multiple menu choices and dialog boxes for different types of tables. Subtotals for subsets of categories of a categorical variable. Custom control over category display order and ability to selectively show or hide categories.

Note: Custom Tables is not included in the SPSS Base system. It is only available if you have Windows There are a number of different types of windows in SPSS: Data Editor. This window displays the contents of the data file. You can create new data files or modify existing ones with the Data Editor. The Data Editor window opens automatically when you start an SPSS session. You can have only one data file open at a time. Viewer. All statistical results, tables, and charts are displayed in the Viewer. You can edit the output and save it for later use. A Viewer window opens automatically the first time you run a procedure that generates output. Draft Viewer. You can display output as simple text (instead of interactive pivot tables) in the Draft Viewer. Pivot Table Editor. Output displayed in pivot tables can be modified in many ways with the Pivot Table Editor. You can edit text, swap data in rows and columns, add color, create multidimensional tables, and selectively hide and show results. Chart Editor. You can modify high-resolution charts and plots in chart windows. You can change the colors, select different type fonts or sizes, switch the horizontal and vertical axes, rotate 3-D scatterplots, and even change the chart type. Text Output Editor. Text output not displayed in pivot tables can be modified with the Text Output Editor. You can edit the output and change font characteristics (type, style, color, size). Syntax Editor. You can paste your dialog box choices into a syntax window, where your selections appear in the form of command syntax. You can then edit the command syntax to use special features of SPSS not available through dialog boxes. You can save these commands in a file for use in subsequent SPSS sessions. Script Editor. Scripting and OLE automation allow you to customize and automate many tasks in SPSS. Use the Script Editor to create and modify basic scripts. Data Editor The Data Editor provides a convenient, spreadsheet-like method for creating and editing data files. The Data Editor window opens automatically when you start a session.

The Data Editor provides two views of your data: Data view. Displays the actual data values or defined value labels. Variable view. Displays variable definition information, including defined variable and value labels, data type (for example, string, date, and numeric), measurement level (nominal, ordinal, or scale), and user-defined missing values. In both views, you can add, change, and delete information contained in the data file. Data View Many of the features of the Data view are similar to those found in spreadsheet applications. There are, however, several important distinctions: Rows are cases. Each row represents a case or an observation. For example, each individual respondent to a questionnaire is a case. Columns are variables. Each column represents a variable or characteristic being measured. For example, each item on a questionnaire is a variable. Cells contain values. Each cell contains a single value of a variable for a case. The cell is the intersection of the case and the variable. Cells contain only data values. Unlike spreadsheet programs, cells in the Data Editor cannot contain formulas. The data file is rectangular. The dimensions of the data file are determined by the number of cases and variables. You can enter data in any cell. If you enter data in a cell outside the boundaries of the defined data file, the data rectangle is extended to include any rows and/or columns between that cell and the file boundaries. There are no "empty" cells within the boundaries of the data file. For numeric variables, blank cells are converted to the system-missing value. For string variables, a blank is considered a valid value. Variable View The Variable view contains descriptions of the attributes of each variable in the data file. In the Variable view: Rows are variables. Columns are variable attributes. You can add or delete variables and modify attributes of variables, including: Variable name Data type Number of digits or characters Number of decimal places Descriptive variable and value labels User-defined missing values Column width Measurement level

In addition to defining variable properties in the Variable view, there are two other methods for defining variable properties: The Copy Data Properties wizard provides the ability to use an external SPSS data file as a template for defining file and variable properties in the working data file. You can also use variables in the working data file as templates for other variables in the working data file. Copy Data Properties is available on the Data menu in the Data Editor window. See Copying Data Properties for more information. Define Variable Properties (also available on the Data menu in the Data Editor window) scans your data and lists all unique data values for any selected variables, identifies unlabeled values, and provides an auto-label feature. This is particularly useful for categorical variables that use numeric codes to represent categories--for example, 0 = Male, 1 = Female. See Defining Variable Properties for more information. The following rules apply to variable names: Variable Names The name must begin with a letter. The remaining characters can be any letter, any digit, a period, or the symbols @, #, _, or $. Variable names cannot end with a period. Variable names that end with an underscore should be avoided (to avoid conflict with variables automatically created by some procedures). The length of the name cannot exceed eight characters. Blanks and special characters (for example,!,?, ', and *) cannot be used. Each variable name must be unique; duplication is not allowed. Variable names are not case sensitive. The names NEWVAR, NewVar, and newvar are all considered identical. Variable Measurement Level You can specify the level of measurement as scale (numeric data on an interval or ratio scale), ordinal, or nominal. Nominal and ordinal data can be either string (alphanumeric) or numeric. Measurement specification is relevant only for: Chart procedures that identify variables as scale or categorical. Nominal and ordinal are both treated as categorical. SPSS-format data files used with AnswerTree. You can select one of three measurement levels: Scale. Data values are numeric values on an interval or ratio scale--for example, age or income. Scale variables must be numeric.

Ordinal. Data values represent categories with some intrinsic order (for example, low, medium, high; strongly agree, agree, disagree, strongly disagree). Ordinal variables can be either string (alphanumeric) or numeric values that represent distinct categories (for example, 1 = low, 2 = medium, 3 = high). Note: For ordinal string variables, the alphabetic order of string values is assumed to reflect the true order of the categories. For example, for a string variable with the values of low, medium, high, the order of the categories is interpreted as high, low, medium--which is not the correct order. In general, it is more reliable to use numeric codes to represent ordinal data. Nominal. Data values represent categories with no intrinsic order--for example, job category or company division. Nominal variables can be either string (alphanumeric) or numeric values that represent distinct categories--for example, 1 = Male, 2 = Female. For SPSS-format data files created in earlier versions of SPSS products, the following rules apply: String (alphanumeric) variables are set to nominal. String and numeric variables with defined value labels are set to ordinal. Numeric variables without defined value labels but less than a specified number of unique values are set to ordinal. Numeric variables without defined value labels and more than a specified number of unique values are set to scale. The default number of unique values is 24. To change the specified value, change the interactive chart options (from the Edit menu, choose Options and click the Interactive tab). Variable Type Variable Type specifies the data type for each variable. By default, all new variables are assumed to be numeric. You can use Variable Type to change the data type. The contents of the Variable Type dialog box depend on the data type selected. For some data types, there are text boxes for width and number of decimals; for others, you can simply select a format from a scrollable list of examples. The available data types are as follows: Numeric. A variable whose values are numbers. Values are displayed in standard numeric format. The Data Editor accepts numeric values in standard format or in scientific notation. Comma. A numeric variable whose values are displayed with commas delimiting every three places, and with the period as a decimal delimiter. The Data Editor accepts numeric values for comma variables with or without commas; or in scientific notation.

Dot. A numeric variable whose values are displayed with periods delimiting every three places, and with the comma as a decimal delimiter. The Data Editor accepts numeric values for dot variables with or without dots; or in scientific notation. Scientific notation. A numeric variable whose values are displayed with an imbedded E and a signed power-of-ten exponent. The Data Editor accepts numeric values for such variables with or without an exponent. The exponent can be preceded either by E or D with an optional sign, or by the sign alone--for example, 123, 1.23E2, 1.23D2, 1.23E+2, and even 1.23+2. Date. A numeric variable whose values are displayed in one of several calendar-date or clock-time formats. Select a format from the list. You can enter dates with slashes, hyphens, periods, commas, or blank spaces as delimiters. The century range for 2-digit year values is determined by your Options settings (from the Edit menu, choose Options and click the Data tab). Custom currency. A numeric variable whose values are displayed in one of the custom currency formats that you have defined in the Currency tab of the Options dialog box. Defined custom currency characters cannot be used in data entry but are displayed in the Data Editor. String. Values of a string variable are not numeric, and hence not used in calculations. They can contain any characters up to the defined length. Uppercase and lowercase letters are considered distinct. Also known as an alphanumeric variable. Variable Labels Although variable names can be only 8 characters long, variable labels can be up to 256 characters long, and these descriptive labels are displayed in output. Value Labels You can assign descriptive value labels for each value of a variable. This is particularly useful if your data file uses numeric codes to represent non-numeric categories (for example, codes of 1 and 2 for male and female). Value labels can be up to 60 characters long. Value labels are not available for long string variables (string variables longer than 8 characters). Entering Data You can enter data directly in the Data Editor in the Data view. You can enter data in any order. You can enter data by case or by variable, for selected areas or for individual cells. The active cell is highlighted. The variable name and row number of the active cell are displayed in the top left corner of the Data Editor.

When you select a cell and enter a data value, the value is displayed in the cell editor at the top of the Data Editor. Data values are not recorded until you press Enter or select another cell. To enter anything other than simple numeric data, you must define the variable type first. If you enter a value in an empty column, the Data Editor automatically creates a new variable and assigns a variable name. Data Value Restrictions in the Data Editor The defined variable type and width determine the type of value that can be entered in the cell in the Data view. If you type a character not allowed by the defined variable type, the Data Editor beeps and does not enter the character. For string variables, characters beyond the defined width are not allowed. For numeric variables, integer values that exceed the defined width can be entered, but the Data Editor displays either scientific notation or asterisks in the cell to indicate that the value is wider than the defined width. To display the value in the cell, change the defined width of the variable. (Note: Changing the column width does not affect the variable width.) Basic Steps in Data Analysis Analyzing data with SPSS is easy. All you have to do is: Get your data into SPSS. You can open a previously saved SPSS data file; read a spreadsheet, database, or text data file; or enter your data directly in the Data Editor. Select a procedure. Select a procedure from the menus to calculate statistics or to create a chart. Select the variables for the analysis. The variables in the data file are displayed in a dialog box for the procedure. Run the procedure and look at the results. Results are displayed in the Viewer.