RClimTool USER MANUAL

Similar documents
ClimDex - Version 1.3. User's Guide

InterAct User. InterAct User (01/2011) Page 1

CLIMATE ANALOGUES_2.0 R PACKAGE INSTALLATION AND USER GUIDE

Stat 428 Autumn 2006 Homework 2 Solutions

K-PAC Reporting Guide

Exporting CART Data and Uploading it to QualityNet

User Guide for the WegenerNet Data Portal

Using. Research Wizard. Version 4.0. Copyright 2001, Zacks Investment Research, Inc.,

Chapter 3: Data Description Calculate Mean, Median, Mode, Range, Variation, Standard Deviation, Quartiles, standard scores; construct Boxplots.

iq Explorer II Data Analysis

Climate Data Screener and Summarizer, version 3.2

Morningstar Direct SM Dates Sets and Calculations

Getting Started with Excel

Minitab 17 commands Prepared by Jeffrey S. Simonoff

Department of Instructional Technology & Media Services Blackboard Grade Book

NWSDSS. Hydrologic Engineering Center National Weather Service to Data Storage System Conversion Utility. User's Manual. Version 5.

Survey of Math: Excel Spreadsheet Guide (for Excel 2016) Page 1 of 9

Introduction to Generation Availability Data System (GADS) Wind Reporting

Organizing Your Data. Jenny Holcombe, PhD UT College of Medicine Nuts & Bolts Conference August 16, 3013

Student Activity Fund Year End FY18 Close SchoolBooks Process PREPARATION

Activity Insight (AI) Administrator s Guide

QUALITY CONTROL FOR UNMANNED METEOROLOGICAL STATIONS IN MALAYSIAN METEOROLOGICAL DEPARTMENT

Accessing Your Profile Online

Exploring Data. This guide describes the facilities in SPM to gain initial insights about a dataset by viewing and generating descriptive statistics.

LAB 1 INSTRUCTIONS DESCRIBING AND DISPLAYING DATA

Office in Motion User Guide

The Program pcpstat. User s Manual. Stefan Liersch, Berlin, August 12, 2003

Creating Reports. Library Trend: 2. Ranking Report: 2. Managing Reports 3. Creating a New Report 3. Report Options 4

Introduction to R Commander

ACH Concentration Service User Guide

The ICNet REU Freeze-Thaw Research Project The Climate Change Analysis Technical Guide: Model Output, Pre-Processing, Analysis, and Displaying Results

Fall 2016 CS130 - Regression Analysis 1 7. REGRESSION. Fall 2016

HERA and FEDRA Software User Notes: General guide for all users Version 7 Jan 2009

1 Introduction to Using Excel Spreadsheets

The FlexTraining Total e-learning Solution Management Guide Version 5.3

Let s Get Started FSA IMPLEMENTATION KIT

Excel Shortcuts Increasing YOUR Productivity

STEP-BY-STEP GUIDE TO E-FILING OF QUARTERLY STATEMENT BY HOUSEHOLD EMPLOYERS

Customizing Local Precipitation Data Files for SWAT

ADMINISTRATIVE USER GUIDE FOR THE APTI-LEARN LEARNING MANAGEMENT SYSTEM (LMS)

1. Introduction. 2. Login TAXPAYER GUIDELINES FOR CONTRIBUTION RETURN

Please enjoy this user guide as you navigate through your Office in Motion!

VHIMS QUICK REFERENCE GUIDE TO INCIDENT ANALYSER

ExamSoft Blackboard Building Block System Administrator Guide Version 1.3.0

Homework 1 Excel Basics

1. Introduction. 2. Login STEP-BY-STEP GUIDE TO E-FILING OF TDS MONTHLY RETURN

Tutorial. Docking School SAnDReS Tutorial Cyclin-Dependent Kinases with K i Information (Scoring Function Analysis)

How to carry out secondary validation of climatic data

ACCUPLACER Placement Validity Study Guide

ACRLMetrics Creating Reports

Climate Data Screener and Summarizer, version 4.0

Advisor Workstation Training Manual: Working in the Research Module

3-5 December 2018, WMO Headquarters, Geneva, Switzerland (Salle C1)

3 Selecting the standard map and area of interest

User Guide. Customer Self Service (CSS) Web Application Progress Software Corporation. All rights reserved.

HP StorageWorks Command View TL TapeAssure Analysis Template White Paper

Long-term consistent grid data for temperature in Switzerland

Applied Regression Modeling: A Business Approach

Needham Bank Business Online Banking

Service Line Export and Pivot Table Report (Windows Excel 2010)

MeltLab Reporting Text, CSV or Excel

SAS Graphics Accelerator: User s Guide

GRETL FOR TODDLERS!! CONTENTS. 1. Access to the econometric software A new data set: An existent data set: 3

PowerView for the EnergyPro

Excel 2013 Charts and Graphs

Schools and Libraries (E-rate) Program FCC Form 474 (SPI) User Guide

A step-by-step guide to eportfolio for assessors.

COMPLIANCE MANAGEMENT SOLUTION USER MANUAL. Version 2.4

Interfacing with MS Office Conference 2017

Q: Which month has the lowest sale? Answer: Q:There are three consecutive months for which sale grow. What are they? Answer: Q: Which month

Portfolios Creating and Editing Portfolios... 38

Error Analysis, Statistics and Graphing

2011 Whois Data Reminder Policy Audit Report

SA+ Spreadsheets. Fig. 1

Visualizing univariate data 1

SharePoint AD Administration Tutorial for SharePoint 2007

CS130 Regression. Winter Winter 2014 CS130 - Regression Analysis 1

CME E-quotes Wireless Application for Android Welcome

Navigator Software User s Manual. User Manual. Navigator Software. Monarch Instrument Rev 0.98 May Page 1 of 17

Monitoring patient status over time using common pain and musculoskeletal outcome measures

Creating Accounts and Test Registrations Using Batch Load

BACKUP APP V7 CLOUUD FILE BACKUP & RESTORE GUIDE FOR WINDOWS

Tutorial. 1CLIMSOFT vers 3.2

Tutorial. Docking School SAnDReS Tutorial Cyclin-Dependent Kinases with K i Information (Structural Parameters)

Creating Accounts Using Batch Load

Downloading other workbooks All our workbooks can be downloaded from:

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

Welcome to the Building Rental Room Reservation System. To learn how to use our new system, please choose from one of the following menus:

How to Access If Rubrics does not appear on your course navbar, click Edit Course, Tools, Rubrics to activate..

GRAPHING BAYOUSIDE CLASSROOM DATA

ZENworks Reporting System Reference. January 2017

OPS Webportal User Guide - Reporting v.2.1

BOULDER IRB era InfoEd Continuing Review

Sinusoidal Data Worksheet

INTRODUCTION to. Program in Statistics and Methodology (PRISM) Daniel Blake & Benjamin Jones January 15, 2010

Data Mining: Exploring Data. Lecture Notes for Chapter 3

Batch Scheduler. Version: 16.0

LOADS, CUSTOMERS AND REVENUE

PORTA ONE. PORTA Billing100. Customer Self-Care Interface.

Audience Analytics Data Submission Guide Current Participants

Transcription:

RClimTool USER MANUAL By Lizeth Llanos Herrera, student Statistics This tool is designed to support, process automation and analysis of climatic series within the agreement of CIAT-MADR. It is not intended to compete or supplant other available tools developed by other entities. Rather, we seek a collaborative and ongoing feedback between methodologies work.

RClimtool has been designed with the objective to facilitate the performance of statistical analysis, quality control, filling missing data, homogeneity analysis and calculation of indicators for daily weather series of maximum temperature, minimum temperature and precipitation. INSTALLING AND RUNNING R The tool was developed under the R language, therefore you have to install this program, specifically the R 2.15.0 version, which can be downloaded from the following link: http://cran.r-project.org/bin / windows/base/old/2.15.0 / Once you have R installed, the following window will appear:

INSTALLING AND RUNNING RClimTool To run the application interface you have to load the source code as shown in the following figure: Once the code has been loaded successfully the subsequent GUI will appear: The previous figure shows the main window of the tool, which is divided into different modules, each located in the left panels of the interface. The content of these modules will be developed later.

WHAT MAKES RClimTool? RClimTool offers different analysis options, designed with the objective of providing an application that brings together everything needed to perform a comprehensive study of climate data. To illustrate the functions of each of the modules, the analysis of daily weather series for the variables maximum temperature, minimum temperature and precipitation from 10 meteorological stations will be demonstrated in the next chapters. 1. Data reading In the data reader module you will find different buttons that allow you to read and load the databases with the information of the variables of interest. Important: Do not use accents or the letter "ñ" to name folders and files to be used with the tool, as this creates conflict when using the application. The buttom Change Directory (1) provides the option to select the directory where the files will be loaded. This will also be the location to save all outputs of the application. 1 2 Figure 1: Data reading In part (2) of Figure 1 are buttons that allows you to upload the information for each variable. For example, by clicking on the Maximum Temperature, a popup window will be appear where you can find the file that contains the maximum daily temperatures of different stations. You can perform this procedure for all other variables that need to be analyzed.

Popup window Figure 2: Example file selection In this window, the location and the file we want to load is selected. Select the file and then click OK as shown in Figure 2. Remember to close the popup window each time a different variable loads. Important: The input data format is specified in Appendix A 2. Graphical and descriptive analysis Once we have loaded the data for all variables to be analyzed, we proceed to the descriptive analysis for each of them. Consequently, you can specify the analysis period, which is useful if you want to analyze only a section of the series, e.g. March-1990 to January-1991. However, if you want to analyze the full data set then these fields must be empty. Period of analysis Figure 3: Example descriptive analysis

After selecting the variable to be analyzed as shown in Figure 3, proceed to click on the Descriptives button and the results can be seen on the R console (see Figure 4). R Console Figure 4: Descriptive analysis For graphical analysis, you can generate different types of automatic graphics, which are generated for all variables. If you want to work with monthly climatological information (monthly average temperature and monthly total for precipitation) you have to select the Monthly Analysis Type option, then click any of the buttons (Plot Charts, Graphs Scatter plots or Boxplot) and a message with the location of the graphs generated will appear (see Figure 5). Option to monthly graphs Figure 5: Automatic graphical analysis

Another option is custom shape graphics: By clicking on the module buttons Custom Graphics a window will appear, where the fields will need to be specified for the x and y arguments and the according variables can be chosen by a dropdown list. Other attributes, such as title, axis labels, color, etc. can be used to customize the graph (if you require more information on the attributes of the graph, click on the Help button). Once the variables are selected and the attributes are modified, you can click OK and a new window will display the graph (see Figure 6). Figure 6: Custom graphics 3. Quality control An important aspect to consider for the analysis of climate data is quality control. This is useful to generate criteria and/or filters in order to identify unreasonable and/or erroneous data. Figure 7: Quality control

In Figure 7 the Quality Control module is displayed. Here are some editable fields that have to be filled in by the user, for example the number of standard deviations, a useful criterion for identifying outliers in a series (the default is 3). The range of the variable has to be specified according to the expected logic values that the variable can take. By clicking the button Validate a window will pop up, indicating the status of each station regarding the range set for the variable. The criteria executed in the console are (see Figure 8): % Atypical data: This is defined as the percentage of data that are not within the following range [ ], where and are the sample mean and sample standard deviation of the variable to validate respectively. Note: This criterion is not suitable for the precipitation variable, which usually has an asymmetric distribution. % Data out of range: Indicates the percentage of data that are outside the limits defined for the range of the variable. The data identified for this criterion will be replaced automatically by NA's. % Data tmax <tmin: Calculated only for temperatures and indicates the percentage of data in which the maximum temperature was lower than the minimum temperature on the same date. The data identified for this criterion will be replaced automatically by NA's. % Data variation 10 (TM_10): Only calculated for temperature variables, and indicates the percentage of days in which the variation of temperature data over another one was higher than or equal to 10 C. % Consecutive data: Identifies the equal data in a period longer than five consecutive days in the analyzed time series and these are replaced by NA's. Figure 8: Criteria for the quality control For outliers data and TM_10 filters, different files will be created for each of the stations in Excel. There you will find the data that were identified before, accompanied by their respective date. It is up to the user to replace data identified by these filters by NA s. This has to be performed manually on the files generated in the Missing Data folder, where you can find the files after you have completed the Quality Control of all variables (see Figure 9).

If you want to replace the data identified in the Quality Control by NA's should be done on these files. File folders unreasonable and/or erroneous data for each station Figure 9: Identification and replacement of unreasonable data by NA's Figure 10: Creating the preliminary report By clicking the button you can generate a pre-report and a Word file is automatically created with a report. This report includes a preliminary descriptive analysis and further criteria generated in the Quality Control module, supplemented with the graphics made by the application. The pre-report will be stored in the directory listed in the popup window, as shown in Figure 10. 4. Missing data Filling missing data is performed using the R package RMAWGEN which from VAR model estimation performs data filling. Importantly, this methodology is useful when you have low percentages of NA data and when information from various stations is linked and not showing much variability. For this module it is essential that data from several stations are in the SAME PERIOD variables for maximum temperature, minimum temperature and precipitation because of their interaction with each other to complete the missing data.

Figure 11: Filling missing data In Figure 11 the required fields that must be specified to fill the missing data are shown, click on the complete data button to start. This process can take several minutes to finish. Once the process is finished, a window appears again indicating that the process is complete. In the Missing Data folder databases for each of the variables and graphics of the original series versus series generated will be created (see Figure 12). Folders with graphical outputs Data files generated (no missing data) Figure 12: Location data missing files

5. Homogeneity Analysis Series In this module, several statistical tests were implemented to analyze the homogeneity of the series: Normality tests: These tests check whether the variable data in the study came from a normal distribution, and if this assumption is true, parametric tests should be used. However, in case the assumption is false, non-parametric tests are required. Seasonality (trend): Spearman s rank correlation* and Mann-Kendall test are proposed. For future estimates it is necessary that this assumption of Seasonality is met. Stability in variance: F- Test* is applied on subsets of information. Stability in Media: Includes T-Test* and U Mann-Whiney test as non-parametric alternative to the T- test, using the medium as a more robust statistic than the statistical average. Note: Tests with * require of compliance with the normality assumption. In Figure 13 some of the results obtained for this module can be seen. In this example, the variable tmax and a significance level of 5% were used. The displayed console tables obtained for each test, which include the p-value and the decision according to the significance level chosen for each station. Figure 13: Homogeneity analysis

For this module provides the option to generate a report that summarizes all statistical tests included in the analysis of homogeneity. To do so, you can click on the Generate Report button. 6. Indicator calculations You have got the following sub-modules for indicator calculations: Annual indicators: The number of days that meet the specified condition each year (Higher than or Lower than) is calculated. The value of the criterion defining the condition is up to the user. Monthly Indicators: For this sub-module monthly maximum, minimum or average temperatures/ precipitation data are calculated. To perform these calculations, you firstly need to select the period and the variable to be analyzed. In the following the value for the indicator of interest is selected by clicking on the checkbox. Finally, the Indicators folder Excel files will be generated with the calculated indicators (see Figure 14). Figure 14: Calculation of annual and monthly indicators 7. ENSO Condition (El Niño/ Southern Oscillation ) RClimTool has information on ENSO conditions from 1950 to 2013 which is available on monthly (1) or quarterly (2) intervals (see Figure 15). After selecting the period of interest you can proceed by clicking the consultation of your interest and the results will appear in the R console (see Figure 16).

1 2 Figure 15: ENSO condition Figure 16: Example consultation ENSO Condition KNOWN ISSUES One problem identified for this version is in the form of missing data: In order to carry out the data filling, the range of dates of the variables has to contain data from January 1 of the initial year of analysis until 31 December of the final year. REPORT PROBLEMS Please report any problem to Lizeth Llanos l.llanos@cgiar.org and David Arango d.arango@cgiar.org including screenshots of error messages and data used for analysis. Furthermore we appreciate any suggestions that contribute to the improvement of the tool.

APPENDIX A: INPUT DATA FORMAT Files have to be in CSV format (comma delimited). You must apply different bases for each of the variables that contain the analyzed stations. These bases must comply with the following aspects: 1. Columns in the following sequence: day, month, year followed by the names of the stations. NOTE: units precipitation= mm and temperature units = degrees Celsius 2. For cases in which missing data are submitted, they have to be coded as NA; data records must be in chronological order. Missing dates are not allowed. Example input data format for RClimTool: Stations names Figure 17: Precipitation variable input format

Figure 18: Variable input format maximum temperature Figure 19: Variable input format minimun temperature