Managing Ecological and Biodiversity Data Using Ecoinformatics: Taiwan Experience Chau Chin Lin Taiwan Forestry Research Institute
Persons to Thank First for The Following Presentation Dr. Hen-biau King (TFRI Director 2003-2007) Dr. Bill Chang (US NSF)
Ecology:Information of Biocomplexity Biotic Abiotic Temporal Spatial
Biodiversity:Information of Life Class: Insecta Order: Lepidoptera Family: Pyralidae Genus: Ostrinia Hübner, 1825 Taxonomic Names Synonym: Pyralis nubilalis Hübner, 1796 Sequence Data Locus: AAL35331 Definition: acyl-coa Z/E11 desaturase 1 mvpyattadg hpekdecfed... Species: Ostrinia nubilalis (Hübner, 1796) Vernacular (EN): European Corn-borer Vernacular (DE): Maiszünsler Vernacular (ES): Piral del maíz Vernacular (FR): Pyrale du maïs Family: Gramineae Taxonomic Descriptions Diagnosis: Wingspan 26-30mm; sexually dimorphic;male: forewings ochreous to dark brown; female: forewings pale yellow; Digital Literature and Web Resources Foodplant: Zea mais L. 1753 Biotic Interactions Spatial /Temporal Observations Collection: DGH Lepidoptera Record id: DGHEUR_003217 Country: France Coordinates: 03.047 E 48.730 N Date: 28 June 2003 Collector: Donald Hobern Individuals: 3 Richness: Pheromones of Ostrinia http://www.nysaes.cornell.edu/fst/faculty/acree /pheronet/phlist/ostrinia.html Abiotic Average Rainfall Location: 48.82 N 2.29 E Jan Feb Mar Apr... 182.3 120.6 158.1 204.9...
All Based on Data Why Data Management Is Important in Ecological Research? http://siliconangle.com/blog/2012/
Data Informs Impacts of Biodiversity Loss on Ocean Ecosystem Services Annual Cumulative Worm et al., Science 2006
Data Enhances Understanding of The Real World Understanding this disease requires knowledge of epidemiology, genetics, and transmission modes, along with their ecological contexts. Integrating ecologically pertinent data into the chain of information from the gene to the biosphere will significantly enhance our understanding of the natural world. Whitfield J. 2003 Ape populations decimated by hunting and Ebola virus. Nature 422:551
However,
Data Collection Is A Hard Work information Data/Raw data/dataset Observations/experiments the real world
Traditional Way of Research Doesn t Care About Data Analysis and modeling Raw Data Data Collection Problem Planning
Information Content Data Entropy Occurs Without Managing Time of publication Specific details General details Accident Retirement or career change Death Time (Michener et al. 1997)
What Data We Have Collected Slide from Dr. John Porter
For Example: Forest Dynamics Plot Data
Forest Dynamics Plots in Taiwan 16 Plots Around the Island
For Example : Biodiversity Data
For Example : Carbon Flux Towers
How Did We Do? Used data Collection Original Observations Analysis and modeling Selection and extraction Secondary Observations Planning Problem Definition (Research Objectives) Planning
What Techniques We Need? A framework that enables scientists to generate new knowledge through innovative tools and approaches For management, archiving, curation, discovering, retrieval, integrating, analyzing, and visualization of biodiversity and ecological data It is called Ecoinformatics
Search and Adapt The Existing Tools < EML> Ecological Metadata Language, EML Morpho metadata and data management software Metacat distributed data system registries: KNB, UCNRS, OBFS, NCEAS, PISCO, LTER EcoGrid and Tool Kit integrating distinct data systems and networks Kepler grid-enabled scientific workflows
Assembling Tools As An Information Management System
EML Driven IMS Senor Network ecogrid QA/QC Information Management Information Synthesis
Dealing with Data Flow Change Slide from US LTER
Dealing with Data Collecting Change Interpret a number 10 x daily Interpret a pattern 1,000 x daily
Dealing with Data Deluge
Providing Good Quality Data Available Online
Capacity Building and Training Helped from US LTER
International Collaborations 2006
Help Each Other within EAP! U.S. LTER Taiwan TFRI Malaysia (FRIM) Kasetsart University Thailand
Apart from software products there have also been a series of publications in both Asian and Western journals, including TREE, Bioscience and Ecological Informatics
Management, Archiving (Creating Metadata) Metadata?
Standard for Ecology/Biodiversity: EML
EML Modules
Metadata/Data Depository System
Data Curation Network SEV SEV? AND OBFS TFRI Harvester CAP Replication ECNU Key Metacat Catalog LNO Morpho clients Web clients Site metadata system XML output filter PISCO
Forming A Decentralized National System Internet User-2 User-3 Forestry User-1 Agriculture Authentication National GIS National Park Database Server National Science Council
Joining Data Observation Network for Earth DataONE DataONE DataONE is a data repository for sharing and preserving data is capable of providing researchers to access globally distributed, networked data from a single point of discovery. is a collaboration among many partner organizations, and is funded by the US-NSF. [Through the knowledge and infrastructure integrates information] National Center for Ecological Analysis and Synthesis (NCEAS), U.S.A; ; http://www.dataone.org/what-dataone.
Data Integration Data integration refers to linking research & monitoring data to the modeling community & vice versa. Data integration also refers to archiving data from monitoring, research, & modeling efforts, as well as making the data easily available for others to access & use. http://www.clear.lsu.edu/data_integration/
Toward An Automation of Data Process Workflow archive Data Site 1 Data Site 2 Data Depository Metadata Shared Data Registry Compute grid Service Broker (UDDI) Web Service WSDL Algorithm Simulation Model Get Data Query Data Grid to find data Return URL Query Service broker to find services Return URL & call functions Get Component Archive output data to Depository Archive workflow
Scientific Workflow Approach to Analysis ASCII C RESULTS: Tables Maps Graphs
Application-A Case
Ogawan,Japan Luquillo,Puerdo Rico Lienhuachih,Taiwan Pasho,Malaysia
Metadata Upload EML Document Metadata Catalog EcoGrid Scientific Workflow Morpho EML + Raw data Download Raw data CTFS Data Model Data Retrieval (SQL) Other Data Models (LDAP) WebServer (Apache+PHP)
Action Items for Individual Ecologists Organize, document, and preserve data for posterity Share data Collaborate with networks of colleagues to bring together heterogeneous datasets to address larger scale questions Address data management issues with students and peers
Data Sharing 1.Data policy What are fair policies for providing access to data? 2.Agreements Specification What controls, embargoes, usage constraints, or other limitations are needed to assure fairness of access and use? 3.Policy Administration What data publication models are appropriate?
Experience Learned: Many hands truly do make "light work!" Kaohsiung, Taiwan 2007
THANK YOU FOR YOUR ATTENTION!! chin@tfri.gov.tw