Tenth ECMWF Workshop on Meteorological Operational Systems 14-18 November 2005, Reading The Logical Data Store Bruce Wright, John Ward & Malcolm Field Crown copyright 2005 Page 1
Contents The presentation covers the following sections Background Logical Data Store (LDS) LDS Public Interface Work in already completed or in progress Implementation approach & parallel activities Questions and answers Crown copyright 2005 Page 2
Current situation and the way forward Organic un-governed growth has lead to: complex, incoherent, undocumented IT systems striking resemblance to spaghetti and meatballs The information silo results in: inconsistent, locally processed data proliferation of data & data access mechanisms poor access to enterprise information assets Crown copyright 2005 Page 3
Current data flows Crown copyright 2005 Page 4
Current situation and the way forward Organic un-governed growth has lead to: complex, incoherent, undocumented IT systems striking resemblance to spaghetti and meatballs The information silo results in: inconsistent, locally processed data proliferation of data & data access mechanisms poor access to enterprise information assets The business drivers for change are: cost efficiency improved agility improved consistency The LDS (Logical Data Store) is the means by which these issues will be addressed Crown copyright 2005 Page 5
Logical Data Store concept Vision A single, logical repository for all core (shared) enterprise meteorological datasets and products Aims Consistent meteorological data Uniquely identifiable Spatially-enabled (facilitating spatial manipulation & querying) Accessible through a set of common interfaces Managed in a standard way Crown copyright 2005 Page 6
Information architecture Data Products Observations Collection & Processing NWP Best Forecast Selection Supplementary Models Product Generation Dissemination Logical Data Store Crown copyright 2005 Page 7
Logical Data Store Server-side Server-side Server-side applications applications applications Management consoles Client applications Client Client applications applications API API API Web browsers Web Web browsers browsers Management consoles Clients Private data Private data Private APIs data APIs APIs Data Lifecycle Management Private data APIs Public interface Private data APIs Data Discovery Portal Infrastructure Management Security LDS Services Applicationspecific data model Shared Data Model Catalogue Records Infrastructure Infrastructure Infrastructure Infrastructure Crown copyright 2005 Page 8
Key to Logical Data Store concept The Public Interface Hides the complexity of: Databases & archives Formats & codes Interfaces to different data types De-couples the client application from the data store Provides: A single way to access all data in the LDS Using a standard request (metadata) Note: In the development of the LDS, we also intend to: Rationalise and consolidate data stores Take advantage of new data management technologies Crown copyright 2005 Page 9
LDS Public Interface Client applications Client Client applications applications API API API Provision of data and products This refers to a set of data and not a physical file (e.g. an NWP model run, all the UK climate observations) Public interface Private data APIs Security Shared Data Model Infrastructure Get resource Dataset Subset Return Process Format Return Standard form Support for range of clients: Web browser, Java, C, Fortran, Open GIS client Geospatial extent (e.g. lat-long box) Temporal extent (e.g. validity time) Parameter / attribute (e.g. Temp.) Domain restriction (e.g. T > 28C) Transform (e.g. units) Derive (e.g. T d from T, q, p) Re-project (e.g. lat-long to National Grid) Interpolate (re-grid) Filter (sub-sample) Change from Standard to: PP/Fieldsfile, NetCDF, GRIB, BUFR, CSV Crown copyright 2005 Page 10
Proposed solution for LDS Public Interface Web Services Use an HTTP Transport for messages (like web pages) Highly interoperable Clear and simple client-server interaction Use XML as a standard form for the request and response Self-describing data Implement metadata standard Can use a standard schema (e.g. GML) Possible Issues: Performance? (esp. for voluminous data) Will it work? (new technology risk) Crown copyright 2005 Page 11
Development already completed Consistent use of Oracle RDBMS to hold a range of data types Standard Java interfaces Web Services with XML for data exchange Draft Met Office Metadata Standard building on the ISO191xx, WMO standards and CF Convention Standard components for deriving best climatological observation values from our archive Crown copyright 2005 Page 12
Investigations underway Lightning location database operational demonstrator Store direct to Oracle Database (data and products) Provide Web Services interface (probably as Web Feature Service) Crown copyright 2005 Page 13
Lightning location Web Service demonstrator Oracle 10g Database XSLT / jython BPEL XPath XSQL & XMDB XML Data SQL Database Binding Component Build request Specify message traverse path Route message based on content Web Service Binding component HTTP Request Client System MetDB MetDB Binding Component XML Transform to expected response type XML XML Response BUFR Files MDB() Normalised Message router XSLT / jython Crown copyright 2005 Page 14
Investigations underway Lightning location database operational demonstrator Store direct to Oracle Database (data and products) Provide Web Services interface (probably as Web Feature Service) NWP use of Oracle RDBMS proof of concept Pull observations directly to supercomputer from database Store forecast output direct from supercomputer into database Provide data access using current (Fortran-based) APIs Crown copyright 2005 Page 15
Investigations underway Lightning location database operational demonstrator Store direct to Oracle Database (data and products) Provide Web Services interface (probably as Web Feature Service) NWP use of Oracle RDBMS proof of concept Pull observations directly to supercomputer from database Store forecast output direct from supercomputer into database Provide data access using current (Fortran-based) APIs Oracle 10g Database cluster functionality investigations 3-node Dell cluster using Oracle RAC (Real Application Cluster) Crown copyright 2005 Page 16
Proposed hardware architecture Corporate Ethernet LAN backbone (CDN) Cluster interconnect switches Gigabit links to other systems (e.g. data sources) VMWare ESX Application/Ingestion Server (Virtual Servers created as required) Cluster interconnect switches Oracle Database Server Nodes Storage Array Crown copyright 2005 Page 17
Approaches to be adopted X NOT Big Bang Piecemeal Very high risk Huge amount of effort Long time before getting any benefit Focus of specific data types, to: Prove the approach (technical solution) Demonstrate end-to-end capability Address existing problems Provide quick wins But, as part of a long term Roadmap Wrapping Interface to existing data stores (for the present): Where migration costs are high (e.g. Archive) To provide simple migration paths to use LDS FFREAD Proxy LDS Provide temporary proxy interfaces For those widely used To allow partial/gradual data migration To allow gradual application migration Crown copyright 2005 Page 18
Other parallel activities SIMDAT EU co-funded project to promote the use of GRID technology Developing catalogue for managing distributed data Collaboration with ECMWF, DWD, MeteoFrance & EUMETSAT to deliver a meteorological scenario for the Future Weather Information System DEWS Developing Environmental Web Services DTI co-funded collaborative project Using leading edge technology in real scenarios: health & marine Academic (BADC, ESSC) & commercial (Lost Wax, BMT, IBM) input Crown copyright 2005 Page 19
Questions & Answers Email: bruce.wright@metoffice.gov.uk Crown copyright 2005 Page 20