Spatial and multi-scale data assimilation in EO-LDAS. Technical Note for EO-LDAS project/nceo. P. Lewis, UCL NERC NCEO

Similar documents
Locally Weighted Least Squares Regression for Image Denoising, Reconstruction and Up-sampling

DENSE 3D POINT CLOUD GENERATION FROM UAV IMAGES FROM IMAGE MATCHING AND GLOBAL OPTIMAZATION

Introduction to digital image classification

Lab on MODIS Cloud spectral properties, Cloud Mask, NDVI and Fire Detection

Digital Image Processing. Prof. P. K. Biswas. Department of Electronic & Electrical Communication Engineering

VALERI 2003 : Concepcion site (Mixed Forest) GROUND DATA PROCESSING & PRODUCTION OF THE LEVEL 1 HIGH RESOLUTION MAPS

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

Motion Analysis. Motion analysis. Now we will talk about. Differential Motion Analysis. Motion analysis. Difference Pictures

Sentinel-1 Toolbox. TOPS Interferometry Tutorial Issued May 2014

(Refer Slide Time: 0:51)

Mid-Year Report. Discontinuous Galerkin Euler Equation Solver. Friday, December 14, Andrey Andreyev. Advisor: Dr.

Geostatistics 2D GMS 7.0 TUTORIALS. 1 Introduction. 1.1 Contents

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

Crop Counting and Metrics Tutorial

Statistical image models

Geometric Accuracy Evaluation, DEM Generation and Validation for SPOT-5 Level 1B Stereo Scene

Geostatistics 3D GMS 7.0 TUTORIALS. 1 Introduction. 1.1 Contents

Methods to define confidence intervals for kriged values: Application on Precision Viticulture data.

Random projection for non-gaussian mixture models

MULTI-TEMPORAL SAR DATA FILTERING FOR LAND APPLICATIONS. I i is the estimate of the local mean backscattering

Year 10 General Mathematics Unit 2

Digital Image Processing

Interval velocity estimation through convex optimization

Petrel TIPS&TRICKS from SCM

LAB 2: DATA FILTERING AND NOISE REDUCTION

SurfNet: Generating 3D shape surfaces using deep residual networks-supplementary Material

Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision

Chapter 10. Conclusion Discussion

Getting Started with montaj GridKnit

Smoothers. < interactive example > Partial Differential Equations Numerical Methods for PDEs Sparse Linear Systems

A Low Power, High Throughput, Fully Event-Based Stereo System: Supplementary Documentation

Lab 9. Julia Janicki. Introduction

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University.

5D leakage: measuring what 5D interpolation misses Peter Cary*and Mike Perz, Arcis Seismic Solutions

Learn the various 3D interpolation methods available in GMS

P257 Transform-domain Sparsity Regularization in Reconstruction of Channelized Facies

Computer Experiments: Space Filling Design and Gaussian Process Modeling

Potential of Sentinel-2 for retrieval of biophysical and biochemical vegetation parameters

LOESS curve fitted to a population sampled from a sine wave with uniform noise added. The LOESS curve approximates the original sine wave.

Spatio-Temporal Stereo Disparity Integration

SYNTHETIC SCHLIEREN. Stuart B Dalziel, Graham O Hughes & Bruce R Sutherland. Keywords: schlieren, internal waves, image processing

The problem we have now is called variable selection or perhaps model selection. There are several objectives.

Overcompressing JPEG images with Evolution Algorithms

Exercise 3 Visualization of MODIS LAI/FPAR product at local scale at ORNL DAAC

Masking Lidar Cliff-Edge Artifacts

P. Bilsby (WesternGeco), D.F. Halliday* (Schlumberger Cambridge Research) & L.R. West (WesternGeco)

The Detection of Faces in Color Images: EE368 Project Report

This research aims to present a new way of visualizing multi-dimensional data using generalized scatterplots by sensitivity coefficients to highlight

D025 Geostatistical Stochastic Elastic Iinversion - An Efficient Method for Integrating Seismic and Well Data Constraints

Seeded region growing using multiple seed points

Minimizing Noise and Bias in 3D DIC. Correlated Solutions, Inc.

[Programming Assignment] (1)

Do It Yourself 8. Polarization Coherence Tomography (P.C.T) Training Course

Development of a Maxwell Equation Solver for Application to Two Fluid Plasma Models. C. Aberle, A. Hakim, and U. Shumlak

Introduction to Objective Analysis

Assignment 4 (Sol.) Introduction to Data Analytics Prof. Nandan Sudarsanam & Prof. B. Ravindran

LAB 2: DATA FILTERING AND NOISE REDUCTION

Lecture 7: Most Common Edge Detectors

Attribute Accuracy. Quantitative accuracy refers to the level of bias in estimating the values assigned such as estimated values of ph in a soil map.

GROUND DATA PROCESSING & PRODUCTION OF THE LEVEL 1 HIGH RESOLUTION MAPS

Response to API 1163 and Its Impact on Pipeline Integrity Management

MC-FUME: A new method for compositing individual reflective channels

3 Nonlinear Regression

Graphical Analysis of Data using Microsoft Excel [2016 Version]

Some Applications of Graph Bandwidth to Constraint Satisfaction Problems

Applications of the generalized norm solver

A toolbox of smooths. Simon Wood Mathematical Sciences, University of Bath, U.K.

Multidimensional Divide and Conquer 2 Spatial Joins

Chapter 3. Bootstrap. 3.1 Introduction. 3.2 The general idea

Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong)

Planetary Rover Absolute Localization by Combining Visual Odometry with Orbital Image Measurements

A Probabilistic Approach to the Hough Transform

Edge Enhancement and Fine Feature Restoration of Segmented Objects using Pyramid Based Adaptive Filtering

Data Fusion. Merging data from multiple sources to optimize data or create value added data

Character Recognition

Near-Infrared Dataset. 101 out of 101 images calibrated (100%), all images enabled

The Comparative Study of Machine Learning Algorithms in Text Data Classification*

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

Motivation. Aerosol Retrieval Over Urban Areas with High Resolution Hyperspectral Sensors

Chapter 7. Conclusions and Future Work

3 Nonlinear Regression

08 An Introduction to Dense Continuous Robotic Mapping

Max-Count Aggregation Estimation for Moving Points

Linear Operations Using Masks

Accuracy, Support, and Interoperability. Michael F. Goodchild University of California Santa Barbara

Analog input to digital output correlation using piecewise regression on a Multi Chip Module

Introduction. Surface and Interbed Multtple Elimination

CHRIS Proba Workshop 2005 II

Multicomponent land data pre-processing for FWI: a benchmark dataset

CHAOS Chaos Chaos Iterate

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu

CSCE 626 Experimental Evaluation.

Limited view X-ray CT for dimensional analysis

Subspace Clustering with Global Dimension Minimization And Application to Motion Segmentation

Retrieval of crop characteristics from high resolution airborne scanner data

3. Data Structures for Image Analysis L AK S H M O U. E D U

Diffusion Wavelets for Natural Image Analysis

Google Earth Engine. Introduction to satellite data analysis in cloud-based environment. Written by: Petr Lukeš

Artificial Intelligence for Robotics: A Brief Summary

G009 Multi-dimensional Coherency Driven Denoising of Irregular Data

Transcription:

Spatial and multi-scale data assimilation in EO-LDAS Technical Note for EO-LDAS project/nceo P. Lewis, UCL NERC NCEO Abstract Email: p.lewis@ucl.ac.uk 2 May 2012 In this technical note, spatial data assimilation in EO-LDAS is explained and explored. This includes the use of input datasets at multiple spatial resolutions. 1. Introduction Although only temporal data assimilation (DA) has been explored in depth in the prototype EO-LDAS tool, it was noted in Lewis et al. (2012a,b) that the tool should be capable of spatial, as well as temporal DA. The purpose of this technical note is to demonstrate the spatial capabilities of the tool and to show how multi-(spatial) resolution DA can be achieved. To illustrate these concepts, we develop from the regularisation of NDVI data example in the EO-LDAS tutorial (www1). In this example, we demonstrate the concept of using regularisation (by a zero-order process model) to smooth and interpolate a noisy data sequence (that we suppose to be NDVI). The data in the experiment are synthetic, i.e. generated from a known truth. The experiment is run from a python code (www2) that generates the synthetic dataset, adds a significant amount of noise (standard deviation 0.15). Correlated gaps (mimicking clouds) are introduced into the data. In the example given 33% of the observations are removed. The results are shown in the figure below. 1

Figure 1. Smoothing a noisy time series with temporal DA. The blue dots show the input data, with associated uncertainty (1.95 sigma, i.e. the 95% confidence interval (C.I)). The truth a half sine wave with a flat section is shown as the green line. The retrieved state is shown as the red line (mean) and grey bounds (95% C.I.). This is a suitable example to demonstrate the principles of temporal data assimilation with two constraints: (i) the noisy observations; and (ii) a simple (zero-order process model). Although some problems are encountered at sudden changes (the transition between the sine wave and the flat time), the form of the underlying function is well reconstructed from the noisy samples. In this case, the observation operator, i.e. the operator that translates between the space of the state we are trying to estimate here (NDVI) and the observations (NDVI) is an identity operator. This is chosen to make a fast experiment and to demonstrate principles. There is no attempt to optimise the smoothness (gamma) term in this and following experiments. Instead, we take a theoretical value from the truth and reduce it by a factor of 5 (so we should generally be under-smoothing). The EO-LDAS tool is accessed in this example through (i) a configuration file; (ii) a command-line to override some of the settings in a generic configuration file. The command line is set as: 2

and the (generic) configuration file given in www3. In the configuration file, we declare the location specification to be time and specify the parameters we wish to solve for (NDVI here). and set up the differential operator (in time): and the (Identity) observation operator: In further examples in the tutorial and in Lewis et al. (2012a,b) we go on to show how more complex observation operators (such as a radiative transfer model) can be used in the DA so that vegetation state variables (LAI etc.) can be estimated from remote reflectance observations. The principle of the underlying (zero- or first-order) process models is the same as in the simple NDVI example. The example is also interesting in its own right, as a demonstration of optimal filtering, i.e. smoothing with a target, multi-constraint cost function. It also shows how uncertainty can be calculated in such an optimal estimation framework. 3

2. Spatial data assimilation Spatial data assimilation proceeds in exactly the same way as temporal DA when we use these simple process models. In fact, instead of considering the x-axis in figure 1 as time, we could simply state that it is a transect in space and all of the same results and conclusions would apply. In the spatial sense, we might consider this to be quite similar to what is done in optimal spatial interpolation schemes such as kriging or regression kriging (where some low order model is fitted and the difference constraint operates on the residuals of that model) (www4). Python code for performing a spatial DA is given in www5. It is very much based on that for the temporal DA discussed above, but now we generate a synthetic dataset in two dimensions: Figure 2. Synthetic 2D (spatial) dataset Figure 2 shows the spatial dataset generated in this code. It contains some 2-D sine waves and flat areas. The test data lies between the values 0 and 1 and represents NDVI, which one could imagine as representing say 4 fields here, two with crops in and two bare soil. In such an imagining, we can see spatial variation in the NDVI (vegetation density) within the fields. Noise, of sigma 0.15 is added to these data, and again 33% of the samples removed. The input dataset then is shown in figure 3. 4

Figure 3. Sampled synthetic 2D (spatial) dataset We can again suppose the holes in the observations to be representative of clouds. The configuration file for this experiment is given in www6. In setting up a spatial DA in EO-LDAS, we declare the location to be row and col : We then define two differential operators, one in row and the other in column space: 5

The observation operator is as in the previous example. The result of running the DA is an estimate of the NDVI for all sample locations: Figure 4. Result of DA (mean) which is our posterior estimate of NDVI obtained from the samples given in figure 3. It is our optimal estimate of the original data (figure 2) and does a very reasonable job of this. The uncertainty in this estimate is given by: 6

Figure 5. Result of DA (sd) Recall that the uncertainty in the input data was 0.15. Where we have sampled data, this has been reduced (by the DA/regularisation) to around 0.06, and in the gaps (i.e. under the clouds) the uncertainty is around 0.075. The fidelity of this reconstruction is perhaps better illustrated by taking a transect through the dataset: Figure 6. Transect through dataset at row 13 7

The input data are shown as green dots, and the reconstruction given as the red line, with cyan (95%) C.I.s. The original data (i.e. what we are trying to reconstruct) is the blue line. As with the temporal example in figure 1, this does a remarkably good job. A scatterplot of the true (x axis) and retrieved (y-axis with 95% CI as green errorbars) is shown in figure 7: Figure 7. Scatterplot of retrieved (y-axis) against true (x-axis) NDVI over all spatial samples The scatterplot reveals a slight bias in the retrieved NDVI for high NDVI values, which is probably a result of the small number of high values in the input dataset and the type of smoothing used). It may just be a feature of the assumption of stationarity in the smoothness term. If you compare the high NDVI values in figures 2 and 4 you can see this same issue, although it is relatively minor in the grand scheme of things. Certainly the 95% C.I. covers the extent of the true data, so the C.I. is likely slightly over-estimated here. 8

3. Multi-resolution data assimilation We can proceed from this example to consider multi-spatial resolution DA within EO-LDAS. Although we do not have any sensor spatial transfer functions within the prototype, we can demonstrate and explore the principles within the existing tool. This can be done by simply mapping a coarser spatial resolution dataset to the grid of a higher resolution dataset. To account for the fact that the sample observations will then be used multiple times within the existing DA, we can simply inflate the apparent uncertainty of each sample that we load. Code to achieve this is given in www7. In this, we generate two datasets, one at full resolution, with an uncertainty of 0.15 and with 33% of the observations missing, and one at a linear scale of 1/4 th, i.e. where 16 pixels at high resolution represent one pixel at coarse resolution. The filter window size used to correlate the data gaps is 3 in this example (larger filter sizes will result in larger gaps). The uncertainty in the coarse resolution data is 0.0375, so less than that at high resolution (by a factor of 4) but then we re-inflate it to an apparent uncertainty of 0.15 when applying the same (coarse) resolution sample pixel over the high-resolution grid. 9

(a) Original synthetic dataset (b) Posterior mean from DA (c) High resolution input dataset (d) Low resolution input dataset (e) Posterior Uncertainty (s.d.) from DA (f) Scatterplot of truth (x) and posterior mean (y) Figure 8. Results of multi-scale analysis for 1/3 data missing 10

Figure 9. Transect through row 13 of results for 1/3 data missing As in other examples in EO-LDAS, we use separate observation operators for the different data streams, though this is largely for convenience in this case as the both data sets are associated with Identity operators in this case. These results demonstrate the ability of the code to achieve a multi-resolution DA (albeit with a simple Identity observation operator here). With 1/3 rd of the samples missing, the results are very good, although we note that the specifics of the gap algorithm used here mean that gaps tend to be created at the edge of the image first (this is to do with how a random noise field is filtered to create the gappiness). There is no apparent bias in the results (figure 8f), and effective use is made of both the high- and low-resolution datasets to provide a viable (and in this case accurate) posterior estimate (figure 8b). In a second example, www8, we consider the case where 2/3 of the data are missing, with a larger filter size (6) resulting in larger gaps. The results are clearly of somewhat lower quality, but this is reflected in the uncertainties. The uncertainty map (figure 10e) clearly demonstrates where the sampling in the input data (in both high and low resolution datasets) is poor (light blue). Unsurprisingly, where we have samples in both high and low spatial resolution datasets, the uncertainty is lowest. Given the amount of extrapolation in this exercise, the results are remarkably good. There is no apparent bias in the results (visually, from figure 10f). The transect in figure 11 shows that though the reconstruction is still perhaps a little noisy (it could 11

most likely tolerate a higher gamma) it provides a faithful reconstruction of the original data from noisy multi-resolution datasets with large gaps. (a) Original synthetic dataset (b) Posterior mean from DA (c) High resolution input dataset (d) Low resolution input dataset (e) Posterior Uncertainty (s.d.) from DA (f) Scatterplot of truth (x) and posterior mean (y) Figure 10. Results of multi-scale analysis for 2/3 data missing 12

Figure 11. Transect through row 13 of results for 2/3 data missing In a final example (www9), we remove 2/3 rd of the samples from the high-resolution image, but only 1/3 rd of the lower resolution data. This is an attempt to mimic the impact of higher frequency low spatial observations with occasional high-resolution data (though we do not directly consider the time dimension in this example). In this case, we have only sparse coverage at high resolution, but good coverage of most of the major features at low resolution. There is minimal blockiness in the DA result in figure 12b, but even this does not seem very apparent in the transect (figure 13). The better coverage provided by the low resolution data produces much less scatter when comparing to the original signal (compare figures 10f and 12f). The result compares very favourably with that in figure 8 which had twice as many high resolution samples. 13

(a) Original synthetic dataset (b) Posterior mean from DA (c) High resolution input dataset (c) Low resolution input dataset (d) Posterior Uncertainty (s.d.) from DA (e) Scatterplot of truth (x) and posterior mean (y) Figure 12. Results of multi-scale analysis for 2/3 data missing for the high resolution data and 1/3 for the low resolution 14

Figure 13. Transect through row 13 of results for 2/3 data missing in the high resolution and 1/3 missing for the low resolution. 15

4. Discussion and conclusions Understanding how to perform spatial data assimilation using regularisation concepts is important for EO DA for the land surface, as whilst biophysical/biogeochemical model may be invoked (as strong or weak constraints (Lewis et al., 2012b)) to help constrain the temporal development of some surface biophysical parameters, they do not generally exist spatially and we must resort to empirical approaches such as that used here. Although we only treat spatial DA in here, both spatial and temporal DA can be achieved within EO-LDAS at the same time. We have demonstrated in this technical note how, although mostly explored using temporal DA, the EO-LDAS prototype can also be used for spatial DA. The next steps in this direction are obvious: to apply the principles shown here with a more complex observation operator so that spatial constraints can be used to aid the mapping of canopy biophysical variables, but there has not been time to explore that in this study. The code should be capable of supporting this as it stands. In addition, it would seem likely that the DA would benefit from the fitting a low-frequency model as in kriging-regression. In this case, the purpose of the zero-order process model would be only to model the residuals. This same concept applies to both space and time modelling. Clearly, regularisation of the sort shown here will have some issues at the edges of sharp boundaries in space (e.g. fields) where we would expect some degree of oversmoothing, but the results shown here with quite sharp boundaries are in any case promising. There are several ways in which any such problems could be overcome, such as using some segmentation and masking so that the spatial DA only takes place within a defined boundary (e.g. within a field), or alternatively applying some form of edge-preserving regularisation (one of the simplest being the use of an L1 metric in the spatial constraint rather than the L2 metric used currently). The use of multi-resolution input data within the EO-LDAS prototype has been demonstrated: this is achieved (within the prototype) by prior mapping of all observational datasets into the space of the state vector and by re-weighting the observation uncertainty accordingly. There may be some neater ways of achieving this in future developments. A change in (linear) scale of 4 was demonstrated here between the high and low-resolution datasets. It was noticed that if this scale change were pushed too far, the cost function minimisation tended to have problems in locating the global minimum. The results show that the differing scale observations can effectively be merged in EO-LDAS producing an optimal combined result as the posterior estimate. The purpose of these experiments was to demonstrate capability rather than to do any great in-depth analysis, but some interesting observations can be made. In effect, what we are seeing here is the regulariser smoothing the low resolution data to the high resolution grid, but rather than just an arbitrary smoothing, this is modulated by the high frequency variations that we do observe. One last comment is that the size of the solution space in these problems is starting to get large. The largest spatial datasets we tried were of the order of 13000 samples, so any state vector will be of length 13000 and any associated matrices 13000x13000 16

samples. The coding in the EO-LDAS prototype is not developed for memory efficiency, and it is likely that this is close to the limit of the dimensionality that might reasonably be explored with this tool. If we wish to more fully exploit spatial DA in future releases, we will have to be much more memory efficient than in the current code, but this may mean a loss of some of the modularity. References Lewis, P., Gomez--Dans, J., Kaminski, T., Settle, J., Quaife, T., Gobron, N., Styles, J., Berger, M. (2012) An Earth Observation Land Data Assimilation System (EO- LDAS). Remote Sensing of Environment, Volume 120, 15 May 2012, Pages 219 235 Lewis, P., Gomez--Dans, J., Kaminski, T., Settle, J., Quaife, T., Gobron, N., Styles, J., Berger, M. (2012) Data assimilation of sentinel-2 observations: preliminary results and outlook. ESA SP, Sentinel-2 preparatory symposium. www1 http://www2.geog.ucl.ac.uk/~plewis/eoldas/example1.html www2 http://www2.geog.ucl.ac.uk/~plewis/eoldas/bin/solve_eoldas_identity.py www3 http://www2.geog.ucl.ac.uk/~plewis/eoldas/confs/identity.conf www4 http://spatial-analyst.net/wiki/index.php?title=regression-kriging www5 http://www2.geog.ucl.ac.uk/~plewis/eoldas/bin/solve_eoldas_spatial.py www6 http://www2.geog.ucl.ac.uk/~plewis/eoldas/confs/identityspatial.conf www7 http://www2.geog.ucl.ac.uk/~plewis/eoldas/bin/solve_eoldas_spatial2a.py www8 http://www2.geog.ucl.ac.uk/~plewis/eoldas/bin/solve_eoldas_spatial2b.py www9 http://www2.geog.ucl.ac.uk/~plewis/eoldas/bin/solve_eoldas_spatial2c.py 17