Detection and Visualisation of Radio Frequency Interference

Size: px
Start display at page:

Download "Detection and Visualisation of Radio Frequency Interference"

Transcription

1 Detection and Visualisation of Radio Frequency Interference A project for the course MAM4007W Mathematics of Computer Science Supervised by: Michelle Kuttel, Sarah Blyth, and Anja Schroeder Philippa Hillebrand HLLPHI012 Category Min Max Chosen Mark Requirement Analysis and Design Theoretical Analysis Experiment Design and Execution System Development and Implementation Results, Findings and Conclusion Aim Formulation and Background Work Quality of Report Writing and Presentation Adherence to Project Proposal and Quality of Deliverables Overall General Project Evaluation Total Computer Science University of Cape Town South Africa October 2014

2 2 Abstract Radio Frequency Interference (RFI) comprises all the unwanted signals in the radio spectrum detected by a radio telescope, which interfere with the, often much fainter, astronomical signals. A clear separation of RFI and astronomical signals through detection is necessary for scientific observations. The majority of RFI signals are produced on Earth, although the sun is also a source. Earth-based signals cannot always simply be tracked down and switched off, as they are often major communications channels, for systems like television and mobile telephones. Therefore a major requirement in radio astronomy is to detect and characterize, and then mitigate, these signals. This can be done manually, but it is much more efficient to do so computationally. Here we highlight and compare six detection/mitigation algorithms, aiming for their possible combination and implementation for the MeerKAT telescope. This is in a radio quiet area of the Karoo, the same site as for the international Square Kilometre Array (SKA) project. The SKA will be the world s largest radio telescope, consisting of thousands of receivers of which the MeerKAT is a precursor. Here we describe the design and implementation of two RFI detection methods based on methods chosen from the literature. Acknowledgements Thank you to supervisors Michelle Kuttel, Sarah Blyth and Anja Schroeder for taking the time to read every draft chapter and discuss the design and testing of the system. Thank you to the SKA for funding and supplying data for this project.

3 Contents 1 Introduction Problem Statement Research Question Approach Background Radio Frequency Interference Characterization and detection of RFI Methods for RFI mitigation RFI detection algorithms Radio Astronomy Data Spectral Kurtosis SumThreshold AOFlagger Morphological Algorithm Spatial Filtering Characterization Methods Conclusions Design Goals Approach SumThreshold Algorithm Final SumThreshold algorithm Surface fitting and dilation Variable window size System Architecture Software Development Input and Output Algorithm Analysis SumThreshold Variable Window Comparison Implementation Languages and libraries SumThreshold Algorithm Prototype Optimisation

4 4 CONTENTS Optimisation Surface and dilation algorithm (discontinued) Variable window algorithm Prototype Optimisation Conclusions Validation Methods Determining success Tests Discussion Results Case Study Case Study Case Study Profiling Discussion Conclusions and Future Work 42 Appendices 45 A Validation Results 46 B SumThreshold 51 C Variable Window 54 D Supporting Code 58 D.1 SaveDataAsImage D.2 transpose D.3 plotstuff D.4 makesmooth D.5 noise

5 List of Figures 2.1 A signal from the LOFAR test station. Top left: Signal with no interference. Top Right: Signal with interference. Bottom: RFI removed by spatial filtering using different filter types (see 2.4.6).[4] Map of frequency restricted regions in the Karoo [7] Diagram showing the structure of the detection and visualisation system a) Data in the general shape of real data, but with RFI removed, and noise added. b) The mask produced by the SumThreshold Algorithm. c) The mask produced by the variable window algorithm a) Data in the general shape of real data, with a single RFI spike, and noise. b) The mask produced by the SumThreshold Algorithm. c) The mask produced by the variable window algorithm a) Data with a baseline of zero, and noise. b) The mask produced by the SumThreshold Algorithm. c) The mask produced by the variable window algorithm a) Data with a baseline of zero, a family of spikes, and noise. b) The mask produced by the SumThreshold Algorithm. c) The mask produced by the variable window algorithm a) A zoomed view of the family of spikes. b) a zoomed view of the stripes displayed by the variable window mask a) The data explored. b) The mask produced by the SumThreshold algorithm. c) The mask produced by the variable window algorithm The SumThreshold mask searching for transient RFI A complete mask, created by combining the SumThreshold (transposed and not) and the variable window masks An ordinary data set with typical RFI in the frequency domain, and minimal RFI in the time domain, along with the masks produced by the algorithms designed A data set with typical RFI in the frequency domain, and two lines of RFI in the time domain, along with the masks produced by the algorithms designed An arbitrary data set which shows the necessity of the detection algorithms to see all the RFI within the data, along with the masks produced by the algorithms designed

6 6 LIST OF FIGURES A.1 a) Data with a baseline of zero, and one small section shifted up. b) The mask produced by the SumThreshold Algorithm. c) The mask produced by the variable window algorithm. The small shift up is treated as a baseline wiggle by both algorithms A.2 a) Data with a baseline of zero, very low noise, with a broadband signal. b) The mask produced by the SumThreshold Algorithm. c) The mask produced by the variable window algorithm. The SumThreshold method is not sensitive to broadband RFI A.3 a) Data with a baseline of zero, low noise, with a broadband signal. b) The mask produced by the SumThreshold Algorithm. c) The mask produced by the variable window algorithm. The SumThreshold method is not sensitive to broadband RFI A.4 a) Data with a baseline of zero, and one small section shifted up. b) The mask produced by the SumThreshold Algorithm. c) The mask produced by the variable window algorithm. The small shift up is treated as a baseline wiggle by both algorithms A.5 a) Data with a baseline of zero, and one narrow spike. b) The mask produced by the SumThreshold Algorithm. c) The mask produced by the variable window algorithm. The spike is accurately flagged by both methods. 50

7 Chapter 1 Introduction The MeerKAT project in Carnavon in the Karoo is a radio telescope which forms the precursor to the Square Kilometre Array (SKA) South Africa project. This telescope detects up radio frequency signals from celestial bodies further away than any we have previously observed, and will consist of an array of telescope dishes larger than ever combined before. Unfortunately, radio signals are not produced only by celestial bodies, but also by man-made objects, and are used extensively for communication. These man-made signals which interfere extensively with the signals being observed from outer space are known as Radio Frequency Interference (RFI) and can be observed in the data as amplitude spikes on a frequency spectrum. If these spikes are not noticed, the data is treated as trustworthy and astronomers may assume that the spikes are an interesting phenomenon, when actually they are just the neighbour starting his car. For this reason, we apply signal processing techniques to the data, trying to find the signals which are statistically significantly different from the underlying noise. This underlying noise is the actual astronomical data and so it is particularly important that the noise is not marked as RFI. Output is some form of mask, which allows the astronomers to know which channels are corrupted, and which contain viable information. The simplest form of RFI detection is known as thresholding. This means setting some value above which the data is flagged as RFI. In some circumstances this is done symmetrically, so if the data is lower than some value it is also flagged. There are more advanced forms of detection, which mostly build on the idea of thresholding. 1.1 Problem Statement The aim of this project is to adapt and compare two methods of RFI detection which can then be used in characterisation of the RFI, and to determine the type of RFI which is being produced in the environment. The effectiveness of the algorithms will be evaluated according to how fast they are able run, how sensitive they are to changes in the data, how much known RFI they are able to detect and how many false positives there are in the output mask. 1.2 Research Question RFI and astronomical signals (radio waves produced by a source) both come in many different forms, which makes detection of RFI difficult. Also, the amount of data recorded 7

8 8 CHAPTER 1. INTRODUCTION by a radio telescope is very large, so any detection algorithm is required to be as efficient as possible. As such the following question will be investigated: Is it possible to adapt an existing detection algorithm to the supplied data, and add any form of characterization to that algorithm? As seen in the past work, Offringa et. al.[15, 16, 17] have worked extensively on detecting RFI in array type telescopes. The data for this project, however, is collected, formatted, and stored differently. The challenge is therefore to apply existing methods to the new data. The characterization of a particular signal has not been researched in as great depth, and so to design an algorithm to appropriately characterize the signals may be beyond the time scale of this project. 1.3 Approach The approach taken to solve this problem follows a simple route. We begin by looking into the solutions produced by others on similar problems, and examine the specifics of the SKA project and relate the solutions to the problem. We then choose two appropriate methods to implement for this project. These algorithms are described in detail in Chapter 2. The next step is to design the algorithms to work with the data collected on the MeerKAT site. This process is shown in Chapter 3. In Chapter 4 we document the process of building up the system, and developing the chosen algorithms. This includes the development of a new algorithm which makes use of previous ideas, but implements them differently. From there we move to the validation of the code in Chapter 5, and a discussion of the results of running the code on real data in Chapter 6.

9 Chapter 2 Background The very first radio map of the skies was produced in 1942 by Reber, an amateur, who was intrigued by Jansky s observations of the Milky Way in 1932[2]. Since then radio telescopes have developed to the point where there are two main types: there are large single dish telescopes (such as Arecibo[2]) and arrays of smaller dishes (such as the Low- Frequency Array (LOFAR), which has recently become fully operational[4, 16]). These telescopes make two different types of observations: active; utilizing RADAR 1 technology, and passive; picking up radio waves emitted by astronomical sources. As radio telescopes become larger and more sensitive, more data on astronomical objects can be collected, leading to a much better understanding of the universe[2]. To this end, the Square Kilometre Array (SKA) telescope has been commissioned, which will be the largest radio telescope in the world. The SKA project, first discussed in 1993, has grown into a global project, located in South Africa and Australia[21]. The MeerKAT project in the Karoo is the precursor to the South African part and will become a part of the SKA. MeerKAT will consist of 64 antennae, with the maximum distance between the dishes being 8 km. The first dish was raised on the 27th of March 2014[24]. 2.1 Radio Frequency Interference Radio frequency interference (RFI) is electromagnetic interference (EMI) from signals in the radio frequencies of the electromagnetic spectrum (Figure 2.1). As EMI can be caused by any type of electrical circuit, sources of RFI are abundant. What is considered RFI is subjective, and dependent on the type of observation being made[5]. Because RFI signals (transmitted by a source) are mostly much stronger than the astronomical signals observed, this can overload the sensitive receivers, causing errors in the calibration of the signal. RFI can also occur in the same frequency as an astronomical signal, causing ambiguities and ripples in the observed spectrum. From the antenna, the radio signal is converted from analogue to digital, and then correlated with the signals from other antennae to create a complete picture of the observations. RFI can be created anywhere along this path. RFI can be categorized into two broad groups: narrow-band RFI (intentional transmissions, such as television signals, or FM radio signals) and broadband RFI (unintentional transmissions, such as those emitted by electric circuits, and power lines)[19]. It may be possible to find and shield broadband sources more easily than narrow-band. 1 Radio Detection And Ranging, used originally for detecting aircraft. 9

10 10 CHAPTER 2. BACKGROUND Strong RFI signals (a signal is transmitted by a source) can completely drown out weaker signals of astronomical importance in the same channel (a channel is a set of frequencies grouped together to make data storage easier). This can cause a significant loss of data, as it can be necessary to completely ignore all signals found on the channel. The L-Band (around 1420 MHz) is important because this is where spectral lines denoting neutral hydrogen in a celestial body can be observed. Unfortunately, there are many RFI signals in this channel[10], which make it difficult to differentiate valid signals and interference. The effects of the interference are shown clearly in Figure 2.1. The diversity of radio signals makes the detection of RFI challenging. Figure 2.1: A signal from the LOFAR test station. Top left: Signal with no interference. Top Right: Signal with interference. Bottom: RFI removed by spatial filtering using different filter types (see 2.4.6).[4] 2.2 Characterization and detection of RFI Every RFI signal has unique characteristics which can be used to characterize the signal, such as strength, geographical location or position of the source, polarization, direction, orientation, periodicity over time, bandwidth, frequency distributions, modulation and encoding[18, 5]. Some characteristics, such as strength, are easy to identify for a single source, while others, such as polarization, are more difficult to determine. Characterizing a signal is useful, as it becomes much easier to locate the source and either shield it, have it switched off, or deal with the signal during the processing of the data collected.

11 2.3. METHODS FOR RFI MITIGATION 11 Knowing the polarization of the signal is useful, because astronomical signals are very weakly polarized, if at all, whereas RFI is usually strongly polarized. Characterization also impacts on the detection algorithms, in that two signals can be compared if they have been characterized, and so it is possible to determine RFI signals through similarity with known RFI. It is also good to be aware of the radio atmosphere around the sensitive equipment, and to know when something changes, to make prediction of behaviour easier[18]. RFI detection and characterization algorithms aim to detect RFI, characterize and identify the signals for ease of management, flag the signals[19] and then mitigate the RFI in a manner that will lose the least possible astronomical data. This can be done by removing a point (frequency, time) which has been flagged[4]. 2.3 Methods for RFI mitigation One of the easiest ways to minimize RFI around a radio telescope is to declare the region to be radio quiet, which means that no transmitting or receiving radio devices are permitted within a certain distance of the telescope. This is difficult to enforce, as discovered at the Medicina telescope in Italy[3]: the growth of nearby cities cannot be curbed, and often the radio quiet region is encroached upon. For this reason, the MeerKAT and SKA projects are based in the Karoo, far from any large settlements. The Astronomical Advantage Act[7] enforces restrictions on frequencies by region (shown in Figure 2.2). These regions surround the core of the MeerKAT and SKA projects. Unfortunately, it is not possible to find a region with absolute radio quiet, independent of the regulations set in place. Satellites and aeroplanes still pass overhead and some signals are very long distance, such as television signals. So, beyond radio quiet regions, the International Telecommunications Union (ITU) has released a table specifying frequency allocations for different types of communication. This table is then specialized by the communications authority of each country to be applicable. The Independent Communications Authority of South Africa (ICASA) has released the relevant table for South Africa[8]. This allocates a relatively small number of narrow frequency bands to radio astronomy, and, commonly, these bands are shared with other communications areas. It is illegal for a signal to be transmitted outside of the allocated frequency, so signals detected in these areas can be turned off by the authorities (ICASA). If radio astronomy wishes to make use of a wide bandwidth of frequencies, there will be a large amount of RFI present which is entirely legal[5]. If it is impossible to avoid RFI, detection and mitigation schemes need to be developed. The many different types of RFI lead to many different detection algorithms. Many of these are designed for specific instruments or projects, and so are not directly suitable for all astronomical data. These algorithms can be compared, combined, and modified to provide a situation-specific solution. 2.4 RFI detection algorithms The simplest form of detection is thresholding, which tests the strength of a signal against a predefined threshold value and, if the signal is above that value, flags it as RFI. This can be done with any kind of data, but is often done after the Fast Fourier Transform (FFT) part of the correlation process. The algorithms we consider here all work post-correlation,

12 12 CHAPTER 2. BACKGROUND Figure 2.2: Map of frequency restricted regions in the Karoo [7] meaning they can work on saved data. A reference antenna (as is used at the MeerKAT site) is used to compare signals to aid in the detection Radio Astronomy Data Radio astronomy data is collected for many different purposes and in many different ways, with different emphases. The data could be collected on a satellite (the SMOS project[18]) or by Earth-based radio telescopes. These telescopes vary design, data collected, and collection method. They can be single- or multi-dish (or beam) and they can observe actively or passively. Telescopes with either a multi-beam feed, or an array of dishes, have their data correlated and so calculate a covariance matrix, as used in the spatial filtering technique. The SKA project will be made up of a large array of passively observing dishes[21], so we may use these techniques. Currently on the MeerKAT site, there is a single antenna observing the environment for RFI. This antenna is used to detect and characterize, as well as visualize RFI before the full telescope becomes operational, which will make RFI mitigation easier later. Therefore at this stage techniques which require multiple antennae will not be usable Spectral Kurtosis Spectral Kurtosis (SK) is a statistical method used for RFI detection, which is usually applied to time-averaged, non-gaussian data, but can be extended to other data types[1].

13 2.4. RFI DETECTION ALGORITHMS 13 SK is a thresholding method, applied either during or after the FFT[11], and is applied equally well in frequency and time domains. The spectral kurtosis can be calculated using V 2 k = σ2 k, (2.1) µ 2 k where σk 2 is the variance and µ2 k is the mean of the power spectral density (PSD). A sample with no RFI will have Vk 2 = 1. The mean and the variance is done for M spectral estimates P ki, where k is the channel number and i = 1,..., M. These are used to calculate the instantaneous power spectral density (PSD) S 1 and the squared spectral power S 2, S 1 = S 2 = Then the mean and variance are given by M P ki (2.2) i=1 M Pki. 2 (2.3) i=1 This gives The variance of V 2 k µ k = S 1 M (2.4) and σ 2 k = MS 2 S 2 1 M(M 1). (2.5) Vk 2 = M ( ) MS2 1. (2.6) M 1 S1 2 is then calculated, and compared to the expected value of var(v 2 k ) = { 24/M, k = 0, N 4/M, k = 1,..., (N 1)[12], (2.7) where N is the Nyquist rate associated with the sampling rate. The Nyquist rate is the minimum rate at which a signal can be sampled without introducing errors, it is twice the highest frequency in the signal.[23] If the variance is significantly different from a baseline level such as the median, the signal can be considered to be RFI. A good implementation of the SK method requires a full understanding of all the statistical techniques involved. The complexity of the algorithm depends on how many windows of size M are used, so giving a worst case O(N 2 ) complexity. The SK method is suitable for use on any type of data, but, as a purely statistical method, does not hold much interest from a computing perspective SumThreshold The SumThreshold method is a form of combinatorial thresholding, which means that samples are not only checked individually for high values, but also are combined to check if two or more neighboring samples are all above a slightly lower threshold value. The flagging function for frequency and time can then be given as flagν M if i {0...M 1} : j {0...M 1} R(ν + (i j) ν, t) > χ M (2.8) flagt M if i {0...M 1} : j {0...M 1} R(ν, t + (i j) t) > χ M, (2.9)

14 14 CHAPTER 2. BACKGROUND where M is the number of samples in a combination, χ is the threshold value, and R(ν, t) is the value of the sample at time t and frequency ν. A sample can be flagged in either time or frequency. Once a sample has been flagged, its value is changed for future combinations to be the average threshold size (χ M ). This lowers the frequency of false positives in the flagged data[14]. The difficulty in this approach lies in calculating appropriate χ values, although it may be possible to make use of Spectral Kurtosis to do this. Much as for SK, the complexity depends on both the number of samples, and the iterations through combinations up to size M giving a worst case O(N 2 ) time. The method can be used on any type of data, making it suitable for this project, and the main interest would be in comparison with SK AOFlagger The AOFlagger is an algorithm which was implemented at LOFAR in 2010[16]. As input, it takes information on a single polarization or set of Stokes I data (an integration technique used to join all the data into one spectrum). The amplitudes are calculated, and a thresholding technique is used to generate the first flags. The channels (frequency) or time steps (time) are then compared based on root-mean-square (rms) values, to flag the outliers. The data are then fitted to a 2D Gaussian surface, again to smooth out outliers. The process is then iterated, increasing the strictness of the threshold until the data converges on the surface. A dilation is then performed on the data, flagging further RFI around the edges of the channels or time steps, on the supposition that not all the RFI was found. At this point the flags can be compared with the original data[13]. The most difficult part of the AOFlagger is in the dilation step, ensuring that the flags are not spread too far, thus flagging channels or time steps unnecessarily. The complexity is certainly above linear time, as the data are fitted to a 2D surface, which requires at the least O(N log N). The algorithm is certainly suitable for the data produced, and has interest when considered in conjunction with basic thresholding, techniques as well as more advanced techniques Morphological Algorithm This algorithm was designed for the LOFAR telescope, and so is suitable for extension to the SKA, as the two telescopes are similar. It combines a number of techniques, such as thresholding and the use of reference antennae, which give good estimates of frequencies in which there is RFI. The algorithm utilizes the fact that most RFI signals are parallel either to the time or the frequency axis. It builds particularly on the AOFlagger algorithm ( 2.4.4). The key concepts used are morphology, and the idea of a scale invariant rank (SIR) operator. An SIR operator is a mathematical operator ρ for which ρ(λx) = λρ(x), where λ is a constant. The operator must be of the SIR type, because RFI signals are themselves scale invariant, meaning that they are not affected by scaling the data. The SIR operator is applied after a basic flagging method and is applied separately to time and frequency. The operator can be defined as ρ(x) = {[Y 1, Y 2 ) : X [Y 1, Y 2 ) (1 η)(y 2 Y 1 )}, (2.10) where [Y 1, Y 2 ) is a half open interval in either frequency or time and η gives the aggressiveness of the operator (meaning that η = 0 will flag nothing, and η = 1 will flag everything). To recombine the time and frequency channels, either a union of the two can be taken,

15 2.5. CHARACTERIZATION METHODS 15 or the operator can be applied sequentially in each channel. The sequential combination is more aggressive than the union and the order of the sequence will influence what is flagged[17]. A full proof of the scale invariance of operator ρ, as well as the full algorithm in O(N) time was given by Offringa et al[17]. The algorithm has been fully implemented in O(N) time. It is predominantly of theoretical interest and uses interesting mathematical concepts Spatial Filtering Spatial filtering aims to reduce the RFI levels in a sample to the point where they can be seen through to view the astronomical signals. Thus it is a mitigation method, although it can be used for combined detection and mitigation. The spatial filtering technique is based on the manipulation of the covariance matrix C formed by correlation of the data from multiple channels (dishes or beams). The background astronomical signals and system noise are considered to be Gaussian noise[4]. The eigenvector and eigenvalue matrices are found, giving C = UΛU H, where Λ is a diagonal matrix containing the eigenvalues in descending order, U is the matrix of eigenvectors and U H is its Hermitian conjugate. The Hermitian conjugate is found by taking the transpose of the matrix and replacing each value with its complex conjugate. Either it is assumed that the RFI has the strongest signal in the system and the first value in Λ is given a null value, or a filter is applied. The filter can be either a projection filter, which gives a projection of C onto the noise subspace (giving C = P N CP N, where P N is the projection) or a subtraction filter, where the projection onto the interference subspace is subtracted from the system (giving C = C P I CP I )[9]. 2.5 Characterization Methods RFI characterization methods draw heavily on the detection methods, as a signal cannot be characterized before it has been detected, and many of the principles in detection and characterization are the same. Some characteristics are easy to find. The SMOS project[18], which measures the brightness temperature (BT) of Earth, found the power of the RFI signal to be directly proportional to the BT. They also suggested that the direction of a pulsating source can be found by analyzing the pulses. The SMOS is satellite-based, so not directly applicable to the SKA, but many of the principles remain the same. Another group working with synthetic aperture radar[10] match the frequency and time stability of a signal to a known signal, from a specific type of radar tower. They also correlated geographical position. Unfortunately, the majority of their characterization is done as part of the detection of the signals. 2.6 Conclusions In Table 2.1, the six algorithms in Section 2.4 are compared based on a number of factors. In this table it can be seen that some algorithms are more suitable to the data collected at MeerKAT than others, and some are more complete (or higher level) than others. The morphological algorithm ( 2.4.5) is an example of a high-level algorithm suitable for the data. This algorithm does have room for extension, however, as the sub-algorithm of SumThreshold ( 2.4.3) could be replaced with another, and characterization methods

16 16 CHAPTER 2. BACKGROUND could be added to it. The spatial filtering algorithm ( 2.4.6) is even higher level, going so far as to mitigate the RFI. This could quite easily be extended, by only applying the algorithm to samples already flagged, but it is not suitable for this data, as it requires an array of inputs. The main methods of interest are the flagging methods and the characterization methods. It would be interesting to combine these methods to flag data not just as RFI, but as a specific type of RFI, which could then be visualized, so that the radio environment of the MeerKAT area can be more intuitively understood. Table 2.1: Comparison of algorithms discussed in 2.4, with scores given from 1-3 for each section, where a higher score means a higher value in that section. Algorithm Features Difficulty Complexity Suitability Interest Spectral Kurtosis Morphological Algorithm AO Flagger SumThreshold Spacial Filtering

17 Chapter 3 Design 3.1 Goals In this project, we aim to determine an effective method for detecting and possibly characterising Radio Frequency Interference (RFI) in radio signals, particularly focussing on signals received from radio telescopes. As the data files are large ( values per file), the method should be efficient in terms of both time and space. 3.2 Approach We select two algorithms to be implemented and compared in discussion with the two Astronomy supervisors. We choose algorithms based on Table 3.1, with a focus on high suitability and low difficulty. Table 3.1: Comparison of algorithms discussed in Chapter 2, with scores given from 1-3 for each area, where a higher score means a higher value in that area. Algorithm Features Difficulty Complexity Suitability Interest Spectral Kurtosis Morphological Algorithm AO Flagger SumThreshold Spatial Filtering The two algorithms chosen are the SumThreshold method and the AOFlagger method. From these chosen algorithms the final methods are developed SumThreshold Algorithm The SumThreshold method is a combinatorial thresholding method which, rather than simply checking if a value is above a specific threshold, includes the surrounding values in the computation. The flagging part can be given in equation form as flagν M if i {0...M 1} : j {0...M 1} R(ν + (i j) ν, t) > χ M flagt M if i {0...M 1} : j {0...M 1} R(ν, t + (i j) t) > χ M. This can be put into pseudo code as follows: 17

18 18 CHAPTER 3. DESIGN Set M, sum, threshold, maxm For each window of size M_i (from M to maxm stepping 2, 4, 8,...) set count = number of unflagged values in the window set sum = sum of all these values if (sum > count * threshold) OR (sum < -count * threshold) set a flag on unflagged values set values to be an average move the window to the right set the threshold for the new window position Final SumThreshold algorithm After a few optimisations during the implementation phase (Chapter 4) a final algorithm is left which is slightly changed from the original. The pseudo code is as follows: Set M, sum, threshold, maxm For each window of size M_i (from M to maxm stepping 2, 4, 8,...) set sum = sum over j in window (value at j) - chi if (sum > 0) set a flag on unflagged values set values to be an average move the window to the right set the threshold for the new window position Surface fitting and dilation The AOFlagger method is an extension of a thresholding method which adds surface fitting and dilation to the algorithm. The initial algorithm attempted was: Row-wise repeat: Do (at least twice): - Replace flagged data with median value - Create spline interpolated surfaces - Compare values between interpolations, flagging those beyond a certain level. end do end repeat However, after beginning the implementation of the system, this algorithm was discarded, and a brand new one developed, the variable window method Variable window size The variable window algorithm was developed in discussion with supervisors Sarah and Anja, and is an attempt to find an efficient way of checking all the data. This method makes use of a smoothed surface which underlies the data at every time period. This surface is used as a comparison, or base threshold value and then a two-dimensional window is placed over the data. The size of this window depends on the rate of change of the standard deviation of the data. So, when the standard deviation is changing quickly, we assume that there are larger spikes in the data, and so use a smaller window. If the

19 3.2. APPROACH 19 standard deviation changes slowly, we assume that there are fewer large spikes, and so use a larger window. The algorithm in pseudo code is: repeat process a number of times Set window size and position loop through entire surface find standard deviation (s.d) find change in s.d (average over three) flag window (look for points 5 * s.d out) vary window size shift window on end loop end repeat System Architecture The system is originally described by the following diagram, where the greyed out parts deal with visualisation be implemented by Gerard Nothnagel, and so are not discussed in this work: The modifications to the algorithms lead to a modification of the system architecture, and so the final system is described by the following diagram:

20 20 CHAPTER 3. DESIGN The RATTY data is data collected on the MeerKAT site, and is provided for use by Christopher Schollar. The smoothed surface included in the grey oval is the underlying surface which the data is compared to in the variable window algorithm. It is required as an input to the system Software Development We follow an iterative approach to the development of the software, focussing first on the requirements, then producing a detailed design, then implementing the design before testing and validation. This cycle is then repeated until a satisfactory result is achieved. We follow this process because the original algorithms have already been documented and so the initial design phase consists predominantly of adapting the algorithm to the situation. This means that the design should be finished before implementation begins, which is based on the waterfall process. The implementation is managed using version control through Git. This allows for more flexible implementation and experimentation. Sections are tested as they are developed, drawing from the concept of unit testing to ensure code integrity. 3.3 Input and Output Input is in the format of HDF5 files (Chapter 2, Section 2.4.4) containing data collected on the MeerKAT site, which have a row for every time at which data was collected and a column for every frequency channel. The output is a new HDF5 file which contains a mask for the original file. This means that, if a value is flagged with 0 it has no RFI, and if it is flagged with 1 there is RFI of some type.

21 3.4. ALGORITHM ANALYSIS Algorithm Analysis SumThreshold Input: array hight m, width n 1. load into memory 2. Create matching mask 3. loop m times 4. while run < r 5. while pos + l/2 <= n (step size= l/2) 6. set chi 7. flag window of length l 8. save and close files This gives a very basic description of the algorithm which can be used to find the complexity. Lines 1, 2 and 8, will add a term of O(3 n m) to the complexity. The loop beginning in line 3 adds a factor of m. The loop in line 4 adds a constant factor r, bringing the complexity up to O(r m+3 n m). The choosing of the threshold value can be viewed as a non-trivial constant time calculation which takes time c. Flagging the window depends on its length l, and takes O(2l) when the window must be flagged. The loop in line 5 contributes a factor of 2n l. So over all the complexity of the algorithm is: Complexity = O(r m 2n (2l + c) + 3 n m) l = O(r m 2n (2 + c) + 3 n m) = O((4r + 2rc + 3) m n) = O(k m n) Where k is some fairly large constant. It is this factor k which must be optimised to improve the performance of the algorithm Variable Window Input: array hight m, width n 1. load two files into memory 2. Create matching mask 3. loop k times 4. while time position + 1/2 time dimension <= m (step 1/2 time dimension) 5. while frequency position + 1/2 frequency dimension <= n (step 1/2 frequency dimension) 6. calculate sigma 7. flag window 8. change window size if appropriate (time never changes) 9. save and close files As in 3.4.1, lines 1, 2, and 9 give a single term for the complexity, of O(4 m n). Line 3 gives a factor of k. Line 4 gives a factor of 2m c 1, where c 1 is the smallest value for the time

22 22 CHAPTER 3. DESIGN dimension. Line 5 gives a factor of 2n c 2, where c 2 is the smallest value for the frequency dimension. Lines 6 8 can be calculated in some constant time, say c 3. This gives the overall complexity as: Complexity = O(4 m n + k 2m c 1 = O((4 + 4c 3k c 1 c 2 ) m n) = O(K m n) 2n c 2 c 3 ) Where K is a non-trivial constant factor. This factor K is what must be optimised for best performance Comparison To properly compare the algorithms analysed in and we look at their constant factors. To do this we assign values to various constants which can be found in the code listings in Appendix B and C. We have for the SumThreshold method: And for the variable window method: k = 4r + 2rc + 3 = c + 3 = c K = 4 + 4c 3k c 1 c 2 = c = 4 + 3c We can assume that the values c and c 3 are comparable, as they are both constant factors which contain the focus of the code. Thus we can show the difference between k and K as: K k = 4 + 3c ( c) = 27 + c( ) = c K = k c Since c must be a positive value, it should come as no surprise that the variable window method runs significantly faster than the SumThreshold method.

23 Chapter 4 Implementation Her we discuss the implementation of the two algorithms chosen for development. These will further be tested and validated with simulated data (Chapter 5) and then have case studies performed of how they react to real data (Chapter 6). With regards to the design of the system, some aspects changed during the implementation process. The original design can be seen in Figure 4.1. The first algorithm, SumThreshold, incorporated a separate script to transpose the data file before inputting it to the algorithm. The second algorithm underwent major changes over the course of the implementation, as it is a more complex system. The original design of fitting the data to a surface was modified into a system which uses a smoothed surface and the standard deviation of a window to search out the larger and smaller RFI in different ways, whilst allowing for noise. Thus the shaded oval was added to Figure 4.1. More details on the implementation of each algorithm follow below. 4.1 Languages and libraries The algorithms are all implemented in the Python programming language. This language was chosen as the developers at the SKA already work predominantly in Python, and there are many very powerful scientific libraries written for Python[6], such as the h5py library[20] which allows a Python script to read a file in the HDF5 file format. This is necessary since the astronomical data is all stored in HDF5 files, which compress the data to a storable size. Another library used extensively is Numpy[22], a library which allows advanced manipulation of arrays of data, making finding statistical values for a section of data simple. 4.2 SumThreshold Algorithm This is the first algorithm implemented. A full description of the original algorithm can be found in Chapter 2. This algorithm is a combinatorial thresholding method, which means that, rather than only checking if every data point is above some threshold value χ, a window is moved across the data. Then, for every pass the sum of unflagged data points is compared to a lowered threshold value. This can be shown by the equations flagν M if i {0...M 1} : j {0...M 1} R(ν + (i j) ν, t) > χ M flagt M if i {0...M 1} : j {0...M 1} R(ν, t + (i j) t) > χ M. 23

24 24 CHAPTER 4. IMPLEMENTATION Figure 4.1: Diagram showing the structure of the detection and visualisation system Prototype 1 To begin, we created a crude implementation of the algorithm as described in Chapter 3. In this process, some issues were discovered, such as: 1. It can be tricky to decide on a suitable thresholding value (χ) above which all data points are flagged. We decided to use statistical relevance checks. So the χ value is set to be the median value increased by 5σ. This is then decreased with each pass to 3σ, which gives the lowered χ value for the combinatorial step. This was discovered to be necessary when performing validation tests with a smooth increasing surface: the surface was flagged as RFI when the slope was positive. 2. The algorithm proves to be unreliable on the edges of the data, an acknowledged issue in signal processing, as there is insufficient data around the specific points to get an accurate median value. This issue can only be solved by counting the fringe values as unreliable, and measuring a little wider than is required for measurement. 3. Part of the optimization of this algorithm is determining the initial window size, as well as the rate of growth and the number of passes to be made. It is unreasonable to begin with a window of size one, which checks every point, as this will slow the algorithm to worse than real time. To achieve real time, a single row of data should be processed in a second or less. This problem is considered in the optimisations listed below.

25 4.3. SURFACE AND DILATION ALGORITHM (DISCONTINUED) The original implementation took a long time to run Optimisation 1 The χ calculation was modified to be independent of the window size, which reduced the χ calculation to constant time. This reduces the complexity of the algorithm, and increases its speed. The second optimisation changed the step size of the window. As stepping through every point multiple times is inefficient, this was changed to begin with a step size of 6, which then increases with every pass so that larger windows have a larger step size. This optimisation cut running time down to below 30 minutes on average for data collected over one hour, giving half real time. This also allows for a user to decide whether accuracy or time is more important. The step size and number of passes can be parametrized to allow a user to set their own values:then a user looking for high accuracy will set the step size very low and the number of passes higher Optimisation 2 Further testing after optimisation 1 brought some glaring errors to light. Optimisation 1 was tested before the correct version of χ was used. Changing the value of χ meant that the subroutine for the combinatorial flagging in the window needed to be reviewed. The original method was: flag window: for point in array: if point not flagged: add to sum increase counter if abs(sum) > abs(counter * chi): flag entire window This does not work, as the majority of the data is negative, but there are some RFI spikes which are positive. To account for these negatives the algorithm was modified to flag window: for point in array: if point not flagged: sum += (point - chi) if sum > 0: flag entire window This gives the sum of the distances of the points from the threshold value. So points which are below the value will have a negative impact, and those which are above will have a positive impact on the sum. The main method was also modified to force the step size to be equal to the length of the window, ensuring that no points are ever missed. 4.3 Surface and dilation algorithm (discontinued) This algorithm was originally going to be based on the AOFlagger model explained in Chapter , which fits the data to some surface and then expands all the flagged

26 26 CHAPTER 4. IMPLEMENTATION areas based on the assumption that RFI will occur in larger regions than are actually detected. Using spline interpolation it is possible to create a smoothed version of the data against which to perform checks. The pseudo code for this original algorithm is as follows Row-wise repeat: Do (at least twice): - Replace flagged data with median value - Create spline interpolated surfaces - Compare values between interpolations, flagging those beyond a certain level. end do end repeat This method, which interpolates every row of data takes a very long time to run (around 2 days). This time is unacceptable and does not give sufficient accuracy to warrant longer than real time processing. After discovering that it takes about 20s to perform a spline interpolation on a single row of data, the algorithm was rethought a little. This involved preprocessing the data to act as a smoothed surface, which in itself takes a long time, but that one file can be used to process many different data sets. This improves the speed greatly, moving to take only a few minutes to perform a detection for an hour s worth of data. Unfortunately the algorithm has moved away from the original idea, and no longer is very different from a basic thresholding algorithm. At this point we discarded this algorithm and moved to the variable window method. 4.4 Variable window algorithm Prototype 1 A system of a fixed window size which ran through the entire surface was initially built. Some noteworthy errors were made during the implementation. The first error was that the window moved diagonally through the data, only looking at a band from the top left to the bottom right. The second thing that required some time to solve was the necessity of a standard deviation calculation with a predefined mean value. There is a standard method for doing this in Python 3, but not in Python 2. Porting the algorithm to Python 3 was considered, but the difference between Python 3 and Python 2 is sufficiently large that this became infeasible very quickly. So it was necessary to write a standard deviation method Optimisation From the system with a fixed window, adding in the window variations was fairly simple. The system takes three steps to produce an accurate representation of the rate of change in the standard deviation, and uses an average over the last three steps to calculate this value. A look up is then used to determine the window size of the next step. The smallest window is , which is for a rate of change greater than 2. The middle size is , for a rate above 1. The largest size is for all smaller rates of change.

27 4.5. CONCLUSIONS 27 The process is repeated three times, which gives reasonable accuracy, and only requires about 20 minutes of processing time on data collected in one hour. 4.5 Conclusions The implementation necessitated adaptation of the original design, which allowed for better algorithms to be developed. These algorithms are based on the ideas used in the originals, but puts the ideas together in a slightly different way which is more appropriate for the data being processed. Through this procedure, we ended up with two viable algorithms the SumThreshold algorithm and the variable window algorithm. These algorithms were then thoroughly checked, as is discussed in the next chapter.

28 Chapter 5 Validation Here we discuss validation of the two algorithms developed for RFI detection, the Sum- Threshold method and the variable window method. In running simulations we are able to accurately determine which RFI each algorithm is able to detect, and to what extent that RFI is detected. We are also able to determine the sensitivity of each algorithm, and the accuracy in the flagging, which will give us a feel for when we can expect false positives from the algorithm. The tests discussed in this chapter contain the most important information found through the simulations. Further test results are provided in Appendix A. 5.1 Methods To check that the output is correct, we create specific test data containing values similar to the real input data, containing RFI signals in known positions. This is done through generating Gaussian noise with fake RFI signals added in known places. If the implementation correctly flags this data, it can be considered to be working correctly. We made use of spline fitting and medians to smooth data. This gives a realistic smooth surface which can be used to test as the data will be based on such a shape. The method of smoothing the data was as follows: Run a window across all data, finding the median. Create a data file containing these median values. Perform a Bivariate Spline on the data file, smoothing value: s=0.5 Save the new spline surface as the smoothed data surface. On top of this smoothed surface, white noise is added, which emulates astronomical data, which is often treated as Gaussian noise[12]. We create a different type of surface to test methods on as well. This is done with a perfectly flat surface, where the baseline of the values is set to zero. We then set specific values to be RFI spikes which should be picked up by the detection method Determining success To determine success we will first compare the results of running each algorithm over time and frequency separately, we will then compare the results of the algorithms to each other. We will also compare each algorithm to a kurtosis algorithm (supplied). We will 28

29 5.2. TESTS 29 then decide if the differences in performance allow for the combination of the algorithms to create a better method, and include characterisation of the signals. 5.2 Tests Figure 5.1: a) Data in the general shape of real data, but with RFI removed, and noise added. b) The mask produced by the SumThreshold Algorithm. c) The mask produced by the variable window algorithm. Test one tested the algorithms on a non-uniform surface with no RFI. This was done with data in the same shape as the real data, but which was smoothed and then had noise added, as can be seen in Fig 5.1a. The expected outcome for this test was two perfectly empty masks. The SumThreshold method provided exactly that (Fig 5.1b), but the variable window method has flagged areas of the data (Fig 5.1c). The bands marked 2 and 3 in Figure 5.1c correspond to points where the data steps steeply, suggesting that there is a weakness in the variable window method when the data steps. This leads to the inclusion of false positives in the mask in these areas. This means that the method should be validated either with another detection algorithm, or by observing the data. The bands marked 1 and 4 have less obvious causes, but the cause is similar. They are both on a steep upward slope of the data, and so the algorithm is very sensitive to this type of change. Test two is designed to test the sensitivity of both algorithms to narrow band, isolated RFI. This is done by using the same surface as in test one, with a single frequency channel including RFI. This is shown in Figure 5.2a, at the point labelled RFI.

30 30 CHAPTER 5. VALIDATION Figure 5.2: a) Data in the general shape of real data, with a single RFI spike, and noise. b) The mask produced by the SumThreshold Algorithm. c) The mask produced by the variable window algorithm. Figure 5.2 shows the increased sensitivity of the variable window method, as the single spike is flagged in a very narrow band, whereas the SumThreshold method picks it up with a wider band. This is because the SumThreshold method will be unable to pick up the spike until it s window has expanded to a size larger than the width of the spike, and the entire spike is enclosed by the window. This shows that it would be possible to increase the accuracy of the SumThreshold method by decreasing the step size of the window position, although this would also increase the run time. The third test makes use of the second surface. Data with a baseline of zero, and Gaussian noise is tested. The expected outcome of this test is that neither algorithm will flag any data points. Figure 5.3 shows that this test produced the expected results. Both masks are completely empty. This is a good thing, as it means that the algorithms are checking only for signals which differ from the median value by a statistically significant amount. This is relevant as one of the earliest iterations of the development did not have this property, and would have found RFI in this surface. The fourth test is designed to test the sensitivity of the algorithms to a group of spikes close together. This stands the danger of being treated as noise with a very high standard deviation by the algorithms. We expect that the SumThreshold method will flag the entire band in which the group is found, and the variable window method will flag the individual spikes. Figure 5.4b shows that the SumThreshold method did not flag any values. This

31 5.2. TESTS 31 Figure 5.3: a) Data with a baseline of zero, and noise. b) The mask produced by the SumThreshold Algorithm. c) The mask produced by the variable window algorithm. Figure 5.4: a) Data with a baseline of zero, a family of spikes, and noise. b) The mask produced by the SumThreshold Algorithm. c) The mask produced by the variable window algorithm. means that the algorithm falls into the trap of treating a group of spikes as noise with a very high standard deviation. The variable window method, however, acts as expected, flagging something in the same place as the RFI. A zoomed in view (Fig 5.5) shows that the variable window in fact flagged exactly the RFI, and so created a distinctive striped pattern. The final test is designed to test the sensitivity of the algorithms in the time dimension,

32 32 CHAPTER 5. VALIDATION Figure 5.5: a) A zoomed view of the family of spikes. b) a zoomed view of the stripes displayed by the variable window mask. Figure 5.6: a) The data explored. b) The mask produced by the SumThreshold algorithm. c) The mask produced by the variable window algorithm as there is some RFI which is visible only in the time domain. To perform this test The surface resembling real data is used, and three rows are seeded with RFI, uniformly along the row. This produces the three horizontal lines marked RFI in Figure 5.6a. We expect that the SumThreshold algorithm will be unable to detect these lines be-

33 5.3. DISCUSSION 33 Figure 5.7: The SumThreshold mask searching for transient RFI Figure 5.8: A complete mask, created by combining the SumThreshold (transposed and not) and the variable window masks. cause it processes the data set one row at a time. However, we shall test the performance of the SumThreshold method on a transpose of the dataset, and expect to see that the RFI is detected as it is narrow band in the time domain. We expect also that the variable window method will successfully flag the three lines. We see the results of this test in Figure 5.6. As expected, there is nothing flagged in Fig 5.6b, which is the SumThreshold mask. We can see also that the variable window method performed almost as expected. There are three lines marked RFI in Fig 5.6c, however, these lines have gaps in them, which seem to be related to the false positive band just to their left. Figure 5.7 shows the results of running the SumThreshold algorithm on the transposed data. We can see that it performed as expected, flagging the three lines accurately. We can see also in Figure 5.8 that the two masks that detected the horizontal lines detected them in the same place. 5.3 Discussion We can see through this validation process that both the SumThreshold and the variable window methods have some shortcomings. The SumThreshold method is not as sensitive as it is expected to be, and does not deal appropriately with groups of RFI. The variable window method is perhaps too sensitive, as it is liable to show false positives when the data increases steeply. The overall sensitivity of the variable window method makes it a very good first method to use for a broad understanding of where there is RFI in the data, but a second method should be used to validate the broader bands of flagged data, as this is where the false positives appear. The two methods are both capable of finding RFI in both the time

34 34 CHAPTER 5. VALIDATION domain and the frequency domain, although the variable window method does so more efficiently. The SumThreshold method, while requiring that the algorithm is run on the transpose of the data, does find the time based RFI as accurately as it finds frequency based RFI. These tests reveal some of the characteristics of the RFI detected by the two algorithms. The SumThreshold algorithm detects predominantly isolated, narrowband RFI. The variable window method is very sensitive, and is able to detect almost any RFI, but has some false positives which could be mistaken for broadband RFI. This can be used when combating the source of the RFI. Knowing the type of the RFI being detected is helpful in narrowing down the possible sources.

35 Chapter 6 Results In this chapter we discuss some case studies for the two RFI detection algorithms developed, to determine the qualitative difference in the algorithms. Each case highlights some feature or difference in the algorithms. The first study is a standard case, with very few features in the time dimension and standard features in the frequency dimension. The next study adds RFI in the time dimension. The third study is a fairly arbitrary choice of data, to show interesting effects. All three case studies are taken from real data collected on the site of the MeerKAT telescope, and made available by the SKA offices. We then compare performance of the two algorithms, and relate this back to the analysis in Chapter Case Study 1 In this instance we consider a fairly typical data set Figure 6.1aa. There are no major discrepancies in the time domain, and the RFI seen in the frequency domain is present in most of the other data sets as well. The first thing to note is that there are some lines on the data image which are clearly RFI, One set of such lines is marked on the figure. These lines show up as much darker than the rest of the image as they have a higher intensity. There are also some broad bands where the intensity increases, these are not broadband RFI, but rather baseline wiggles, also marked on the figure. These should not be flagged, as they correspond to trends in the baseline noise, rather than unusual occurrences. The second mask (Figure 6.1ac), produced by the variable window algorithm, contains more flagged points. The first bar of masked points in the variable window mask (labelled 1) is not shared by the SumThreshold mask, and is also not visible in the data. The variable window method does create false positives under certain circumstances (see Chapter 5), and it is possible that this data includes those. There follow after that some faint lines, marked 2. There are more of these lines on the variable window mask (Fig 6.1ac), but there are a few on the SumThreshold mask (Fig 6.1ab) as well. Those which are on both masks are tall thin isolated spikes. These are picked up very effectively by both algorithms, and can be used as a type of characterisation, as we know that if both methods flag the spike it must be a narrow and isolated type of RFI. The SumThreshold algorithm cannot detect a family of spikes (marked 3), because the system sees them as simply noise with a very high standard deviation. However, the variable window method is able to pick them up, as a specific type of RFI, leaving a 35

36 36 CHAPTER 6. RESULTS (a) a) The data explored. b) The mask produced by the SumThreshold algorithm. c) The mask produced by the variable window algorithm (b) The SumThreshold mask searching for transient RFI (c) A complete mask, created by combining the SumThreshold (transposed and not) and the variable window masks. Figure 6.1: An ordinary data set with typical RFI in the frequency domain, and minimal RFI in the time domain, along with the masks produced by the algorithms designed. distinctive pattern of stripes. This is one of the greatest failings of the SumThreshold method in its current form. A better version would be able to detect that the group of

Removing Radio Frequency Interference in the LOFAR using GPUs

Removing Radio Frequency Interference in the LOFAR using GPUs Vrije Universiteit Amsterdam Master Thesis Removing Radio Frequency Interference in the LOFAR using GPUs Author: Linus Schoemaker Supervisors: Dr. Rob. V. van Nieuwpoort Alessio Sclocco A thesis submitted

More information

Digital Image Processing. Prof. P. K. Biswas. Department of Electronic & Electrical Communication Engineering

Digital Image Processing. Prof. P. K. Biswas. Department of Electronic & Electrical Communication Engineering Digital Image Processing Prof. P. K. Biswas Department of Electronic & Electrical Communication Engineering Indian Institute of Technology, Kharagpur Lecture - 21 Image Enhancement Frequency Domain Processing

More information

Introduction to Digital Image Processing

Introduction to Digital Image Processing Fall 2005 Image Enhancement in the Spatial Domain: Histograms, Arithmetic/Logic Operators, Basics of Spatial Filtering, Smoothing Spatial Filters Tuesday, February 7 2006, Overview (1): Before We Begin

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Introduction Often, we have only a set of features x = x 1, x 2,, x n, but no associated response y. Therefore we are not interested in prediction nor classification,

More information

Experiments with Edge Detection using One-dimensional Surface Fitting

Experiments with Edge Detection using One-dimensional Surface Fitting Experiments with Edge Detection using One-dimensional Surface Fitting Gabor Terei, Jorge Luis Nunes e Silva Brito The Ohio State University, Department of Geodetic Science and Surveying 1958 Neil Avenue,

More information

Chapter 5snow year.notebook March 15, 2018

Chapter 5snow year.notebook March 15, 2018 Chapter 5: Statistical Reasoning Section 5.1 Exploring Data Measures of central tendency (Mean, Median and Mode) attempt to describe a set of data by identifying the central position within a set of data

More information

IMAGE DE-NOISING IN WAVELET DOMAIN

IMAGE DE-NOISING IN WAVELET DOMAIN IMAGE DE-NOISING IN WAVELET DOMAIN Aaditya Verma a, Shrey Agarwal a a Department of Civil Engineering, Indian Institute of Technology, Kanpur, India - (aaditya, ashrey)@iitk.ac.in KEY WORDS: Wavelets,

More information

Imaging and Deconvolution

Imaging and Deconvolution Imaging and Deconvolution Urvashi Rau National Radio Astronomy Observatory, Socorro, NM, USA The van-cittert Zernike theorem Ei E V ij u, v = I l, m e sky j 2 i ul vm dldm 2D Fourier transform : Image

More information

Spatial and multi-scale data assimilation in EO-LDAS. Technical Note for EO-LDAS project/nceo. P. Lewis, UCL NERC NCEO

Spatial and multi-scale data assimilation in EO-LDAS. Technical Note for EO-LDAS project/nceo. P. Lewis, UCL NERC NCEO Spatial and multi-scale data assimilation in EO-LDAS Technical Note for EO-LDAS project/nceo P. Lewis, UCL NERC NCEO Abstract Email: p.lewis@ucl.ac.uk 2 May 2012 In this technical note, spatial data assimilation

More information

Robust Regression. Robust Data Mining Techniques By Boonyakorn Jantaranuson

Robust Regression. Robust Data Mining Techniques By Boonyakorn Jantaranuson Robust Regression Robust Data Mining Techniques By Boonyakorn Jantaranuson Outline Introduction OLS and important terminology Least Median of Squares (LMedS) M-estimator Penalized least squares What is

More information

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation Obviously, this is a very slow process and not suitable for dynamic scenes. To speed things up, we can use a laser that projects a vertical line of light onto the scene. This laser rotates around its vertical

More information

Image Processing and Analysis

Image Processing and Analysis Image Processing and Analysis 3 stages: Image Restoration - correcting errors and distortion. Warping and correcting systematic distortion related to viewing geometry Correcting "drop outs", striping and

More information

XRDUG Seminar III Edward Laitila 3/1/2009

XRDUG Seminar III Edward Laitila 3/1/2009 XRDUG Seminar III Edward Laitila 3/1/2009 XRDUG Seminar III Computer Algorithms Used for XRD Data Smoothing, Background Correction, and Generating Peak Files: Some Features of Interest in X-ray Diffraction

More information

Spatial Outlier Detection

Spatial Outlier Detection Spatial Outlier Detection Chang-Tien Lu Department of Computer Science Northern Virginia Center Virginia Tech Joint work with Dechang Chen, Yufeng Kou, Jiang Zhao 1 Spatial Outlier A spatial data point

More information

DESIGN AND EVALUATION OF MACHINE LEARNING MODELS WITH STATISTICAL FEATURES

DESIGN AND EVALUATION OF MACHINE LEARNING MODELS WITH STATISTICAL FEATURES EXPERIMENTAL WORK PART I CHAPTER 6 DESIGN AND EVALUATION OF MACHINE LEARNING MODELS WITH STATISTICAL FEATURES The evaluation of models built using statistical in conjunction with various feature subset

More information

Image Compression With Haar Discrete Wavelet Transform

Image Compression With Haar Discrete Wavelet Transform Image Compression With Haar Discrete Wavelet Transform Cory Cox ME 535: Computational Techniques in Mech. Eng. Figure 1 : An example of the 2D discrete wavelet transform that is used in JPEG2000. Source:

More information

EXAM SOLUTIONS. Image Processing and Computer Vision Course 2D1421 Monday, 13 th of March 2006,

EXAM SOLUTIONS. Image Processing and Computer Vision Course 2D1421 Monday, 13 th of March 2006, School of Computer Science and Communication, KTH Danica Kragic EXAM SOLUTIONS Image Processing and Computer Vision Course 2D1421 Monday, 13 th of March 2006, 14.00 19.00 Grade table 0-25 U 26-35 3 36-45

More information

Table of Contents (As covered from textbook)

Table of Contents (As covered from textbook) Table of Contents (As covered from textbook) Ch 1 Data and Decisions Ch 2 Displaying and Describing Categorical Data Ch 3 Displaying and Describing Quantitative Data Ch 4 Correlation and Linear Regression

More information

Computational issues for HI

Computational issues for HI Computational issues for HI Tim Cornwell, Square Kilometre Array How SKA processes data Science Data Processing system is part of the telescope Only one system per telescope Data flow so large that dedicated

More information

Measures of Central Tendency. A measure of central tendency is a value used to represent the typical or average value in a data set.

Measures of Central Tendency. A measure of central tendency is a value used to represent the typical or average value in a data set. Measures of Central Tendency A measure of central tendency is a value used to represent the typical or average value in a data set. The Mean the sum of all data values divided by the number of values in

More information

Lecture Image Enhancement and Spatial Filtering

Lecture Image Enhancement and Spatial Filtering Lecture Image Enhancement and Spatial Filtering Harvey Rhody Chester F. Carlson Center for Imaging Science Rochester Institute of Technology rhody@cis.rit.edu September 29, 2005 Abstract Applications of

More information

Measures of Central Tendency

Measures of Central Tendency Page of 6 Measures of Central Tendency A measure of central tendency is a value used to represent the typical or average value in a data set. The Mean The sum of all data values divided by the number of

More information

Schedule for Rest of Semester

Schedule for Rest of Semester Schedule for Rest of Semester Date Lecture Topic 11/20 24 Texture 11/27 25 Review of Statistics & Linear Algebra, Eigenvectors 11/29 26 Eigenvector expansions, Pattern Recognition 12/4 27 Cameras & calibration

More information

Chapter 2 Basic Structure of High-Dimensional Spaces

Chapter 2 Basic Structure of High-Dimensional Spaces Chapter 2 Basic Structure of High-Dimensional Spaces Data is naturally represented geometrically by associating each record with a point in the space spanned by the attributes. This idea, although simple,

More information

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski

Data Analysis and Solver Plugins for KSpread USER S MANUAL. Tomasz Maliszewski Data Analysis and Solver Plugins for KSpread USER S MANUAL Tomasz Maliszewski tmaliszewski@wp.pl Table of Content CHAPTER 1: INTRODUCTION... 3 1.1. ABOUT DATA ANALYSIS PLUGIN... 3 1.3. ABOUT SOLVER PLUGIN...

More information

ELEC Dr Reji Mathew Electrical Engineering UNSW

ELEC Dr Reji Mathew Electrical Engineering UNSW ELEC 4622 Dr Reji Mathew Electrical Engineering UNSW Review of Motion Modelling and Estimation Introduction to Motion Modelling & Estimation Forward Motion Backward Motion Block Motion Estimation Motion

More information

CREATING THE DISTRIBUTION ANALYSIS

CREATING THE DISTRIBUTION ANALYSIS Chapter 12 Examining Distributions Chapter Table of Contents CREATING THE DISTRIBUTION ANALYSIS...176 BoxPlot...178 Histogram...180 Moments and Quantiles Tables...... 183 ADDING DENSITY ESTIMATES...184

More information

How to Measure Wedge. Purpose. Introduction. Tools Needed

How to Measure Wedge. Purpose. Introduction. Tools Needed Purpose Optical Wedge Application (OWA) is an add-on analysis tool for measurement of optical wedges in either transmission or reflection. OWA can measure a single part or many parts simultaneously (e.g.

More information

10.4 Measures of Central Tendency and Variation

10.4 Measures of Central Tendency and Variation 10.4 Measures of Central Tendency and Variation Mode-->The number that occurs most frequently; there can be more than one mode ; if each number appears equally often, then there is no mode at all. (mode

More information

10.4 Measures of Central Tendency and Variation

10.4 Measures of Central Tendency and Variation 10.4 Measures of Central Tendency and Variation Mode-->The number that occurs most frequently; there can be more than one mode ; if each number appears equally often, then there is no mode at all. (mode

More information

arxiv: v2 [astro-ph.im] 18 Jan 2012

arxiv: v2 [astro-ph.im] 18 Jan 2012 Astronomy & Astrophysics manuscript no. aa97- c ESO January, A morphological algorithm for improving radio-frequency interference detection A.R. Offringa, J.J. van de Gronde, and J.B.T.M. Roerdink University

More information

MSA220 - Statistical Learning for Big Data

MSA220 - Statistical Learning for Big Data MSA220 - Statistical Learning for Big Data Lecture 13 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Clustering Explorative analysis - finding groups

More information

Final Exam Study Guide

Final Exam Study Guide Final Exam Study Guide Exam Window: 28th April, 12:00am EST to 30th April, 11:59pm EST Description As indicated in class the goal of the exam is to encourage you to review the material from the course.

More information

3 Nonlinear Regression

3 Nonlinear Regression CSC 4 / CSC D / CSC C 3 Sometimes linear models are not sufficient to capture the real-world phenomena, and thus nonlinear models are necessary. In regression, all such models will have the same basic

More information

Clustering and Visualisation of Data

Clustering and Visualisation of Data Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some

More information

Edge and corner detection

Edge and corner detection Edge and corner detection Prof. Stricker Doz. G. Bleser Computer Vision: Object and People Tracking Goals Where is the information in an image? How is an object characterized? How can I find measurements

More information

JitKit. Operator's Manual

JitKit. Operator's Manual JitKit Operator's Manual March, 2011 LeCroy Corporation 700 Chestnut Ridge Road Chestnut Ridge, NY, 10977-6499 Tel: (845) 578-6020, Fax: (845) 578 5985 Internet: www.lecroy.com 2011 by LeCroy Corporation.

More information

Audio-coding standards

Audio-coding standards Audio-coding standards The goal is to provide CD-quality audio over telecommunications networks. Almost all CD audio coders are based on the so-called psychoacoustic model of the human auditory system.

More information

Clustering: Classic Methods and Modern Views

Clustering: Classic Methods and Modern Views Clustering: Classic Methods and Modern Views Marina Meilă University of Washington mmp@stat.washington.edu June 22, 2015 Lorentz Center Workshop on Clusters, Games and Axioms Outline Paradigms for clustering

More information

Ultrasonic Multi-Skip Tomography for Pipe Inspection

Ultrasonic Multi-Skip Tomography for Pipe Inspection 18 th World Conference on Non destructive Testing, 16-2 April 212, Durban, South Africa Ultrasonic Multi-Skip Tomography for Pipe Inspection Arno VOLKER 1, Rik VOS 1 Alan HUNTER 1 1 TNO, Stieltjesweg 1,

More information

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables

Further Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables Further Maths Notes Common Mistakes Read the bold words in the exam! Always check data entry Remember to interpret data with the multipliers specified (e.g. in thousands) Write equations in terms of variables

More information

Image Transformation Techniques Dr. Rajeev Srivastava Dept. of Computer Engineering, ITBHU, Varanasi

Image Transformation Techniques Dr. Rajeev Srivastava Dept. of Computer Engineering, ITBHU, Varanasi Image Transformation Techniques Dr. Rajeev Srivastava Dept. of Computer Engineering, ITBHU, Varanasi 1. Introduction The choice of a particular transform in a given application depends on the amount of

More information

An Intuitive Explanation of Fourier Theory

An Intuitive Explanation of Fourier Theory An Intuitive Explanation of Fourier Theory Steven Lehar slehar@cns.bu.edu Fourier theory is pretty complicated mathematically. But there are some beautifully simple holistic concepts behind Fourier theory

More information

CoE4TN4 Image Processing. Chapter 5 Image Restoration and Reconstruction

CoE4TN4 Image Processing. Chapter 5 Image Restoration and Reconstruction CoE4TN4 Image Processing Chapter 5 Image Restoration and Reconstruction Image Restoration Similar to image enhancement, the ultimate goal of restoration techniques is to improve an image Restoration: a

More information

Radio Interferometry Bill Cotton, NRAO. Basic radio interferometry Emphasis on VLBI Imaging application

Radio Interferometry Bill Cotton, NRAO. Basic radio interferometry Emphasis on VLBI Imaging application Radio Interferometry Bill Cotton, NRAO Basic radio interferometry Emphasis on VLBI Imaging application 2 Simplest Radio Interferometer Monochromatic, point source 3 Interferometer response Adding quarter

More information

Optimised corrections for finite-difference modelling in two dimensions

Optimised corrections for finite-difference modelling in two dimensions Optimized corrections for 2D FD modelling Optimised corrections for finite-difference modelling in two dimensions Peter M. Manning and Gary F. Margrave ABSTRACT Finite-difference two-dimensional correction

More information

Principles of Audio Coding

Principles of Audio Coding Principles of Audio Coding Topics today Introduction VOCODERS Psychoacoustics Equal-Loudness Curve Frequency Masking Temporal Masking (CSIT 410) 2 Introduction Speech compression algorithm focuses on exploiting

More information

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods

More information

Lecture 17 Reprise: dirty beam, dirty image. Sensitivity Wide-band imaging Weighting

Lecture 17 Reprise: dirty beam, dirty image. Sensitivity Wide-band imaging Weighting Lecture 17 Reprise: dirty beam, dirty image. Sensitivity Wide-band imaging Weighting Uniform vs Natural Tapering De Villiers weighting Briggs-like schemes Reprise: dirty beam, dirty image. Fourier inversion

More information

Lecture 2 September 3

Lecture 2 September 3 EE 381V: Large Scale Optimization Fall 2012 Lecture 2 September 3 Lecturer: Caramanis & Sanghavi Scribe: Hongbo Si, Qiaoyang Ye 2.1 Overview of the last Lecture The focus of the last lecture was to give

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 WRI C225 Lecture 04 130131 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Histogram Equalization Image Filtering Linear

More information

NAME :... Signature :... Desk no. :... Question Answer

NAME :... Signature :... Desk no. :... Question Answer Written test Tuesday 19th of December 2000. Aids allowed : All usual aids Weighting : All questions are equally weighted. NAME :................................................... Signature :...................................................

More information

Unsupervised learning in Vision

Unsupervised learning in Vision Chapter 7 Unsupervised learning in Vision The fields of Computer Vision and Machine Learning complement each other in a very natural way: the aim of the former is to extract useful information from visual

More information

An imaging technique for subsurface faults using Teleseismic-Wave Records II Improvement in the detectability of subsurface faults

An imaging technique for subsurface faults using Teleseismic-Wave Records II Improvement in the detectability of subsurface faults Earth Planets Space, 52, 3 11, 2000 An imaging technique for subsurface faults using Teleseismic-Wave Records II Improvement in the detectability of subsurface faults Takumi Murakoshi 1, Hiroshi Takenaka

More information

3. Data Structures for Image Analysis L AK S H M O U. E D U

3. Data Structures for Image Analysis L AK S H M O U. E D U 3. Data Structures for Image Analysis L AK S H M AN @ O U. E D U Different formulations Can be advantageous to treat a spatial grid as a: Levelset Matrix Markov chain Topographic map Relational structure

More information

Face Detection on Similar Color Photographs

Face Detection on Similar Color Photographs Face Detection on Similar Color Photographs Scott Leahy EE368: Digital Image Processing Professor: Bernd Girod Stanford University Spring 2003 Final Project: Face Detection Leahy, 2/2 Table of Contents

More information

Machine Learning for Pre-emptive Identification of Performance Problems in UNIX Servers Helen Cunningham

Machine Learning for Pre-emptive Identification of Performance Problems in UNIX Servers Helen Cunningham Final Report for cs229: Machine Learning for Pre-emptive Identification of Performance Problems in UNIX Servers Helen Cunningham Abstract. The goal of this work is to use machine learning to understand

More information

CHAPTER 3 IMAGE ENHANCEMENT IN THE SPATIAL DOMAIN

CHAPTER 3 IMAGE ENHANCEMENT IN THE SPATIAL DOMAIN CHAPTER 3 IMAGE ENHANCEMENT IN THE SPATIAL DOMAIN CHAPTER 3: IMAGE ENHANCEMENT IN THE SPATIAL DOMAIN Principal objective: to process an image so that the result is more suitable than the original image

More information

CHAPTER 3. Preprocessing and Feature Extraction. Techniques

CHAPTER 3. Preprocessing and Feature Extraction. Techniques CHAPTER 3 Preprocessing and Feature Extraction Techniques CHAPTER 3 Preprocessing and Feature Extraction Techniques 3.1 Need for Preprocessing and Feature Extraction schemes for Pattern Recognition and

More information

In examining performance Interested in several things Exact times if computable Bounded times if exact not computable Can be measured

In examining performance Interested in several things Exact times if computable Bounded times if exact not computable Can be measured System Performance Analysis Introduction Performance Means many things to many people Important in any design Critical in real time systems 1 ns can mean the difference between system Doing job expected

More information

specular diffuse reflection.

specular diffuse reflection. Lesson 8 Light and Optics The Nature of Light Properties of Light: Reflection Refraction Interference Diffraction Polarization Dispersion and Prisms Total Internal Reflection Huygens s Principle The Nature

More information

Assignment 3: Edge Detection

Assignment 3: Edge Detection Assignment 3: Edge Detection - EE Affiliate I. INTRODUCTION This assignment looks at different techniques of detecting edges in an image. Edge detection is a fundamental tool in computer vision to analyse

More information

Simulation Supported POD Methodology and Validation for Automated Eddy Current Procedures

Simulation Supported POD Methodology and Validation for Automated Eddy Current Procedures 4th International Symposium on NDT in Aerospace 2012 - Th.1.A.1 Simulation Supported POD Methodology and Validation for Automated Eddy Current Procedures Anders ROSELL, Gert PERSSON Volvo Aero Corporation,

More information

Model parametrization strategies for Newton-based acoustic full waveform

Model parametrization strategies for Newton-based acoustic full waveform Model parametrization strategies for Newton-based acoustic full waveform inversion Amsalu Y. Anagaw, University of Alberta, Edmonton, Canada, aanagaw@ualberta.ca Summary This paper studies the effects

More information

OSKAR-2: Simulating data from the SKA

OSKAR-2: Simulating data from the SKA OSKAR-2: Simulating data from the SKA AACal 2012, Amsterdam, 13 th July 2012 Fred Dulwich, Ben Mort, Stef Salvini 1 Overview OSKAR-2: Interferometer and beamforming simulator package. Intended for simulations

More information

Analysis of the Parallelisation of the Duchamp Algorithm

Analysis of the Parallelisation of the Duchamp Algorithm ivec Research Internships (2009-2010) Analysis of the Parallelisation of the Duchamp Algorithm Stefan Westerlund University of Western Australia Abstract A critical step in radio astronomy is to search

More information

Computer Vision I - Filtering and Feature detection

Computer Vision I - Filtering and Feature detection Computer Vision I - Filtering and Feature detection Carsten Rother 30/10/2015 Computer Vision I: Basics of Image Processing Roadmap: Basics of Digital Image Processing Computer Vision I: Basics of Image

More information

Image Restoration and Reconstruction

Image Restoration and Reconstruction Image Restoration and Reconstruction Image restoration Objective process to improve an image, as opposed to the subjective process of image enhancement Enhancement uses heuristics to improve the image

More information

VERY LARGE TELESCOPE 3D Visualization Tool Cookbook

VERY LARGE TELESCOPE 3D Visualization Tool Cookbook European Organisation for Astronomical Research in the Southern Hemisphere VERY LARGE TELESCOPE 3D Visualization Tool Cookbook VLT-SPE-ESO-19500-5652 Issue 1.0 10 July 2012 Prepared: Mark Westmoquette

More information

ACS/WFC Crosstalk after Servicing Mission 4

ACS/WFC Crosstalk after Servicing Mission 4 Instrument Science Report ACS 2010-02 ACS/WFC Crosstalk after Servicing Mission 4 Anatoly Suchkov, Norman Grogin, Marco Sirianni, Ed Cheng, Augustyn Waczynski, & Marcus Loose March 10, 2010 ABSTRACT The

More information

Audio-coding standards

Audio-coding standards Audio-coding standards The goal is to provide CD-quality audio over telecommunications networks. Almost all CD audio coders are based on the so-called psychoacoustic model of the human auditory system.

More information

Fast Automated Estimation of Variance in Discrete Quantitative Stochastic Simulation

Fast Automated Estimation of Variance in Discrete Quantitative Stochastic Simulation Fast Automated Estimation of Variance in Discrete Quantitative Stochastic Simulation November 2010 Nelson Shaw njd50@uclive.ac.nz Department of Computer Science and Software Engineering University of Canterbury,

More information

Sentinel-1 Toolbox. TOPS Interferometry Tutorial Issued May 2014

Sentinel-1 Toolbox. TOPS Interferometry Tutorial Issued May 2014 Sentinel-1 Toolbox TOPS Interferometry Tutorial Issued May 2014 Copyright 2015 Array Systems Computing Inc. http://www.array.ca/ https://sentinel.esa.int/web/sentinel/toolboxes Interferometry Tutorial

More information

Digital Image Processing. Image Enhancement - Filtering

Digital Image Processing. Image Enhancement - Filtering Digital Image Processing Image Enhancement - Filtering Derivative Derivative is defined as a rate of change. Discrete Derivative Finite Distance Example Derivatives in 2-dimension Derivatives of Images

More information

MEASURING SURFACE CURRENTS USING IR CAMERAS. Background. Optical Current Meter 06/10/2010. J.Paul Rinehimer ESS522

MEASURING SURFACE CURRENTS USING IR CAMERAS. Background. Optical Current Meter 06/10/2010. J.Paul Rinehimer ESS522 J.Paul Rinehimer ESS5 Optical Current Meter 6/1/1 MEASURING SURFACE CURRENTS USING IR CAMERAS Background Most in-situ current measurement techniques are based on sending acoustic pulses and measuring the

More information

Mid-Year Report. Discontinuous Galerkin Euler Equation Solver. Friday, December 14, Andrey Andreyev. Advisor: Dr.

Mid-Year Report. Discontinuous Galerkin Euler Equation Solver. Friday, December 14, Andrey Andreyev. Advisor: Dr. Mid-Year Report Discontinuous Galerkin Euler Equation Solver Friday, December 14, 2012 Andrey Andreyev Advisor: Dr. James Baeder Abstract: The focus of this effort is to produce a two dimensional inviscid,

More information

MetroPro Surface Texture Parameters

MetroPro Surface Texture Parameters MetroPro Surface Texture Parameters Contents ROUGHNESS PARAMETERS...1 R a, R q, R y, R t, R p, R v, R tm, R z, H, R ku, R 3z, SR z, SR z X, SR z Y, ISO Flatness WAVINESS PARAMETERS...4 W a, W q, W y HYBRID

More information

Dijkstra's Algorithm

Dijkstra's Algorithm Shortest Path Algorithm Dijkstra's Algorithm To find the shortest path from the origin node to the destination node No matrix calculation Floyd s Algorithm To find all the shortest paths from the nodes

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 13 UNSUPERVISED LEARNING If you have access to labeled training data, you know what to do. This is the supervised setting, in which you have a teacher telling

More information

Detecting Polytomous Items That Have Drifted: Using Global Versus Step Difficulty 1,2. Xi Wang and Ronald K. Hambleton

Detecting Polytomous Items That Have Drifted: Using Global Versus Step Difficulty 1,2. Xi Wang and Ronald K. Hambleton Detecting Polytomous Items That Have Drifted: Using Global Versus Step Difficulty 1,2 Xi Wang and Ronald K. Hambleton University of Massachusetts Amherst Introduction When test forms are administered to

More information

Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong)

Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong) Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong) References: [1] http://homepages.inf.ed.ac.uk/rbf/hipr2/index.htm [2] http://www.cs.wisc.edu/~dyer/cs540/notes/vision.html

More information

HYPERSPECTRAL IMAGE COMPRESSION

HYPERSPECTRAL IMAGE COMPRESSION HYPERSPECTRAL IMAGE COMPRESSION Paper implementation of Satellite Hyperspectral Imagery Compression Algorithm Based on Adaptive Band Regrouping by Zheng Zhou, Yihua Tan and Jian Liu Syed Ahsan Ishtiaque

More information

CoE4TN3 Medical Image Processing

CoE4TN3 Medical Image Processing CoE4TN3 Medical Image Processing Image Restoration Noise Image sensor might produce noise because of environmental conditions or quality of sensing elements. Interference in the image transmission channel.

More information

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation Data Mining Part 2. Data Understanding and Preparation 2.4 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Normalization Attribute Construction Aggregation Attribute Subset Selection Discretization

More information

Direct Variations DIRECT AND INVERSE VARIATIONS 19. Name

Direct Variations DIRECT AND INVERSE VARIATIONS 19. Name DIRECT AND INVERSE VARIATIONS 19 Direct Variations Name Of the many relationships that two variables can have, one category is called a direct variation. Use the description and example of direct variation

More information

Chapter 1. Looking at Data-Distribution

Chapter 1. Looking at Data-Distribution Chapter 1. Looking at Data-Distribution Statistics is the scientific discipline that provides methods to draw right conclusions: 1)Collecting the data 2)Describing the data 3)Drawing the conclusions Raw

More information

Image Sampling and Quantisation

Image Sampling and Quantisation Image Sampling and Quantisation Introduction to Signal and Image Processing Prof. Dr. Philippe Cattin MIAC, University of Basel 1 of 46 22.02.2016 09:17 Contents Contents 1 Motivation 2 Sampling Introduction

More information

RFI Identification and Automatic Flagging

RFI Identification and Automatic Flagging RFI Identification and Automatic Flagging Urvashi Rau VLA Data Reduction Workshop 8 12 April 2013 Outline : 1388MHz RFI at the VLA + Online Flags Automatic Flagging Options and Strategies Some examples

More information

Descriptive Statistics, Standard Deviation and Standard Error

Descriptive Statistics, Standard Deviation and Standard Error AP Biology Calculations: Descriptive Statistics, Standard Deviation and Standard Error SBI4UP The Scientific Method & Experimental Design Scientific method is used to explore observations and answer questions.

More information

Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Exploratory data analysis tasks Examine the data, in search of structures

More information

Image Sampling & Quantisation

Image Sampling & Quantisation Image Sampling & Quantisation Biomedical Image Analysis Prof. Dr. Philippe Cattin MIAC, University of Basel Contents 1 Motivation 2 Sampling Introduction and Motivation Sampling Example Quantisation Example

More information

3 Graphical Displays of Data

3 Graphical Displays of Data 3 Graphical Displays of Data Reading: SW Chapter 2, Sections 1-6 Summarizing and Displaying Qualitative Data The data below are from a study of thyroid cancer, using NMTR data. The investigators looked

More information

RS SigEdit A module of RS LabSite Advanced Graphical Display and Editing

RS SigEdit A module of RS LabSite Advanced Graphical Display and Editing RS SigEdit A module of RS LabSite Advanced Graphical Display and Editing Expanding your Signal Editing Capabilities The RS LabSite suite of software offers two applications for data viewing and editing,

More information

convolution shift invariant linear system Fourier Transform Aliasing and sampling scale representation edge detection corner detection

convolution shift invariant linear system Fourier Transform Aliasing and sampling scale representation edge detection corner detection COS 429: COMPUTER VISON Linear Filters and Edge Detection convolution shift invariant linear system Fourier Transform Aliasing and sampling scale representation edge detection corner detection Reading:

More information

EE 701 ROBOT VISION. Segmentation

EE 701 ROBOT VISION. Segmentation EE 701 ROBOT VISION Regions and Image Segmentation Histogram-based Segmentation Automatic Thresholding K-means Clustering Spatial Coherence Merging and Splitting Graph Theoretic Segmentation Region Growing

More information

Data Mining - Data. Dr. Jean-Michel RICHER Dr. Jean-Michel RICHER Data Mining - Data 1 / 47

Data Mining - Data. Dr. Jean-Michel RICHER Dr. Jean-Michel RICHER Data Mining - Data 1 / 47 Data Mining - Data Dr. Jean-Michel RICHER 2018 jean-michel.richer@univ-angers.fr Dr. Jean-Michel RICHER Data Mining - Data 1 / 47 Outline 1. Introduction 2. Data preprocessing 3. CPA with R 4. Exercise

More information

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm Group 1: Mina A. Makar Stanford University mamakar@stanford.edu Abstract In this report, we investigate the application of the Scale-Invariant

More information

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search

Clustering. Informal goal. General types of clustering. Applications: Clustering in information search and analysis. Example applications in search Informal goal Clustering Given set of objects and measure of similarity between them, group similar objects together What mean by similar? What is good grouping? Computation time / quality tradeoff 1 2

More information

Multi-frame blind deconvolution: Compact and multi-channel versions. Douglas A. Hope and Stuart M. Jefferies

Multi-frame blind deconvolution: Compact and multi-channel versions. Douglas A. Hope and Stuart M. Jefferies Multi-frame blind deconvolution: Compact and multi-channel versions Douglas A. Hope and Stuart M. Jefferies Institute for Astronomy, University of Hawaii, 34 Ohia Ku Street, Pualani, HI 96768, USA ABSTRACT

More information

Robust Kernel Methods in Clustering and Dimensionality Reduction Problems

Robust Kernel Methods in Clustering and Dimensionality Reduction Problems Robust Kernel Methods in Clustering and Dimensionality Reduction Problems Jian Guo, Debadyuti Roy, Jing Wang University of Michigan, Department of Statistics Introduction In this report we propose robust

More information