On the Use of the SVM Approach in Analyzing an Electronic Nose

Size: px

Start display at page:

Download "On the Use of the SVM Approach in Analyzing an Electronic Nose"

Shonda Brown
6 years ago
Views:

Seventh International Conference on Hybrid Intelligent Systems On the Use of the SVM Approach in Analyzing an Electronic Nose Manlio Gaudioso gaudioso@deis.unical.it. Walaa Khalaf walaa@deis.unical.it. Calogero Pace cpace@unical.

1 Seventh International Conference on Hybrid Intelligent Systems On the Use of the SVM Approach in Analyzing an Electronic Nose Manlio Gaudioso Walaa Khalaf Calogero Pace Dipartimento di Elettronica Informatica e Sistemistica, Università della Calabria, 876 Rende (CS), Italia Abstract We present an Electronic Nose (ENose) which is aimed both at identifying the type of gas and at estimating its concentration. Our system contains 8 sensors, 5 of them being gas sensors (of the class TGS from FIGARO USA, INC., whose sensing element is a tin dioxide (SnO ) semiconductor), the remaining being a temperature sensor (LM5 from National Semiconductor Corporation), a humidity sensor (HIH 6 from Honeywell), and a pressure sensor (XFAM from Fujikura Ltd.). Our integrated hardware software system uses some machine learning principles and least square regression principle to identify at first a new gas sample, and then to estimate its concentration, respectively. In particular we adopt a training model using the Support Vector Machine (SVM) approach to teach the system how discriminate among different gases, then we apply another training model using the least square regression, for each type of gas, to predict its concentration.. Introduction The paper deals with the problems of gases detection and recognition as well as with the estimation of their concentrations. In fact, detection and recognition can be seen as a two class and a multi class classification problem, respectively. The detection of volatile organic compounds (VOCs) has become a serious task in many fields, because the fast evaporation rate and toxic nature of VOCs could be dangerous at high concentration levels in air and working ambients for the health of human beings. In fact, the VOCs are also considered as the main reason for allergic pathologies, skin and lung diseases. In this paper, to identify the type of gas we use the sup- Corresponding author. port vector machine (SVM) approach which was introduced by Vapnik [] as a classification tool. The SVM method strongly relies on statistical learning theory. Classification is based on the idea of finding the best separating hyperplane (in terms of classification error and separation margin) of two point sets in the sample space (which in our case is the Euclidean eight dimension vector space). Our classification approach includes the possibility of adopting Kernel transformations within the SVM context. We adopt a multi sensor scheme and useful information is gathered by combining the outputs of the different sensors. The use of just one sensor does not allow in general to identify the gas. In fact the same sensor output may correspond to different concentrations of many different gases. On the other hand by combining the information coming from several sensors of diverse types we identify the gas and estimate its concentration. In this paper we present the description of the system as shown in Figure producing the details of its construction, a brief theoretical overview of the mathematical models for classification and regression, and the results of our experiments on four different types of gases (Methanol, Ethanol, Acetone, and Benzene). The results have been particularly encouraging both in terms of classification errors and concentration prediction. In particular we present in section the scheme of the electronic nose (ENose), in section we give a brief overview of the SVM approach. Section 4 is devoted to the description of the experiments and results. Finally the conclusions are drawn in section 5.. Electronic nose An electronic nose combines an array of gas sensors, whose response constitutes an odor pattern [9]. A single sensor in the array should not be highly specific in its response but should respond to a broad range of compounds, so that different patterns are expected to be related to dif /7 $5. 7 IEEE DOI.9/HIS.7.6 4

2 Figure. Block diagram of the system ferent odors. To achieve high recognition rate, several sensors with different selectivity patterns are used and pattern recognition techniques must be coupled with the sensor array [8]. Our system consists of five different types of gas sensors which are from the TGS class of FIGARO USA INC. The sensing element is a tin dioxide (SnO ) semiconductor. In particular three of them are of TGS 8 type, one of the TGS 8 type, and the last one is TGS 6 type. In addition three auxiliary sensors are used: they are a temperature sensor (LM5 from National Semiconductor Corporation), a humidity sensor (HIH-6 from Honeywell), and a pressure sensor (XFAM from Fujikura Ltd.). Humidity and temperature changes have a strong effect on most sensors. The gas sensors and the auxiliary sensors are put inside a box with effective volume of cc. Inside the box we put also a fan to let the solvent drops evaporate easily. All sensors are connected with an analog digital converter (A/D) and then by means of an interface card to a PC, where the training for both classification and estimation of the concentration is implemented.. Support vector machines (SVM) Support vector machines (SVMs) are a set of related supervised learning methods used for classification and regression. They belong to a family of generalized linear classifiers. This family of classifiers has both the abilities of minimizing the empirical classification error and maximizing the geometric margin as shown in Figure. In fact a SVM is also known as maximum margin classifier [5]. In this section we summarize the main features of SVM. Detailed surveys can be found in [][4][6][]. As shown in Figure, geometrically the optimal separating hyperplane for two point sets can be found by constructing the convex hulls of the two classes and then equally dividing the shortest distance between the hulls. The orthogonal line will be the optimal hyperplane that separates the two classes. The equation of this hyperplane is defined by f(x) =w T x + b = (boldface for vectors) and it can be demonstrated that, for a proper normalization, the margin is equal to w. Therefore maximizing the margin is equivalent to minimizing w. 4

3 The advantage of this maximum margin criterion is being robust against noise and making the solution unique for linearly separable problems. In many practical cases the data are not linearly separable, then the hyperplane tries to both maximize the margin and minimize the sum of classification errors at the same time. The error ξ i of a point (x i,y i ),y i {, +} (with respect to target margin γ and for an hyperplane f) is defined as: ξ i = ξ((x i,y i ),f(x i ),γ)=max(,γ y i f(x i )) ξ i is called margin slack variable which measures how much a point fails to have margin γ. Ify i and f(x i ) have different signs the point x i is misclassified and ξ i = max(,γ y i f(x i )) >γ>. The error ξ i is greater than zero if the point x i is correctly classified but with margin smaller than γ. Finally, the more x i falls in the wrong region, the bigger is the error. The cost function to be minimized is: w + C ξ i () i where C is a positive constant, which determines the trade off between accuracy on the training set and margin width. Therefore, this constant can be regarded as a regularization parameter. When C has a small value, the optimal separating hyperplane tends to maximize the distance with respect to the closest point, while for large values of C, the optimal separating hyperplane tends to minimize the non-correctly classified points. If the original patterns are non-linear separable, they must be mapped by means of appropriate Kernel functions to a higher dimensional space called feature space. A linear separation in the feature space corresponds to a non-linear separation in the original input space []. Kernels are a special class of functions that permit the inner products to be calculated directly in the feature space, without explicitly applying the mapping. The family of kernel functions adopted in machine learning range from simple linear and polynomial mappings to sigmoid and radial basis functions [7]. A radial basis function kernel maps the training data into an infinite dimensional space. In this paper the linear kernel is used. 4. Experiments and results In our experiments we used four different types of gases with different concentrations. They are methanol, ethanol, acetone, and benzene. The data set for these gases is made up of samples in R 8 space where each sample correspond to Figure. How to find the optimal hyperplane the outputs of the sensors for a given couple (gas, concentration) as shown in Figure. In the first analysis, we used a SVM with a linear kernel function, and we applied a multi class classification by using the LIBSVM-.8 package []. The optimal regularization parameter C was found out experimentally by minimizing the leave-one-out crossvalidation error over the training set. In k fold cross validation, the data is divided into k subsets (where k equals the data set size). Then the program is trained k times, each time leaving out one of the subsets from training set, and consider this omitted sample as a testing sample to compute the error. The results of classification are shown in Table with respect to different values of C. We used 6 gas samples for methanol, 6 gas samples for ethanol, 6 gas samples for acetone, and 8 gas samples for benzene. Each experiment was repeated twice with each concentration. By using linear Kernel we got 94.74% classification correctness, making 96 cross validations. After we finished the classification process, the next step was to estimate the concentration of the classified gas. To this aim, we have used the least square regression. We builded an approximation of the response (sensor resistance versus concentration) for each sensor and each gas. Then we used this approximation to find the concentration for each gas type. For sintered SnO gas sensor, the concentration dependence of the response is nonlinear and can be described by a power law of the form [] S = Ac R () 44

4 PCA.448 % Methanol Ethanol Acetone Benzene PCA % 4 x 4 TGS x 4 TGS x 4 TGS x 4 TGS x 4 TGS Original estimated Figure. First two principle components for the experimental data set Figure 4. Acetone concentrations vs sensor resistance for each sensor type where S is the sensor resistance, A a constant, c the concentration of the gas and R an index that lies between. and.. We applied this equation on all sensors for each gas. The values of A s and R s, are calculated as follows: n (ln c i ln S i ) (ln c i ) (ln S i ) r = n (ln c i ) ( ln c i ) () (ln S i ) r (ln c i ) η = n where R r and A exp(η). Figure 4 shows the original concentrations with respect to its sensor resistances, as well as the estimated curve for acetone type gas. We will have five concentrations, each one coming from one sensor. To find the optimal estimate of the concentration we have adopted the least square regression principle. We came out in our experiments with five coefficients (α s) that provide the concentration estimate as a combination of the results produced by the diverse sensors with weights coming from least square computation. They were obtained by solving the following minimization problem with respect to α, ( ) M min α,...,αm c (l) α i c (l) i (4) l= where n is the number of gas samples, c is the true concentration, M is the number of sensors (in our case M =5), c the concentrations that have been previously calculated. Table shows the real concentrations with respect to the Table. Multiple C values vs classification rate C value Classification rate % results of the proposed method. For comparison purposes we add in the table also the results obtained by simply averaging the outcomes provided by the five sensors. Finally we considered ( Table ) the correlation coefficient (C.C) as a measure for the estimation accuracy []. 5. Conclusion The results demonstrate that our system has the ability to identify the type of the gas and then estimate its concentration. The best correctness was 94.74%. the values obtained in terms of concentration estimates appear quite 45

5 Table. Experimental results Real Concentrations Concentration Concentration by our method by method of average satisfactory. Future work will be devoted to identify binary mixtures of gases and then to calculate the concentration of each component. 6. Acknowledgements The authors are grateful to A. Astorino for some useful discussions about classification. Also, we wish to thank the Dipartimento di Elettronica Informatica e Sistemistica, Università della Calabria, for supporting this project. The work has been partially supported by the Italian Ministero dell Università e della Ricerca Scientifica under PRIN Project Numerical Methods for Global Optimization and for some classes of Nonsmooth Optimization problems (578.). [] C. Burges. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, (): 67, 998. [] C. C. Chang and C. J. Lin. Libsvm: A library for support vector machines. [Online]. Available: cjlin/libsvm,. [4] N. Cristianini and J. Shawe-Taylor. An Introduction to Support Vector Machines and other kernel based learning methods. Cambridge University Press,. [5] C. Distante, N. Ancona, and P. Siciliano. Support vector machines for olfactory signals recognition. Sensors and Actuators B, 88(): 9,. [6] R. Gutierrez-Osuna. Pattern analysis for machine olfaction: Areview.IEEE Sensors Journal, ():89, June. [7] K. Müller,S.Mika,G.Rätsch, K. Tsuda, and B. Schölkopf. An introduction to kernel-based learning algorithms. IEEE Trans. on Neural Networks, ():8, March. [8] M. Pardo and G. Sberveglieri. Classification of electronic nose data with support vector machines. Sensors and Actuators B, 7:7 77, 5. [9] T. C. Pearce, S. S. Schiffman, H. T. Nagle, and J. W. Gardner. Handbook of Machine Olfaction: Electronic Nose Technology. WILEY VCH,. [] M. Penza, G. Cassano, and F. Tortorella. Identification and quantification of individual volatile organic compounds in a binary mixture by saw multisensor array and pattern recognition analysis. Measurment Science and Technology, : ,. [] V. N. Vapnik. Statistical Learning Theory. John Wiley and Sons, 998. [] X. Wang, H. Zhang, and C. Zhang. Signals recognition of electronic nose based on support vector machines. Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, Guangzhou, pages 94 98, 8 August 5. Table. Correlation coefficient (C.C) value for each gas Gas type C.C for the C.C for the new method method of average Acetone Methanol Ethanol Benzene References [] P. Bartlett and J. Gardner. Odour sensors for an electronic nose, in: Sensors and sensory systems for an electronic nose. NATO ASI Series E: Applied Sience, :97 6,

Robot Learning. There are generally three types of robot learning: Learning from data. Learning by demonstration. Reinforcement learning

Robot Learning. There are generally three types of robot learning: Learning from data. Learning by demonstration. Reinforcement learning Robot Learning 1 General Pipeline 1. Data acquisition (e.g., from 3D sensors) 2. Feature extraction and representation construction 3. Robot learning: e.g., classification (recognition) or clustering (knowledge