arxiv: v2 [cs.lg] 22 Jan 2019

Size: px
Start display at page:

Download "arxiv: v2 [cs.lg] 22 Jan 2019"

Transcription

1 Spatial Variational Auto-Encoing via Matrix-Variate Normal Distributions Zhengyang Wang Hao Yuan Shuiwang Ji arxiv: v2 [cs.lg] 22 Jan 2019 Abstract The key iea of variational auto-encoers (VAEs) resembles that of traitional auto-encoer moels in which spatial information is suppose to be explicitly encoe in the latent space. However, the latent variables in VAEs are vectors, which can be interprete as multiple feature maps of size 1x1. Such representations can only convey spatial information implicitly when couple with powerful ecoers. In this work, we propose spatial VAEs that use feature maps of larger size as latent variables to explicitly capture spatial information. This is achieve by allowing the latent variables to be sample from matrix-variate normal (MVN) istributions whose parameters are compute from the encoer network. To increase epenencies among locations on latent feature maps an reuce the number of parameters, we further propose spatial VAEs via low-rank MVN istributions. Experimental results show that the propose spatial VAEs outperform original VAEs in capturing rich structural an spatial information. Keywors Deep learning, variational auto-encoers, matrixvariate normal istributions, generative moels, unsupervise learning 1 Introuction. The mathematical an computational moeling of probability istributions in high-imensional space an generating samples from them are highly useful yet very challenging. With the evelopment of eep learning methos, eep generative moels have been shown to be effective an scalable [12, 22, 5, 9, 19, 8, 21] in capturing probability istributions over high-imensional ata spaces an generating samples from them. Among them, variational auto-encoers (VAEs) [12, 22, 6, 11] are one of the most promising approaches. In machine Department of Computer Science an Engineering at Texas A&M University. zhengyang.wang@tamu.eu School of Electrical Engineering an Computer Science at Washington State University. hao.yuan@wsu.eu Department of Computer Science an Engineering at Texas A&M University. sji@tamu.eu learning, the auto-encoer architecture is applie to train scalable moels by learning latent representations. For image moeling tasks, it is preferre to encoe spatial information into the latent space explicitly. However, the latent variables in VAEs are vectors, which can be interprete as 1 1 feature maps with no explicit spatial information. While such lack of explicit spatial information oes not lea to major performance problems on simple tasks such as igit generation from the MNIST ataset [16], it greatly limits the moel s abilities when images are more complicate [13, 17]. To overcome this limitation, we propose spatial VAEs that employ ( > 1) feature maps as latent representations. Such latent feature maps are generate from matrix-variate normal (MVN) istributions whose parameters are compute from the encoer network. Specifically, MVN istributions are able to generate feature maps with appropriate epenencies among locations. To increase epenencies among locations on latent feature maps an reuce the number of parameters, we further propose spatial VAEs via lowrank MVN istributions. In this low-rank formulation, the mean matrix of MVN istribution is compute as the outer prouct of two vectors compute from the encoer network. Experimental results on image moeling tasks emonstrate the capabilities of our spatial VAEs in complicate image generation tasks. It is worth noting that the original VAEs can be consiere as a special case of spatial VAEs via MVN istributions. That is, if we set the size of feature maps generate via MVN istributions to 1 1, spatial VAEs via MVN istributions reuce to the original VAEs. More importantly, when the size of feature maps is larger than 1 1, irect structural ties have been built into elements of the feature maps via MVN istributions. Thus, our propose spatial VAEs are intrinsically ifferent with the original VAEs when the size of feature maps is larger than 1 1. Specifically, our propose spatial VAEs cannot be obtaine by enlarging the size of the latent representations in the original VAEs.

2 2 Backgroun an Relate Work. In this section, we introuce the architectures of autoencoers an variational auto-encoers. 2.1 Auto-Encoer Architectures. Auto-encoer (AE) is a moel architecture use in tasks like image segmentation [30, 23, 18], machine translation [2, 25] an enoising reconstruction [28, 29]. It consists of two parts: an encoer that encoes the input ata into lower-imensional latent representations an a ecoer that generates outputs by ecoing the representations. Depening on ifferent tasks, the latent representations will focus on ifferent properties of input ata. Nevertheless, these tasks usually require outputs to have similar or exactly the same structure as inputs. Thus, structural information is expecte to be preserve through the encoer-ecoer process. In computer vision tasks, structural information usually means spatial information of images. There are two main strategies to preserve spatial information in AE for image tasks. One is to apply very powerful ecoers, like conitional pixel convolutional neural networks (PixelCNNs) [20, 27, 24, 9], that generate output images pixel-by-pixel. In this way, the ecoers can recover spatial information in the form of epenencies among pixels. However, pixel-by-pixel generation is very slow, resulting in major spee problems in practice. The other metho is to let the latent representations explicitly contain spatial information an apply ecoers that can make use of such information. To apply this strategy for image tasks, usually the latent representations are feature maps of size between the size of a pixel (1 1) an that of the input image, while the ecoers are econvolutional neural networks (DCNNs) [30]. Since most computer vision tasks only require high-level spatial information like relative locations of objects instea of etaile relationships among pixels, preserving only rough spatial information is enough, an this strategy is prove effective an efficient. 2.2 Variational Auto-Encoers. In unsupervise learning, generative moels aim to moeling the unerlying ata istribution. Formally, for ata space X, let p true (x) enote the probability ensity function (PDF) of the true ata istribution for x X. Given a ataset D = {x (i) } N i=1 of i.i. samples from X, generative moels try to approximate p true (x) using a moel istribution p θ (x) where θ represents moel parameters. To train the moel, maximum likelihoo (ML) inference is performe on θ; that is, parameters are upate to optimize log p θ (D) = log p θ (x (1),..., x (N) ) = N i=1 log p θ(x (i) ). The approximation quality of p θ (x) relies on the generalization ability of the moel. In machine learning, it highly epens on learning latent representations which can encoe common features among ata samples an isentangle abstract explanatory factors behin the ata [3]. In ata generation tasks, we apply p θ (x) = p θ (x z)p θ (z)z for moeling, where p θ (z) is the PDF of the istribution of latent representations an p θ (x z) represents a complex mapping from the latent space to the ata space. A major avantage of using latent representations is imensionality reuction of ata since they are low-imensional. The prior p θ (z) can be simple an easy to moel while the mapping represente by p θ (x z) can be learne through complicate eep learning moels automatically. Recently, [12] point out that the above moel has intractability problems an can only be traine by costly sampling-base methos. To tackle this, they propose variational auto-encoers (VAEs), which instea maximize a variational lower boun of the loglikelihoo as (2.1) log p θ (x) L VAE = E z qφ (z x)[log p θ (x z)] D KL [q φ (z x) p θ (z)], where q φ (z x) is an approximation moel to the intractable p θ (z x), parameterize by φ, D KL [ ] represents the Kullback-Leibler ivergence. In VAEs, p θ (x z) = N (x; f θ (z), σ 2 I), q φ (z x) = N (z; µ φ (x), Σ φ (x)), an p θ (z) = N (z; 0, I) are moele as multivariate Gaussian istributions with iagonal covariance matrices. Here, f θ (z), µ φ (x) an Σ φ (x) are compute with eep neural networks like CNNs. Figure 1 shows the architecture of VAEs. The moel parameters θ an φ can be traine using the reparameterization trick [22], where the sampling process z q φ (z x) = N (z; µ φ (x), Σ φ (x)) is ecompose into two steps as (2.2) ɛ N (ɛ; 0, I), z = µ φ (x) + Σ 1 2 φ (x) ɛ. 3 Spatial Variational Auto-Encoers. In this section, we analyze a problem of the original VAEs an propose spatial VAEs in Section 3.1 to overcome it. Afterwars, several ways to implement spatial VAEs are iscusse. A naïve implementation is introuce an analyze in Section 3.2, followe by a metho that incorporates the use of matrix-variate normal (MVN) istributions in Section 3.3. Finally, we propose our final moel, spatial VAEs via low-rank MVN istributions, by applying a low-rank formulation of MVN istributions in Section 3.4.

3 3.1 Overview. Note that p θ (x z) an q φ (z x) in VAEs resemble the encoer an ecoer, respectively, in AE for image reconstruction tasks, where z represents the latent representations. However, in VAE, z is commonly a vector, which can be consiere as multiple 1 1 feature maps. While z may implicitly preserve some spatial information of the input image x, it raises the requirement for a more complex ecoer. Given a fixe architecture, the hypothesis space of ecoer moels is limite. As a result, the optimal ecoer may not lie in the hypothesis space [31]. This problem significantly hampers the performance of VAEs, especially when spatial information is important for images in X. Base on the above analysis, it is beneficial to either have larger hypothesis space for ecoers or let z explicitly contain spatial information. Note that these two methos correspon to the two strategies introuce in Section 2.1. [9] follow the first strategy an propose PixelVAEs whose ecoers are conitional PixelCNNs [27] instea of simple DCNNs. As conitional PixelCNNs themselves are also generative moels, PixelVAEs can be consiere as conitional PixelCNNs with the conitions replace by z. In spite of their impressive results, the performance of PixelVAEs an conitional PixelC- NNs is similar, which inicates that conitional Pixel- CNNs are responsible for capturing most properties of images in X. In this case, z contributes little to the performance. In aition, applying conitional PixelCNNs leas to very slow generation process in practice. In this work, the secon strategy is explore by constructing spatial latent representations z in the form of feature maps of size larger than 1 1. Such feature maps can explicitly contain spatial information. We term VAEs with spatial latent representations as spatial VAEs. The main istinction between spatial VAEs an the original VAEs is the size of latent feature maps. By having ( > 1) feature maps instea of 1 1 ones, the total imension of the latent representations z significantly increases. However, spatial VAEs are essentially ifferent from the original VAEs with a higherimensional latent vector z. Suppose the vector z is extene by 2 times in orer to match the total imension, the number of hien noes in each layer of ecoers will exploe corresponingly. This results in an explosion in the number of ecoers parameters, which slows own the generation process. Whereas in spatial VAEs, ecoers becomes even simpler since is closer to the require size of output images. From the other sie, when using ecoers of similar capacities, spatial VAEs must have higher-imensional latent representations than the original VAEs. It is emonstrate that this only slightly influences the training process by requiring more outputs from encoers, while the generation process that only involves ecoers remains unaffecte. Our experimental results show that with proper esigns, spatial VAEs substantially outperform the original VAEs when applying similar ecoers. 3.2 Naïve Spatial VAEs. To achieve spatial VAEs, a irect an naïve way is to simply reshape the original vector z into N feature maps of size. But this naïve way is problematic since the sampling process oes not change. Note that in the original VAEs, the vector z is sample from q φ (z x) = N (z; µ φ (x), Σ φ (x)). The covariance matrix Σ φ (x) is iagonal, meaning each variable is uncorrelate. In particular, for multivariate Gaussian istributions, uncorrelation implies inepenence. Therefore, z s components are inepenent ranom variables an the variances of their istributions correspon to entries on the iagonal of Σ φ (x). Specifically, suppose z is a C- imensional vector, the i th component is a ranom variable that follows the univariate normal istribution as z i N (z i ; µ φ (x) i, iag(σ φ (x)) i ), i = 1,..., C, where iag( ) represents the vector consisting of a matrix s iagonal entries. After applying the reparameterization trick, we can rewrite Equation 2.2 as (3.3) ɛ i N (ɛ i ; 0, 1), z i = µ φ (x) i + iag(σ φ (x)) 1 2 i ɛ i, i = 0,..., C. To sample N feature maps of size in naïve spatial VAEs, the above process is followe by a reshape operation while setting C = 2 N. However, between two ifferent components z i an z j, the only relationship is that their respective istribution parameters (µ φ (x) i, iag(σ φ (x)) i ) an (µ φ (x) j, iag(σ φ (x)) j ) are both compute from x. Such epenencies are implicit an weak. It is obvious that after reshaping, there is no irect relationship among locations within each feature map, while spatial latent representations shoul contain spatial information like epenencies among locations. To overcome this limitation, we propose spatial VAEs via matrix-variate normal istributions. 3.3 Spatial VAEs via Matrix-Variate Normal Distributions. Instea of obtaining N feature maps of size by first sampling a 2 N-imensional vector from multivariate normal istributions an then reshaping, we propose to irectly sample matrices as feature maps from matrix-variate normal (MVN) istributions [10], resulting in an improve moel known as spatial VAEs via MVN istributions. Specifically, we moify q φ (z x) in the original VAEs an keep other parts the same. As explaine below, MVN istributions can moel epenencies between the rows an columns

4 Ouputs Input Image Encoer Sampling Decoer Generate Image Interpretation C C 1 1 C iag( ) z~ N(, ) N k k = k iag( ) iag( ) iag( k ) iag( k ) = reshape ( ) iag k N N z ~ MVN (, ) Figure 1: Illustration of the ifferences between the propose spatial VAEs via low-rank MVN istributions an the original VAEs. At the top is the architecture of the original VAEs where the latent z is a vector sample from a multivariate Gaussian istribution with a iagonal covariance matrix. Below is the propose moel which is explaine in etail in Section 3.4. Briefly, it moifies the sampling process by incorporating a low-rank formulation of the MVN istributions an prouces latent representations that explicitly retain spatial information. in a matrix. In this way, epenencies among locations within a feature map are establishe. We procee by proviing the efinition of MVN istributions. Definition: A ranom matrix A R m n is sai to follow a matrix-variate normal istribution N m,n (A; M, Ω Ψ) with mean matrix M R m n an covariance matrix Ω Ψ, where Ω R m m > 0, Ψ R n n > 0, if vec(a T ) follows the multivariate normal istribution N (vec(a T ); vec(m T ), Ω Ψ). Here, enotes the Kronecker prouct an vec( ) enotes transforming a R m n matrix into an mn-imensional vector by concatenating the columns. In MVN istributions, Ω an Ψ capture the relationships across rows an columns, respectively, of a matrix. By constructing the covariance matrix through the Kronecker prouct of these two matrices, epenencies among values in a matrix can be moele. In spatial VAEs, a feature map F can be consiere as a R matrix that follows a MVN istribution N, (F ; M, Ω Ψ), where Ω R an Ψ R are iagonal matrices. Although within F the ranom variables corresponing to each location are still inepenent since Ω Ψ is iagonal, MVN istributions are able to a irect structural ties among locations through their variances. For example, for two locations (i 1, j 1 ) an (i 2, j 2 ) in F, (3.4) (3.5) F (i1,j 1) N (F (i1,j 1); M (i1,j 1), iag(ω Ψ) i1 j 1 ), F (i2,j 2) N (F (i2,j 2); M (i2,j 2), iag(ω Ψ) i2 j 2 ). Here, F (i1,j 1) an F (i2,j 2) are inepenently sample from two univariate Gaussian istributions. However, the variances iag(ω Ψ) i1 j 1 an iag(ω Ψ) i2 j 2 have built irect interactions through the Kronecker prouct. Base on this, we propose spatial VAEs via MVN istributions, which samples N feature maps of size from N inepenent MVN istributions as (3.6) F k N, (F k ; M kφ (x), Ω kφ (x) Ψ kφ (x)),

5 k = 0,..., N, where M kφ (x), Ω kφ (x) an Ψ kφ (x) are compute through the encoer. Here, compare to the original VAEs, q φ (z x) is replace but p θ (z) remains the same. Since MVN istributions are efine base on multivariate Gaussian istributions, the term D KL [q φ (z x) p θ (z)] in Equation 2.1 can be calculate in a similar way. To emonstrate the ifferences with naïve spatial VAEs, we reexamine the original VAEs. Note that naïve spatial VAEs have the same sampling process as the original VAEs. The original VAE samples a C = 2 N- imensional vector z from q φ (z x) = N (z; µ φ (x), Σ φ (x)) where µ φ (x) is a C-imensional vector an Σ φ (x) is a R C C iagonal matrix. Because Σ φ (x) is iagonal, it can be represente by the C-imensional vector iag(σ φ (x)). To summarize, the encoer of the original VAEs outputs 2C = 2 2 N values which are interprete as µ φ (x) an iag(σ φ (x)). In spatial VAEs via MVN istributions, accoring to Equation 3.6, M kφ (x) is a R matrix while Ω kφ (x) an Ψ kφ (x) are R iagonal matrices that can be represente by -imensional vectors. In this case, the require number of outputs from the encoer is change to ( 2 + 2)N, corresponing to [M 1φ (x),..., M N φ (x)], [iag(ω 1φ (x)),..., iag(ω N φ (x))] an [iag(ψ 1φ (x)),..., iag(ψ N φ (x))]. As has been explaine in Section 3.2, since Ω kφ (x) Ψ kφ (x) is iagonal, sampling the matrix F k is equivalent to sampling scalar numbers from inepenent univariate normal istributions. So the moifie sampling process with the reparameterization trick is (3.7) where ɛ (i,j,k) N (ɛ (i,j,k) ; 0, 1), z (i,j,k) = µ kφ (x) (i,j) +iag(ω kφ (x) Ψ kφ (x)) 1 2 i j ɛ (i,j,k), i, j = 0,...,, k = 1,..., N, iag(ω kφ (x) Ψ kφ (x)) i j = [iag(ω kφ (x))iag T (Ψ kφ (x))] (i,j). Here, we take avantage of the fact that for iagonal matrices, the Kronecker prouct is equivalent to the out-prouct of vectors. To be specific, suppose D 1 an D 2 are two R iagonal matrices, then 1 = iag(d 1 ) an 2 = iag(d 2 ) are two -imensional vectors an satisfy (3.8) iag(d 1 D 2 ) = vec( 1 T 2 ). It is worth noting that, compare to naïve spatial VAEs, the require number of outputs from the encoer ecreases from 2 2 N to ( 2 + 2)N. As a result, spatial VAEs via MVN istributions leas to a simpler moel while aing structural ties among locations. Note that the original VAEs can be consiere as a special case of the spatial VAEs via MVN istributions. That is, if we set = 1, spatial VAEs via MVN istributions reuce to the original VAEs. 3.4 A Low-Rank Formulation. The use of MVN istributions makes locations irectly relate to each other within a feature map by aing restrictions on variances. However, in probability theory, variance only measures the expecte istance from the mean. To have more irect relationships, it is preferre to have restricte means. In this section, we introuce a lowrank formulation of MVN istributions [1] for spatial VAEs. The low-rank formulation of a MVN istribution N m,n (M, Ω Ψ) is enote as N m,n (µ, ν, Ω Ψ) where the mean matrix M is compute by the out-prouct µν T instea. Here, µ an ν are m-imensional an n- imensional vectors, respectively. Similar to computing the covariance matrix through the Kronecker prouct of two separate matrices, it explicitly forces structural interactions among entries of the mean matrix. Applying this low-rank formulation leas to our final moel, spatial VAEs via low-rank MVN istributions, which is illustrate in Figure 1. By using two istinct - imensional vectors to construct M iφ (x) R, Equation 3.6 is moifie as (3.9) F k N, (F k ; µ kφ (x)ν k T φ (x), Ω kφ (x) Ψ kφ (x)), k = 0,..., N, where µ kφ (x) an ν kφ (x) are -imensional vectors. For the encoer, the number of outputs is further reuce to 4N from ( 2 + 2)N, replacing 2 N outputs for (M 1φ (x),..., M N φ (x)) with N outputs for (µ 1φ (x),..., µ N φ (x)) an another N outputs for (ν 1φ (x),..., ν N φ (x)). In contrast to Equation 3.7, the two-step sampling process can be expresse as (3.10) where ɛ (i,j,k) N (ɛ (i,j,k) ; 0, 1), z (i,j,k) = (µ kφ (x)ν k T φ (x)) (i,j) +iag(ω kφ (x) Ψ kφ (x)) 1 2 i j ɛ (i,j,k), i, j = 0,...,, k = 1,..., N, iag(ω kφ (x) Ψ kφ (x)) i j = [iag(ω kφ (x))iag T (Ψ kφ (x))] (i,j). As has been emonstrate in Section 3.1, spatial VAEs require more outputs from encoers than the original

6 Figure 2: Sample face images generate by ifferent VAEs when traine on the CelebA ataset. The first an secon rows shows training images an images generate by the original VAEs. The remaining three rows are the results of naïve spatial VAEs, spatial VAEs via MVN istributions an spatial VAEs via low-rank MVN istributions, respectively. VAEs, which slows own the training process. Spatial VAEs via low-rank MVN istributions properly aress the problem while achieving appropriate spatial latent representations. Accoring to the experimental results, they outperform the original VAEs in several image generation tasks when similar ecoers are use. 4 Experimental Stuies. We use the original VAEs as the baseline moels in our experiments, as most recent improvements on VAEs are erive from the vector latent representations an can be easily incorporate into our matrix-base moels. To eluciate the performance ifferences of various spatial VAEs, we compare the results of three ifferent spatial VAEs as introuce in Section 3; namely naïve spatial VAEs, spatial VAEs via MVN istributions an spatial VAEs via low-rank MVN istributions. We train the moels on the CelebA, CIFAR-10 an MNIST atasets, an analyze sample images generate from the moels to evaluate the performance. For the same task, the encoers of all compare moels are compose of the same convolutional neural networks (CNNs) an a fully-connecte output layer [15, 14]. While the fullyconnecte layer may iffer as require by ifferent numbers of output units, it only slightly affects the training process. As iscusse in Section 3.1, it is reasonable to compare spatial VAEs with the original VAEs in the case that their ecoers have similar architectures an moel capabilities. Therefore, following the original VAEs, econvolutional neural networks (DCNNs) are use as ecoers in spatial VAEs. Meanwhile, the total number of trainable parameters in the ecoers of all compare moels are set to be as similar as possible while accommoating ifferent input sizes. 4.1 CelebA. The CelebA ataset contains 202, 599 colore face images of size The generative moels are suppose to generate faces that are similar but not exactly the same to those in the ataset. For this task, the CNNs in the encoers have 3 layers while the ecoers are 5 or 6-layer DCNNs corresponing to spatial VAEs an the original VAEs, respectively. This ifference is cause by the fact that spatial VAEs have ( > 1) feature maps as latent representations, which require fewer up-sampling operations to obtain outputs. We set = 3 an N = 64, an the imension of z in the original VAEs is 81 in orer to have ecoers with similar numbers of trainable parameters. Figure 2 shows sample face images generate by the original VAEs an three ifferent variants of spatial VAEs. It is clear that spatial VAEs can generate images with more etails than the original VAEs.

7 Figure 3: Sample images generate by ifferent VAEs when traine on the CIFAR-10 ataset. From top to bottom, the five rows are training images an images generate by the original VAEs, naïve spatial VAEs, spatial VAEs via MVN istributions, spatial VAEs via low-rank MVN istributions, respectively. Due to the lack of explicit spatial information, the original VAEs prouce face images with little etails like hair near the borers. While naïve spatial VAEs seem to aress this problem, most faces have only incomplete hairs as naïve spatial VAEs cannot capture the relationships among ifferent locations. Theoretically, spatial VAEs via MVN istributions are able to incorporate interactions among locations. However, the results are strange faces with some istortions. We believe the reason is that aing epenencies among locations through restrictions on istribution variances is not effective an sufficient. Spatial VAEs via low-rank MVN istributions that have restricte means tackle this well an generate faces with appealing visual appearances. 4.2 CIFAR-10. The CIFAR-10 ataset consists of 60, 000 color images of in 10 classes. VAEs usually perform poorly in generating photo-realistic images since there are significant ifferences among images in ifferent classes, inicating that the unerlying true istribution of the ata is a multi-moel. In this case, VAEs ten to output very blurry images [26, 8, 7]. However, comparison among ifferent moels can still emonstrate the ifferences in terms of generative capabilities. In this experiment, we set = 3 an N = 128, an the imension of z in the original VAEs is 150. The encoers have 4 layers while the ecoers have 4 or 5 layers. Some sample images are provie in Figure 3. The original VAEs only prouce images compose of several colore areas, which is consistent to the results of a similar moel reporte in [22]. It is obvious that all three implementations of spatial VAEs generate images with more etails. However, naïve spatial VAEs still prouce meaningless images as there is no relationship among ifferent parts. The images generate by spatial VAEs via MVN istributions look like some istorte Table 1: Parzen winow log-likelihoo estimates of test ata on the MNIST ataset. We follow the same proceure as in [8]. Moel Log-Likelihoo Original VAE 297 Naïve SVAE 275 SVAE via MVN 267 SVAE via low-rank MVN 296 objects, which have similar problems to the results of the CelebA ataset. Again, spatial VAEs via lowrank MVN istributions outperform the other moels, proucing blurry but object-like images. 4.3 MNIST. We perform quantitative analysis on real-value MNIST ataset by employing the Parzen winow log-likelihoo estimates [4]. This evaluation metho is use for several generative moels where the exact likelihoo is not tractable [8, 19]. The results are reporte in Table 1 where SVAE is short for spatial VAE. Despite of the ifference in visual quality of generate images, spatial VAE via low-rank MVN istributions shares similar quantitative results with the original VAE. Note that generative moels for images are suppose to capture the unerlying ata istribution by maximizing log-likelihoo an generate images that are similar to real ones. However, it has been pointe in [26] that these two objectives are not consistent, an generative moels nee to be evaluate irectly with respect to the applications for which they were intene. A moel that can generates samples with goo visual appearances may have poor average log-likelihoo on test ataset an vice versa. Common examples of eep generative moels are VAEs an generative aversarial networks (GANs) [8]. VAEs usually have higher average log-likelihoo while GANs

8 Table 2: Training an generation time of ifferent moels when traine on the CelebA ataset using a Nviia Tesla K40C GPU. The average time for training one epoch an the time for generating 10, 000 images are reporte an compare. Moel Training time Generation time Original VAE s s Naïve SVAE s s SVAE via MVN s s SVAE via low-rank MVN s s can generate more photo-realistic images. This is basically cause by the ifferent training objectives of these two moels [7]. Currently there is no commonly accepte stanar for evaluating generative moels. 4.4 Timing Comparison. To show the influence of ifferent spatial VAEs to the training process, we compare the training time on the CelebA ataset. Theoretically, spatial VAEs slow own training ue to the larger numbers of outputs from encoers. To keep the number of trainable parameters in ecoers roughly equal, we set the imension of z in the original VAEs to be 81 while = 3 an N = 64 for spatial VAEs. Accoring to Section 3, the numbers of outputs from their encoers are 162, 1152, 960, an 768 for the original VAE, naïve spatial VAE, spatial VAE via MVN istributions an spatial VAE via low-rank MVN istributions, respectively. We train our moels on a Nviia Tesla K40C GPU an report the average time for training one epoch in Table 2. Comparisons of the time for generating 10, 000 images are also provie to show that the increase in the total imension of latent representations oes not affect the generation process. The results show consistent relationships between the training time an the number of outputs from encoers; that is, spatial VAEs cost more time than the original VAE but spatial VAEs via low-rank MVN istributions can alleviate this problem. Moreover, spatial VAEs only slightly slow own the training process since they only affect one single layer in the moels. 5 Conclusion. In this work, we propose spatial VAEs for image generation tasks, which improve VAEs by requiring the latent representations to explicitly contain spatial information of images. Specifically, in spatial VAEs, ( > 1) feature maps are sample to serve as spatial latent representations in contrast to a vector. This is achieve by sampling the latent feature maps from MVN istributions, which can moel epenencies between the rows an columns in a matrix. We further propose to employ a low-rank formulation of MVN istributions to establish stronger epenencies. Qualitative results on ifferent atasets show that spatial VAEs via low-rank MVN istributions substantially outperform the original VAEs. Acknowlegements. This work was supporte by the National Science Founation grants IIS an DBI References [1] G. I. Allen an R. Tibshirani, Transposable regularize covariance moels with an application to missing ata imputation, The Annals of Applie Statistics, 4 (2010), p [2] D. Bahanau, K. Cho, an Y. Bengio, Neural machine translation by jointly learning to align an translate, arxiv preprint arxiv: , (2014). [3] Y. Bengio, A. Courville, an P. Vincent, Representation learning: A review an new perspectives, IEEE transactions on pattern analysis an machine intelligence, 35 (2013), pp [4] O. Breuleux, Y. Bengio, an P. Vincent, Quickly generating representative samples from an rbm-erive process, Neural computation, 23 (2011), pp [5] Y. Bura, R. Grosse, an R. Salakhutinov, Importance weighte autoencoers, arxiv preprint arxiv: , (2015). [6] C. Doersch, Tutorial on variational autoencoers, arxiv preprint arxiv: , (2016). [7] I. Goofellow, Nips 2016 tutorial: Generative aversarial networks, arxiv preprint arxiv: , (2016). [8] I. Goofellow, J. Pouget-Abaie, M. Mirza, B. Xu, D. Ware-Farley, S. Ozair, A. Courville, an Y. Bengio, Generative aversarial nets, in Avances in neural information processing systems, 2014, pp [9] I. Gulrajani, K. Kumar, F. Ahme, A. A. Taiga, F. Visin, D. Vazquez, an A. Courville, Pixelvae: A latent variable moel for natural images, arxiv preprint arxiv: , (2016). [10] A. K. Gupta an D. K. Nagar, Matrix variate istributions, vol. 104, CRC Press, 1999.

9 [11] D. P. Kingma, T. Salimans, R. Jozefowicz, X. Chen, I. Sutskever, an M. Welling, Improve variational inference with inverse autoregressive flow, in Avances in neural information processing systems, 2016, pp [12] D. P. Kingma an M. Welling, Auto-encoing variational bayes, arxiv preprint arxiv: , (2013). [13] A. Krizhevsky an G. Hinton, Learning multiple layers of features from tiny images, (2009). [14] A. Krizhevsky, I. Sutskever, an G. E. Hinton, Imagenet classification with eep convolutional neural networks, in Avances in neural information processing systems, 2012, pp [15] Y. LeCun, L. Bottou, Y. Bengio, an P. Haffner, Graient-base learning applie to ocument recognition, Proceeings of the IEEE, 86 (1998), pp [16] Y. LeCun, C. Cortes, an C. J. Burges, The mnist atabase of hanwritten igits, [17] Z. Liu, P. Luo, X. Wang, an X. Tang, Deep learning face attributes in the wil, in Proceeings of International Conference on Computer Vision (ICCV), [18] J. Long, E. Shelhamer, an T. Darrell, Fully convolutional networks for semantic segmentation, in Proceeings of the IEEE Conference on Computer Vision an Pattern Recognition, 2015, pp [19] A. Makhzani, J. Shlens, N. Jaitly, I. Goofellow, an B. Frey, Aversarial autoencoers, arxiv preprint arxiv: , (2015). [20] A. v.. Oor, N. Kalchbrenner, an K. Kavukcuoglu, Pixel recurrent neural networks, arxiv preprint arxiv: , (2016). [21] A. Rafor, L. Metz, an S. Chintala, Unsupervise representation learning with eep convolutional generative aversarial networks, arxiv preprint arxiv: , (2015). [22] D. J. Rezene, S. Mohame, an D. Wierstra, Stochastic backpropagation an approximate inference in eep generative moels, arxiv preprint arxiv: , (2014). [23] O. Ronneberger, P. Fischer, an T. Brox, U- net: Convolutional networks for biomeical image segmentation, in International Conference on Meical Image Computing an Computer-Assiste Intervention, Springer, 2015, pp [24] T. Salimans, A. Karpathy, X. Chen, an D. P. Kingma, Pixelcnn++: Improving the pixelcnn with iscretize logistic mixture likelihoo an other moifications, arxiv preprint arxiv: , (2017). [25] I. Sutskever, O. Vinyals, an Q. V. Le, Sequence to sequence learning with neural networks, in Avances in neural information processing systems, 2014, pp [26] L. Theis, A. v.. Oor, an M. Bethge, A note on the evaluation of generative moels, arxiv preprint arxiv: , (2015). [27] A. van en Oor, N. Kalchbrenner, L. Espeholt, O. Vinyals, A. Graves, et al., Conitional image generation with pixelcnn ecoers, in Avances in Neural Information Processing Systems, 2016, pp [28] P. Vincent, H. Larochelle, Y. Bengio, an P.-A. Manzagol, Extracting an composing robust features with enoising autoencoers, in Proceeings of the 25th international conference on Machine learning, ACM, 2008, pp [29] P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, an P.-A. Manzagol, Stacke enoising autoencoers: Learning useful representations in a eep network with a local enoising criterion, Journal of Machine Learning Research, 11 (2010), pp [30] M. D. Zeiler, D. Krishnan, G. W. Taylor, an R. Fergus, Deconvolutional networks, in Computer Vision an Pattern Recognition (CVPR), 2010 IEEE Conference on, IEEE, 2010, pp [31] S. Zhao, J. Song, an S. Ermon, Towars eeper unerstaning of variational autoencoing moels, arxiv preprint arxiv: , (2017).

Classifying Facial Expression with Radial Basis Function Networks, using Gradient Descent and K-means

Classifying Facial Expression with Radial Basis Function Networks, using Gradient Descent and K-means Classifying Facial Expression with Raial Basis Function Networks, using Graient Descent an K-means Neil Allrin Department of Computer Science University of California, San Diego La Jolla, CA 9237 nallrin@cs.ucs.eu

More information

Transient analysis of wave propagation in 3D soil by using the scaled boundary finite element method

Transient analysis of wave propagation in 3D soil by using the scaled boundary finite element method Southern Cross University epublications@scu 23r Australasian Conference on the Mechanics of Structures an Materials 214 Transient analysis of wave propagation in 3D soil by using the scale bounary finite

More information

Exploring Context with Deep Structured models for Semantic Segmentation

Exploring Context with Deep Structured models for Semantic Segmentation 1 Exploring Context with Deep Structure moels for Semantic Segmentation Guosheng Lin, Chunhua Shen, Anton van en Hengel, Ian Rei between an image patch an a large backgroun image region. Explicitly moeling

More information

Fast Fractal Image Compression using PSO Based Optimization Techniques

Fast Fractal Image Compression using PSO Based Optimization Techniques Fast Fractal Compression using PSO Base Optimization Techniques A.Krishnamoorthy Visiting faculty Department Of ECE University College of Engineering panruti rishpci89@gmail.com S.Buvaneswari Visiting

More information

Exploring Context with Deep Structured models for Semantic Segmentation

Exploring Context with Deep Structured models for Semantic Segmentation APPEARING IN IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, APRIL 2017. 1 Exploring Context with Deep Structure moels for Semantic Segmentation Guosheng Lin, Chunhua Shen, Anton van en

More information

Controllable Generative Adversarial Network

Controllable Generative Adversarial Network Controllable Generative Adversarial Network arxiv:1708.00598v2 [cs.lg] 12 Sep 2017 Minhyeok Lee 1 and Junhee Seok 1 1 School of Electrical Engineering, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul,

More information

Particle Swarm Optimization Based on Smoothing Approach for Solving a Class of Bi-Level Multiobjective Programming Problem

Particle Swarm Optimization Based on Smoothing Approach for Solving a Class of Bi-Level Multiobjective Programming Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 17, No 3 Sofia 017 Print ISSN: 1311-970; Online ISSN: 1314-4081 DOI: 10.1515/cait-017-0030 Particle Swarm Optimization Base

More information

Image Segmentation using K-means clustering and Thresholding

Image Segmentation using K-means clustering and Thresholding Image Segmentation using Kmeans clustering an Thresholing Preeti Panwar 1, Girhar Gopal 2, Rakesh Kumar 3 1M.Tech Stuent, Department of Computer Science & Applications, Kurukshetra University, Kurukshetra,

More information

Deep generative models of natural images

Deep generative models of natural images Spring 2016 1 Motivation 2 3 Variational autoencoders Generative adversarial networks Generative moment matching networks Evaluating generative models 4 Outline 1 Motivation 2 3 Variational autoencoders

More information

Multilevel Linear Dimensionality Reduction using Hypergraphs for Data Analysis

Multilevel Linear Dimensionality Reduction using Hypergraphs for Data Analysis Multilevel Linear Dimensionality Reuction using Hypergraphs for Data Analysis Haw-ren Fang Department of Computer Science an Engineering University of Minnesota; Minneapolis, MN 55455 hrfang@csumneu ABSTRACT

More information

Coupling the User Interfaces of a Multiuser Program

Coupling the User Interfaces of a Multiuser Program Coupling the User Interfaces of a Multiuser Program PRASUN DEWAN University of North Carolina at Chapel Hill RAJIV CHOUDHARY Intel Corporation We have evelope a new moel for coupling the user-interfaces

More information

Non-homogeneous Generalization in Privacy Preserving Data Publishing

Non-homogeneous Generalization in Privacy Preserving Data Publishing Non-homogeneous Generalization in Privacy Preserving Data Publishing W. K. Wong, Nios Mamoulis an Davi W. Cheung Department of Computer Science, The University of Hong Kong Pofulam Roa, Hong Kong {wwong2,nios,cheung}@cs.hu.h

More information

Tight Wavelet Frame Decomposition and Its Application in Image Processing

Tight Wavelet Frame Decomposition and Its Application in Image Processing ITB J. Sci. Vol. 40 A, No., 008, 151-165 151 Tight Wavelet Frame Decomposition an Its Application in Image Processing Mahmu Yunus 1, & Henra Gunawan 1 1 Analysis an Geometry Group, FMIPA ITB, Banung Department

More information

Design of Policy-Aware Differentially Private Algorithms

Design of Policy-Aware Differentially Private Algorithms Design of Policy-Aware Differentially Private Algorithms Samuel Haney Due University Durham, NC, USA shaney@cs.ue.eu Ashwin Machanavajjhala Due University Durham, NC, USA ashwin@cs.ue.eu Bolin Ding Microsoft

More information

A Convex Clustering-based Regularizer for Image Segmentation

A Convex Clustering-based Regularizer for Image Segmentation Vision, Moeling, an Visualization (2015) D. Bommes, T. Ritschel an T. Schultz (Es.) A Convex Clustering-base Regularizer for Image Segmentation Benjamin Hell (TU Braunschweig), Marcus Magnor (TU Braunschweig)

More information

Almost Disjunct Codes in Large Scale Multihop Wireless Network Media Access Control

Almost Disjunct Codes in Large Scale Multihop Wireless Network Media Access Control Almost Disjunct Coes in Large Scale Multihop Wireless Network Meia Access Control D. Charles Engelhart Anan Sivasubramaniam Penn. State University University Park PA 682 engelhar,anan @cse.psu.eu Abstract

More information

Pixel-level Generative Model

Pixel-level Generative Model Pixel-level Generative Model Generative Image Modeling Using Spatial LSTMs (2015NIPS) L. Theis and M. Bethge University of Tübingen, Germany Pixel Recurrent Neural Networks (2016ICML) A. van den Oord,

More information

A Duality Based Approach for Realtime TV-L 1 Optical Flow

A Duality Based Approach for Realtime TV-L 1 Optical Flow A Duality Base Approach for Realtime TV-L 1 Optical Flow C. Zach 1, T. Pock 2, an H. Bischof 2 1 VRVis Research Center 2 Institute for Computer Graphics an Vision, TU Graz Abstract. Variational methos

More information

Deep Spatial Pyramid for Person Re-identification

Deep Spatial Pyramid for Person Re-identification Deep Spatial Pyrami for Person Re-ientification Sławomir Bąk Peter Carr Disney Research Pittsburgh, PA, USA, 15213 {slawomir.bak,peter.carr}@isneyresearch.com Abstract Re-ientification refers to the task

More information

arxiv: v2 [cs.lg] 7 Jun 2017

arxiv: v2 [cs.lg] 7 Jun 2017 Pixel Deconvolutional Networks Hongyang Gao Washington State University Pullman, WA 99164 hongyang.gao@wsu.edu Hao Yuan Washington State University Pullman, WA 99164 hao.yuan@wsu.edu arxiv:1705.06820v2

More information

Visual Recommender System with Adversarial Generator-Encoder Networks

Visual Recommender System with Adversarial Generator-Encoder Networks Visual Recommender System with Adversarial Generator-Encoder Networks Bowen Yao Stanford University 450 Serra Mall, Stanford, CA 94305 boweny@stanford.edu Yilin Chen Stanford University 450 Serra Mall

More information

Random Clustering for Multiple Sampling Units to Speed Up Run-time Sample Generation

Random Clustering for Multiple Sampling Units to Speed Up Run-time Sample Generation DEIM Forum 2018 I4-4 Abstract Ranom Clustering for Multiple Sampling Units to Spee Up Run-time Sample Generation uzuru OKAJIMA an Koichi MARUAMA NEC Solution Innovators, Lt. 1-18-7 Shinkiba, Koto-ku, Tokyo,

More information

Skyline Community Search in Multi-valued Networks

Skyline Community Search in Multi-valued Networks Syline Community Search in Multi-value Networs Rong-Hua Li Beijing Institute of Technology Beijing, China lironghuascut@gmail.com Jeffrey Xu Yu Chinese University of Hong Kong Hong Kong, China yu@se.cuh.eu.h

More information

Deep Generative Models Variational Autoencoders

Deep Generative Models Variational Autoencoders Deep Generative Models Variational Autoencoders Sudeshna Sarkar 5 April 2017 Generative Nets Generative models that represent probability distributions over multiple variables in some way. Directed Generative

More information

Channel Locality Block: A Variant of Squeeze-and-Excitation

Channel Locality Block: A Variant of Squeeze-and-Excitation Channel Locality Block: A Variant of Squeeze-and-Excitation 1 st Huayu Li Northern Arizona University Flagstaff, United State Northern Arizona University hl459@nau.edu arxiv:1901.01493v1 [cs.lg] 6 Jan

More information

Using Vector and Raster-Based Techniques in Categorical Map Generalization

Using Vector and Raster-Based Techniques in Categorical Map Generalization Thir ICA Workshop on Progress in Automate Map Generalization, Ottawa, 12-14 August 1999 1 Using Vector an Raster-Base Techniques in Categorical Map Generalization Beat Peter an Robert Weibel Department

More information

6 Gradient Descent. 6.1 Functions

6 Gradient Descent. 6.1 Functions 6 Graient Descent In this topic we will iscuss optimizing over general functions f. Typically the function is efine f : R! R; that is its omain is multi-imensional (in this case -imensional) an output

More information

arxiv: v1 [cs.cv] 16 Jul 2017

arxiv: v1 [cs.cv] 16 Jul 2017 enerative adversarial network based on resnet for conditional image restoration Paper: jc*-**-**-****: enerative Adversarial Network based on Resnet for Conditional Image Restoration Meng Wang, Huafeng

More information

A Discrete Search Method for Multi-modal Non-Rigid Image Registration

A Discrete Search Method for Multi-modal Non-Rigid Image Registration A Discrete Search Metho for Multi-moal on-rigi Image Registration Alexaner Shekhovtsov Juan D. García-Arteaga Tomáš Werner Center for Machine Perception, Dept. of Cybernetics Faculty of Electrical Engineering,

More information

Characterizing Decoding Robustness under Parametric Channel Uncertainty

Characterizing Decoding Robustness under Parametric Channel Uncertainty Characterizing Decoing Robustness uner Parametric Channel Uncertainty Jay D. Wierer, Wahee U. Bajwa, Nigel Boston, an Robert D. Nowak Abstract This paper characterizes the robustness of ecoing uner parametric

More information

PAMPAS: Real-Valued Graphical Models for Computer Vision. Abstract. 1. Introduction. Anonymous

PAMPAS: Real-Valued Graphical Models for Computer Vision. Abstract. 1. Introduction. Anonymous PAMPAS: Real-Value Graphical Moels for Computer Vision Anonymous Abstract Probabilistic moels have been aopte for many computer vision applications, however inference in high-imensional spaces remains

More information

A Multi-class SVM Classifier Utilizing Binary Decision Tree

A Multi-class SVM Classifier Utilizing Binary Decision Tree Informatica 33 (009) 33-41 33 A Multi-class Classifier Utilizing Binary Decision Tree Gjorgji Mazarov, Dejan Gjorgjevikj an Ivan Chorbev Department of Computer Science an Engineering Faculty of Electrical

More information

Generalized Edge Coloring for Channel Assignment in Wireless Networks

Generalized Edge Coloring for Channel Assignment in Wireless Networks TR-IIS-05-021 Generalize Ege Coloring for Channel Assignment in Wireless Networks Chun-Chen Hsu, Pangfeng Liu, Da-Wei Wang, Jan-Jan Wu December 2005 Technical Report No. TR-IIS-05-021 http://www.iis.sinica.eu.tw/lib/techreport/tr2005/tr05.html

More information

A Neural Network Model Based on Graph Matching and Annealing :Application to Hand-Written Digits Recognition

A Neural Network Model Based on Graph Matching and Annealing :Application to Hand-Written Digits Recognition ITERATIOAL JOURAL OF MATHEMATICS AD COMPUTERS I SIMULATIO A eural etwork Moel Base on Graph Matching an Annealing :Application to Han-Written Digits Recognition Kyunghee Lee Abstract We present a neural

More information

Queueing Model and Optimization of Packet Dropping in Real-Time Wireless Sensor Networks

Queueing Model and Optimization of Packet Dropping in Real-Time Wireless Sensor Networks Queueing Moel an Optimization of Packet Dropping in Real-Time Wireless Sensor Networks Marc Aoun, Antonios Argyriou, Philips Research, Einhoven, 66AE, The Netherlans Department of Computer an Communication

More information

Generalized Low Rank Approximations of Matrices

Generalized Low Rank Approximations of Matrices Machine Learning, 2005 2005 Springer Science + Business Meia, Inc.. Manufacture in The Netherlans. DOI: 10.1007/s10994-005-3561-6 Generalize Low Rank Approximations of Matrices JIEPING YE * jieping@cs.umn.eu

More information

Image compression predicated on recurrent iterated function systems

Image compression predicated on recurrent iterated function systems 2n International Conference on Mathematics & Statistics 16-19 June, 2008, Athens, Greece Image compression preicate on recurrent iterate function systems Chol-Hui Yun *, Metzler W. a an Barski M. a * Faculty

More information

Discriminative Filters for Depth from Defocus

Discriminative Filters for Depth from Defocus Discriminative Filters for Depth from Defocus Fahim Mannan an Michael S. Langer School of Computer Science, McGill University Montreal, Quebec HA 0E9, Canaa. {fmannan, langer}@cim.mcgill.ca Abstract Depth

More information

Unknown Radial Distortion Centers in Multiple View Geometry Problems

Unknown Radial Distortion Centers in Multiple View Geometry Problems Unknown Raial Distortion Centers in Multiple View Geometry Problems José Henrique Brito 1,2, Rolan Angst 3, Kevin Köser 3, Christopher Zach 4, Pero Branco 2, Manuel João Ferreira 2, Marc Pollefeys 3 1

More information

Optimal Oblivious Path Selection on the Mesh

Optimal Oblivious Path Selection on the Mesh Optimal Oblivious Path Selection on the Mesh Costas Busch Malik Magon-Ismail Jing Xi Department of Computer Science Rensselaer Polytechnic Institute Troy, NY 280, USA {buschc,magon,xij2}@cs.rpi.eu Abstract

More information

arxiv: v2 [cs.lg] 9 Jun 2017

arxiv: v2 [cs.lg] 9 Jun 2017 Shengjia Zhao 1 Jiaming Song 1 Stefano Ermon 1 arxiv:1702.08396v2 [cs.lg] 9 Jun 2017 Abstract Deep neural networks have been shown to be very successful at learning feature hierarchies in supervised learning

More information

A fast embedded selection approach for color texture classification using degraded LBP

A fast embedded selection approach for color texture classification using degraded LBP A fast embee selection approach for color texture classification using egrae A. Porebski, N. Vanenbroucke an D. Hama Laboratoire LISIC - EA 4491 - Université u Littoral Côte Opale - 50, rue Ferinan Buisson

More information

MANJUSHA K.*, ANAND KUMAR M., SOMAN K. P.

MANJUSHA K.*, ANAND KUMAR M., SOMAN K. P. Journal of Engineering Science an echnology Vol. 13, No. 1 (2018) 141-157 School of Engineering, aylor s University IMPLEMENAION OF REJECION SRAEGIES INSIDE MALAYALAM CHARACER RECOGNIION SYSEM BASED ON

More information

arxiv: v4 [cs.si] 22 Dec 2017

arxiv: v4 [cs.si] 22 Dec 2017 Graph Embeing Techniques, Applications, an Performance: A Survey Palash Goyal an Emilio Ferrara University of Southern California, Information Sciences Institute 4676 Amiralty Way, Suite 1. Marina el Rey,

More information

SURVIVABLE IP OVER WDM: GUARANTEEEING MINIMUM NETWORK BANDWIDTH

SURVIVABLE IP OVER WDM: GUARANTEEEING MINIMUM NETWORK BANDWIDTH SURVIVABLE IP OVER WDM: GUARANTEEEING MINIMUM NETWORK BANDWIDTH Galen H Sasaki Dept Elec Engg, U Hawaii 2540 Dole Street Honolul HI 96822 USA Ching-Fong Su Fuitsu Laboratories of America 595 Lawrence Expressway

More information

Comparative Study of Projection/Back-projection Schemes in Cryo-EM Tomography

Comparative Study of Projection/Back-projection Schemes in Cryo-EM Tomography Comparative Stuy of Projection/Back-projection Schemes in Cryo-EM Tomography Yu Liu an Jong Chul Ye Department of BioSystems Korea Avance Institute of Science an Technology, Daejeon, Korea ABSTRACT In

More information

Auxiliary Guided Autoregressive Variational Autoencoders

Auxiliary Guided Autoregressive Variational Autoencoders Auxiliary Guided Autoregressive Variational Autoencoders Thomas Lucas, Jakob Verbeek To cite this version: Thomas Lucas, Jakob Verbeek. Auxiliary Guided Autoregressive Variational Autoencoders. 2017.

More information

Learning convex bodies is hard

Learning convex bodies is hard Learning convex boies is har Navin Goyal Microsoft Research Inia navingo@microsoftcom Luis Raemacher Georgia Tech lraemac@ccgatecheu Abstract We show that learning a convex boy in R, given ranom samples

More information

Handling missing values in kernel methods with application to microbiology data

Handling missing values in kernel methods with application to microbiology data an Machine Learning. Bruges (Belgium), 24-26 April 2013, i6oc.com publ., ISBN 978-2-87419-081-0. Available from http://www.i6oc.com/en/livre/?gcoi=28001100131010. Hanling missing values in kernel methos

More information

Refinement of scene depth from stereo camera ego-motion parameters

Refinement of scene depth from stereo camera ego-motion parameters Refinement of scene epth from stereo camera ego-motion parameters Piotr Skulimowski, Pawel Strumillo An algorithm for refinement of isparity (epth) map from stereoscopic sequences is propose. The metho

More information

PIXELCNN++: IMPROVING THE PIXELCNN WITH DISCRETIZED LOGISTIC MIXTURE LIKELIHOOD AND OTHER MODIFICATIONS

PIXELCNN++: IMPROVING THE PIXELCNN WITH DISCRETIZED LOGISTIC MIXTURE LIKELIHOOD AND OTHER MODIFICATIONS PIXELCNN++: IMPROVING THE PIXELCNN WITH DISCRETIZED LOGISTIC MIXTURE LIKELIHOOD AND OTHER MODIFICATIONS Tim Salimans, Andrej Karpathy, Xi Chen, Diederik P. Kingma {tim,karpathy,peter,dpkingma}@openai.com

More information

Research Article Inviscid Uniform Shear Flow past a Smooth Concave Body

Research Article Inviscid Uniform Shear Flow past a Smooth Concave Body International Engineering Mathematics Volume 04, Article ID 46593, 7 pages http://x.oi.org/0.55/04/46593 Research Article Invisci Uniform Shear Flow past a Smooth Concave Boy Abullah Mura Department of

More information

Adjacency Matrix Based Full-Text Indexing Models

Adjacency Matrix Based Full-Text Indexing Models 1000-9825/2002/13(10)1933-10 2002 Journal of Software Vol.13, No.10 Ajacency Matrix Base Full-Text Inexing Moels ZHOU Shui-geng 1, HU Yun-fa 2, GUAN Ji-hong 3 1 (Department of Computer Science an Engineering,

More information

Clustering and Unsupervised Anomaly Detection with l 2 Normalized Deep Auto-Encoder Representations

Clustering and Unsupervised Anomaly Detection with l 2 Normalized Deep Auto-Encoder Representations Clustering and Unsupervised Anomaly Detection with l 2 Normalized Deep Auto-Encoder Representations Caglar Aytekin, Xingyang Ni, Francesco Cricri and Emre Aksu Nokia Technologies, Tampere, Finland Corresponding

More information

5th International Conference on Advanced Design and Manufacturing Engineering (ICADME 2015)

5th International Conference on Advanced Design and Manufacturing Engineering (ICADME 2015) 5th International Conference on Avance Design an Manufacturing Engineering (ICADME 25) Research on motion characteristics an application of multi egree of freeom mechanism base on R-W metho Xiao-guang

More information

Dense Disparity Estimation in Ego-motion Reduced Search Space

Dense Disparity Estimation in Ego-motion Reduced Search Space Dense Disparity Estimation in Ego-motion Reuce Search Space Luka Fućek, Ivan Marković, Igor Cvišić, Ivan Petrović University of Zagreb, Faculty of Electrical Engineering an Computing, Croatia (e-mail:

More information

Shift-map Image Registration

Shift-map Image Registration Shift-map Image Registration Svärm, Linus; Stranmark, Petter Unpublishe: 2010-01-01 Link to publication Citation for publishe version (APA): Svärm, L., & Stranmark, P. (2010). Shift-map Image Registration.

More information

New Version of Davies-Bouldin Index for Clustering Validation Based on Cylindrical Distance

New Version of Davies-Bouldin Index for Clustering Validation Based on Cylindrical Distance New Version of Davies-Boulin Inex for lustering Valiation Base on ylinrical Distance Juan arlos Roas Thomas Faculta e Informática Universia omplutense e Mari Mari, España correoroas@gmail.com Abstract

More information

Estimating Velocity Fields on a Freeway from Low Resolution Video

Estimating Velocity Fields on a Freeway from Low Resolution Video Estimating Velocity Fiels on a Freeway from Low Resolution Vieo Young Cho Department of Statistics University of California, Berkeley Berkeley, CA 94720-3860 Email: young@stat.berkeley.eu John Rice Department

More information

The Reconstruction of Graphs. Dhananjay P. Mehendale Sir Parashurambhau College, Tilak Road, Pune , India. Abstract

The Reconstruction of Graphs. Dhananjay P. Mehendale Sir Parashurambhau College, Tilak Road, Pune , India. Abstract The Reconstruction of Graphs Dhananay P. Mehenale Sir Parashurambhau College, Tila Roa, Pune-4030, Inia. Abstract In this paper we iscuss reconstruction problems for graphs. We evelop some new ieas lie

More information

A multiple wavelength unwrapping algorithm for digital fringe profilometry based on spatial shift estimation

A multiple wavelength unwrapping algorithm for digital fringe profilometry based on spatial shift estimation University of Wollongong Research Online Faculty of Engineering an Information Sciences - Papers: Part A Faculty of Engineering an Information Sciences 214 A multiple wavelength unwrapping algorithm for

More information

Open Access Adaptive Image Enhancement Algorithm with Complex Background

Open Access Adaptive Image Enhancement Algorithm with Complex Background Sen Orers for Reprints to reprints@benthamscience.ae 594 The Open Cybernetics & Systemics Journal, 205, 9, 594-600 Open Access Aaptive Image Enhancement Algorithm with Complex Bacgroun Zhang Pai * epartment

More information

Generalized Edge Coloring for Channel Assignment in Wireless Networks

Generalized Edge Coloring for Channel Assignment in Wireless Networks Generalize Ege Coloring for Channel Assignment in Wireless Networks Chun-Chen Hsu Institute of Information Science Acaemia Sinica Taipei, Taiwan Da-wei Wang Jan-Jan Wu Institute of Information Science

More information

Intensive Hypercube Communication: Prearranged Communication in Link-Bound Machines 1 2

Intensive Hypercube Communication: Prearranged Communication in Link-Bound Machines 1 2 This paper appears in J. of Parallel an Distribute Computing 10 (1990), pp. 167 181. Intensive Hypercube Communication: Prearrange Communication in Link-Boun Machines 1 2 Quentin F. Stout an Bruce Wagar

More information

Cluster Center Initialization Method for K-means Algorithm Over Data Sets with Two Clusters

Cluster Center Initialization Method for K-means Algorithm Over Data Sets with Two Clusters Available online at www.scienceirect.com Proceia Engineering 4 (011 ) 34 38 011 International Conference on Avances in Engineering Cluster Center Initialization Metho for K-means Algorithm Over Data Sets

More information

Feature Extraction and Rule Classification Algorithm of Digital Mammography based on Rough Set Theory

Feature Extraction and Rule Classification Algorithm of Digital Mammography based on Rough Set Theory Feature Extraction an Rule Classification Algorithm of Digital Mammography base on Rough Set Theory Aboul Ella Hassanien Jafar M. H. Ali. Kuwait University, Faculty of Aministrative Science, Quantitative

More information

Adversarial Symmetric Variational Autoencoder

Adversarial Symmetric Variational Autoencoder Adversarial Symmetric Variational Autoencoder Yunchen Pu, Weiyao Wang, Ricardo Henao, Liqun Chen, Zhe Gan, Chunyuan Li and Lawrence Carin Department of Electrical and Computer Engineering, Duke University

More information

An Adaptive Routing Algorithm for Communication Networks using Back Pressure Technique

An Adaptive Routing Algorithm for Communication Networks using Back Pressure Technique International OPEN ACCESS Journal Of Moern Engineering Research (IJMER) An Aaptive Routing Algorithm for Communication Networks using Back Pressure Technique Khasimpeera Mohamme 1, K. Kalpana 2 1 M. Tech

More information

Evolutionary Optimisation Methods for Template Based Image Registration

Evolutionary Optimisation Methods for Template Based Image Registration Evolutionary Optimisation Methos for Template Base Image Registration Lukasz A Machowski, Tshilizi Marwala School of Electrical an Information Engineering University of Witwatersran, Johannesburg, South

More information

arxiv: v1 [cs.cv] 7 Mar 2018

arxiv: v1 [cs.cv] 7 Mar 2018 Accepted as a conference paper at the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN) 2018 Inferencing Based on Unsupervised Learning of Disentangled

More information

AI-Sketcher : A Deep Generative Model for Producing High-Quality Sketches

AI-Sketcher : A Deep Generative Model for Producing High-Quality Sketches AI-Sketcher : A Deep Generative Model for Producing High-Quality Sketches Nan Cao, Xin Yan, Yang Shi, Chaoran Chen Intelligent Big Data Visualization Lab, Tongji University, Shanghai, China {nancao, xinyan,

More information

Bends, Jogs, And Wiggles for Railroad Tracks and Vehicle Guide Ways

Bends, Jogs, And Wiggles for Railroad Tracks and Vehicle Guide Ways Ben, Jogs, An Wiggles for Railroa Tracks an Vehicle Guie Ways Louis T. Klauer Jr., PhD, PE. Work Soft 833 Galer Dr. Newtown Square, PA 19073 lklauer@wsof.com Preprint, June 4, 00 Copyright 00 by Louis

More information

DOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION

DOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION DOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION Yen-Cheng Liu 1, Wei-Chen Chiu 2, Sheng-De Wang 1, and Yu-Chiang Frank Wang 1 1 Graduate Institute of Electrical Engineering,

More information

Lecture 1 September 4, 2013

Lecture 1 September 4, 2013 CS 84r: Incentives an Information in Networks Fall 013 Prof. Yaron Singer Lecture 1 September 4, 013 Scribe: Bo Waggoner 1 Overview In this course we will try to evelop a mathematical unerstaning for the

More information

Study of Network Optimization Method Based on ACL

Study of Network Optimization Method Based on ACL Available online at www.scienceirect.com Proceia Engineering 5 (20) 3959 3963 Avance in Control Engineering an Information Science Stuy of Network Optimization Metho Base on ACL Liu Zhian * Department

More information

Loop Scheduling and Partitions for Hiding Memory Latencies

Loop Scheduling and Partitions for Hiding Memory Latencies Loop Scheuling an Partitions for Hiing Memory Latencies Fei Chen Ewin Hsing-Mean Sha Dept. of Computer Science an Engineering University of Notre Dame Notre Dame, IN 46556 Email: fchen,esha @cse.n.eu Tel:

More information

Fast Window Based Stereo Matching for 3D Scene Reconstruction

Fast Window Based Stereo Matching for 3D Scene Reconstruction The International Arab Journal of Information Technology, Vol. 0, No. 3, May 203 209 Fast Winow Base Stereo Matching for 3D Scene Reconstruction Mohamma Mozammel Chowhury an Mohamma AL-Amin Bhuiyan Department

More information

A Comparative Evaluation of Iris and Ocular Recognition Methods on Challenging Ocular Images

A Comparative Evaluation of Iris and Ocular Recognition Methods on Challenging Ocular Images A Comparative Evaluation of Iris an Ocular Recognition Methos on Challenging Ocular Images Vishnu Naresh Boeti Carnegie Mellon University Pittsburgh, PA 523 naresh@cmu.eu Jonathon M Smereka Carnegie Mellon

More information

THE APPLICATION OF ARTICLE k-th SHORTEST TIME PATH ALGORITHM

THE APPLICATION OF ARTICLE k-th SHORTEST TIME PATH ALGORITHM International Journal of Physics an Mathematical Sciences ISSN: 2277-2111 (Online) 2016 Vol. 6 (1) January-March, pp. 24-6/Mao an Shi. THE APPLICATION OF ARTICLE k-th SHORTEST TIME PATH ALGORITHM Hua Mao

More information

Discrete Markov Image Modeling and Inference on the Quadtree

Discrete Markov Image Modeling and Inference on the Quadtree 390 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 9, NO. 3, MARCH 2000 Discrete Markov Image Moeling an Inference on the Quatree Jean-Marc Laferté, Patrick Pérez, an Fabrice Heitz Abstract Noncasual Markov

More information

DOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION

DOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION 2017 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, SEPT. 25 28, 2017, TOKYO, JAPAN DOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION Yen-Cheng Liu 1,

More information

I. Introuction With the evolution of imaging technology, an increasing number of image moalities becomes available. In remote sensing, sensors are use

I. Introuction With the evolution of imaging technology, an increasing number of image moalities becomes available. In remote sensing, sensors are use A multivalue image wavelet representation base on multiscale funamental forms P. Scheuners Vision Lab, Department ofphysics, University ofantwerp, Groenenborgerlaan 7, 00 Antwerpen, Belgium Tel.: +3/3/8

More information

High dimensional Apollonian networks

High dimensional Apollonian networks High imensional Apollonian networks Zhongzhi Zhang Institute of Systems Engineering, Dalian University of Technology, Dalian 11604, Liaoning, China E-mail: lutzzz063@yahoo.com.cn Francesc Comellas Dep.

More information

Secure Network Coding for Distributed Secret Sharing with Low Communication Cost

Secure Network Coding for Distributed Secret Sharing with Low Communication Cost Secure Network Coing for Distribute Secret Sharing with Low Communication Cost Nihar B. Shah, K. V. Rashmi an Kannan Ramchanran, Fellow, IEEE Abstract Shamir s (n,k) threshol secret sharing is an important

More information

Rough Set Approach for Classification of Breast Cancer Mammogram Images

Rough Set Approach for Classification of Breast Cancer Mammogram Images Rough Set Approach for Classification of Breast Cancer Mammogram Images Aboul Ella Hassanien Jafar M. H. Ali. Kuwait University, Faculty of Aministrative Science, Quantitative Methos an Information Systems

More information

Considering bounds for approximation of 2 M to 3 N

Considering bounds for approximation of 2 M to 3 N Consiering bouns for approximation of to (version. Abstract: Estimating bouns of best approximations of to is iscusse. In the first part I evelop a powerseries, which shoul give practicable limits for

More information

NET Institute*

NET Institute* NET Institute* www.netinst.org Working Paper #08-24 October 2008 Computer Virus Propagation in a Network Organization: The Interplay between Social an Technological Networks Hsing Kenny Cheng an Hong Guo

More information

Deep Learning With Noise

Deep Learning With Noise Deep Learning With Noise Yixin Luo Computer Science Department Carnegie Mellon University yixinluo@cs.cmu.edu Fan Yang Department of Mathematical Sciences Carnegie Mellon University fanyang1@andrew.cmu.edu

More information

Yet Another Parallel Hypothesis Search for Inverse Entailment Hiroyuki Nishiyama and Hayato Ohwada Faculty of Sci. and Tech. Tokyo University of Scien

Yet Another Parallel Hypothesis Search for Inverse Entailment Hiroyuki Nishiyama and Hayato Ohwada Faculty of Sci. and Tech. Tokyo University of Scien Yet Another Parallel Hypothesis Search for Inverse Entailment Hiroyuki Nishiyama an Hayato Ohwaa Faculty of Sci. an Tech. Tokyo University of Science, 2641 Yamazaki, Noa-shi, CHIBA, 278-8510, Japan hiroyuki@rs.noa.tus.ac.jp,

More information

Online Appendix to: Generalizing Database Forensics

Online Appendix to: Generalizing Database Forensics Online Appenix to: Generalizing Database Forensics KYRIACOS E. PAVLOU an RICHARD T. SNODGRASS, University of Arizona This appenix presents a step-by-step iscussion of the forensic analysis protocol that

More information

Bayesian localization microscopy reveals nanoscale podosome dynamics

Bayesian localization microscopy reveals nanoscale podosome dynamics Nature Methos Bayesian localization microscopy reveals nanoscale poosome ynamics Susan Cox, Ewar Rosten, James Monypenny, Tijana Jovanovic-Talisman, Dylan T Burnette, Jennifer Lippincott-Schwartz, Gareth

More information

Human recognition based on ear shape images using PCA-Wavelets and different classification methods

Human recognition based on ear shape images using PCA-Wavelets and different classification methods Meical Devices an Diagnostic Engineering Review Article ISSN: 2399-6854 Human recognition base on ear shape images using PCA-Wavelets an ifferent classification methos Ali Mahmou Mayya1* an Mariam Mohamma

More information

A Plane Tracker for AEC-automation Applications

A Plane Tracker for AEC-automation Applications A Plane Tracker for AEC-automation Applications Chen Feng *, an Vineet R. Kamat Department of Civil an Environmental Engineering, University of Michigan, Ann Arbor, USA * Corresponing author (cforrest@umich.eu)

More information

Symmetric Variational Autoencoder and Connections to Adversarial Learning

Symmetric Variational Autoencoder and Connections to Adversarial Learning Symmetric Variational Autoencoder and Connections to Adversarial Learning Liqun Chen 1 Shuyang Dai 1 Yunchen Pu 1 Erjin Zhou 4 Chunyuan Li 1 Qinliang Su 2 Changyou Chen 3 Lawrence Carin 1 1 Duke University,

More information

1 Surprises in high dimensions

1 Surprises in high dimensions 1 Surprises in high imensions Our intuition about space is base on two an three imensions an can often be misleaing in high imensions. It is instructive to analyze the shape an properties of some basic

More information

arxiv: v4 [cs.lg] 27 Nov 2017 ABSTRACT

arxiv: v4 [cs.lg] 27 Nov 2017 ABSTRACT PIXEL DECONVOLUTIONAL NETWORKS Hongyang Gao Washington State University hongyang.gao@wsu.edu Hao Yuan Washington State University hao.yuan@wsu.edu Zhengyang Wang Washington State University zwang6@eecs.wsu.edu

More information

Implementation and Evaluation of NAS Parallel CG Benchmark on GPU Cluster with Proprietary Interconnect TCA

Implementation and Evaluation of NAS Parallel CG Benchmark on GPU Cluster with Proprietary Interconnect TCA Implementation an Evaluation of AS Parallel CG Benchmark on GPU Cluster with Proprietary Interconnect TCA Kazuya Matsumoto 1, orihisa Fujita 2, Toshihiro Hanawa 3, an Taisuke Boku 1,2 1 Center for Computational

More information

APPLYING GENETIC ALGORITHM IN QUERY IMPROVEMENT PROBLEM. Abdelmgeid A. Aly

APPLYING GENETIC ALGORITHM IN QUERY IMPROVEMENT PROBLEM. Abdelmgeid A. Aly International Journal "Information Technologies an Knowlege" Vol. / 2007 309 [Project MINERVAEUROPE] Project MINERVAEUROPE: Ministerial Network for Valorising Activities in igitalisation -

More information

On the Role of Multiply Sectioned Bayesian Networks to Cooperative Multiagent Systems

On the Role of Multiply Sectioned Bayesian Networks to Cooperative Multiagent Systems On the Role of Multiply Sectione Bayesian Networks to Cooperative Multiagent Systems Y. Xiang University of Guelph, Canaa, yxiang@cis.uoguelph.ca V. Lesser University of Massachusetts at Amherst, USA,

More information

arxiv: v4 [cs.gr] 20 Feb 2019

arxiv: v4 [cs.gr] 20 Feb 2019 GENERATING LIQUID SIMULATIONS WITH DEFORMATION-AWARE NEURAL NETWORKS Lukas Prantl, Boris Bonev & Nils Thuerey Department of Computer Science Technical University of Munich Boltzmannstr. 3, 85748 Garching,

More information