Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

Size: px

Start display at page:

Download "Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process"

Lynette Allen
5 years ago
Views:

Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process Changyou Chen,, Lan Du,, Wray Buntine, ANU College of Engineering and Computer Science The

1 Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process Changyou Chen,, Lan Du,, Wray Buntine, ANU College of Engineering and Computer Science The Australian National University National ICT, Australia September 7, Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

2 Outline The Poisson-Dirichlet Process The Poisson-Dirichlet Process Table Indicator Representation of the HPDP 3 Experiments: Topic Modeling using the HPDP 4 Conclusion Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

Beneficial to automatic document analysis and understanding. Computational linguistic: e.g., the n-gram model, adaptor grammar. Computer vision.

3 The Poisson-Dirichlet Process What can We Do with the Poisson-Dirichlet Process? Applications of the Poisson-Dirichlet process: Topic modeling: Finding meaningful topics discussed in large scale documents. Beneficial to automatic document analysis and understanding. Computational linguistic: e.g., the n-gram model, adaptor grammar. Computer vision. Using PDP/HPDP for image annotation, image segmentation, scene learning, and etc. Others: Data compression, relational modeling, etc. Cows, grass, hill, sky, tree... 3 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

4 The Poisson-Dirichlet Process What is the Poisson-Dirichlet Process? The Poisson-Dirichlet process takes as input a base distribution, and yields as output a discreet distribution which is somewhat similar to the base distribution y = f (x) y = K i= p i δ xi 4 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

5 The Poisson-Dirichlet Process What is the Poisson-Dirichlet Process? The Poisson-Dirichlet process (PDP) is a random probability measure, or a distribution over distributions. The basic form of the PDP is: p k δ X k ( ) () k= where p = (p,p,...) is a probability vector so p k and k= p k =. Also, δ X k ( ) is a discrete measure concentrated at Xk. The values X k X are i.i.d. drawn from the base measure H( ). The one parameter version of the PDP is the Dirichlet process (DP), and the two parameter version is the Pitman-Yor process. 5 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

6 The Poisson-Dirichlet Process Why Poisson-Dirichlet Processes? It allows us to extend finite mixture models to infinite mixture models, by putting a PDP prior on the mixture components. γ G c θ x N c P(γ λ), x P(x θ c ) θ G, θ x N x P(x θ) P(γ λ) is a infinite discreet distribution with parameter λ, or a Poisson-Dirichlet distribution. G is the distribution over mixture components θ, or a Poisson-Dirichlet Process. 6 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

7 The Poisson-Dirichlet Process Why Poisson-Dirichlet Processes? Easily extend to model hierarchical discreet distributions, e.g., hierarchical Dirichlet processes (HDP) or hierarchical Poisson-Dirichlet process (HPDP). e.g. G e.g. p G p p p K θ p... p... p j x (a) HDP N p... p j p... (b) Probability vector hierarchy 7 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

8 The Poisson-Dirichlet Process The Chinese Restaurant Process G p c c x N x * x N x * x N (a) PDP (b) PDP (c) CRP The Chinese restaurant process (c) (CRP) is the marginalized version of the PDP (b). Sampling the PDP can be done following the CRP s metaphor. 8 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

9 The Poisson-Dirichlet Process The Chinese Restaurant Process The CRP assumes a Chinese restaurant has an infinite number of tables, each with infinite capacity, the customers go in and sit at the restaurant as: The first customer sits at the first unoccupied table with probability. For the subsequent (n + )-th customer: He can choose to sit at an occupied table to share the dish with other seated customers with probability proportional to the number of customers that have already sit at that table. or to sit at an unoccupied table with probability proportional to the sum of strength parameter and the total table count Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

10 Outline The Poisson-Dirichlet Process Table Indicator Representation of the HPDP 3 Experiments: Topic Modeling using the HPDP 4 Conclusion Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

11 HPDP in the Chinese Restaurant Process Metaphor This shows a simple linear hierarchy of 3 Chinese restaurants. We ll bring customers in and watch the seating. z: dish u: table indicator Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

12 HPDP in the Chinese Restaurant Process Metaphor z =? u = t =? Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

13 HPDP in the Chinese Restaurant Process Metaphor z =? u = t =? t =? 3 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

14 HPDP in the Chinese Restaurant Process Metaphor z =? u = t =? t =? 4 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

15 HPDP in the Chinese Restaurant Process Metaphor z = u = 5 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

16 HPDP in the Chinese Restaurant Process Metaphor z = u = NA 6 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

17 HPDP in the Chinese Restaurant Process Metaphor z =? u = t =? 7 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

18 HPDP in the Chinese Restaurant Process Metaphor z =? u = t =? 8 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

19 HPDP in the Chinese Restaurant Process Metaphor z = u = 9 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

20 HPDP in the Chinese Restaurant Process Metaphor z = u = NA Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

21 HPDP in the Chinese Restaurant Process Metaphor z = u = NA Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

22 HPDP in the Chinese Restaurant Process Metaphor z =? u = t =? Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

23 HPDP in the Chinese Restaurant Process Metaphor t = z =? u = t =? t =? 3 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

24 HPDP in the Chinese Restaurant Process Metaphor t = z = u = t = t = 4 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

25 HPDP in the Chinese Restaurant Process Metaphor t = z =? u = t = t = t =? 5 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

26 HPDP in the Chinese Restaurant Process Metaphor t = z =? u = t =? t = t = t =? 6 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

27 HPDP in the Chinese Restaurant Process Metaphor t = t = 3 z =? u = t =? t = t = t =? 7 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

28 HPDP in the Chinese Restaurant Process Metaphor t = t = 3 z = 3 u = t = 3 t = t = t = 3 8 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

29 HPDP in the Chinese Restaurant Process Metaphor We can also add customers at any restaurant. t = t = 3... t = 3 t =... t = t = Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

30 HPDP Statistics used in Sampling Seating arrangements represenation t, =,m,, = t, =,m,, = 3 t,3 =,m,3, = t, =,m,, = 4 t, =,m,, = t,3 =,m,3, = t, =,m,, =, m,, = 3 t, =,m,, = t,3 =,m,3, = m = counts at each table z = u = t = t = 3 t = z = u = NA z = u = t = 3 t = t = z = z = 3 u =... u =... Table indicators representation n, =,t, = n, = 3,t, = n,3 =,t,3 = n, = 4,t, = n, =,t, = n,3 =,t,3 = n, = 5,t, = n, =,t, = n,3 =,t,3 = t = number of tables 3 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

31 Table Indicator Representation of the HPDP The above u is called the table indicator, defined as: Definition (Table indicator u l ) The table indicator u l for each customer l is an auxiliary latent variable which indicates up to which level in the restaurant hierarchy l has contributed a table count (i.e. activated a new table). It is enough to use this variable to represent the HPDP, e.g., other statistics (#customers and #tables in each restaurant) can be reconstructed from the table indicators easily. A block Gibbs sampler can be derived using this representation. 3 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

32 Table Counts Representation of the HPDP Theorem: (Teh et. al., 6 []; Buntine and Hutter, []) Posterior of the HPDP with table count representation: ( ) (bj a j ) Tj P r ( z :J, t :J H ) = j (b j ) Nj S n jk t jk,a j k n jk : #customers in the j-th restaurant eating dish k t jk : #tables in the j-th restaurant serving dish k N j : = n jk, T j := t jk k k S N M,a : the generalized Stirling number (x y) N : the Pochhammer symbol with increment y () H : base distribution 3 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

33 Table Indicator Representation of the HPDP Theorem Posterior of the HPDP in our representation: ( ) (bj a j ) Tj P r ( z :J, u :J H ) = j (b j ) Nj S n jk t jk!(n jk t jk )! t jk,a j k n jk! the symbols are the same with the previous ones except that t jk can be constructed from the table indicators: t jk = δ zl =kδ ul d(j) j T(j) l D(j ) 33 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

34 How does our Sampler Work? Sampling a DP with dim = 5, data=5 points, true conc.=: One run with K=5, N=5, disc.=, conc.= Mean over runs from with K=5, N=5, disc.=, conc.= Sampling with seating arrangements Table indicator sampler Collapsed table sampler 6 5 Sampling with seating arrangements Table indicator sampler Collapsed table sampler Mean tables Deviation for mean tables Time (ms) (a) Estimate of total table counts Time (ms) (b) Std.Dev. of total table counts One run with K=5, N=5, disc.=, conc.= Mean over runs with K=5, N=5, disc.=, conc.= 3 Sampling with seating arrangements Table indicator sampler Collapsed table sampler 3.5 Sampling with seating arrangements Table indicator sampler Collapsed table sampler Mean concentration Deviation for concentration Time (ms) (c) Estimate of concentration par. b Time (ms) (d) Std.Dev. of concentration par. b 34 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

35 Experiments: Topic Modeling using the HPDP Outline The Poisson-Dirichlet Process Table Indicator Representation of the HPDP 3 Experiments: Topic Modeling using the HPDP 4 Conclusion 35 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

36 Experiments: Topic Modeling using the HPDP Datasets, Compared Algorithms and Evaluations We use five datasets (from Blogs, Reuters, NIPS, UCI): Health Person Obama NIPS Enron # words,9,678,656,574,38,667,93,365 6,4,7 # documents,655 8,66 9,95,5 39,86 vocabulary size,863 3,946 8,38,49 8, Compared algorithms: Teh et.al. s sampling direct assignment SDA [] Buntine and Hutter s collapsed table sampler CTS [] Our proposed block Gibbs sampler STC STC initialized with SDA, denoted as STC + SDA We used the unbiased left-to-right algorithm [3] to calculate the testing perplexities for the topic models, which is a standard evaluation. 36 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

37 Experiments: Topic Modeling using the HPDP Perplexities Table: Test log (perplexities) on the five datasets Dataset Health Person Obama I = I = I = I = I = I = SDA CTS SDA+STC STC Dataset Enron NIPS I = 5 I = I = I = SDA SDA+STC STC updated result 37 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

38 Experiments: Topic Modeling using the HPDP Convergence Speed testing perplexity..8.6 SDA STC CTS SDA+STC training time (s) (a) Health with I = testing perplexity SDA STC CTS SDA+STC training time (s) (b) Health with I = testing perplexity 3.5 SDA STC CTS SDA+STC training time (s) (c) Person with I = testing perplexity SDA STC CTS SDA+STC training time (s) (d) Person with I = testing perplexity SDA STC SDA+STC training time (s) (e) NIPS with I = testing perplexity SDA STC SDA+STC training time (s) (f) NIPS with I = 38 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

39 Outline Conclusion The Poisson-Dirichlet Process Table Indicator Representation of the HPDP 3 Experiments: Topic Modeling using the HPDP 4 Conclusion 39 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

40 Conclusion Conclusion Proposed a new representation for the HPDP based on the CRP metaphor. All useful statistics of the CRP can be reconstructed from the table indicator. No dynamic memory allocations for table counts. A blocked Gibbs sampler can be easily derived, e.g., we do not have to sample the table counts separately. Experimental results on topic modeling indicate fast mixing of the proposed algorithm. All other PDP related applications can be adapted to this representation. 4 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

41 Reference Conclusion Teh, Y.: A Bayesian interpretation of interpolated Kneser-Ney. Technical Report TRA/6, School of Computing, National University of Singapore (6) Buntine, W., Hutter, M.: A Bayesian review of the Poisson-Dirichlet process. Technical Report arxiv:7.96, NICTA and ANU, Australia () Buntine, W.: Estimating likelihoods for topic models. In: ACML 9. (9) Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

42 Conclusion Thanks for your attention!!! 4 Changyou Chen, Lan Du, Wray Buntine Sampling Table Configulations for the Hierarchical Poisson-Dirichlet Process

Mondrian Forests: Efficient Online Random Forests

Mondrian Forests: Efficient Online Random Forests Balaji Lakshminarayanan Joint work with Daniel M. Roy and Yee Whye Teh 1 Outline Background and Motivation Mondrian Forests Randomization mechanism Online