DANCEREPRODUCER: AN AUTOMATIC MASHUP MUSIC VIDEO GENERATION SYSTEM BY REUSING DANCE VIDEO CLIPS ON THE WEB

Size: px

Start display at page:

Download "DANCEREPRODUCER: AN AUTOMATIC MASHUP MUSIC VIDEO GENERATION SYSTEM BY REUSING DANCE VIDEO CLIPS ON THE WEB"

Trevor Atkins
5 years ago
Views:

1 DANCEREPRODUCER: AN AUTOMATIC MASHUP MUSIC VIDEO GENERATION SYSTEM BY REUSING DANCE VIDEO CLIPS ON THE WEB Tomoyasu Nakano 1 Sora Murofushi 3 Masataka Goto 2 Shigeo Morishima 3 1 t.nakano[at]aist.go.jp 2 m.goto[at]aist.go.jp 3 shigeo[at]waseda.jp ABSTRACT DanceReProducer 1. INTRODUCTION MAD movies mashup videos 1st generation (primary or original) content Copyright: 2011 Tomoyasu Nakano et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 Unported License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Original content (1st generation) Mashup music videos (User-generated video clips) 2nd generation new new 3rd generation new Nth generation... Figure 1 2nd generation (secondary or derivative) content DanceReProducer

2 e.g. N 2. RELATED WORK 3. SYSTEM DESIGN 3.1 Criteria of natural/skillful relationships between an image sequence and music Local relationships Rhythm e.g. Automatic mashup music video generation system Web Segment N Output Input Estimated music structure Stretch and Concatenate Chorus A A B B B A A B B B C C C Figure 2 DanceReProducer Impression Context relationships Music structure e.g. Temporal continuity 3.2 Image sequence generation Automatic image sequence generation

2 Interface Figure 4 Interactive re-selection of a generated image

3 Figure 3 visual unit Rhythmic synchronization Impression synchronization Music structure Temporal continuity Interface Figure 4 Interactive re-selection of a generated image sequence e.g. Jumping to the beginning of sections 4. INTERNAL MECHANISM OF DANCEREPRODUCER

4 Database construction Web A Gather videos Resampling (30 fps, 44.1kHz) B Extract frame feature 30 fps 44.1kHz 1 bar Beat tracking 1 frame C Extract bar-level feature 1 bar Resampling (16 points) D DCT (3rd order with DC) 1 feature vector ( = + ) E Construct Database Video generation Input music F Extract bar-level feature DCT input tempo G Reconstruct H Train mapping model I Select visual unit Output Database Viterbi search under the Euclidean distance < 20% > 20% PCA weighting calculation clustering mapping linear regression A A A A B B Music structure Stretch and concatenate of the unit Figure Database construction frame-time frame features bar-level features Beat tracking e.g Frame feature extraction (Music) Frame feature extraction (Image sequence)

5 Bar-level feature extraction bar-level feature e.g. 4.2 Video generation 20% N N 95% Linear regression models for multiple clusters k Image sequence selection under the criteria for natural/skillful relationships d(n, k m ) n(1 n N) m k d(n, k m ) ch(n) =1 c l (n, k m ) = ch(k m )=1, p c d(n, k m ) c a (n, k m ) =min τ,μ c l (n, k m ) (μ = m κ = k 1) +c a (n 1,κ μ ) st(n) st(n 1). p t c l (n, k m ) +c a (n 1,κ μ ) ch(n) n st(n) p c p t N

6 d min d min =argmin c a (N,k m ). k,m 4.3 Model training weighted according to view counts ω V c w = max (α log 10 (V c ) β,0). α β , 000 ω =1 100, 000 ω =3 ω ω =2 5. IMPLEMENTATION OF DANCEREPRODUCER 5.1 Dataset MikuMikuDance (MMD) NicoNicoDouga 5.2 Trial usage and introspective comments 6. CONCLUSION DanceReProducer n

7 Acknowledgments IPSJ SIG Technical Reports 2007-MUS-069 IEEE Trans. on Speech and Audio Processing 7. REFERENCES Journal of Information Processing Society of Japan Proc. of the 2008 Computers in Music Modeling and Retrieval Conference Journal of New Music Research IPSJ Transactions on Computer Vision and Image Media Proc. of the 12th annual ACM international conference on Multimedia Proc. of the 32nd IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2007) Proc. of the tenth ACM international conference on Multimedia Proc. of the 12th annual ACM international conference on Multimedia IEEE Trans. on Audio, Speech, and Language Processing IEEE Trans. on Circuits and Systems for Video Technology

Music Signal Spotting Retrieval by a Humming Query Using Start Frame Feature Dependent Continuous Dynamic Programming

Music Signal Spotting Retrieval by a Humming Query Using Start Frame Feature Dependent Continuous Dynamic Programming Takuichi Nishimura Real World Computing Partnership / National Institute of Advanced