Generative Autoregressive Networks for 3D Dancing Move Synthesis From Music
Ahn, Hyemin(Seoul National University)
United States | IEEE Robotics and Automation Letters
2020-03-02 | 바로가기
Generators, Task_analysis, Skeleton, Music
Cited by 6
■ View full text
IEEE Robotics and Automation Letters
Date of Publication: 02 March 2020
Hyemin Ahn1, Jaehun Kim2, Kihyun Kim1, Songhwai Oh1
1 Department of Electrical and Computer Engineering and ASRI, Seoul National University
2 Delft University of Technology
This letter proposes a framework which is able to generate a sequence of three-dimensional human dance poses for a given music. The proposed framework consists of three components: a music feature encoder, a pose generator, and a music genre classifier. We focus on integrating these components for generating a realistic 3D human dancing move from music, which can be applied to artificial agents and humanoid robots. The trained dance pose generator, which is a generative autoregressive model, is able to synthesize a dance sequence longer than 1,000 pose frames. Experimental results of generated dance sequences from various songs show how the proposed method generates human-like dancing move to a given music. In addition, a generated 3D dance sequence is applied to a humanoid robot, showing that the proposed framework can make a robot to dance just by listening to music.
In this letter, we have proposed a machine learning based framework for synthesizing a 3D dance motion when a music has been given as an input. The proposed framework consists of three parts: a music feature encoder, a pose generator, and a music genre classifier. From a given input music, a music feature encoder extracts a set of audio features. Based on this, the genre of the music is determined by the music genre classifier, and a pose generator trained for that genre is used to generate the dance pose sequence for all frames. The proposed pose generator is a generative autoregressive model, which takes the current output pose as an input for generating the next pose frame.
The disadvantage of the proposed method is that the pose generator must be trained separately for each genre. If we trained all genres of dance so that one model could learn, it has been observed that the unrealistic dancing moves are generated. In order to construct a model that can learn patterns of various genres of dance, it will be necessary to apply a multi-task learning technique, which is our future work.
* 관련 자료