music

New dataset: MidiCaps - A Large-scale Dataset of Caption-annotated MIDI Files

I am thrilled to share that MidiCaps - A Large-scale Dataset of Caption-annotated MIDI Files, has been accepted at ISMIR Conference. The MidiCaps dataset is a large-scale dataset of 168,385 midi music files with descriptive text captions, and a set of extracted musical features. The captions have been produced through a captioning pipeline incorporating MIR feature extraction and LLM Claude 3 to caption the data from extracted features with an in-context learning task. The framework used to extract the captions is available open source on github.

Mustango: Toward Controllable Text-to-Music Generation.

Excited to announce Mustango, a powerful multimodal Model for generating music from textual prompts. Mustango leverages a Latent Diffusion Model conditioned on textual prompts (encoded using Flan-T5) and various musical features. Try the demo! What makes it different from the rest?
-- greater controllability in the music generation.
-- trained on a large dataset generated using ChatGPT and musical manipulations.
-- superior performance over its predecessors as per the experts.
-- open source!

New roadmap paper on the role of music technology for health care and well-being

Two years ago, I attended the Lorenz workshop on Music, Health, and Computing in at the University of Leiden. After a long and thorough process, a roadmap paper was published. All of the workshop attendees, who are experts in either music therapy or music technology, put their heads together, to create this important roadmap for the future of this new interdisciplinary field.

New paper on Multimodal Deep Models for Predicting Affective Responses Evoked by Movies

Together with my PhD student Thao and Prof. Gemma Roig (MIT/Frankfurt University), a new paper was published on "Multimodal Deep Models for Predicting Affective Responses Evoked by Movies" in the Proceedings of the 2nd International Workshop on Computer Vision for Physiological Measurement as part of ICCV. Seoul, South Korea. 2019. A preprint is available here.

IEEE Conference on Games - talk on music game for cognitive and physical wellbeing for elderly

Today I gave a talk at the IEEE Conference on Games at Queen Mary University of London. The prototype game was developed as part of a UROP project led by Prof. Kat Agres (NUS), Prof. Simon Lui (Tencent), and myself (SUTD). Credits to the bulk of the development goes to Xuexuan Zhou!

The full game is described in our proceedings paper and the slides are available here:

New Frontiers in Psychology paper on A Novel Graphical Interface for the Analysis of Music Practice Behaviors

The paper I wrote together with Janis Sokolovskis and Elaine Chew from QMUL, called A Novel Interface for the Graphical Analysis of Music Practice Behaviours was just poublished in Frontiers in Psychology - Human-Media Interaction. Read the full article here or download the pdf.

Grant from MIT-SUTD IDC on "An intelligent system for understanding and matching perceived emotion from video with music"

A few months ago, Prof. Gemma Roig (PI, SUTD), Prof. Dorien Herremans (co-PI, SUTD), Dr. Kat Agres (co-PI, A*STAR) and Dr. Eran Gozy (co-PI, MIT, creator of Guitar Hero) got awarded a new grant from the International Design Center (joint research institute of MIT and SUTD) for 'An intelligent system for understanding and matching perceived emotion from video with music'. This is an exiting opportunity and the birth of our new Affective Computing Lab at SUTD that links the computer vision lab and AMAAI lab.

Workshop on Deep Learning and Music

International Workshop on Deep Learning for Music
In conjunction with the 2017 International Joint Conference on Neural Networks
(IJCNN 2017))

14-19 May (1 day), Anchorage
Read more

There has been tremendous interest in deep learning across many fields of study. Recently, these techniques have gained popularity in the field of music. Projects such as Magenta (Google's Brain Team's music generation project), Jukedeck and others testify to their potential.

Pages