ismir

Presenting MidiCaps and MIRFLEX at ISMIR in San Francisco

Exciting news from the AMAAI Lab at this year’s ISMIR conference in San Francisco! We were thrilled to showcase some of our research:

MidiCaps
Presented by Jan Melechovsky and Abinaba Roy, MidiCaps is the first large-scale open midi dataset with text captions. This resource will enable us to develop the very first text-to-midi models (stay tuned -- our lab's model is coming soon!).

New dataset: MidiCaps - A Large-scale Dataset of Caption-annotated MIDI Files

I am thrilled to share that MidiCaps - A Large-scale Dataset of Caption-annotated MIDI Files, has been accepted at ISMIR Conference. The MidiCaps dataset is a large-scale dataset of 168,385 midi music files with descriptive text captions, and a set of extracted musical features. The captions have been produced through a captioning pipeline incorporating MIR feature extraction and LLM Claude 3 to caption the data from extracted features with an in-context learning task. The framework used to extract the captions is available open source on github.

ISMIR in Suzhou, China - presenting hit prediction through social media listening behaviours

Last week the National University of Singapore hosted the International Society for Music Information Retrieval conference in lovely Suzhou, China. It featured a ton of interesting presentations by established academics in the field including Prof. Elaine Chew (who also talked about MorpheuS), Roger Dannenberg and others; but also industry leaders such as Jeffrey C. Smith (Smule) and E. Humprey (Spotify).