Highlights/Upcoming events

PhD positions in the Music, Audio, and AI Lab

Posted by dorien on Wednesday, 6 March 2024

The Music, Audio, and AI Lab (AMAAI) at SUTD invites applications for a PhD position in the exciting and rapidly evolving field of music and audio artificial intelligence.

The AMAAI lab is engaged in cutting-edge research at the intersection of music, audio, and artificial intelligence. Our PhD students contribute to groundbreaking projects that explore areas such as:

Tags:

PhD

job

New survey on how NLP is used in Music Information Retrieval

Posted by dorien on Friday, 1 March 2024

I am excited to announce the latest paper by Viet-Toan Le, who was a visiting student at the AMAAI Lab, on 'Natural Language Processing Methods for Symbolic Music Generation and Information Retrieval: a Survey'. Viet-Toan did an amazing job at collating and nicely presenting over 225 papers in Music Information Retrieval that are inspired by NLP, as well as presenting the latest challenges and steps forward for the field!

Tags:

survey

MIR

nlp

Postdoc and RA position in text-to-music project

Posted by dorien on Friday, 9 February 2024

I am excited to announce that we have two positions open at the AMAAI Lab in Singapore: a postdoc and research assistant position in generative AI for music models. They will be building on our recent work of the Mustango model (https://github.com/AMAAI-Lab/mustango) and video2music (https://github.com/AMAAI-Lab/Video2Music).

Tags:

job

Mustango: Toward Controllable Text-to-Music Generation.

Posted by dorien on Wednesday, 15 November 2023

Excited to announce Mustango, a powerful multimodal Model for generating music from textual prompts. Mustango leverages a Latent Diffusion Model conditioned on textual prompts (encoded using Flan-T5) and various musical features. Try the demo! What makes it different from the rest?
-- greater controllability in the music generation.
-- trained on a large dataset generated using ChatGPT and musical manipulations.
-- superior performance over its predecessors as per the experts.
-- open source!

Tags:

Video2Music: Suitable Music Generation from Videos using an Affective Multimodal Transformer model

Posted by dorien on Tuesday, 7 November 2023

We are happy to announce Video2Music, a novel AI-powered multimodal music generation framework called Video2Music. This framework uniquely uses video features as conditioning input to generate matching music using a Transformer architecture. By employing cutting-edge technology, our system aims to provide video creators with a seamless and efficient solution for generating tailor-made background music.

Live demo on Replicate.
View on github.

Tags:

video2music

genAI

Upcoming talks

Posted by dorien on Thursday, 12 October 2023

We are happy to announce two talks on Tuesday 17 October at 2pm at SUTD I3 Lab 1.605

Add to your calendar.

Title : Exploring NLP Methods in Symbolic MIR: Representations and Models

Twitter-based Bitcoin extreme movement predictions with PreBit

Posted by dorien on Friday, 23 June 2023

Our paper on PreBit - A multimodal model with Twitter FinBERT embeddings for extreme price movement prediction of Bitcoin just got published in Expert Systems with Applications.

Read the article here.

Tags:

bitcoin

nlp

finBERT

Time-series momentum portfolios with deep multi-task learning

Posted by dorien on Monday, 12 June 2023

Congratulations to Joel Ong on publishing our paper on using multi-task deep learning for porfolio construction in Expert Systems with Applications. The paper presents a new way to leverage time series momentum in a deep learning setting. Read a Twitter thread explaining the basics here.

Tags:

DiffRoll - Music Transcription with Diffusion

Posted by dorien on Monday, 31 October 2022

Great work Cheuk Kin Wai on his latest paper on DiffRoll: Diffusion-based Generative Music Transcription with Unsupervised Pretraining Capability

Cheuk, K. W., Sawata, R., Uesaka, T., Murata, N., Takahashi, N., Takahashi, S., ... & Mitsufuji, Y. (2022). DiffRoll: Diffusion-based Generative Music Transcription with Unsupervised Pretraining Capability. arXiv preprint arXiv:2210.05148.

Demo & Source code available here.

Tags:

diffusion

transcription

New paper on the emoMV datasets published in Information Fusion

Posted by dorien on Wednesday, 19 October 2022

Congratulations to Thao on leading the publication of the EmoMV dataset set for music-video matching based on emotion!

Pham Q-H, Herremans D., Roig G.. 2022. EmoMV: Affective Music-Video Correspondence Learning Datasets for Classification and Retrieval. Information Fusion. DOI: 10.1016/j.inffus.2022.10.002

Paper highlights:

Tags:

emotion

dataset

Keynote at AIMC

Posted by dorien on Wednesday, 21 September 2022

I was honoured to give a keynote talk at the 3rd Conference on AI Music Creativity (AIMC) on controllable music generation with emotion. Watch the full keynote here:

New paper in Sensors on Single Image Video Prediction with Auto-Regressive GANs

Posted by dorien on Tuesday, 9 August 2022

Congrats on my former research assistant Jiahui Huang on his latest paper in Sensors on 'Single Image Video Prediction with Auto-Regressive GANs'. Now we can generate videos of faces with desired emotions!

Full paper available here.

Huang, Jiahui, Yew Ken Chia, Samson Yu, Kevin Yee, Dennis Küster, Eva G. Krumhuber, Dorien Herremans, and Gemma Roig. "Single Image Video Prediction with Auto-Regressive GANs." Sensors 22, no. 9 (2022): 3533.

Tags:

video

generation

Bitcoin extreme price prediction with finBERT & Twitter

Posted by dorien on Tuesday, 12 July 2022

Read my PhD student Yanzhao Zou and myself's latest paper on A Multimodal Model with Twitter Finbert Embeddings for Extreme Price Movement Prediction of Bitcoin.

Tags:

bitcoin

cryptocurrency

Job opening: game developer (unity)

Posted by dorien on Thursday, 9 June 2022

SUTD Game Lab is looking for a passionate game developer to join our team. The Game Programmer will visualise how systems should work, and translate this into a functioning solution using the existing game engine to develop prototypes or a polished game for training purposes.

Responsibilities:

Our cough models featured in NRF magazine

Posted by dorien on Thursday, 2 June 2022

Read the full NRF magazine here. Our original paper is available here.

Seminar on music and AI at KTH

Posted by dorien on Thursday, 31 March 2022

It was an honour today to be part of the seminar at the KTH Royal Institute of Technology in Stockholm as part of the dialogues series.

dialogues1: probing the future of creative technology
Subject: “Interaction with generative music frameworks”

Guests: Dorien Herremans and Kıvanç Tatar (Video link to be posted)

Dorien Herremans: Controllable deep music generation with emotion

Tags:

seminar

talk

AI and you - podcast

Posted by dorien on Monday, 7 March 2022

Excited to be featured on the latest 'AI and You - What is AI? How will it affect your life, your work, and your world?' podcast by Peter Scott from Human Cusp.

We're focusing on AI in music: What's the state of the art in AI music composition, how can human composers use it to their advantage, and what is the AI Song Contest? How do musical AIs surprise their creators and how are they like your grandmother trying to explain death metal?

Panel discussion at Ars Electronica - watch video here

Posted by dorien on Monday, 7 March 2022

Last year I was honoured to be part of the panel discussion on 'Challenging the limits of AI for the next generation of co-creative tools - Frontiers of Music and Artificial Intelligence'. at Ars Electronica, IRCAM (FR). Watch the video below.

ReconVAT presented in ACM Multimedia

Posted by dorien on Monday, 28 February 2022

Congrats to Kin Wa Cheuk for his published paper in the ACM Multimedia conference (A*) on 'ReconVAT: A Semi-Supervised Automatic Music Transcription Framework for Low-Resource Real-World Data'. If you are interested in training low-data music transcription models with semi-supervised learning, check out the full paper here, or access the preprint.

Watch Raven's talk here:

Tags:

transcription

Internship opportunity Sounders Music / SUTD

Posted by dorien on Tuesday, 18 January 2022

Sounder Music in the Netherlands (https://soundersmusic.com/) has an internship opportunity for a MSc or PhD student in data analytics for music. The internship will be (remotely) co-supervised by myself (Prof. Dorien Herremans, SUTD) and the founder of Sounders Music (Willem Bloem).

Tags:

internship

Dorien Herremans

Assistant Professor, SUTD
AMAAI - Audio, Music, and AI

Highlights/Upcoming events

PhD positions in the Music, Audio, and AI Lab

New survey on how NLP is used in Music Information Retrieval

Postdoc and RA position in text-to-music project

Mustango: Toward Controllable Text-to-Music Generation.

Video2Music: Suitable Music Generation from Videos using an Affective Multimodal Transformer model

Upcoming talks

Twitter-based Bitcoin extreme movement predictions with PreBit

Time-series momentum portfolios with deep multi-task learning

DiffRoll - Music Transcription with Diffusion

New paper on the emoMV datasets published in Information Fusion

Keynote at AIMC

New paper in Sensors on Single Image Video Prediction with Auto-Regressive GANs

Bitcoin extreme price prediction with finBERT & Twitter

Job opening: game developer (unity)

Our cough models featured in NRF magazine

Seminar on music and AI at KTH

AI and you - podcast

Panel discussion at Ars Electronica - watch video here

ReconVAT presented in ACM Multimedia

Internship opportunity Sounders Music / SUTD

Pages