genAI

Text2midi at AAAI

I’m thrilled to introduce text2midi, an end-to-end trained AI model designed to bridge the gap between textual descriptions and MIDI file generation! Our paper has been accepted in the Proceedings of the Association for the Advancement of Artificial Intelligence (AAAI), and will be presented in Philadelphia the coming month.

Mustango: Toward Controllable Text-to-Music Generation.

Excited to announce Mustango, a powerful multimodal Model for generating music from textual prompts. Mustango leverages a Latent Diffusion Model conditioned on textual prompts (encoded using Flan-T5) and various musical features. Try the demo! What makes it different from the rest?
-- greater controllability in the music generation.
-- trained on a large dataset generated using ChatGPT and musical manipulations.
-- superior performance over its predecessors as per the experts.
-- open source!

Video2Music: Suitable Music Generation from Videos using an Affective Multimodal Transformer model

We are happy to announce Video2Music, a novel AI-powered multimodal music generation framework called Video2Music. This framework uniquely uses video features as conditioning input to generate matching music using a Transformer architecture. By employing cutting-edge technology, our system aims to provide video creators with a seamless and efficient solution for generating tailor-made background music.

Live demo on Replicate.
View on github.