Publications

Export 123 results:
Author Title Type [ Year(Desc)]
2024
Melechovsky J., Mehrish A., Sisman B., Herremans D..  2024.  Accent Conversion in Text-To-Speech Using Multi-Level VAE and Adversarial Training. Proc. of IEEE Tencon, Singapore.
Melechovsky J., Mehrish A., Sisman B., Herremans D..  2024.  Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder. Proc. of IEEE Tencon, Singapore.
Kang J., Herremans D..  2024.  Are we there yet? A brief survey of Music Emotion Prediction Datasets, Models and Outstanding Challenges arXiv:2406.08809. PDF icon 2406.08809v1.pdf (156.19 KB)
Luo J., Yang X., Herremans D..  2024.  BandControlNet: Parallel Transformers-based Steerable Popular Music Generation with Fine-Grained Spatiotemporal Features. arXiv:2407.10462. PDF icon 2407.10462v1.pdf (2.3 MB)
Ong J., Herremans D..  2024.  DeepUnifiedMom: Unified Time-series Momentum Portfolio Construction via Multi-Task Learning with Multi-Gate Mixture of Experts. arXiv:2406.08742. PDF icon 2406.08742v1.pdf (1.06 MB)
Wang K., Herremans D..  2024.  DisfluencySpeech -- Single-Speaker Conversational Speech Dataset with Paralanguage. Proc. of IEEE Tencon, Singapore.
Chow D., Herremans D..  2024.  Gamification and skills tree. Trends and Foresight Report on Cyber-Physical Learning.
Melechovsky J., Roy A., Herremans D..  2024.  MidiCaps — A large-scale MIDI dataset with text captions. arXiv:2406.02255. PDF icon 2406.02255v1.pdf (699.83 KB)
Melechovsky J, Guo Z, Ghosal D, Majumder N, Herremans D, Poria S.  2024.  Mustango: Toward Controllable Text-to-Music Generation. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). pages 8293–8316. PDF icon 2311.08355 (1).pdf (11.38 MB)
Le D-V-T, Bigo L., Keller M., Herremans D..  2024.  Natural Language Processing Methods for Symbolic Music Generation and Information Retrieval: a Survey. arXiv. 2402.17467PDF icon 2402.17467.pdf (1.01 MB)
Lam P., Zhang H., Chen N.F, Sisman B., Herremans D..  2024.  SNIPER Training: Variable Sparsity Rate Training For Text-To-Speech. Proc. of IEEE Tencon, Singapore. PDF icon 2211.07283.pdf (435.22 KB)
Kang J, Poria S, Herremans D..  2024.  Video2Music: Suitable Music Generation from Videos using an Affective Multimodal Transformer model. Expert Systems with Applications. PDF icon 2311.00968.pdf (5.51 MB)

Pages