Publications
MusIAC: An extensible generative framework for Music Infilling Application with multi-level Control. EvoMUSART. 2202.05528.pdf (893.23 KB)
.
2022. Predicting emotion from music videos: exploring the relative contribution of visual and auditory information to affective responses. Arxiv preprint.
.
2022. Single Image Video Prediction with Auto-Regressive GANs. Sensors. 22:3533.
.
2022. Understanding Audio Features via Trainable Basis Functions. Arxiv preprint. 2204.11437.pdf (7.36 MB)
.
2022. A white paper on cyberphysical learning. White paper, Singapore University of Technology and Design. LSL_WhitePaper_Cyber-physical-Campus-Higher-Education.pdf (6.98 MB)
.
2022. Constructing Time-Series Momentum Portfolios with Deep Multi-Task Learning. Expert Systems with Applications. 230(120587) 2306.13661.pdf (707.95 KB)
.
2023. DiffRoll: Diffusion-based Generative Music Transcription with Unsupervised Pretraining Capability. ICASSP. diffroll.pdf (2.2 MB)
.
2023. A Domain-Knowledge-Inspired Music Embedding Space and a Novel Attention Mechanism for Symbolic Music Modeling. Proceedings of the 37th AAAI Conference on Artificial Intelligence. 2212.00973.pdf (1.74 MB)
.
2023. Learning accent representation with multi-level VAE towards controllable speech synthesis. IEEE Spoken Language Technology (SLT) Workshop.
.
2023. MERP: A Music Dataset with Emotion Ratings and Raters’ Profile Information. Sensors - Intelligent Sensors. 23(1) sensors-23-00382 (2).pdf (1.21 MB)
.
2023. A Multimodal Model with Twitter Finbert Embeddings for Extreme Price Movement Prediction of Bitcoin. Expert Systems with Applications. 2206.00648.pdf (3.26 MB)
.
2023. Accent Conversion in Text-To-Speech Using Multi-Level VAE and Adversarial Training. Proc. of IEEE Tencon, Singapore.
.
2024. Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder. Proc. of IEEE Tencon, Singapore.
.
2024. Are we there yet? A brief survey of Music Emotion Prediction Datasets, Models and Outstanding Challenges arXiv:2406.08809. 2406.08809v1.pdf (156.19 KB)
.
2024. BandControlNet: Parallel Transformers-based Steerable Popular Music Generation with Fine-Grained Spatiotemporal Features. arXiv:2407.10462. 2407.10462v1.pdf (2.3 MB)
.
2024. DeepUnifiedMom: Unified Time-series Momentum Portfolio Construction via Multi-Task Learning with Multi-Gate Mixture of Experts. arXiv:2406.08742. 2406.08742v1.pdf (1.06 MB)
.
2024. DisfluencySpeech -- Single-Speaker Conversational Speech Dataset with Paralanguage. Proc. of IEEE Tencon, Singapore.
.
2024. Gamification and skills tree. Trends and Foresight Report on Cyber-Physical Learning.
.
2024. MidiCaps — A large-scale MIDI dataset with text captions. arXiv:2406.02255. 2406.02255v1.pdf (699.83 KB)
.
2024. Mustango: Toward Controllable Text-to-Music Generation. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). pages 8293–8316. 2311.08355 (1).pdf (11.38 MB)
.
2024. Natural Language Processing Methods for Symbolic Music Generation and Information Retrieval: a Survey. arXiv. 2402.17467 2402.17467.pdf (1.01 MB)
.
2024. SNIPER Training: Variable Sparsity Rate Training For Text-To-Speech. Proc. of IEEE Tencon, Singapore. 2211.07283.pdf (435.22 KB)
.
2024. Video2Music: Suitable Music Generation from Videos using an Affective Multimodal Transformer model. Expert Systems with Applications. 2311.00968.pdf (5.51 MB)
.
2024.