Publications
A white paper on cyberphysical learning. White paper, Singapore University of Technology and Design.
LSL_WhitePaper_Cyber-physical-Campus-Higher-Education.pdf (6.98 MB)
.
2022. 
Visualizing the evolution of alternative hit charts. The 18th International Society for Music Information Retrieval Conference (ISMIR) - Late Breaking Demo.
dh_visualiation_preprint.pdf (5.34 MB)
.
2017. 
Video2Music: Suitable Music Generation from Videos using an Affective Multimodal Transformer model. Expert Systems with Applications.
2311.00968.pdf (5.51 MB)
.
2024. 
A variational autoencoder for music generation controlled by tonal tension. Joint Conference on AI Music Creativity (CSMC + MuMe).
2010.06230.pdf (622.82 KB)
.
2020. 
A variable neighborhood search algorithm to generate piano fingerings for polyphonic sheet music. International Transactions in Operational Research, Special Issue on Variable Neighbourhood Search. 24(3):509–535.
ITOR_VNS_APF_preprint.pdf (840.28 KB)
.
2017. .
2011. 
Unsupervised disentanglement of pitch and timbre for isolated musical instrument sounds. Proceedings of the International Society of Music Information Retrieval (ISMIR).
.
2020. Underwater Acoustic Communication Receiver Using Deep Belief Network. IEEE Transactions on Communications. :1-1.
2102.13397.pdf (12.87 MB)
.
2021. 
Understanding Audio Features via Trainable Basis Functions. Arxiv preprint.
2204.11437.pdf (7.36 MB)
.
2022. 
Uma abordagem baseada em programação linear inteira para a geração de solos de guitarra. XLVIII Simpósio Brasileiro de Pesquisa Operacional (SBPO).
sbpo_dh.pdf (346.61 KB)
.
2016. 
.
2025.
Towards robust audio spoofing detection: a detailed comparison of traditional and learned features. IEEE Access. 7:84229-84241.
ieee_access_herremans.pdf (14.31 MB)
.
2019. 
Towards emotion based music generation: A tonal tension model based on the spiral array. Proceedings of Cognitive Science (CogSci).
CogSci_tension (1).pdf (610.91 KB)
.
2019. 
Text2midi: Generating Symbolic Music from Captions. Proceedings of AAAI, Philadelphia.
2412.16526v2.pdf (569.51 KB)
.
2025. 
Tension ribbons: Quantifying and visualising tonal tension. Second International Conference on Technologies for Music Notation and Representation (TENOR). 2:8-18.
paper_tenor_dh_preprint_small.pdf (1.67 MB)
.
2016. 
Tabu Search voor de optimalisatie van muzikale fragmenten. Faculty of Applied Economics. MSc Business Engineer Management Information Systems
Thesis.pdf (1.25 MB)
.
2005. 
The Structure of Chord Progressions Influences Listeners’ Enjoyment and Absorptive States in EDM. 15th International Conference on Music Perception and Cognition.
Agres460_preprint_v2.pdf (387.15 KB)
.
2018. 
SNIPER Training: Variable Sparsity Rate Training For Text-To-Speech. Proc. of IEEE Tencon, Singapore.
2211.07283.pdf (435.22 KB)
.
2024. 
Single Image Video Prediction with Auto-Regressive GANs. Sensors. 22:3533.
.
2022. Singing Voice Separation Using a Deep Convolutional Neural Network Trained by Ideal Binary Mask and Cross Entropy. Neural Computing and Applications.
main.pdf (2.59 MB)
.
2018. 
Singing voice conversion with disentangled representations of singer and vocal technique using variational autoencoders. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP).
1912.02613.pdf (2.9 MB)
.
2020. .
2014. 
Revisiting the Onsets and Frames Model with Additive Attention. Proceedings of the International Joint Conference on Neural Networks (IJCNN).
2104.06607.pdf (1.52 MB)
.
2021. 
Regression-based music emotion prediction using triplet neural networks. Proceedings of the International Joint Conference on Neural Networks (IJCNN).
2001.09988.pdf (777.31 KB)
.
2020. 
ReconVAT: A Semi-Supervised Automatic Music Transcription Framework for Low-Resource Real-World Data. ACM Multimedia.
.
2021. Real-Time Binaural Auralization. ISTD. PhD
NatalieAngus_PhD_Thesis_01Jul18.pdf (6.19 MB)
.
2018. 
.
2025.
PRESENT: Zero-Shot Text-to-Prosody Control. IEEE Signal Processing Letters.
2408.06827v1.pdf (367.55 KB)
.
2025. 
Predicting emotion from music videos: exploring the relative contribution of visual and auditory information to affective responses. Arxiv preprint.
.
2022. Perceptual evaluation of measures of spectral variance. Journal of the Acoustical Society of America. 143(6):3300–3311.
jasa_an_dh_preprint.pdf (2.46 MB)
.
2018. 
PerceptionGAN: Real-world image construction from provided text through perceptual understanding. 4th Int. Conf. on Imaging, Vision and Pattern Recognition (IVPR), and 9th Int. Conf. on Informatics, Electronics & Vision (ICIEV).
perceptionGAN-preprint.pdf (2.83 MB)
.
2020. 
O.R. and music generation. OR/MS Today. 45(1)
O.R. and music generation - INFORMS.pdf (825.66 KB)
.
2018. 
A novel music-based game with motion capture to support cognitive and motor function in the elderly. IEEE Conference on Games.
preprint.pdf (2.6 MB)
.
2019. 
A Novel Interface for the Graphical Analysis of Music Practice Behaviours. Frontiers in Psychology - Human-Media Interaction. 9
practice_browser.pdf (4.9 MB)
.
2018. 
nnAudio: An on-the-fly GPU Audio to Spectrogram Conversion Toolbox Using 1D Convolution Neural Networks. IEEE Access.
nnAudio.pdf (10.2 MB)
.
2020. 
nnAudio: A PyTorch Audio Processing Tool Using 1D Convolution neural networks. ISMIR - Late Breaking Demo.
nnAudio.pdf (399.08 KB)
.
2019. 
Natural Language Processing Methods for Symbolic Music Generation and Information Retrieval: a Survey. ACM Computing Surveys.
2402.17467.pdf (1.01 MB)
.
2025. 
Mustango: Toward Controllable Text-to-Music Generation. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). pages 8293–8316.
2311.08355 (1).pdf (11.38 MB)
.
2024. 
Musical stylometry: Characterisation of music. Multivariate Humanities.
.
2021. Music generation with structural constraints: an operations research approach. 30th Annual Conference of the Belgian Operational Research (OR) Society (ORBEL30). :37-39.
orbel30_dh.pdf (117.78 KB)
.
2016. 
Music FaderNets: Controllable Music Generation Based On High-Level Features via Low-Level Feature Modelling. ISMIR.
2007.15474.pdf (2.67 MB)
.
2020. 
Music, Computing, and Health: A roadmap for the current and future roles of music technology for healthcare and well-being. Music & Science.
Preprint for OSF_Agres, Schaefer, Volk, et al. (2021)_Music & Science_watermark.pdf (4.07 MB)
.
2021. 
Music and Motion-Detection: A Game Prototype for Rehabilitation and Strengthening in the Elderly. IEEE International Conference on Orange Technologies (ICOT) .
agres_herr_music_rehab_preprint.pdf (1.77 MB)
.
2017. 
MusIAC: An extensible generative framework for Music Infilling Application with multi-level Control. EvoMUSART.
2202.05528.pdf (893.23 KB)
.
2022. 
A multi-modal platform for semantic music analysis: visualizing audio- and score-based tension. 11th International Conference on Semantic Computing IEEE ICSC 2017.
paper_preprint.pdf (1.63 MB)
.
2017. 
A Multimodal Model with Twitter Finbert Embeddings for Extreme Price Movement Prediction of Bitcoin. Expert Systems with Applications.
2206.00648.pdf (3.26 MB)
.
2023. 
Multimodal Deep Models for Predicting Affective Responses Evoked by Movies. The 2nd International Workshop on Computer Vision for Physiological Measurement as part of ICCV. Seoul, South Korea. 2019.
1909.06957.pdf (836.3 KB)
.
2019. 
MorpheuS: generating structured music with constrained patterns and tension. IEEE Transactions on Affective Computing. PP (In Press)(99)
herremans2017morpheusFullIEEE.pdf (5.71 MB)
.
2017. 
MorpheuS: constraining structure in automatic music generation. Dagstuhl seminar on Computational Music Structure Analysis.
abstract_dagstuhl_dh.pdf (88.49 KB)
.
2016. 
MorpheuS: Automatic music generation with recurrent pattern constraints and tension profiles. IEEE TENCON.
paper_morpheus_dh_ieee.pdf (550.61 KB)
.
2016. 
Modern Portfolio Construction with Advanced Deep Learning Models. SUTD. PhD
Joel_Ong_Thesis.pdf (3.44 MB)
.
2024. 
Modeling temporal tonal relations in polyphonic music through deep networks with a novel image-based representation. The Thirty-Second AAAI Conference on Artificial Intelligence.
preprint_lstm.pdf (741.28 KB)
.
2018. 
Modeling Musical Context with Word2vec. First International Workshop On Deep Learning and Music. 1:11-18.
herremans2017work2vec.pdf (745.8 KB)
.
2017. 
MIRFLEX: Music Information Retrieval Feature Library for Extraction. ISMIR, Late Breaking Demos.
2411.00469v1.pdf (89.86 KB)
.
2024. 
Minimally Simple Binaural Room Modelling Using a Single Feedback Delay Network. Journal of the Audio Engineering Society. 66(10):791-807.
angus_jaes_preprint.pdf (6.39 MB)
.
2018. 
MidiCaps — A large-scale MIDI dataset with text captions. ISMIR.
2406.02255v1.pdf (699.83 KB)
.
2024. 
Midi Miner – A Python library for tonal tension and track classification. ISMIR - Late Breaking Demo.
midi_miner.pdf (83.7 KB)
.
2019. 
MERP: A Music Dataset with Emotion Ratings and Raters’ Profile Information. Sensors - Intelligent Sensors. 23(1)
sensors-23-00382 (2).pdf (1.21 MB)
.
2023. 
Markov Based Quality Metrics For Generating Structured Music With Optimization Techniques. Digital Music Research Network (DMNR+9).
dmrn9_dh.pdf (133.29 KB)
.
2014. 
Machine Learning Research that Matters for Music Creation: A Case Study. Journal of New Music Research. 48(1):36-55.
concert_paper_preprint.pdf (1.6 MB)
.
2019. 
A Machine Learning Approach for MIDI to Guitar Tablature Conversion. Sound and Music Computing Conference (SMC).
25.pdf (528.42 KB)
.
2022. 
Looking into the minds of Bach, Haydn and Beethoven: Classification and generation of composer-specific music.
RPS-2014-001.pdf (575.42 KB)
.
2014. .
2025. 
Learning Disentangled Representations of Timbre and Pitch for Musical Instrument Sounds Using Gaussian Mixture Variational Autoencoders. ISMIR.
jyun-ismir.pdf (5.62 MB)
.
2019. 
Learning accent representation with multi-level VAE towards controllable speech synthesis. IEEE Spoken Language Technology (SLT) Workshop.
.
2023. Latent space representation for multi-target speaker detection and identification with a sparse dataset using Triplet neural networks. IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2019).
1910.01463.pdf (934.76 KB)
.
2019. 
.
2022. .
2025.
.
2025.
The impact of musical structure on enjoyment and absorptive listening states in trance music. Music and Consciousness 2 - Worlds, Practices, Modalities.
.
2019. The impact of Audio input representations on neural network based music transcription. Proceedings of the International Joint Conference on Neural Networks (IJCNN).
2001.09989.pdf (1.87 MB)
.
2020. .
2017. 
A Hybrid Fuzzy Logic-Neural Network Approach For Multi-path Separation Of Underwater Acoustic Signals. 89th IEEE Vehicular Technology Conference.
fuzzy logic.pdf (1.66 MB)
.
2019. 
Hit Song Prediction Based on Early Adopter Data and Audio Features. The 18th International Society for Music Information Retrieval Conference (ISMIR) - Late Breaking Demo.
paper_preprint_hit.pdf (221.73 KB)
.
2017. 
Hierarchical Recurrent Neural Networks for Conditional Melody Generation with Long-term Structure. Proceedings of the International Joint Conference on Neural Networks (IJCNN).
2102.09794.pdf (1015.73 KB)
.
2021. 
HEAR 2021: Holistic Evaluation of Audio Representations. Proceedings of Machine Learning Research (PMLR): NeurIPS 2021 Competition Track.
2203.03022.pdf (406.58 KB)
.
2022. 
Harmonic Structure Predicts the Enjoyment of Uplifting Trance Music. Frontiers in Psychology, Cognitive Science. 7(1999)
agres16ut.pdf (1.15 MB)
.
2017. 
Generative Modelling for Controllable Audio Synthesis of Expressive Piano Performance. Workshop on Machine Learning for Music Discover (ML4MD) as part of ICML.
2006.09833.pdf (2.81 MB)
.
2020. .
2014. 
Generating structured music for bagana using quality metrics based on Markov models. Expert Systems With Applications. 42 (21)(21):424–7435.
paper-bagana.pdf (1.73 MB)
.
2015. 
Generating music with an optimization algorithm using a Markov based objective function. ORBEL29, Belgian Conference on Operations Research.
orbel29abs.pdf (138.67 KB)
.
2015. 
Generating Lead Sheets with Affect: A Novel Conditional seq2seq Framework. Proceedings of the International Joint Conference on Neural Networks (IJCNN).
2104.13056.pdf (857.78 KB)
.
2021. 
Generating guitar solos by integer programming. Journal of the Operational Research Society. :971-985.
preprint_guitar_solo_generation_dh.pdf (772.59 KB)
.
2017. 
Generating Fingerings for Polyphonic Piano Music with a Tabu Search Algorithm. Mathematics and Computation in Music. 9110:149-160.
paper_mcm_preprint.pdf (405.73 KB)
.
2015. 
A Gaussian mixture classifier model to differentiate respiratory symptoms using phonated /ɑː/ sounds. The 18th Australasian International Conference on Speech Science and Technology (SST).
ahsounds.pdf (1018.01 KB)
.
2022. 
Gamification and skills tree. Trends and Foresight Report on Cyber-Physical Learning.
.
2024. FuX, an Android app that generates counterpoint. IEEE Symposium on Computational Intelligence for Creativity and Affective Computing (CICAC). :48-55.
wp_fux.pdf (486.27 KB)
.
2013. 
A Functional Taxonomy of Music Generation Systems. ACM Computing Surveys. 50(5):30.
music_generation_survey_dh_preprint.pdf (349.15 KB)
.
2017. 
From Context to Concept: Exploring Semantic Relationships in Music with Word2Vec. Neural Computing and Applications.
paper.pdf (1.64 MB)
.
2018. 
Forecasting Bitcoin Volatility Spikes from Whale Transactions and Cryptoquant Data Using Synthesizer Transformer Models. SSRN.
SSRN-id4247684.pdf (5.05 MB)
.
2022. 
First species counterpoint generation with VNS and vertical viewpoints. Annual Conference of the Belgian Operation Research Society (ORBEL28).
orbel28_dh.pdf (216.63 KB)
.
2014. 
First species counterpoint generation with VNS and vertical viewpoints. Digital Music Research Network (DMNR+8).
dnmr8_dh_dc.pdf (147.73 KB)
.
2013. 
Evaluating the Effectiveness of an Augmented Reality Game Promoting Environmental Action. Sustainability. 13(24):13912.
sustainability-13-13912.pdf (16.23 MB)
.
2021. 
EmoMV: Affective Music-Video Correspondence Learning Datasets for Classification and Retrieval. Information Fusion.
SSRN-id4189323.pdf (2.01 MB)
.
2022. 
The emergence of deep learning: new opportunities for music and audio technologies. Neural Computing and Applications.
main_preprint.pdf (102.16 KB)
.
2019. 
The Effect of Spectrogram Reconstructions on Automatic Music Transcription:An Alternative Approach to Improve Transcription Accuracy. Proceedings of the International Conference on Pattern Recognition (ICPR2020).
2010.09969.pdf (3.46 MB)
.
2021. 
The Effect of Repetitive Structure on Enjoyment in Uplifting Trance Music. 14th International Conference for Music Perception and Cognition (ICMPC). :280-282.
preprint_trance.pdf (139.27 KB)
.
2016. 
The effect of repetitive structure on enjoyment and altered states in uplifting trance music. 2nd International Conference on Music and Consciousness (MUSCON 2), Brighton.
AgresEtAl_muscon.pdf (12.47 KB)
.
2015. 
.
2010.
Downscaling using Deep Convolutional Autoencoders, a case study for South East Asia. Egusphere preprint.
egusphere-2022-234.pdf (8.99 MB)
.
2022. 