Publications
DiffRoll: Diffusion-based Generative Music Transcription with Unsupervised Pretraining Capability. ICASSP. diffroll.pdf (2.2 MB)
.
2023. MERP: A Music Dataset with Emotion Ratings and Raters’ Profile Information. Sensors - Intelligent Sensors. 23(1) sensors-23-00382 (2).pdf (1.21 MB)
.
2023. Jointist: Joint Learning for Multi-instrument Transcription and Its Applications. 2206.10805.pdf (427.51 KB)
.
2022. Understanding Audio Features via Trainable Basis Functions. Arxiv preprint. 2204.11437.pdf (7.36 MB)
.
2022. The Effect of Spectrogram Reconstructions on Automatic Music Transcription:An Alternative Approach to Improve Transcription Accuracy. Proceedings of the International Conference on Pattern Recognition (ICPR2020). 2010.09969.pdf (3.46 MB)
.
2021. ReconVAT: A Semi-Supervised Automatic Music Transcription Framework for Low-Resource Real-World Data. ACM Multimedia.
.
2021. Revisiting the Onsets and Frames Model with Additive Attention. Proceedings of the International Joint Conference on Neural Networks (IJCNN). 2104.06607.pdf (1.52 MB)
.
2021. The impact of Audio input representations on neural network based music transcription. Proceedings of the International Joint Conference on Neural Networks (IJCNN). 2001.09989.pdf (1.87 MB)
.
2020. nnAudio: An on-the-fly GPU Audio to Spectrogram Conversion Toolbox Using 1D Convolution Neural Networks. IEEE Access. nnAudio.pdf (10.2 MB)
.
2020. Regression-based music emotion prediction using triplet neural networks. Proceedings of the International Joint Conference on Neural Networks (IJCNN). 2001.09988.pdf (777.31 KB)
.
2020. Unsupervised disentanglement of pitch and timbre for isolated musical instrument sounds. Proceedings of the International Society of Music Information Retrieval (ISMIR).
.
2020. Latent space representation for multi-target speaker detection and identification with a sparse dataset using Triplet neural networks. IEEE Automatic Speech Recognition and Understanding Workshop (ASRU 2019). 1910.01463.pdf (934.76 KB)
.
2019. nnAudio: A PyTorch Audio Processing Tool Using 1D Convolution neural networks. ISMIR - Late Breaking Demo. nnAudio.pdf (399.08 KB)
.
2019. Blacklisted speaker identification using triplet neural networks. MCE2018 competition. SUTD_description.pdf (133.08 KB)
.
2018.