The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process

This study aims to explore the application of deep learning models in multi-track music generation to enhance the efficiency and quality of music production. Considering the limited capability of traditional methods in extracting and representing audio features, a multi-track music generation model...

Full description

Bibliographic Details
Published in:IEEE Access
Main Author: Jiang R.; Mou X.
Format: Article
Language:English
Published: Institute of Electrical and Electronics Engineers Inc. 2024
Online Access:https://www.scopus.com/inward/record.uri?eid=2-s2.0-85200811467&doi=10.1109%2fACCESS.2024.3439989&partnerID=40&md5=be772e01b02fe910201de3423b318752
id 2-s2.0-85200811467
spelling 2-s2.0-85200811467
Jiang R.; Mou X.
The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process
2024
IEEE Access
12

10.1109/ACCESS.2024.3439989
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85200811467&doi=10.1109%2fACCESS.2024.3439989&partnerID=40&md5=be772e01b02fe910201de3423b318752
This study aims to explore the application of deep learning models in multi-track music generation to enhance the efficiency and quality of music production. Considering the limited capability of traditional methods in extracting and representing audio features, a multi-track music generation model based on the Bidirectional Encoder Representations from Transformers (BERT) Transformer network is proposed. This model first utilizes the BERT model to encode and represent music data, capturing semantic and emotional information within the music data. Subsequently, the encoded music features are inputted into the Transformer network to learn the temporal relationships and structural patterns among music sequences, thereby generating new multi-track music compositions. The performance of this model is evaluated, revealing that compared to other algorithms, the proposed model achieves an accuracy of 95.98% in music generation prediction, with an improvement in precision by 4.77%. Particularly, the model demonstrates significant advantages in predicting pitch of music tracks. Hence, the multi-track music generation model proposed in this study exhibits excellent performance in accuracy and pitch prediction, offering valuable experimental reference for research and practice in the field of multi-track music generation. ©2024 The Authors.
Institute of Electrical and Electronics Engineers Inc.
21693536
English
Article
All Open Access; Gold Open Access
author Jiang R.; Mou X.
spellingShingle Jiang R.; Mou X.
The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process
author_facet Jiang R.; Mou X.
author_sort Jiang R.; Mou X.
title The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process
title_short The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process
title_full The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process
title_fullStr The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process
title_full_unstemmed The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process
title_sort The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process
publishDate 2024
container_title IEEE Access
container_volume 12
container_issue
doi_str_mv 10.1109/ACCESS.2024.3439989
url https://www.scopus.com/inward/record.uri?eid=2-s2.0-85200811467&doi=10.1109%2fACCESS.2024.3439989&partnerID=40&md5=be772e01b02fe910201de3423b318752
description This study aims to explore the application of deep learning models in multi-track music generation to enhance the efficiency and quality of music production. Considering the limited capability of traditional methods in extracting and representing audio features, a multi-track music generation model based on the Bidirectional Encoder Representations from Transformers (BERT) Transformer network is proposed. This model first utilizes the BERT model to encode and represent music data, capturing semantic and emotional information within the music data. Subsequently, the encoded music features are inputted into the Transformer network to learn the temporal relationships and structural patterns among music sequences, thereby generating new multi-track music compositions. The performance of this model is evaluated, revealing that compared to other algorithms, the proposed model achieves an accuracy of 95.98% in music generation prediction, with an improvement in precision by 4.77%. Particularly, the model demonstrates significant advantages in predicting pitch of music tracks. Hence, the multi-track music generation model proposed in this study exhibits excellent performance in accuracy and pitch prediction, offering valuable experimental reference for research and practice in the field of multi-track music generation. ©2024 The Authors.
publisher Institute of Electrical and Electronics Engineers Inc.
issn 21693536
language English
format Article
accesstype All Open Access; Gold Open Access
record_format scopus
collection Scopus
_version_ 1809678473557442560