The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process
This study aims to explore the application of deep learning models in multi-track music generation to enhance the efficiency and quality of music production. Considering the limited capability of traditional methods in extracting and representing audio features, a multi-track music generation model...
Published in: | IEEE Access |
---|---|
Main Author: | |
Format: | Article |
Language: | English |
Published: |
Institute of Electrical and Electronics Engineers Inc.
2024
|
Online Access: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85200811467&doi=10.1109%2fACCESS.2024.3439989&partnerID=40&md5=be772e01b02fe910201de3423b318752 |
id |
2-s2.0-85200811467 |
---|---|
spelling |
2-s2.0-85200811467 Jiang R.; Mou X. The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process 2024 IEEE Access 12 10.1109/ACCESS.2024.3439989 https://www.scopus.com/inward/record.uri?eid=2-s2.0-85200811467&doi=10.1109%2fACCESS.2024.3439989&partnerID=40&md5=be772e01b02fe910201de3423b318752 This study aims to explore the application of deep learning models in multi-track music generation to enhance the efficiency and quality of music production. Considering the limited capability of traditional methods in extracting and representing audio features, a multi-track music generation model based on the Bidirectional Encoder Representations from Transformers (BERT) Transformer network is proposed. This model first utilizes the BERT model to encode and represent music data, capturing semantic and emotional information within the music data. Subsequently, the encoded music features are inputted into the Transformer network to learn the temporal relationships and structural patterns among music sequences, thereby generating new multi-track music compositions. The performance of this model is evaluated, revealing that compared to other algorithms, the proposed model achieves an accuracy of 95.98% in music generation prediction, with an improvement in precision by 4.77%. Particularly, the model demonstrates significant advantages in predicting pitch of music tracks. Hence, the multi-track music generation model proposed in this study exhibits excellent performance in accuracy and pitch prediction, offering valuable experimental reference for research and practice in the field of multi-track music generation. ©2024 The Authors. Institute of Electrical and Electronics Engineers Inc. 21693536 English Article All Open Access; Gold Open Access |
author |
Jiang R.; Mou X. |
spellingShingle |
Jiang R.; Mou X. The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process |
author_facet |
Jiang R.; Mou X. |
author_sort |
Jiang R.; Mou X. |
title |
The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process |
title_short |
The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process |
title_full |
The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process |
title_fullStr |
The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process |
title_full_unstemmed |
The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process |
title_sort |
The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process |
publishDate |
2024 |
container_title |
IEEE Access |
container_volume |
12 |
container_issue |
|
doi_str_mv |
10.1109/ACCESS.2024.3439989 |
url |
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85200811467&doi=10.1109%2fACCESS.2024.3439989&partnerID=40&md5=be772e01b02fe910201de3423b318752 |
description |
This study aims to explore the application of deep learning models in multi-track music generation to enhance the efficiency and quality of music production. Considering the limited capability of traditional methods in extracting and representing audio features, a multi-track music generation model based on the Bidirectional Encoder Representations from Transformers (BERT) Transformer network is proposed. This model first utilizes the BERT model to encode and represent music data, capturing semantic and emotional information within the music data. Subsequently, the encoded music features are inputted into the Transformer network to learn the temporal relationships and structural patterns among music sequences, thereby generating new multi-track music compositions. The performance of this model is evaluated, revealing that compared to other algorithms, the proposed model achieves an accuracy of 95.98% in music generation prediction, with an improvement in precision by 4.77%. Particularly, the model demonstrates significant advantages in predicting pitch of music tracks. Hence, the multi-track music generation model proposed in this study exhibits excellent performance in accuracy and pitch prediction, offering valuable experimental reference for research and practice in the field of multi-track music generation. ©2024 The Authors. |
publisher |
Institute of Electrical and Electronics Engineers Inc. |
issn |
21693536 |
language |
English |
format |
Article |
accesstype |
All Open Access; Gold Open Access |
record_format |
scopus |
collection |
Scopus |
_version_ |
1809678473557442560 |