The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process

This study aims to explore the application of deep learning models in multi-track music generation to enhance the efficiency and quality of music production. Considering the limited capability of traditional methods in extracting and representing audio features, a multi-track music generation model...

Full description

Bibliographic Details
Published in:IEEE ACCESS
Main Authors: Jiang, Rong; Mou, Xiaofei
Format: Article
Language:English
Published: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC 2024
Subjects:
Online Access:https://www-webofscience-com.uitm.idm.oclc.org/wos/woscc/full-record/WOS:001291890100001
author Jiang
Rong; Mou
Xiaofei
spellingShingle Jiang
Rong; Mou
Xiaofei
The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process
Computer Science; Engineering; Telecommunications
author_facet Jiang
Rong; Mou
Xiaofei
author_sort Jiang
spelling Jiang, Rong; Mou, Xiaofei
The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process
IEEE ACCESS
English
Article
This study aims to explore the application of deep learning models in multi-track music generation to enhance the efficiency and quality of music production. Considering the limited capability of traditional methods in extracting and representing audio features, a multi-track music generation model based on the Bidirectional Encoder Representations from Transformers (BERT) Transformer network is proposed. This model first utilizes the BERT model to encode and represent music data, capturing semantic and emotional information within the music data. Subsequently, the encoded music features are inputted into the Transformer network to learn the temporal relationships and structural patterns among music sequences, thereby generating new multi-track music compositions. The performance of this model is evaluated, revealing that compared to other algorithms, the proposed model achieves an accuracy of 95.98% in music generation prediction, with an improvement in precision by 4.77%. Particularly, the model demonstrates significant advantages in predicting pitch of music tracks. Hence, the multi-track music generation model proposed in this study exhibits excellent performance in accuracy and pitch prediction, offering valuable experimental reference for research and practice in the field of multi-track music generation.
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
2169-3536

2024
12

10.1109/ACCESS.2024.3439989
Computer Science; Engineering; Telecommunications
gold
WOS:001291890100001
https://www-webofscience-com.uitm.idm.oclc.org/wos/woscc/full-record/WOS:001291890100001
title The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process
title_short The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process
title_full The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process
title_fullStr The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process
title_full_unstemmed The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process
title_sort The Analysis of Multi-Track Music Generation With Deep Learning Models in Music Production Process
container_title IEEE ACCESS
language English
format Article
description This study aims to explore the application of deep learning models in multi-track music generation to enhance the efficiency and quality of music production. Considering the limited capability of traditional methods in extracting and representing audio features, a multi-track music generation model based on the Bidirectional Encoder Representations from Transformers (BERT) Transformer network is proposed. This model first utilizes the BERT model to encode and represent music data, capturing semantic and emotional information within the music data. Subsequently, the encoded music features are inputted into the Transformer network to learn the temporal relationships and structural patterns among music sequences, thereby generating new multi-track music compositions. The performance of this model is evaluated, revealing that compared to other algorithms, the proposed model achieves an accuracy of 95.98% in music generation prediction, with an improvement in precision by 4.77%. Particularly, the model demonstrates significant advantages in predicting pitch of music tracks. Hence, the multi-track music generation model proposed in this study exhibits excellent performance in accuracy and pitch prediction, offering valuable experimental reference for research and practice in the field of multi-track music generation.
publisher IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
issn 2169-3536

publishDate 2024
container_volume 12
container_issue
doi_str_mv 10.1109/ACCESS.2024.3439989
topic Computer Science; Engineering; Telecommunications
topic_facet Computer Science; Engineering; Telecommunications
accesstype gold
id WOS:001291890100001
url https://www-webofscience-com.uitm.idm.oclc.org/wos/woscc/full-record/WOS:001291890100001
record_format wos
collection Web of Science (WoS)
_version_ 1809679297595572224