Summary: | Trainable parameters and hyperparameters are critical to the development of a deep learning model. However, the components have typically been studied individually, and most studies have found it difficult to investigate the effects of their combination on model performance. We are interested in examining the correlation between the number of trainable parameters in a deep learning model and its performance metrics under different hyperparameters. Specifically, we want to study the effects of using either the Adam or SGD optimizers at three varying learning rates. We use six pre-trained models whose trainable parameters have been quantitatively defined using two strategies: (1) freezing the convolutional basis with partially trainable weights and (2) training the whole model with most trainable weights to obtain a set of trainable parameters. Our experimental result shows a positive correlation between the trainable parameters and the test accuracy regardless of the level of the learning rate. However, for the generalization of the model, it was not guaranteed that a higher number of trainable parameters would lead to higher accuracy and F1 score. We have shown that the correlation between trainable parameters and model generalization becomes positive by using Adam with the smallest learning rate. © 2023 IEEE.
|