Audio Effect Estimation with DNN-Based Prediction and Search Algorithm

Youichi Okita1, Haruhiro Katayose2

1Graduate School of Science and Technology, Kwansei Gakuin University, Japan
2School of Engineering, Kwansei Gakuin University, Japan

This website provides supplementary materials for the above-titled paper to appear in the Proceedings of the ICASSP 2026. Here, we demonstrate several examples of estimation results achieved by proposed methods.

Abstract

Audio effects play an essential role in sound design. This research addresses the task of audio effect estimation, which aims to estimate the configuration of applied effects from a wet signal. Existing approaches to this problem can be categorized into predictive approaches, which use models pre-trained in a data-driven manner, and search-based approaches, which are based on wet signal reconstruction. In this study, we propose a novel approach that integrates these approaches: first, DNNs predict the dry signal and effect configuration, and then a search is performed based on wet signal reconstruction using these predictions. By estimating the dry signal in the prediction stage, it becomes possible to complement or improve the predictions using reconstruction similarity as an objective function. The experimental evaluation showed that methods based on the proposed approach outperformed the method solely based on the predictive approach. Furthermore, the findings suggest that the task division of predicting the effect type combination followed by the search-based estimation of order and parameters was the most effective across various metrics.

Results

In each example, we first show the process of applying effects to the ground-truth dry signal. Then, for the baseline and each proposed method, we demonstrate the process of effect removal, the estimated effect configuration, and the reconstructed wet signal. To evaluate the performance of effect configuration estimation independently of the performance of effect removal, we performed the reconstruction using ground-truth dry signals.

Example 1

Ground-truth

Dry
 

Wet
 
Bypass-Config-Iter
(Baseline)

One effect removed
(SI-SDR: 21.40)

Reconstructed
(SI-SDR: 36.98)
Dry-Type-Direct
+
Search

Entire chain removed
(SI-SDR: 19.39)

Reconstructed
(SI-SDR: 42.22)
Bypass-Type-Iter
+
Search

One effect removed
(SI-SDR: 21.40)

Reconstructed
(SI-SDR: 45.46)
Bypass-Config-Iter
+
Search

One effect removed
(SI-SDR: 21.40)

Reconstructed
(SI-SDR: 36.27)

Example 2

Ground-truth

Dry
 

Wet
 
Bypass-Config-Iter
(Baseline)

One effect removed
(SI-SDR: 27.43)

Reconstructed
(SI-SDR: 22.81)
Dry-Type-Direct
+
Search

Entire chain removed
(SI-SDR: 24.61)

Reconstructed
(SI-SDR: 28.14)
Bypass-Type-Iter
+
Search

One effect removed
(SI-SDR: 27.43)

Reconstructed
(SI-SDR: 29.34)
Bypass-Config-Iter
+
Search

One effect removed
(SI-SDR: 27.43)

Reconstructed
(SI-SDR: 29.29)

Example 3

Ground-truth

Dry
 

One effect applied
 

Wet
 
Bypass-Config-Iter
(Baseline)

Two effects removed
(SI-SDR: 19.44)

One effect removed
 

Reconstructed
(SI-SDR: 30.65)
Dry-Type-Direct
+
Search

Entire chain removed
(SI-SDR: 16.46)

Reconstructed
(SI-SDR: 35.20)
Bypass-Type-Iter
+
Search

Two effects removed
(SI-SDR: 19.44)

One effect removed
 

Reconstructed
(SI-SDR: 35.34)
Bypass-Config-Iter
+
Search

Two effects removed
(SI-SDR: 19.44)

effect removed
 

Reconstructed
(SI-SDR: 35.33)

Example 4

Ground-truth

Dry
 

One effect applied
 

Two effects applied
 

Wet
 
Bypass-Config-Iter
(Baseline)

Three effects removed
(SI-SDR: 9.52)

Two effects removed
 

One effect removed
 

Reconstructed
(SI-SDR: 9.30)
Dry-Type-Direct
+
Search

Entire chain removed
(SI-SDR: 9.72)

Reconstructed
(SI-SDR: 25.19)
Bypass-Type-Iter
+
Search

Three effects removed
(SI-SDR: 9.52)

Two effects removed
 

One effect removed
 

Reconstructed
(SI-SDR: 18.15)
Bypass-Config-Iter
+
Search

Three effects removed
(SI-SDR: 9.52)

Two effects removed
 

One effect removed
 

Reconstructed
(SI-SDR: 18.53)

Example 5

Ground-truth

Dry
 

One effect applied
 

Two effects applied
 

Wet
 
Bypass-Config-Iter
(Baseline)

Three effects removed
(SI-SDR: 4.74)

Two effects removed
 

One effect removed
 

Reconstructed
(SI-SDR: 15.18)
Dry-Type-Direct
+
Search

Entire chain removed
(SI-SDR: 9.94)

Reconstructed
(SI-SDR: 16.84)
Bypass-Type-Iter
+
Search

Three effects removed
(SI-SDR: 4.74)

Two effects removed
 

One effect removed
 

Reconstructed
(SI-SDR: 14.73)
Bypass-Config-Iter
+
Search

Three effects removed
(SI-SDR: 4.74)

Two effects removed
 

One effect removed
 

Reconstructed
(SI-SDR: 13.49)

Citation

@inproceedings{okita2026audio,
author={Okita, Youichi and Katayose, Haruhiro},
title={Audio Effect Estimation with {DNN}-Based Prediction and Search Algorithm},
booktitle={Proceedings of the 2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
year={2026},
}