TIGER-DFT: Time-Domain Speech Separation Using Trainable Complex Encoder-Decoder Layers
This article presents a time-domain speech separation model TIGER-DFT based on a TIGER core for separation part with encoder-decoder layers that mimics a windowed Discrete Fourier transformation. Our proposed model has achieved a separation performance of 7.72 dB in SI-SDR and 9.65 dB in SI-SDRi for noise Libri2Mix dataset outperformed original TIGER model for 0.4 dB for both metrics.