Abstract: End-to-end time-domain speech separation with masking strategy has shown its performance advantage, where a 1-D convolutional layer is used as the speech encoder to encode a sliding window ...