LEVERAGING DOMAIN FEATURES FOR DETECTING ADVERSARIAL ATTACKS AGAINST DEEP SPEECH RECOGNITION IN NOISE

Can you distinguish these two samples?

And if we let you know, a common voice assistant would register one of these samples as "Open all doors"?

This work strives to detect these adversarial attacks by employing the underutilised power of cepstral coefficients computed using cleverly designed filter-banks and especially inverse filter-banks as shown in our article.

https://github.com/aau-es-ml/adversarial_speech_filterbank_defence