Voice is gaining more and more importance in how we relate to our technological environment. From changing the television channel, interacting with our smartphone to making requests to the devices that we have in our environment, are some of the advantages that the use of the word now offers us.
However, there are also risks in this type of communication. The voice is a source of information about our identity. Every time we interact with virtual assistants like Siri or Alexa or with automated telephone answering services, we expose our voice to technologies that not only allow us to recognize what we say, but can also analyze who we are, and extract personal traits such as our age, dialect, mood, etc.
Researchers Luis Alfonso Hernández, Juan Manuel Perero and Fernando Espinoza, from the Signal Processing Applications Group (GAPS) of the Higher Technical School of Telecommunications Engineers at the Polytechnic University of Madrid (UPM) in Spain, in collaboration with the company Sigma AI, have developed a system that uses the latest advances in artificial intelligence to remove personal information from the voice signal.
Specifically, the proposed system uses machine learning techniques through deep neural networks (Deep Learning) to obtain a representation of speech that allows separating the linguistic content from the particular characteristics of each speaker (age, emotional state, dialect…); apply different transformations to suppress those features of the speaker that you want to protect and, finally, generate a voice that maintains the original linguistic content, but excluding the sensitive characteristics of the speaker’s voice.
From a person’s voice we can extract more information about it than we think. (Image: Amazings/NCYT)
Anonymized voice, safe identity
“If the voice is properly anonymized, it could not be considered personal data and there would be no need to worry about having to securely store biometric data,” says Luis A. Hernández, one of the creators of the new system. In addition, “it must be taken into account that virtual assistants train with a large number of audio recordings from many people and here the problem of storing voice recordings without being properly anonymized” arises.
Another important field of application is research, development and innovation in the different speech technologies, since these fields need to have large databases with voice recordings that can also be protected and thus adapt to the requirements of the General Regulations. of Data Protection.
The system could be installed both on the mobile device -in the case of customer service centers- or on the switchboard system itself. In the scenario of virtual assistants “since they are more closed systems”, Luis clarifies, “the manufacturer would have to integrate it into the assistant software, in order to eliminate the problem of preserving and processing biometric data”.
This system has been presented at the international VoicePrivacy initiative, which brings together the main research groups in this field and whose objective is to promote the development of new technologies that make it possible to suppress sensitive information from the speaker, preserving the linguistic content of the spoken message.
The solution developed by the UPM and Sigma AI has been one of the solutions that demonstrated the greatest capacity for voice anonymization while maintaining a high level of linguistic quality of the message. (Source: UPM)