Tech

OpenAI can recreate human voices, but it hasn’t released the technology yet


Speech synthesis has come a long way since 1978 speaking and spelling toy that once wowed people with its state-of-the-art electronic voice reading capabilities.Now, using deep learning artificial intelligence modelthe software can not only create sounds that sound realistic, but also convincingly imitate existing sounds Use small audio samples.

Along these lines, OpenAI this week released its Speech Engine, a text-to-speech artificial intelligence model for creating synthetic speech from 15-second clips of recorded audio.It provides audio samples of the speech engine actually running on its website.

Once the voice is cloned, users can enter text into the speech engine and get AI-generated speech results. But OpenAI isn’t ready to release its technology widely yet. The company initially planned to launch a pilot program for developers to sign up for the speech engine API it launched earlier this month. But after thinking more about the ethical implications, the company decided to scale back its ambitions for now.

“In line with our stance on AI safety and our voluntary commitments, we have chosen to preview but not release this technology broadly at this time,” the company wrote. “We hope this preview of the speech engine both highlights its potential and Motivating people.” Societal resilience needs to be built up to meet the challenges posed by increasingly compelling generative models. “

Generally speaking, voice cloning technology isn’t particularly new—there are already Some AI speech synthesis model The technology has been active in the open source community since 2022, with software packages including open voice and XTTSv2But the idea that OpenAI is gradually letting anyone use its particular brand of voice technology is noteworthy. In some ways, the company’s reluctance to fully release it may be the bigger story.

OpenAI says the benefits of its speech technology include providing reading assistance through natural voices, providing creators with global reach by translating content while preserving native accents, providing personalized speech options for non-verbal individuals, and helping patients in Regaining your voice after surgery. Speech disorders.

But it also means that anyone with 15 seconds of recorded voice can effectively clone it, which has obvious implications for potential abuse. Even though OpenAI has never widely released its speech engine, the ability to clone voices has caused trouble in society in the following ways: phone scam Someone imitates the voice of a loved one and Campaign robocalls Featuring cloned voices of politicians like Joe Biden.

In addition, researchers and producers Already indicated Voice cloning technology can be used to break into bank accounts that use voice verification (such as Chase’s bank account) speech recognition), sent by U.S. Senator Sherrod Brown of Ohio, Chairman of the U.S. Senate Committee on Banking, Housing, and Urban Affairs a letter To CEOs major banks In May 2023, ask about the security measures banks are taking to address artificial intelligence risks.

OpenAI recognizes that the technology could cause trouble if initially widely released, so it’s trying to address those issues with a set of rules. It has been testing the technology with a select group of partner companies since last year.For example, a video compositing company Hagen The model has been used to translate a speaker’s voice into other languages ​​while maintaining the same voice.



Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button