OpenAI’s voice cloning AI model requires only 15 seconds of samples to work

OpenAI is providing limited access to a text-to-speech generation platform it developed called Voice Engine, which can create synthetic speech based on 15-second snippets of someone’s speech. An AI-generated voice can read text prompts on command. “These small-scale deployments help us understand speech engine approaches, safeguards, and think about how it might work across various industries,” OpenAI said. Said in his blog post.

Companies with access include education technology company Age of Learning, visual storytelling platform HeyGen, front-line health software maker Dimagi, artificial intelligence communication app creator Livox and health system Lifespan.

What you can hear in these examples released by OpenAI learning era The technology has been used to generate pre-written voice-over content, as well as “instant, personalized responses” written by GPT-4 that are read to students.

First, the English reference audio:

Here are three AI-generated audio clips based on the sample,

OpenAI said it began developing a speech engine in late 2022, and the technology already provides preset voices for the text-to-speech API, ChatGPT’s reading function.In an interview TechCrunchJeff Harris, a member of the OpenAI speech engine product team, said the model was trained on “a combination of licensed and publicly available data.” OpenAI told the publication that the model will only be available to about 10 developers.

AI text-to-audio generation is a growing area of ​​generative AI. While most people focus on musical instruments or natural sounds, few focus on speech generation, in part because of the issues cited by OpenAI.Some names in the space include companies like Podcastle and ElevenLabs offering AI voice cloning technology tool vejicastle Explored last year.

According to OpenAI, its partners agree to abide by its usage policy, which is not to use Voice Generation to impersonate an individual or organization without their consent. Create your own voices for individual users and reveal to your audience that these voices are generated by artificial intelligence. OpenAI also added watermark Track the source of audio clips and automatically monitor how audio is used.

OpenAI has proposed a number of measures it believes could limit the risks of such tools, including phasing out voice-based authentication to access bank accounts, policies to protect the use of people’s voices in AI, increasing education about AI deepfakes, and developing tracking System artificial intelligence content.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button