Google’s Lumiere makes artificial intelligence videos closer to reality than fantasy

Google’s new generation video AI model Lumiere use a The new diffusion model is called Space-Time-U-Net (or STUNet) is able to calculate where things are in a film (space) and how they move and change simultaneously (time). technical art This method reportedly allows Lumiere to create videos in one process, rather than putting smaller still frames together.

Lumiere starts by building a basic framework based on the prompts. It then uses the STUNet frame to begin approximating where objects within that frame will move to create more frames that flow into each other, creating the appearance of seamless motion. Lumiere also produced 80 fps compared to 25 fps with stable video spread.

Granted, I’m more of a text journalist than a video journalist, but the fascinating reels released by Google, as well as the preprint scientific papers, show how AI-powered video generation and editing tools have gone from the uncanny valley in just a few years. A nearly realistic tool.It also establishes Google’s technology in a space already occupied by competitors such as Runway, Stable Video Diffusion or Meta emuRunway is one of the first mass-market text-to-video platforms. Release of Runway Gen-2 More realistic videos began to be available in March last year, and runway videos also struggle to depict sports.

Google kindly placed the clips and tips on the Lumiere website, which allowed me to place the same tips through Runway for comparison. The result is as follows:

Yes, some of the sequences are a bit contrived, especially if you look closely at the skin textures or the more atmospheric scenes.but Look at that turtle! It moves like a turtle in water! It looks like a real turtle! I sent the Lumiere introductory video to a friend who is a professional video editor. While she noted that “you could clearly tell it wasn’t completely real,” she was impressed that if I hadn’t told her it was artificial intelligence, she would have thought it was CGI. (She also said: “It’s going to take away my job, isn’t it?”)

While other models stitch together videos from generated keyframes of motion that has occurred (think of pictures in a flip book), STUNet lets Lumiere focus on the motion itself based on where the generated content should appear at a given time in the video.

Google isn’t a big player in text-to-video, but it has slowly released more advanced AI models and is leaning toward more modal focus. Gemini large language model It will eventually bring image generation capabilities to Bard. Lumiere isn’t yet available for testing, but it shows Google’s ability to develop an AI video platform that’s comparable to popular, but controversial, AI video generators like Runway and Pika.Just a reminder, this is the place Google launches artificial intelligence video Two years ago.

Google Imagen clips in 2022
Image: Google

In addition to text-to-movie generation, Lumiere will also allow image-to-movie generation, stylized generation (allowing users to create movies in a specific style), movie images that animate only part of a movie, and inpainting to mask an area Change the color or pattern of your video.

However, Google’s Lumiere paper states, “There are risks of abuse using our technology to create false or harmful content, and we believe it is critical to develop and apply tools for detecting bias and malicious use cases to ensure safety and fairness,” the paper’s authors did not explain. How to achieve this.

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button