Google unveils Veo, a high-definition AI video generator that may rival Sora

liquid reality —

Google’s video-synthesis model creates minute-long 1080p videos from written prompts.

Still images taken from videos generated by Google Veo.

Enlarge / Still images taken from videos generated by Google Veo.

Google / Benj Edwards

On Tuesday at Google I/O 2024, Google announced Veo, a new AI video-synthesis model that can create HD videos from text, image, or video prompts, similar to OpenAI’s Sora. It can generate 1080p videos lasting over a minute and edit videos from written instructions, but it has not yet been released for broad use.

Veo reportedly includes the ability to edit existing videos using text commands, maintain visual consistency across frames, and generate video sequences lasting up to and beyond 60 seconds from a single prompt or a series of prompts that form a narrative. The company says it can generate detailed scenes and apply cinematic effects such as time-lapses, aerial shots, and various visual styles

Since the launch of DALL-E 2 in April 2022, we’ve seen a parade of new image synthesis and video synthesis models that aim to allow anyone who can type a written description to create a detailed image or video. While neither technology has been fully refined, both AI image and video generators have been steadily growing more capable.

In February, we covered a preview of OpenAI’s Sora video generator, which many at the time believed represented the best AI video synthesis the industry could offer. It impressed Tyler Perry enough that he put his film studio expansions on hold. However, so far, OpenAI has not provided general access to the tool—instead, it has limited its use to a select group of testers.

Now, Google’s Veo appears at first glance to be capable of video-generation feats similar to Sora. We have not tried it ourselves, so we can only go by the cherry-picked demonstration videos the company has provided on its website. That means anyone viewing them should take Google’s claims with a huge grain of salt, because the generation results may not be typical.

Veo’s example videos include a cowboy riding a horse, a fast-tracking shot down a suburban street, kebabs roasting on a grill, a time-lapse of a sunflower opening, and more. Conspicuously absent are any detailed depictions of humans, which have historically been tricky for AI image and video models to generate without obvious deformations.

Google says that Veo builds upon the company’s previous video-generation models, including Generative Query Network (GQN), DVD-GAN, Imagen-Video, Phenaki, WALT, VideoPoet, and Lumiere. To enhance quality and efficiency, Veo’s training data includes more detailed video captions, and it utilizes compressed “latent” video representations. To improve Veo’s video-generation quality, Google included more detailed captions for the videos used to train Veo, allowing the AI to interpret prompts more accurately.

Veo also seems notable in that it supports filmmaking commands: “When given both an input video and editing command, like adding kayaks to an aerial shot of a coastline, Veo can apply this command to the initial video and create a new, edited video,” the company says.

While the demos seem impressive at first glance (especially compared to Will Smith eating spaghetti), Google acknowledges AI video-generation is difficult. “Maintaining visual consistency can be a challenge for video generation models,” the company writes. “Characters, objects, or even entire scenes can flicker, jump, or morph unexpectedly between frames, disrupting the viewing experience.”

Google has tried to mitigate those drawbacks with “cutting-edge latent diffusion transformers,” which is basically meaningless marketing talk without specifics. But the company is confident enough in the model that it is working with actor Donald Glover and his studio, Gilga, to create an AI-generated demonstration film that will debut soon.

Initially, Veo will be accessible to select creators through VideoFX, a new experimental tool available on Google’s AI Test Kitchen website, labs.google. Creators can join a waitlist for VideoFX to potentially gain access to Veo’s features in the coming weeks. Google plans to integrate some of Veo’s capabilities into YouTube Shorts and other products in the future.

There’s no word yet about where Google got the training data for Veo (if we had to guess, YouTube was likely involved). But Google states that it is taking a “responsible” approach with Veo. According to the company, “Videos created by Veo are watermarked using SynthID, our cutting-edge tool for watermarking and identifying AI-generated content, and passed through safety filters and memorization checking processes that help mitigate privacy, copyright, and bias risks.”

Note: This article have been indexed to our site. We do not claim legitimacy, ownership or copyright of any of the content above. To see the article at original source Click Here

Related Posts
Google is working to bring Material You to Chrome OS [Gallery] thumbnail

Google is working to bring Material You to Chrome OS [Gallery]

With Android 12, Google introduced Material You, a colorful third-generation of Material Design which will eventually expand beyond Android. To that end, Google has begun experimenting with bringing Material You design cues to Chrome OS. While the biggest hallmark of Material You – especially on Android 12 – is the introduction of the dynamic colors…
Read More
iPad Air will probably come without an OLED screen in 2022 thumbnail

iPad Air will probably come without an OLED screen in 2022

iPad Air 2022 wohl ohne OLED. (Foto: t3n) Für 2022 hatte Apple-Analyst Ming-Chi Kuo ein iPad Air mit OLED-Bildschirm prognostiziert. Jetzt hat Kuo seine Vorhersage revidiert. Ein Grund: Zoff zwischen Apple und Samsung. Im März dieses Jahres hatte der renommierte Analyst Ming-Chi Kuo erklärt, er gehe davon aus, dass Apple 2022 ein iPad Air mit…
Read More

How to Train Large Models on Many GPUs?

[Updated on 2022-03-13: add expert choice routing.] [Updated on 2022-06-10]: Greg and I wrote a shorted and upgraded version of this post, published on OpenAI Blog: “Techniques for Training Large Neural Networks” In recent years, we are seeing better results on many NLP benchmark tasks with larger pre-trained language models. How to train large and
Read More
A ChatGPT Outage Briefly Broke the Rabbit R1 thumbnail

A ChatGPT Outage Briefly Broke the Rabbit R1

Rabbit CEO Jesse Lyu defended the Rabbit R1 for the last week, telling Gizmodo and other outlets that it’s “not an Android app.” Now, he may have to defend the next fatal claim for an AI device: that it’s not just a ChatGPT wrapper. The push-to-talk feature on Rabbit R1s experienced a brief outage on
Read More
騰訊雲支援 THE GULU 提供檢測中心實時派籌資訊 thumbnail

騰訊雲支援 THE GULU 提供檢測中心實時派籌資訊

肺炎疫情再度爆發,確診人數日日急增。很多市民都會到各區的檢測中心排隊做檢測。雖然政府都提到即使「安心出行」顯示了風險警告,也不代表一定要去做檢查。但很多市民為求心安,或是因為工作的需要都要排隊做檢測。有見及此食肆排隊派籌 app THE GULU 特別推出檢測中心排隊系統,市民可前往檢測中心取得排隊籌號; 同時透過 THE GULU 手機應用,即時知道檢測中心的派籌情況。 相信不少市民都有使用 THE GULU 遙距取得食肆的籌號,這樣就可以不用在食肆門外呆等,好好利用排隊的時間來購物或辦理其他事宜。現在大家只要透過 THE GULU ,就可以知道檢測中心派籌情況,不過市民目前仍需要親身來到檢測中心,取得實體排隊籌然後按應用顯示的召集時間返回中心報到並進行檢測。這樣大家就是不用在寒風中排隊數小時,亦能減少因為人群聚集而增加感染的機會。今次的服務是結合了 THE GULU 從食肆派籌系統累積經驗,加上騰訊雲效能強大的雲計算平臺,為市民提供即時的籌號資訊。 市民取得實體排隊籌然後按應用顯示的召集時間返回中心報到並進行檢測。 大家只要下載及登記 THE GULU 帳戶,就可以從主頁看到有關預約檢測中心的圖示,裡面會顯示港九新界多家的檢測中心名稱,點選要去檢測中心,系統就會即時顯示包括「目前召集籌號」、「召集起始時間」以及「目前派發的籌號」等等,雖然現階段還未能做到網上取籌那麼方便,但至少讓市民在排隊檢測可以更有預算。
Read More
How AI can fight human trafficking thumbnail

How AI can fight human trafficking

October 7, 2021 4:40 PM Image Credit: Lidiia / Shutterstock The Transform Technology Summits start October 13th with Low-Code/No Code: Enabling Enterprise Agility. Register now! There are 40.3 million victims of human trafficking globally, according to the International Labor Organization. Marinus Analytics, a startup based in Pittsburgh, Pennsylvania, hopes to make a dent in that…
Read More
Index Of News
Total
0
Share