Google unveils Veo, a high-definition AI video generator that may rival Sora

liquid reality —

Google’s video-synthesis model creates minute-long 1080p videos from written prompts.

Still images taken from videos generated by Google Veo.

Enlarge / Still images taken from videos generated by Google Veo.

Google / Benj Edwards

On Tuesday at Google I/O 2024, Google announced Veo, a new AI video-synthesis model that can create HD videos from text, image, or video prompts, similar to OpenAI’s Sora. It can generate 1080p videos lasting over a minute and edit videos from written instructions, but it has not yet been released for broad use.

Veo reportedly includes the ability to edit existing videos using text commands, maintain visual consistency across frames, and generate video sequences lasting up to and beyond 60 seconds from a single prompt or a series of prompts that form a narrative. The company says it can generate detailed scenes and apply cinematic effects such as time-lapses, aerial shots, and various visual styles

Since the launch of DALL-E 2 in April 2022, we’ve seen a parade of new image synthesis and video synthesis models that aim to allow anyone who can type a written description to create a detailed image or video. While neither technology has been fully refined, both AI image and video generators have been steadily growing more capable.

In February, we covered a preview of OpenAI’s Sora video generator, which many at the time believed represented the best AI video synthesis the industry could offer. It impressed Tyler Perry enough that he put his film studio expansions on hold. However, so far, OpenAI has not provided general access to the tool—instead, it has limited its use to a select group of testers.

Now, Google’s Veo appears at first glance to be capable of video-generation feats similar to Sora. We have not tried it ourselves, so we can only go by the cherry-picked demonstration videos the company has provided on its website. That means anyone viewing them should take Google’s claims with a huge grain of salt, because the generation results may not be typical.

Veo’s example videos include a cowboy riding a horse, a fast-tracking shot down a suburban street, kebabs roasting on a grill, a time-lapse of a sunflower opening, and more. Conspicuously absent are any detailed depictions of humans, which have historically been tricky for AI image and video models to generate without obvious deformations.

Google says that Veo builds upon the company’s previous video-generation models, including Generative Query Network (GQN), DVD-GAN, Imagen-Video, Phenaki, WALT, VideoPoet, and Lumiere. To enhance quality and efficiency, Veo’s training data includes more detailed video captions, and it utilizes compressed “latent” video representations. To improve Veo’s video-generation quality, Google included more detailed captions for the videos used to train Veo, allowing the AI to interpret prompts more accurately.

Veo also seems notable in that it supports filmmaking commands: “When given both an input video and editing command, like adding kayaks to an aerial shot of a coastline, Veo can apply this command to the initial video and create a new, edited video,” the company says.

While the demos seem impressive at first glance (especially compared to Will Smith eating spaghetti), Google acknowledges AI video-generation is difficult. “Maintaining visual consistency can be a challenge for video generation models,” the company writes. “Characters, objects, or even entire scenes can flicker, jump, or morph unexpectedly between frames, disrupting the viewing experience.”

Google has tried to mitigate those drawbacks with “cutting-edge latent diffusion transformers,” which is basically meaningless marketing talk without specifics. But the company is confident enough in the model that it is working with actor Donald Glover and his studio, Gilga, to create an AI-generated demonstration film that will debut soon.

Initially, Veo will be accessible to select creators through VideoFX, a new experimental tool available on Google’s AI Test Kitchen website, labs.google. Creators can join a waitlist for VideoFX to potentially gain access to Veo’s features in the coming weeks. Google plans to integrate some of Veo’s capabilities into YouTube Shorts and other products in the future.

There’s no word yet about where Google got the training data for Veo (if we had to guess, YouTube was likely involved). But Google states that it is taking a “responsible” approach with Veo. According to the company, “Videos created by Veo are watermarked using SynthID, our cutting-edge tool for watermarking and identifying AI-generated content, and passed through safety filters and memorization checking processes that help mitigate privacy, copyright, and bias risks.”

Note: This article have been indexed to our site. We do not claim legitimacy, ownership or copyright of any of the content above. To see the article at original source Click Here

Related Posts
Catch up on Linux.conf.au 2014 thumbnail

Catch up on Linux.conf.au 2014

If you failed to make the trip to Perth for LCA this year, you are able to watch most of the talks online. Linux.conf.au took place last week in Perth, and this year, the conference's video team has outdone itself, with the session videos appearing the next day in a lot of cases. The videos…
Read More
Bill Gates’s energy venture fund is expanding into climate adaptation and later-stage investments thumbnail

Bill Gates’s energy venture fund is expanding into climate adaptation and later-stage investments

To date, Breakthrough has been focused on “five grand challenges,” backing companies that promise to drive down climate pollution in electricity, transportation, manufacturing, buildings, and agriculture. All these efforts are considered forms of climate mitigation. Climate adaptation refers to developing ways of bolstering protections against the dangers of climate change, rather than preventing it. In
Read More
Man was bitten by a bat while sleeping and tested positive for rabies virus and died without being vaccinated thumbnail

Man was bitten by a bat while sleeping and tested positive for rabies virus and died without being vaccinated

近日,美国伊利诺伊州发生了自1954年以来,第一例人类狂犬病病例。据外媒报道,来自美国伊利诺伊州斯普林格罗夫(Spring Grove)的87岁的托马斯·克罗布(Thomas Krob),由于被蝙蝠咬到却拒绝接种疫苗,最终导致自己死亡。 根据当地卫生部门的说法,克罗布在今年8月中旬的时候,睡醒时发现自己的脖子上有一只蝙蝠。随后,这只蝙蝠被政府部门捕获。在对蝙蝠进行病毒检测后发现,狂犬病毒检测呈阳性。但即使如此,克罗布仍拒绝接种疫苗。大约在一个月后,克罗布开始出现颈部疼痛、头痛、手臂难以控制、手指麻木和说话困难等症状,最终抢救无效死亡。伊利诺伊州公共卫生部在一份声明中表示,出现在克罗布身上的狂犬病,是该州自1954年以来的第一例人类狂犬病病例。根据美国疾病控制与预防中心的说法,人类一旦感染狂犬病病毒叮并使病毒进入大脑,就会开始引起症状,这可能需要数周或数月的时间。起初的症状看起来像流感,但随后可能会演变成更严重的神经症状。大多数蝙蝠没有狂犬病,但是一些携带狂犬病毒的蝙蝠可以通过咬人或者体液进入到人类眼睛来传播病毒,但触碰蝙蝠,或者蝙蝠的粪便、体液等,不太可能感染狂犬病毒。据了解,狂犬病是狂犬病毒所致的急性传染病,人兽共患,多见于犬、狼、猫等肉食动物,人多因被病兽咬伤而感染。临床表现为特有的恐水、怕风、咽肌痉挛、进行性瘫痪等,因恐水症状比较突出,故本病又名恐水症。狂犬病病毒属于弹状病毒科狂犬病毒属,单股RNA病毒,动物通过互相间的撕咬而传播病毒。我国的狂犬病主要由犬传播。对于狂犬病尚缺乏有效的治疗手段,人患狂犬病后的病死率几近100%。
Read More
MSI Summit MS321UP review: 4K premium color at a lower price thumbnail

MSI Summit MS321UP review: 4K premium color at a lower price

At a glanceExpert's Rating ProsSharp, bright, and color-accurate imageMany image customization optionsWide range of connectivityConsColor temperature, gamma could be more accurateOn-screen menu can be slow and unreliableUSB-C port only delivers 15 watts of powerOur VerdictThe MSI Summit MS321UP tries to undercut the higher-end competition while still offering the same feature sets with largely hit-or-miss results.
Read More
Index Of News
Total
0
Share