Situation
In 2024, AI-generated video is becoming a hot trend. OpenAI's Sora released in February is catching a lot of attention for increasing video length to 60 seconds.
Previously, turning text into video required people/objects, scenes, filming, editing, and many more. It's an even longer chain in film and television. Labor, hardware and software, and time make the final cost of the blockbuster huge. AI-generated video brings a huge space for cost reduction and efficiency.
Various tech giants have invested in multimodal video generation research early on such as Google's Gemini and Video Poet, Microsoft's NUWA-XL and Mora, and before Sora, Runway's Gen-1/Gen-2, Pika 1.0, and Stability AI's Stable Video Diffusion were more popular. In addition, Meta also released Make-A-Video. And this year OpenAI released Sora, a text-to-video model, to take AI video generation to the next level.
Problem
Now a big problem faced by AI video is that after receiving the prompt, it can hardly be adopted and has considerable uncertainty. Video requires logic; the longer the video, the more it requires continuous reasonable logic, making it more difficult for AI to generate long videos. For example, a video of a child eating an apple. It will get smaller and smaller. But the AI generation may show a picture where the apple is never finished.
The difficulties of AI generation are, first, the video is continuous multi-frame images, not a simple combination of pictures. Second, the model complexity, computational difficulty, and cost enhancement. And it requires a large number of “text-video” pairing data. Currently, the videos generated by AI are about 5-15 seconds because of the lack of diverse datasets and large data labeling workload. Thus, Sora has attracted widespread attention by increasing the video length to 1 minute.
Applications
It's prudent to wait for AI technology to mature before utilizing it, but exploring first leads to developing first. AI-generated video is a good breakthrough opportunity for the IPTV, OTT, and DVB fields.
For example, the audience group of children's animation does not have such high requirements for visual experience as the 3D animation of big IP. Xinhu, the creative director of Yinghua (Shanghai) Information Technology Co., Ltd., describes the current AI video generation as a “semi-automatic rifle”, that is, “AI generation + manual production” semi-automated mode. He said, in this case, doing some children's work is in line with the current level of AI.
AI video generation helped Yinghua save 20% of the production process, reducing costs and increasing efficiency. It is said that the “human + AI” mode has enabled Yinghua to realize the batch production of children's content. AI can be used for scene construction, character material library, precision repair, subtitles, promotional materials, dubbing, etc. The cost is 1/20 of animation, the production cycle is 1/10 of animation, and the quality is 90% of 2D animation. Moreover, what we are doing now can lay the foundation for the future, otherwise, the technology is always hanging in the air and can't be realized.
Currently, the use of AI video generation must be under control.
In the future, AI technology will not only reduce costs, but also personalize content, enhance the user experience on OTT platforms, and provide more diverse content to attract different types of viewers. There is no doubt that AI technology will transform the OTT industry and make it more promising. Streaming is the future.
If you're thinking about starting your streaming business, welcome to visit
OTT Maker and get a free trial. Or you can email
market@dolit.cn if you have any questions about OTT business. We are happy to assist you.
Comments
Post a Comment