Curated by THEOUTPOST
On Wed, 5 Feb, 4:02 PM UTC
13 Sources
[1]
OmniHuman-1 creates full-body AI avatars from a single image
ByteDance, the parent company of TikTok, has recently launched OmniHuman-1, a sophisticated AI video generation framework that can create high-quality videos from a single image coupled with an audio clip. This model combines video, audio, and near-perfect lip-syncing capabilities. OmniHuman-1 is notable for producing not only photorealistic videos but also anthropomorphic cartoons, animated objects, and complex poses. Alongside this, ByteDance introduced another AI model called Goku, which achieves similar text-to-video quality with a compact architecture of 8 billion parameters, specifically targeting the advertising market. These developments position ByteDance among the top players in the AI field alongside Chinese tech giants like Alibaba and Tencent. Its advances significantly disrupt the landscape for AI-generated content compared to other companies such as Kling AI, given ByteDance's extensive video media library, which is potentially the largest after Facebook. The demo videos for OmniHuman-1 showcase impressive results from various input types, with a high level of detail and minimal glitches. Unlike traditional deepfake technologies that often focus solely on facial animations, OmniHuman-1 encompasses full-body animations, accurately mimicking gestures and expressions. Furthermore, the AI model adapts well to different image qualities, creating smooth motion regardless of the original input. OmniHuman-1 leverages a diffusion-transformer model to generate motion by predicting movement patterns frame-by-frame, resulting in realistic transitions and body dynamics. Trained on an extensive dataset of 18,700 hours of human video footage, the model understands a wide array of motions and expressions. Notably, its "omni-conditions" training strategy, which integrates multiple input signals such as audio, text, and pose references, enhances the accuracy of movement predictions. Tried out CogVideoX, another open-source text-to-video AI Despite the promising advances in AI video generation, the ethical implications are significant. The technology introduces risks such as the potential for deepfake misuse in generating misleading media, identity theft, and other malicious applications. Consequently, ByteDance has not yet released OmniHuman-1 for public use, likely due to these concerns. If it becomes publicly available, strong safeguards including digital watermarking and content authenticity tracking will likely be necessary to mitigate potential abuses.
[2]
ByteDance launches ultra-realistic human videos AI OmniHuman-1 amid TikTok controversy -- Features, capabilities & more
Amid the TikTok controversy in US, ByteDance has silently launched a new artificial intelligence model OmniHuman-1. Chinese technology giant ByteDance, the company behind TikTok, has discreetly introduced a groundbreaking artificial intelligence model, OmniHuman-1, designed to generate ultra-realistic human videos from a single still image. The development, which places ByteDance at the forefront of AI-driven content creation, has sparked fresh concerns over the potential misuse of deepfake technology, especially in an era where digital disinformation is an increasing global threat. According to a recently published research paper by ByteDance's AI division, the OmniHuman-1 model has been trained on over 18,700 hours of human videos, allowing it to produce highly accurate, lifelike human movements and speech synchronization, as mentioned in a report by ABC. This leap in generative AI technology could have profound implications, raising questions about its ethical use and potential national security risks. AI expert Henry Ajder cautioned that OmniHuman-1 represents a significant leap forward in deepfake technology. Unlike previous models, which required hundreds or even thousands of images to generate convincing videos, ByteDance's latest model can achieve astonishingly realistic results from just one image. "If this technology is widely available, it will make it easier than ever to create fake videos for deceptive purposes," Ajder told ABC News. He emphasized that the model's sophisticated rendering of facial expressions and body movements could allow for highly convincing impersonations, posing serious risks in areas like political disinformation, identity theft, and cyber fraud. ByteDance has yet to disclose the exact sources of the training data used for OmniHuman-1. The company declined ABC News' request for comment, but a ByteDance representative assured Forbes that if the technology is deployed for public use, it will include strict safeguards against harmful content. Among the demonstrations included in the research paper, OmniHuman-1 transformed a still portrait of Albert Einstein into a video where the physicist appeared to deliver a lecture. Other examples showcased AI-generated TED Talk speakers and musicians, illustrating the model's potential for education, entertainment, and digital storytelling. One of the key advancements of OmniHuman-1 is its ability to generate high-fidelity video in any aspect ratio, eliminating common AI flaws such as unnatural lip movements and hand distortions. According to researchers, the realism of the outputs surpasses existing AI models, making it difficult for traditional AI-detection tools to identify synthetic content. The timing of ByteDance's AI breakthrough is particularly significant as governments worldwide grapple with the rising use of AI-generated disinformation. Recent reports from the Brookings Institution highlighted that artificial intelligence played a role in influencing voter opinions during the 2024 U.S. elections, with Russian actors deploying AI-generated propaganda on issues like immigration, crime, and foreign policy. Other countries have also experienced the dangerous potential of AI-driven deception. In Bangladesh, a scandal erupted when an AI-generated deepfake depicted a politician in a compromising image. In Moldova, similar technology was used to falsely portray the country's pro-Western president supporting a Russian-backed political party. Meanwhile, in the United States, an AI-generated voice clone of President Joe Biden was used to discourage voter participation in the New Hampshire primary, an incident that the state's attorney general condemned as a direct attack on electoral integrity. While ByteDance has demonstrated its technological prowess, the United States is working to close the gap. Former President Donald Trump previously announced a $500 billion private-sector AI investment, involving companies like OpenAI, SoftBank, and Oracle, to accelerate American AI innovation. However, John Cohen, a former intelligence official at the Department of Homeland Security, warned that the U.S. has been slow to react to the evolving AI-driven threat landscape. "The United States is in a dynamic and dangerous threat environment, fueled by online content created by foreign intelligence agencies, extremist groups, and criminal organizations," Cohen stated. He added that tools like OmniHuman-1 could empower malicious actors to produce sophisticated deepfakes more efficiently and at a lower cost. As the world moves into an AI-dominated future, the unveiling of OmniHuman-1 raises urgent ethical and regulatory questions. Whether ByteDance will integrate this technology into TikTok or other platforms remains to be seen, but its capabilities underscore the high-stakes battle over AI supremacy between China and the United States. ByteDance, the parent company of TikTok, was co-founded by Zhang Yiming and Liang Rubo. Government device bans are typically driven by national security concerns regarding potential data access by the Chinese government.
[3]
ByteDance OmniHuman Creates Details AI Videos From a Single Image
ByteDance, the company behind TikTok, has introduced OmniHuman, an advanced artificial intelligence system capable of generating highly realistic full-body deepfake videos from a single image. This innovation marks a significant step forward in AI-driven video generation, allowing the creation of lifelike animations, synchronized audio, and intricate gestures. While the technology offers immense potential for creative and educational applications, it also raises critical ethical concerns, including risks related to misinformation, fraud, and misuse. It sounds like something out of a sci-fi movie, but ByteDance has made it a reality with their new AI system, OmniHuman. From virtual teaching assistants to resurrecting historical figures, the possibilities are as exciting as they are unsettling. It's easy to see the creative potential -- imagine artists, educators, and filmmakers using this tool to reimagine storytelling. Yet, the same technology that can inspire also has the potential to deceive. What happens when these hyper-realistic deepfakes fall into the wrong hands? The risks of misinformation, fraud, and erosion of trust in digital media are real and pressing. AI Revolution explore more about the implications of the new OmniHuman AI model. OmniHuman employs sophisticated AI methodologies to produce full-body animations that closely replicate natural movements, gestures, and speech synchronization. At the heart of its functionality lies "omni-conditions" training, a process that integrates data from diverse sources such as text, audio, and body pose modeling. This multi-faceted approach enables the system to deliver outputs that are both highly realistic and adaptable. The AI system is trained on an extensive dataset, reportedly exceeding 18,700 hours of video content. Although ByteDance has not disclosed the exact sources of this data, it is widely speculated that TikTok's vast content library plays a significant role. Beyond video generation, OmniHuman includes advanced editing tools that allow users to modify body proportions, adjust aspect ratios, and even alter gestures. These features make it a versatile solution for video editing and content creation, offering unparalleled precision and realism. OmniHuman's capabilities unlock a wide range of possibilities across various industries. Its potential applications include: These applications highlight OmniHuman's potential to redefine how digital content is produced, consumed, and experienced. Dive deeper into AI video with other articles and guides we have written below. Despite its new potential, OmniHuman introduces significant risks that demand careful consideration. The ability to create hyper-realistic deepfakes makes the technology susceptible to misuse in several critical areas: The growing accessibility of such technologies underscores the urgency for robust detection tools and comprehensive regulatory frameworks. Without proactive measures, the societal impact of deepfake misuse could be profound, affecting governance, security, and personal privacy. Efforts to regulate deepfake technology remain in their early stages and vary significantly across regions. In the United States, some states have enacted laws targeting AI impersonation and deepfake misuse, but the absence of comprehensive federal regulations creates enforcement challenges. Globally, the regulatory landscape is fragmented. While some nations prioritize fostering innovation over imposing restrictions, others are beginning to address the ethical and legal implications of advanced AI systems. This inconsistency highlights the need for international collaboration to establish clear guidelines and safeguards that balance technological progress with ethical responsibility. ByteDance is not the only player in the race to develop advanced AI video generation systems. Major technology companies such as Google, Meta, and Microsoft are also heavily investing in similar technologies, pushing the boundaries of what AI can achieve in video production. Although OmniHuman has not yet been publicly released, ByteDance plans to showcase the technology at an upcoming computer vision conference, signaling its commitment to advancing the field. The rapid pace of AI innovation suggests that comparable systems will soon emerge from other research labs and open source initiatives. This competitive environment emphasizes the importance of balancing technological advancements with ethical considerations to ensure responsible development and use. The societal implications of OmniHuman and similar technologies are extensive. As deepfake systems become more sophisticated, they risk undermining trust in digital content, complicating governance, and allowing unethical practices in both business and personal contexts. The ethical debate surrounding AI often centers on balancing creative freedom with the need to prevent harm. Addressing these challenges will require open dialogue among technologists, policymakers, and the public. Establishing clear ethical guidelines and fostering transparency in AI development will be essential to navigating these complex issues responsibly. By encouraging collaboration among stakeholders, society can harness the fantastic potential of technologies like OmniHuman while mitigating their risks.
[4]
ByteDance's OmniHuman-1 AI model can generate scarily realistic videos - Phandroid
ByteDance, TikTok's parent company, has introduced a new AI model, the OmniHuman-1. Companies coming out with new AI models isn't too surprising these days. However, the OmniHuman-1 AI model is a video generator. But unlike other AI-powered video tools, OmniHuman-1 does not require written prompts. Instead, it uses photos, either in the form of selfies, full-body images, or even cartoon drawings, as input. AI video and image generation is not a new concept. Companies like Meta already have similar AI tools of their own. However, what makes OmniHuman-1 particularly impressive is how eerily realistic these AI-generated videos are. These examples of the OmniHuman-1 AI model in action show how good the model is. The accuracy of facial expressions, body movements, and lip-syncing makes it difficult to distinguish between AI-generated content and actual footage. The AI model also allows users to incorporate their audio or video. The animated character in the video can then move and speak in a lifelike manner based on the audio or video input. However, this advancement raises significant concerns about deepfakes. As AI-generated videos become indistinguishable from real ones, the potential for misinformation, fraud, and manipulation grows. Deepfakes have already been used in malicious ways such as spreading disinformation and cybercrime. The hyper-realistic capabilities of OmniHuman-1 could make things worse. To that end, social media platforms need to take on a bigger responsibility when it comes to identifying AI-generated content. There needs to be stronger detection algorithms and clear labels to make sure that users are not misled by these hyper-realistic deepfakes.
[5]
TikTok parent company just launched stunning AI video generator -- OmniHuman-1 is taking the world by storm
Chinese AI video hits its stride, and suddenly things ain't what they used to be. We've always known it was coming, what we couldn't predict is where it would come from and how fast. The best AI video generation technology has now moved from a trickle to a tidal wave of product releases and research. One name in particular has exploded onto our screens in the past few weeks: Bytedance. The company has just released two stunningly good text-to-video AI models, which rival the best in the world. For those who don't know, Bytedance is the infamous owner of TikTok. And now the company has released OmniHuman-1, a new multimodal video generation framework which can take a single image and generate extremely sophisticated video with audio attached. The model is special because of its ability to combine video, audio and lip-syncing in a near perfect match. We're not talking about pretty good video here, we're talking extremely high quality output in every way. The project's GitHub demo page features a raft of beautifully crafted videos, all taken from a single image plus an audio file. The lip syncing is almost perfect, the image resolution is spectacular, and there are remarkably few glitches in the output that we can see. The platform is not limited to photorealistic video either, it can produce cartoons, artificial animated objects, animals and even some quite complicated and challenging poses. In the past few days the company has also dropped Goku, which offers similar text to video quality, but with an interesting twist. First the Goku model only features 8B parameters, which is incredibly small for this kind of quality. It's clear the company is specifically targeting the advertising market, based no doubt on its massive back catalog of TikTok videos and shopping experiences. These moves propel the Chinese company into the AI big league, alongside other Chinese AI giants Alibaba, Tencent and DeepSeek. Suddenly the landscape has changed completely in ways no one could have imagined even a year ago. Other Chinese companies like Kling AI have already shown what's possible, but the Bytedance tech is different because it comes from a company which probably owns the largest video media library on earth after Facebook. Meanwhile, Goku also moves AI generation further down the yellow brick road, and targets one of the biggest industries in the world: advertising. The demo videos on the project page show a range of clips which are quite clearly aimed at short or long form social media advertising applications. Women and men using body products and other cheesy demo clips predominate. These video tools are not just destined to sell us more products, it's obvious there's a much larger agenda at work here. After advertising, the next domino to fall is almost certainly going to be animated art in all its forms. Even if we don't see full length animations using this technology in the short term, there's no question that it's already being deployed as part of the production process. Before we get too excited we should remember that the computing demands of this kind of AI model are still colossal. There's a reason it took Sora so long to appear on the market. It's also important to note that both OmniHuman and Goku exist only in the lab, with no public facing application for anybody to play with. Yet. However, anyone needing a glimpse into the massive disruption that Chinese AI is bringing to the video animation world, should take a look at Kling AI. The AI generated video that's coming out of this publicly accessible commercial service is nothing less than staggering. All generated from a simple text prompt. And in case you think it's all just ten second clips, take a look at this mock up of a well-known television show. Last summer mega industry publication The Hollywood Reporter ran a front page story entitled 'Hollywood at a Crossroads: Everyone Is Using AI, But They Are Scared to Admit It'. The undertone of the article was fatalistic. Basic movie worker's labor would inevitably be "displaced" first by AI. Followed later by a creeping AI penetration which would consume everything in its path over time. This process has definitely already started. In the same way digital technology has almost completely unseated analog movie making, AI with its massive cost efficiencies will inevitably perform the same kind of industry disruption. The one certainty is we'll see these impacts much sooner than any of us expect. Polish director Besaleel sums up the mood: "I foresee that film and TV productions will eventually employ only leading and perhaps supporting actors, while the entire world of background and minor characters will be created digitally", The video and movie world is changing folks, and at warp speed 9.
[6]
TikTok owner ByteDance's new AI offering can generate life-like videos from single picture
The announcement of the AI model comes amid discussions about ByteDance divesting its American business to ensure TikTok's continued operation in the country.Artificial intelligence is the new frontier of innovation. As tech companies are in a race to unveil their own AI models, TikTok owner ByteDance has revealed OmniHuman-1, an AI tool that makes lifelike videos from a single image. OmniHuman-1 outperforms existing methods by a significant margin and can create hyper-realistic human videos on the basis of weak signal inputs, according to a research paper on the tool. ByteDance trained the model on over 18,700 hours of human video data, according to Forbes. This included multiple kinds of inputs such as audio, physical gestures and text. The project page for OmniHuman-1 features video samples of historical figures, animals and animated characters in realistic looking videos. One black-and-white clip features Albert Einstein delivering a lecture on the importance of emotions in art. Also Read : Google rolls out Gemini 2.0 for all users: What you need to know about latest AI models The AI model announcement follows a number of discussions around ByteDance divesting its US business to allow TikTok to function in the United States. US President Donald Trump had previously delayed the ban on TikTok and said he hoped that a solution could be reached on the matter. The social media app was briefly banned in the US by the Biden administration over national security concerns. It later came back online. The US leader also said that he hopes to see a bidding war for the social media app among American companies. According to Trump, the US' sovereign wealth fund, which was announced recently, may be used to fund the sale of TikTok. According to The Hindu, OmniHuman will compete with OpenAI's Sora and AI video generators like Pika and Luma Labs once it is released. While researchers have mentioned that the OmniHuman-1 samples feature sounds and images from generation by AI models or public sources, the capabilities of ByteDance, and AI video generators in general, has sparked concerns. Also Read : How Donald Trump came up with the idea to 'take over' Gaza Experts believe that these tools could be used to generate realistic clips and lead to dissemination of fake news. False endorsements by public figures like politicians made from AI tools might be detrimental to social harmony. 1. Is this TikTok's first AI video generator? No, the video sharing platform already has another AI video generator called Jimeng. 2. Has ByteDance explained how it trained its AI tool? The research paper did give some details, but the company has not made any statement.
[7]
TikTok owner ByteDance has a new AI video creator you have to see to believe
TikTok parent company, ByteDance, is showing off a new AI video creator that can produce vivid videos of people talking, singing, and moving around from a single photograph. The new OmniHuman model can bring an image to life with eerily accurate body movements, facial expressions, and gestures. OmniHuman's breakthrough involved training on more than 18,700 hours of video. The AI can now mimic how humans move, speak, and interact in videos. Notably, this AI can create fully moving characters rather than just animating a face or upper body. That means a single picture can be turned into a video of someone giving a speech, dancing, or even playing an instrument. The result is a very realistic video, whether the character is a human from a photograph or one from a more stylized painting. You can see examples below. If and when ByteDance does make OmniHuman available, it's easy to imagine it blowing up on TikTok. The company already offers an AI video-maker named Jimeng on the platform, and something like OmniHuman could entice many more people to play with TikTok and its other features. Of course, ByteDance won't enter the space without competition. OpenAI's Sora has drawn accolades and is a big name in the AI video space, but there are plenty of others, such as Pika, Runway, Pollo, and Luma Labs' Dream Machine. There's a lot of potential use for ByteDance's model, whether recreating actors of the past for more movies or teaching students history from the simulated mouths of historical figures. Even digital avatars for social media and gaming could become more lifelike, adapting in real-time based on user input. OmniHuman is still a research project for now, but the fact that ByteDance is already showcasing its capabilities suggests that practical applications aren't far behind. The AI character below could be the next face of a video trend on TikTok.
[8]
TikTok's Parent Teases Video AI Model Rivaling OpenAI's Sora, Turns Photos into Videos
With DeepSeek becoming the world's leading app in no time, ByteDance, the company behind TikTok, has now released a research paper on its new video generation AI model, OmniHuman-1. The OmniHuman-1 model can generate realistic human videos by employing a mixed data training strategy with multi-modality motion conditioning. In the research paper, the authors mention, "We propose OmniHuman, an end-to-end multimodality-conditioned human video generation framework that generates human videos based on a single image and motion signals (e.g., audio, video, or both)." The researchers who worked on it include Gaojie Lin, Jianwen Jiang, Jiaqi Yang, Zerong Zheng, and Chao Liang. The model relies on omni-conditions training, which ensures that it does not waste data while transferring data from weaker-conditioned tasks to stronger-conditioned tasks. In other words, one can generate a video based on a single image. While that is exciting, it is scary at the same time, considering deepfake creations are already succeeding in extorting money from senior citizens. Anshuman Jha, an AI consultant at AON, took to LinkedIn to highlight potential abuse from using such a model. "From entertainment to advertising, the applications are limitless. Imagine personalised ads where celebrities endorse products in real-time or deceased artists perform new songs. The potential for misuse is glaring," he said. On the other hand, Jha also mentioned it as a "marvel". At the moment, the model is not available to the public. However, the results shared through the official website mention that the model works on any kind of image. A Reddit discussion on OmniHuman-1 agrees that it can be a game-changer in AI-based video generation models. There is a buzz about it on social media platforms, and everyone seems surprised at the accuracy of the results. Similar to how DeepSeek recently dominated everything up until today, OmniHuman-1 could be the next talk of the town in video generation AI models.
[9]
Chinese tech giant quietly unveils advanced AI model amid battle over TikTok
"This is probably the most impressive model I've seen," one AI expert said. In the rapidly expanding field of artificial intelligence, the Chinese tech giant behind TikTok this week quietly unveiled an advanced AI model for generating video that leapfrogs the company ahead of its U.S. competition and raises new concerns about the threat of deepfake videos. ByteDance's OmniHuman-1 model is able to create realistic videos of humans talking and moving naturally from a single still image, according to a paper published by researchers with the tech company. Experts who spoke to ABC News warned that the technology -- if made available for public use -- could lead to new abuses and magnify the longstanding national-security concerns about Beijing-based ByteDance. "If you only need one image, then all of the sudden, it's much easier to find a way to target someone," Henry Ajder, a world-leading expert on generative AI told ABC News. "Previously, you might have needed hundreds of images, if not thousands, to create compelling, really interesting videos to train them on." After training the model on over 18,700 hours of human videos, ByteDance researchers boasted that the technology is "unprecedented" in "accuracy and personalization," with users able to create "extremely realistic human videos" that significantly outperform existing methods. Based on a single still image, users can create content that lacks the telltale signs of artificial generation -- such as issues depicting hand movements or lip syncing -- and can potentially evade AI-detection tools, according to Ajder. "This is probably the most impressive model I've seen to combine all of these different multimodal activities," Ajder said. "The ability to generate custom voice audio to match the video is notable and then, of course, there's just the fidelity of the actual video outputs themselves. I mean, they're incredibly realistic. They're incredibly impressive." ByteDance declined ABC News' request for comment, and their research paper offered limited details about the source of the videos used to train the model. A ByteDance representative told Forbes that the tool, if publicly deployed, would include safeguards against harmful and misleading content. Last year, TikTok announced that the platform would automatically label AI-generated content and generally work to improve AI literacy. Among the videos released in the research paper, OmniHuman was used to transform a still image of Albert Einstein's portrait into a video of the theoretical physicist delivering a lecture. Other artificially generated videos depicted speakers delivering Ted Talks and musicians playing piano while singing. According to the research paper, the model can generate realistic video at any aspect ratio based on a single image and audio clip. While the release of the model marks a new advancement in the rapidly growing field of artificial intelligence, it also raises the stakes of the harms that can stem from it, including deepfakes used to influence elections or produce non-consensual pornography, experts said. According to John Cohen, an ABC News contributor and former head of intelligence at the Department of Homeland Security, the ability to create higher quality videos using AI could lead to "dramatic expansion" of the threats stemming from the content. "The United States is in the midst of a dynamic and dangerous threat environment that in large part is fueled by online content that is purposely placed there by foreign intelligence services, terrorist groups, criminal organizations and domestic violence groups for the purposes of inspiring and informing criminal and oftentimes violent activities," Cohen said, warning that technology like OmniHuman could allow bad actors to create deep fakes "more effectively, more efficiently and more cheaply." Ahead of the 2024 election, artificial intelligence was used by Russian individuals to sow discord among voters, including the dissemination of propaganda videos about immigration, crime, and the ongoing war in Ukraine, according to a recent report from the Brookings Institution, a nonpartisan research group. While state and local authorities were able to correct much of the disinformation in real time, the advancing technology has had sprawling implications abroad. In Bangladesh -- a Muslim majority country -- AI was used to create a scandalous fake image of a politician in a bikini, and in Moldova, similar technology was used to create a fake video of the country's pro-West president supporting a political party aligned with Russia. Before last year's New Hampshire primary, AI was used to create a phone call impersonating the voice of President Joe Biden encouraging recipients of the call to "save your vote" for the November general election, rather than participate in the critical early primary. The New Hampshire attorney general's office described the calls as "an unlawful attempt to disrupt the New Hampshire Presidential Primary Election and to suppress New Hampshire voters." While OmniHuman has not been released for public use, Ajder predicted that the tool could soon be rolled out across ByteDance's platforms, including TikTok. The prospect adds to the complex dilemma the United States faces, as companies like ByteDance are required to support and cooperate with operations by China's military and intelligence services, according to Cohen. ByteDance's technological success comes as the U.S. has invested record amounts of money to advance AI technology. President Donald Trump -- who named a so-called "AI czar" to his administration -- last month announced a $500 billion private sector AI investment between the companies OpenAI, Softbank and Oracle. "The challenge is that our federal government has for years been too slow to react to this threat environment," Cohen said. "Until we do that, we're going to be behind the eight ball in dealing with these emerging threats."
[10]
This AI System by TikTok's Owners Can Generate Realistic Videos of People
It is a research work and the model is not available in the public domain ByteDance, the company behind TikTok, recently shared its research on a new artificial intelligence (AI) framework. Dubbed OmniHuman, it is a video-generation framework that can create realistic human videos with full-body movement and lip-syncing. The researchers stated that it requires a human image along with motion signals such as video or audio to generate output. Several demonstration videos generated using the AI model have also been shared, showcasing the realism of the final output. Notably, the company stated that the AI model is available in the public domain. The researchers shared several demonstrations and detailed the framework on its website. It is an end-to-end system that was built using a novel multimodality motion conditioning mixed training strategy, the post claimed. While the researchers did not share any benchmark metrics, they claimed that the AI model "significantly outperforms existing methods." OmniHuman can generate videos using an image of the person and a motion signal. Motion signals can be audio only, video only or a combination of audio and video. The AI model can generate realistic videos based on text prompts. These videos can be full-body where the limbs, facial expressions, and lip movement can be synced with the audio or music playing in the background. OmniHuman can generate videos in different aspect ratios, allowing flexibility to users. The use of motion signals is a novel technique, which the company is calling omni-conditions training. With this, the AI model is trained on different modalities, including text, image, audio, and video. Researchers said this allowed the model to learn mixed conditioning which overcame the scarcity of high-quality data. Notably, the model was trained on 18,700 hours of human video data. The details about the training process have been documented in a paper published in the online pre-print journal arXiv. The company also shared several demonstrations of videos generated using the model, and the results appear to be highly realistic with natural body movements, hand gestures, and lip movements. Such realism has also raised concerns about deepfakes. However, the company has specified that the AI model is currently not available to be downloaded, and there is no service people can use to access its capabilities.
[11]
ByteDance unveils a deepfake model that may be the most realistic yet
Serving tech enthusiasts for over 25 years. TechSpot means tech analysis and advice you can trust. WTF?! Machines' ability to generate fake videos of people has become alarmingly impressive. ByteDance, the Chinese tech giant behind TikTok, just showed off a new AI system called OmniHuman-1 that can create deepfake videos almost indistinguishable from reality to the average person. We may be well past the uncanny valley point right now. OmniHuman-1's fake videos look startlingly lifelike, and the model's deepfake outputs are perhaps the most realistic to date. Just take a look at this TED Talk that never actually took place. The system only needs a single photo and an audio clip to generate these videos from scratch. You can also adjust elements such as aspect ratio and body framing. The AI can even modify existing video footage, editing things like body movements and gestures in creepily realistic ways. Of course, the results aren't 100% perfect. Some poses do look a bit off, like this awkward example of holding a wine glass. There's also this AI-rendered lecture from Einstein where his hands twist in odd directions. His face is rendered almost perfectly, though. Still, the quality overall is way ahead of previous deepfake techniques. Under the hood, OmniHuman-1 was trained on 18,700 hours of video data using a novel "omni-conditions" approach that lets it learn from multiple input sources like text prompts, audio, and body poses simultaneously. The ByteDance researchers say that this wider training data helps the AI "significantly reduce data wastage" compared to older deepfake models. The implications of this technology are concerning. Deepfakes have already been weaponized for misinformation campaigns, fraud, and all sorts of nefarious purposes over the past few years. There were numerous incidents during the 2024 election cycle of deepfake audios and videos being spread to mislead voters. Financial scams conned people out of billions last year, too. One notable case involved a scammer using AI to pose as Brad Pitt, tricking a woman into sending $850,000 last month. Considering these incidents, hundreds of AI ethics experts pleaded for deepfake regulations last year. Several US states have already passed laws against malicious deepfakes, but there's still no overarching federal legislation. California, for one, was on the verge of enacting a law that would let judges force people to take down deepfakes and potentially face fines for posting them. However, that bill has stalled in the legislative process.
[12]
ByteDance's Deepfake Tool Creates Convincing Videos From One Photo
Watch the short talking head video above. Granted, it is in French, and close inspection of it may raise suspicions but perhaps caught unaware it could well fool people into believing it is a real video and not AI-generated. The clip is from OmniHuman-1, an AI video system created by ByteDance -- the Chinese company behind TikTok -- which can deepfake a person using just one photo and one piece of audio. OmniHuman-1 is just a research paper, for now, but the demos ByteDance is showing off are mightily impressive and appear to be an improvement on other deepfake apps that suffer from uncanny valley syndrome. Tech Crunch reports that OmniHuman-1 has been trained on 19,000 hours of video content from "undisclosed sources" which you can guarantee means any video ByteDance found on the internet or any other platform -- copyrighted or not. The AI tool can also edit existing videos and can change the movements of a person's limbs. Tech Crunch calls the results "astonishing." In the examples below, a woman giving a fake Ted Talk achieves a good level of verisimilitude while an AI Albert Einstein delivers a lecture in front of a chalkboard. "We propose an end-to-end multimodality-conditioned human video generation framework named OmniHuman, which can generate human videos based on a single human image and motion signals (e.g., audio only, video only, or a combination of audio and video)," the Bytedance researchers write. "In OmniHuman, we introduce a multimodality motion conditioning mixed training strategy, allowing the model to benefit from data scaling up of mixed conditioning. This overcomes the issue that previous end-to-end approaches faced due to the scarcity of high-quality data. OmniHuman significantly outperforms existing methods, generating extremely realistic human videos based on weak signal inputs, especially audio. It supports image inputs of any aspect ratio, whether they are portraits, half-body, or full-body images, delivering more lifelike and high-quality results across various scenarios." Users of OmniHuman-1 will get better results if they use high-quality and high-resolution reference images. It even shared a series of videos showing deepfakes talking with their hands -- a part of the body AI imagery notoriously struggles with. The onset of deepfake technology has worrying implications in the real world: malicious actors try and use AI video to sway voters in elections by posting fake endorsements or besmirching an opposing politician's name. In February, a finance worker was scammed into paying $200 million Hong Kong dollars ($25.6 million) to criminals after a virtual meeting with a deepfake impersonator.
[13]
Deepfake videos are getting shockingly good | TechCrunch
Researchers from TikTok owner ByteDance have demoed a new AI system, OmniHuman-1, that can generate perhaps the most realistic deepfake videos to date. Deepfaking AI is a commodity. There's no shortage of apps that can insert someone into a photo, or make a person appear to say something they didn't actually say. But most deepfakes -- and video deepfakes in particular -- fail to clear the uncanny valley. There's usually some tell or obvious sign that AI was involved somewhere. Not so with OmniHuman-1 -- at least from the cherry-picked samples the ByteDance team released. According to the ByteDance researchers, OmniHuman-1 only needs a single reference image and audio, like speech or vocals, to generate a video. The output video's aspect ratio is adjustable, as is the subject's "body proportion" -- i.e. how much of their body is shown in the fake clip. OmniHuman-1 can also edit existing videos, even modifying the movements of a person's limbs. It's truly astonishing how convincing the result can be: Granted, OmniHuman-1 isn't perfect. The ByteDance team says that "low-quality" reference images won't yield the best videos, and the system seems to struggle with certain poses. Note the weird gestures with the wine glass in this video: Still, OmniHuman-1 is easily heads and shoulders above previous deepfake techniques, and it may well be a sign of things to come. While ByteDance hasn't released the system, the AI community tends not to take long to reverse-engineer models like these. The implications are worrisome. Last year, political deepfakes spread like wildfire around the globe. On election day in Taiwan, a Chinese Communist Party-affiliated group posted AI-generated, misleading audio of a politician throwing his support behind a pro-China candidate. In Moldova, deepfake videos depicted the country's president, Maia Sandu, resigning. And in South Africa, a deepfake of rapper Eminem supporting a South African opposition party circulated ahead of the country's election. Deepfakes are also increasingly being used to carry out financial crimes. Consumers are being duped by deepfakes of celebrities offering fraudulent investment opportunities, while corporations are being swindled out of millions by deepfake impersonators. According to Deloitte, AI-generated content contributed to more than $12 billion in fraud losses in 2023, and could reach $40 billion in the U.S. by 2027. Last February, hundreds in the AI community signed an open letter calling for strict deepfake regulation. In the absence of a law criminalizing deepfakes at the federal level in the U.S., more than 10 states have enacted statutes against AI-aided impersonation. California's law -- currently stalled -- would be the first to empower judges to order the posters of deepfakes to take them down or potentially face monetary penalties. Unfortunately, deepfakes are hard to detect. While some social networks and search engines have taken steps to limit their spread, the volume of deepfake content online continues to grow at an alarmingly fast rate. In a May 2024 survey from ID verification firm Jumio, 60% of people said they encountered a deepfake in the past year. Seventy-two percent of respondents to the poll said they were worried about being fooled by deepfakes on a daily basis, while a majority supported legislation to address the proliferation of AI-generated fakes.
Share
Share
Copy Link
ByteDance, TikTok's parent company, launches OmniHuman-1, an advanced AI model capable of generating highly realistic full-body videos from a single image, raising both excitement and concerns in the tech world.
ByteDance, the parent company of TikTok, has recently introduced OmniHuman-1, a groundbreaking AI video generation framework that can create high-quality, full-body videos from a single image coupled with an audio clip 1. This sophisticated model combines video, audio, and near-perfect lip-syncing capabilities, positioning ByteDance among the top players in the AI field alongside Chinese tech giants like Alibaba and Tencent 12.
OmniHuman-1 leverages a diffusion-transformer model to generate motion by predicting movement patterns frame-by-frame, resulting in realistic transitions and body dynamics 1. The model has been trained on an extensive dataset of 18,700 hours of human video footage, enabling it to understand a wide array of motions and expressions 12. Its "omni-conditions" training strategy integrates multiple input signals such as audio, text, and pose references, enhancing the accuracy of movement predictions 1.
The AI system is capable of producing not only photorealistic videos but also anthropomorphic cartoons, animated objects, and complex poses 1. Unlike traditional deepfake technologies that often focus solely on facial animations, OmniHuman-1 encompasses full-body animations, accurately mimicking gestures and expressions 13. The model adapts well to different image qualities, creating smooth motion regardless of the original input 1.
OmniHuman-1's capabilities unlock a wide range of possibilities across various industries, including:
While the technology offers immense potential for creative and educational applications, it also raises critical ethical concerns 3. The ability to create hyper-realistic deepfakes introduces risks related to misinformation, fraud, and erosion of trust in digital media 23. Experts warn that if widely available, this technology could make it easier than ever to create fake videos for deceptive purposes 2.
The rapid advancement of AI video generation technology underscores the need for robust detection tools and comprehensive regulatory frameworks 3. Currently, efforts to regulate deepfake technology remain in their early stages and vary significantly across regions 3. The absence of comprehensive federal regulations in the United States creates enforcement challenges, highlighting the need for international collaboration to establish clear guidelines and safeguards 3.
ByteDance's introduction of OmniHuman-1, along with another AI model called Goku, significantly disrupts the landscape for AI-generated content 15. The company's extensive video media library, potentially the largest after Facebook, gives it a unique advantage in the field 15. However, major technology companies such as Google, Meta, and Microsoft are also heavily investing in similar technologies, pushing the boundaries of what AI can achieve in video production 3.
As the world moves into an AI-dominated future, the unveiling of OmniHuman-1 raises urgent ethical and regulatory questions. Whether ByteDance will integrate this technology into TikTok or other platforms remains to be seen, but its capabilities underscore the high-stakes battle over AI supremacy between China and the United States 2.
Reference
[2]
[3]
[4]
Phandroid - Android News and Reviews
|ByteDance's OmniHuman-1 AI model can generate scarily realistic videos - PhandroidByteDance, TikTok's parent company, has introduced Goku, an advanced AI model capable of generating high-quality videos from text prompts. This development positions ByteDance as a key player in the rapidly evolving field of AI-generated content.
3 Sources
3 Sources
China is making significant strides in the field of generative AI, aiming to close the gap with the United States. This development has implications for global technological competition and raises concerns about the potential misuse of AI technology.
3 Sources
3 Sources
ByteDance, TikTok's parent company, is leading the race in China's generative AI market by aggressively hiring top talent and becoming Nvidia's largest chip customer in Asia, outpacing competitors like Alibaba and Baidu.
3 Sources
3 Sources
TikTok has made its AI-driven ad creation tool, Symphony Creative Studio, available to all advertisers globally. The platform has also partnered with Getty Images to integrate licensed content into the tool, enabling the creation of AI-generated ads with authentic visuals.
6 Sources
6 Sources
Meta introduces Movie Gen, an advanced AI model capable of generating and editing high-quality videos and audio from text prompts, potentially revolutionizing content creation for businesses and individuals.
46 Sources
46 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved