Curated by THEOUTPOST
On Fri, 13 Dec, 12:04 AM UTC
11 Sources
[1]
ChatGPT Can Now Watch and Interact With You in Real Time - Decrypt
OpenAI took the wraps off ChatGPT's long-promised video capabilities Thursday, letting users point their phones at objects for real-time AI analysis -- a feature that's been gathering dust since its first demo in May. Previously, you could input text, charts, voice, or still photos and interact with GPT. This feature, released late Thursday, allows GPT to watch you in real time and conversationally provide feedback. For instance, in my tests, this mode was able to solve math problems, give food recipes, tell stories, and even turn itself into my daughter's new best friend, interacting with her while making pancakes, giving suggestions and encouraging her learning process through different games. The release comes just a day after Google showed its own take on a camera-enabled AI assistant powered by the newly minted Gemini 2.0. Meta's been playing in this sandbox too, with its own AI that can see and chat through phone cameras. ChatGPT's new tricks aren't for everyone though. Only Plus, Team, and Pro subscribers can access what OpenAI calls "Advanced Voice Mode with vision." The Plus subscription costs $20 a month, and the Pro tier costs $200. "We're excited to announce that we're bringing video to Advanced voice mode so you can bring live video and also live screen sharing into your conversations with ChatGPT," Kevin Weil, OpenAI's Chief Product Officer, said in a video Thursday. The stream was part of its "12 Days of OpenAI" campaign that will show 12 different announcements in as many consecutive days. So far, OpenAI has launched its o1 model for all users and unveiled the ChatGPT Pro plan for $200 per month, introduced reinforcement fine-tuning for customized models, released its generative video app Sora, updated its canvas feature, and released ChatGPT to Apple devices via the tech giant's Apple Intelligence feature. The company gave a peek at what it can do during Thursday's livestream. The idea is that users can activate the video mode, in the same interface as advanced voice, and start interacting with the chatbot in real time. The chatbot has great vision understanding and is capable of providing relevant feedback with low latency, making the conversation feel natural. Getting here wasn't exactly smooth sailing. OpenAI first promised these features "within a few weeks" in late April, but the feature was postponed following controversy over mimicking actress Scarlett Johansson's voice -- without her permission -- in advanced voice mode. Since video mode relies on advanced voice mode, that apparently slowed the rollout. And rival Google is not sitting idle. Project Astra just landed in the hands of "trusted testers" on Android this week, promising a similar feature: an AI that speaks multiple languages, taps into Google's search and maps, and remembers conversations for up to 10 minutes. However, this feature is not yet widely available, as a broader rollout is expected for early next year. Google also has more ambitious plans for its AI models, giving them the ability to execute tasks in real time, showing agentic behavior beyond audiovisual interactions. Meta is also fighting for a place in the next era of AI interactions. Its assistant, Meta AI, was featured this September. It shows similar capabilities to OpenAI's and Google's new assistants, providing low-latency responses and real-time video understanding. But Meta is betting on using augmented reality to push its AI offering, with "discreet" smart glasses capable enough of powering those interactions, using a small camera built into their frames. Meta calls it Project Orion. Current ChatGPT Plus users can try the new video features by tapping the voice icon next to the chat bar, then hitting the video button. Screen sharing needs an extra tap through the three-dot (aka "hamburger") menu. For Enterprise and Edu ChatGPT users eager to try the new video features, January is the magic month. As for EU subscribers? They'll just have to watch from the sidelines for now.
[2]
ChatGPT's new skill will change how you interact forever
OpenAI has launched real-time video capabilities in ChatGPT, enhancing its Advanced Voice Mode with visual understanding. This feature, which began rolling out on December 12, allows users to interact with the AI via their camera to receive contextual assistance. The integration signifies a notable advancement in the AI's conversational abilities, especially for those subscribing to ChatGPT Plus, Team, or Pro. The Advanced Voice Mode now includes real-time video analysis and screen-sharing functionalities. Users can point their phones at objects for immediate responses or share their screens for detailed explanations of settings or problems. This addition builds on previous capabilities, allowing for a more interactive user experience. OpenAI demonstrated the feature during a livestream, showcasing its ability to engage in casual conversations and provide insights based on visual input. As for availability, the rollout began on December 12 and will extend over the following week. However, European users along with ChatGPT Enterprise and Edu subscribers may face delays; these users will receive access early next year. This expansion indicates OpenAI's focus on enhancing user interaction with AI, aligning with its overarching goal to provide advanced capabilities within its suite of products. In a parallel development, Apple recently introduced iOS 18.2, which incorporates several ChatGPT features across Siri, Writing Tools, and Visual Intelligence. The integration with Siri allows the voice assistant to recognize queries that fall outside its range and redirect them to ChatGPT. Users will be notified and must approve this action before it proceeds. For users with iPhone 16 devices, Visual Intelligence empowers them to point their camera at objects or situations to retrieve information via ChatGPT or Google. Furthermore, the Writing Tools feature now includes a new "Compose" tool, enabling content creation from scratch using ChatGPT's capabilities. These features, emphasizing utility and user control, have been structured to align with ChatGPT's usage limits, ensuring an organized experience. OpenAI has organized a campaign named "12 Days of OpenAI," commencing on December 5, which features daily livestream sessions revealing new features or products. CEO Sam Altman described the campaign as a mix of significant updates and minor enhancements. Among the notable announcements was the introduction of a new Santa voice for the Advanced Voice Mode, which users can activate via a snowflake icon. Additionally, the campaign unveiled Sora, OpenAI's new video model, now available to ChatGPT Pro and Plus users. This model can generate text-to-video and video-to-video content, significantly broadening the creative avenues available to users. Other notable updates included the announcement of Canvas for all web users, previously a beta feature, enabling a more integrated project management experience. Looking ahead, OpenAI has plans to expand its offerings, including a full version of its o1 language model, set to enhance reasoning capabilities further. The company aims to make Reinforcement Fine-Tuning more widely available, with applications currently sought from research institutes and universities for fine-tuning AI models for specific tasks.
[3]
ChatGPT adds eyes to its voice with new screen and video sharing feature
You can stream video from your camera or share your screen with the AI ChatGPT gazed upon day six of the 12 Days of OpenAI with a fresh eye courtesy of a new visual ability connected to its Advanced Voice Mode that lets you share your screen and live stream video to the AI chatbot. Instead of uploading photos and screenshots to ask ChatGPT questions, you can now show it what you're looking at directly and ask for advice. The feature is a bit like having a video chat with a friend, albeit one who isn't sharing their own image back. But you can hear the AI's voice, making the conversation hands-free. If you want to try it out, you can tap the voice icon in the ChatGPT app and then the video icon to start streaming video from your camera. To share your screen, just tap the three-dot menu and select "Share Screen." Imagine you're struggling to assemble a new IKEA bookshelf (because who hasn't?). Instead of staring at confusing instructions, you can point your camera at the half-built mess and ask ChatGPT, "What did I do wrong here?" The AI can check out the parts and offer step-by-step guidance. If you need to tweak some settings on your phone or computer and aren't sure how to do so, you can share your screen with ChatGPT to enable the AI to walk you through the menus and buttons to get things sorted. No more digging through endless tech forums or asking your friend who's good with computers. The feature might even make ChatGPT a better sous-chef. If your recipe says, "whisk until it thickens," and you're not sure if you've whisked enough, just point the camera at your bowl and ask ChatGPT for confirmation or a suggestion you might want to order in. OpenAI's CPO Kevin Weil and his team demonstrated how ChatGPT can help make pour-over coffee by pointing the camera at their brewing setup to show off the new feature during its debut. The AI understood the coffee-making gear and walked them through the steps like a virtual barista. The feature encourages treating ChatGPT more like a person than a computer, even more than just the voice interface. Being able to 'see' makes the AI seem more present in the real world and less like a chatbot in a vacuum. OpenAI did acknowledge that sharing what your camera sees might make some wary of using the feature. It's not automatically on, you have to activate it each time you use it so there's no accidental video shot. The new feature is only available for ChatGPT Plus and Pro users as of right now. Enterprise and Education tier subscribers will get access next month, but OpenAI didn't say if or when the free tier will get access. That makes sense since it probably demands a lot of computing power, and OpenAI doesn't want ChatGPT to go down again after what happened on Wednesday.
[4]
ChatGPT Can Finally See Through Your Camera for Real-Time Interaction
You can also share your screen with ChatGPT and ask questions about whatever is displayed on the screen. About seven months ago, OpenAI introduced the GPT-4o model and demonstrated its multimodal capabilities. The hot AI startup showcased cool demos where you could share your camera with ChatGPT and interact with it in real-time using voice. Now after a long wait, ChatGPT can finally see through your camera and offer insights, suggestions, and guidance in real time. As part of the "12 days of OpenAI" announcement, the company has released Live Video to ChatGPT, and it's available through Advanced Voice Mode. You can now point your camera at a monument, food items, surroundings, or anything else, and ask ChatGPT questions in real-time. Not only that, you can also share your screen and voice chat with ChatGPT in real-time. Screen sharing is incredibly useful, especially for studying. You can share your screen with ChatGPT and seek its assistance while solving math problems, understanding diagrams, charts, or analyzing anything that is on your screen. OpenAI says live video and screen sharing is rolling out to the latest ChatGPT app on Android and iOS, starting today and over the course of the next week. It's a paid feature so ChatGPT Plus, Pro, and Teams users will have access to these new capabilities. To access live video on ChatGPT, open Advanced Voice Mode and tap on the camera icon. And to share your screen with ChatGPT, tap on the 3-dot menu in Advanced Voice Mode and choose "Share Screen". By the way, Google is also developing Project Astra which lets you share your camera and screen with Gemini 2.0. You can try an early preview of Project Astra by following our article. Apart from that, OpenAI announced a special Santa voice mode, letting you talk to him in his signature jolly voice. You can tap on the snowflake icon in ChatGPT and start interacting with Santa. It's rolling out today and will be available for the rest of December. The good news is that the Santa voice mode is available to free users as well with extended voice limits.
[5]
OpenAI is giving ChatGPT's voice feature the ability to 'see'
OpenAI is adding vision to ChatGPT's advanced voice feature. On the sixth day of its "12 Days of OpenAI" event, the artificial intelligence startup announced that ChatGPT can recognize objects shown through a smartphone camera or on the screen of a device, and respond with its Advanced Voice Mode feature. OpenAI teased the feature when it debuted its GPT-4o model in May. Most ChatGPT Plus and Pro users and all Team users will be able to access the video and screensharing feature through the ChatGPT mobile app in the coming days, OpenAI said, while it expects ChatGPT Plus and Pro users in the European Union, Switzerland, Iceland, Norway, and Liechtenstein to get the feature soon. The chatbot's Enterprise and Edu users will get video and screensharing in January. Advanced Voice is powered by OpenAI's natively multimodal 4o model, the startup said, meaning it can process audio input and respond in a natural and conversational way. OpenAI also added a Santa voice as a preset voice in Advanced Voice Mode that is available to all mobile, web, and desktop app users. The feature, which can be accessed through a snowflake icon in ChatGPT, will be available worldwide until early January, the startup said. At the beginning of the livestream announcement, the OpenAI team apologized for an hours-long outage that hit ChatGPT and its newly-launched video generator, Sora, the previous day. The company released its text-to-video AI generator to some paid users earlier this week, which it called "critical to our AGI [artificial general intelligence] road map." However, OpenAI chief executive Sam Altman wrote on X the following day that the startup had "significantly underestimated demand for sora," and that it will "take awhile to get everyone access." As part of its holiday launch event, OpenAI has announced new products and features, including its o1 model out of preview and a $200-per-month subscription tier to ChatGPT. The chatbot is also now available to access through Apple's Siri voice assistant.
[6]
You Can Now Ask ChatGPT Queries About Your Surroundings With This Feature
The feature will be rolled out to Enterprise and Edu users in early 2025 OpenAI rolled out the Advanced Voice Mode with Vision feature in ChatGPT on Thursday. The feature, which lets the artificial intelligence (AI) chatbot access the smartphone's camera to capture visual information of the user's surrounding, will be available to all ChatGPT Plus, Team and Pro subscribers. The feature draws on the capabilities of GPT-4o and can provide real-time voice responses on what is being shown in the camera. Vision in ChatGPT was first unveiled in May during the company's Spring Updates event. The new ChatGPT feature was rolled out on day six of OpenAI's 12-day feature release schedule. The AI firm has so far released the full version of the o1 model, the video generation Sora model, and a new Canvas tool. Now, with the Advanced Voice mode with Vision, users can let the AI see their surroundings and ask questions based on them. In a demonstration, the OpenAI team members interacted with the chatbot with the camera on, and introduced several people. After that, the AI could answer a quiz on those people even when they were not actively on the screen. This highlights that the vision mode also comes with memory, although the company did not specify how long the memory lasts. Users can use the ChatGPT vision feature to show the AI their fridge and ask for recipes or by showing their wardrobe and asking for outfit recommendations. They can also show the AI a landmark outside and ask questions about it. This feature is paired with the chatbot's low latency and emotive Advanced Voice mode, making it easier for users to interact in natural language. Once the feature rolls out to users, they can go to the mobile app of ChatGPT and tap on the Advanced Voice icon. In the new interface, they will now see a video option, tapping which will give the AI access to the user's camera feed. Additionally, a Screenshare feature is also available which can be accessed by tapping the three dot menu. Screenshare feature will enable the AI to see the user's device and any app or screen they go to. This way, the chatbot can also help users with smartphone-related issues and queries. Notably, OpenAI said that all Team subscribers will get access to the feature within the next week in the latest version of the ChatGPT mobile app. Most Plus and Pro users will also get the feature, however, users in the European Union region, Switzerland, Iceland, Norway, and Liechtenstein will not get it at present. On the other hand, Enterprise and Edu users will get access to ChatGPT's Advanced Voice with Vision in eary 2025.
[7]
ChatGPT gets screensharing and real-time video analysis, rivaling Gemini 2
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI finally added long-awaited video and screen sharing to its advanced voice mode, allowing users to interact with the chatbot in different modalities. Both capabilities are now available on iOS and Android mobile apps for ChatGPT Teams, Plus and Pro users, and will be rolled out to ChatGPT Enterprise and Edu subscribers in January. However, users in the EU, Switzerland, Iceland, Norway and Liechtenstein won't be able to access advanced voice mode. OpenAI first teased the feature in May, when the company unveiled GPT-4o and discussed ChatGPT learning to "watch" a game and explain what's happening. Advanced voice mode was rolled out to users in September. Users can access video via new buttons on the advanced voice mode screen to start a video. OpenAI's video mode feels like a video call like Facetime, because ChatGPT responds in real-time to what users show in the video. It can see what is around the user, identify objects and even remember people who introduce themselves. In an OpenAI demo as part of the company's "12 Days of Shipmas" event, ChatGPT used the video feature to help brew coffee. ChatGPT saw the coffee paraphernalia, instructed when to put in a filter and critiqued the result. It is also very similar to Google's recently announced Project Astra, in which users can open a video chat, and Gemini 2.0 will respond to questions about what it sees, like identifying a sculpture found in a London street. In many ways, these features are more advanced versions of what AI devices like the Humane Pin and the Rabbit r1 were marketed to do: Have an AI voice assistant respond to questions about what it's seeing in a video. Sharing a screen The new screen-sharing feature brings ChatGPT out of the app and into the realm of the browser. For screen share, a three-dot menu allows users to navigate out of the ChatGPT app. They can open apps on their phones and ask ChatGPT questions about what it's seeing. In the demo, OpenAI researchers triggered screen share, then opened the messages app to ask ChatGPT for help responding to a photo sent via text message. However, the screen-sharing feature on advanced voice mode bears similarities to recently released features from Microsoft and Google. Last week, Microsoft released a preview version of Copilot Vision, which lets Pro subscribers open a Copilot chat while browsing a webpage. Copilot Vision can look at photos on a store's website or even help play the map guessing game Geoguessr. Google's Project Astra can also read browsers in the same way. Both Google and OpenAI released screen-sharing AI chat features on phones to target the consumer base who may be using ChatGPT or Gemini more on the go. But these types of features could signal a way for enterprises to collaborate more with AI agents, as the agent can see what a person is looking at onscreen. It can be a precursor to models that use computers, like Anthropic's Computer Use, where the AI model is not only looking at a screen but is actively opening tabs and programs for the user. Ho ho ho, ask Santa a question In a bid for levity, OpenAI also rolled out "Santa Mode" in advanced voice mode. The new preset voice sounds much like the jolly old man in a red suit. Unlike the new features restricted to specific users, "Santa Mode" is now available to users with access to advanced voice mode on the mobile app, the web version of ChatGPT and the Windows and MacOS apps until early January. Chats with Santa, though, will not be saved in chat history and will not affect ChatGPT's memory.
[8]
ChatGPT Can Finally See
After months of testing, OpenAI rolled out "advanced voice" mode for ChatGPT back in September. The feature lets you have real-time conversations with ChatGPT: You can interrupt the bot and its "speaking" to ask another question, and it understands your tone of voice, which it uses to both inform its responses, as well as the inflection it uses. (It's very creepy when it laughs.) One feature of advanced voice mode has been missing since launch, however. When OpenAI first announced the perk back in May, it showed off how ChatGPT would be able to access your camera and "see" the world around you. While chatting with the bot, you could point your camera at something, ask a question, and ChatGPT would answer as best it could. Seven months later, this capability is here, and it's frighteningly impressive. In order to access it, you'll need to have a paid subscription to ChatGPT -- either Plus ($20 per month), or Pro ($200 per month). ChatGPT Team subscribers are also eligible. The feature may not be available on your end right away, even if you pay, since OpenAI is rolling it out over time. Accessing the camera is pretty straightforward once it rolls out to your account. You launch advanced voice mode the same way you always do, using the waveform icon in the bottom-right of the chat. From here, you'll see a new camera icon, which, of course, launches the live camera feed. This doesn't interrupt the chat: You can be in the middle of a conversation with ChatGPT, open the camera, and continue gabbing away, only now with the camera feed as part of the conversation. The first time I used this, I pointed the camera at a Nintendo Switch box I had nearby, with an iPhone cable and my Magic Trackpad resting on top of it, and asked, "What is this?" ChatGPT said: "It looks like a Nintendo Switch OLED box with some cables and a laptop on top. Are you planning on setting it up?" Two of out three correct, as it mistook my trackpad for a laptop, but hey, close enough. Next up, I pointed it at my water bottle, and asked it to identify what I was highlighting: "That looks like a black Hydro Flask bottle. It's great for keeping drinks cold or hot! Do you take it with you often?" I asked a follow-up: "Do you know what model of Hydro Flask this is?" ChatGPT: "I can't be certain of the exact model, but it looks like one of their wide-mouth bottles, probably around 32 ounces. It's definitely designed to keep your drinks at the right temperature for hours." That...is basically right. I'm not all that comfortable with ChatGPT guessing the size correctly, either. I moved on to my keyboard, which ChatGPT accurately stated was an Apple Magic Keyboard. I asked which keys it could see, and named a handful, but not all, of the keys I had in frame. So, I asked how many keys it could see, and it said "about 30," when there were 26. So, again, close. It was able to identify the MagSafe port on my MacBook, as well as the two USB ports and the headphone jack to its right. It recognized the air vent in my ceiling, and the specific type of boots I had by my front door. All in all, it basically recognized everything I tested it on -- minus the trackpad. But beyond recognition, I think what startled me the most was the speed of these responses. You ask ChatGPT to identify something, and it does, sometimes quicker than if you asked a real person to do it. Sometimes, the bot will hold onto a word for a moment (e.g. "I thiiiiiiiiink that's a...") which is probably a trick to let ChatGPT process the rest of what it wants to say. I've also caught it less sure of itself with its first response: I pointed it at my Magic Mouse, and its first guess what a computer mouse. But when I asked what brand it was, it didn't only specify Apple, but said it was an Apple Magic Mouse, known for its "sleek design" and "touch-sensitive surface." All things considered, though, these responses are often near-instantaneous, which speaks to how powerful OpenAI's models are these days. I'm still largely an AI skeptic, but this was the first development in a while that impressed me -- and I'm torn about how I feel about that. On the one hand, I could see this tech being used for good. Imagine how helpful something like this could be for users who are blind or have impaired vision, especially in a convenient device like smart glasses. Someone could ask their AI assistant what direction they're facing, to read the menu at a restaurant, or whether it's safe to cross the street. Tech like this could change search for the better, and make it easy to learn new things about the world by pointing our smartphone camera at a subject. On the flip side, my mind turns to the negative, especially since AI is still prone to hallucination. As more and more people use this technology, they will inevitably experience the mistakes AI can make, and if they're relying on the bot to help them with tasks -- especially something that involves their safety -- hallucinations can be dangerous. I didn't experience any large errors; just the trackpad mixup. Anderson Cooper found that the bot made a mistake on a geometry problem (again, not a huge issue). But it's a good reminder that as this tech improves rapidly, its inherent flaws raise the stakes for failure. Perhaps that's why every live camera session warns you not to use the feature for anything involving safety.
[9]
ChatGPT Advanced Voice with Vision just launched -- here's how to try it
ChatGPT's Advanced Voice with Vision was launched during Day 6 of OpenAI's '12 Days of OpenAI' live demonstration and briefing today. This powerful new feature allows users to interact with ChatGPT using spoken input, images and even video, opening up new possibilities for creativity and problem-solving. Currently, Advanced Voice with Vision is available exclusively to ChatGPT Plus and Pro subscribers. This premium feature is designed for those who want to experience the cutting-edge AI technology with a monthly fee. However, as a bonus, the team mentioned that the option to 'Chat with Santa' feature will be made available to everyone even if they have gone beyond the chat limit in the free tier. Accessing the Santa feature is simple; look for the snowflake next to the microphone icon. I have ChatGPT Plus, but the Santa feature is not yet available to me. All of the new features for ChatGPT's Advance Voice with Vision will be slowly rolling out globally, so don't worry if you can't access it yet. When it is made available to you, here's how to access and make the most of this latest ChatGPT feature. To access Advanced Voice with Vision, you must be a ChatGPT Plus subscriber. This subscription costs $20 monthly and unlocks several premium features, including the latest GPT-4.5 model, faster response times, and Advanced Voice with Vision. If you're not a subscriber, head to the ChatGPT website, log into your account and select the Plus subscription option. Once subscribed, you can use Advanced Voice with Vision immediately. Once you've confirmed your subscription, log in to your ChatGPT account on the web or mobile app. Inside the chat interface, look for the following icons: Voice input: This microphone icon allows you to speak to ChatGPT, transforming your voice into text for more natural, hands-free interactions. Image upload: This camera or image icon lets you upload pictures directly into the chat window. You can ask ChatGPT questions or give commands related to the uploaded image. Snowflake: You might as well chat with Santa while you're at it. The demo today made it look like a lot of fun. If you see the snowflake, go ahead and give it a whirl (said like Kevin McCallister in Home Alone). All icons are typically located near the text input field and can easily be spotted. With the feature enabled, you can now interact with ChatGPT in a whole new way. Now, instead of typing your queries to ChatGPT, try speaking them. Whether you're asking for help drafting an email or brainstorming ideas for your next project. Voice input makes the interaction feel more conversational and intuitive. The combination of voice and vision makes ChatGPT an even more powerful assistant for day-to-day tasks and creative endeavors. Next, try uploading images and ask ChatGPT questions about them. For example, you could upload a photo of a plant and ask for care tips or show a math problem from a textbook and request a step-by-step solution. The applications are nearly endless. Today's demo showed the team introducing themselves and ChatGPT responding with the correct answers to quizzes about them. You could use the feature as a cooking assistant. Snap a photo of your pantry and ask ChatGPT for recipe suggestions based on your ingredients. You could have ChatGPT review a document. Upload an image of a handwritten note or printed document and have the AI analyze or summarize it. Another way to help with productivity is by providing ChatGPT with verbal and visual inputs for designing a presentation, editing photos, or crafting stories. Advanced Voice with Vision combines two of ChatGPT's most dynamic features, making it easier than ever to interact with AI. Whether you're a student, professional, or casual user just curious about exploring new technology, this feature takes multitasking and problem-solving to the next level. If you're a ChatGPT Plus subscriber, this feature is ready for you to explore today. Log in, activate it, and start experiencing the future of AI-powered assistance.
[10]
You Can Now Share Videos and Screens With the ChatGPT App
OpenAI has introduced new features for ChatGPT as part of its "12 Days of OpenAI" promotion. These features revolve around a Santa mode and Christmas themes, but also an update to Advanced Voice that lets users share their screens. The first feature OpenAI is bringing is a holiday-themed "Santa" mode, which will be available worldwide until early January. In this mode, the ChatGPT voice interface changes to include a snow globe design and a snowflake icon, using a special Santa voice. Conversations in Santa mode are temporary and won't be saved in chat history. Also, when you chat with Santa, your Advanced Voice usage limits reset, so you can keep using it even if you've hit your limit. ChatGPT's Advanced Voice Mode has also received an update that lets users share videos and screens with ChatGPT. This means users can show what they're talking about instead of just describing it with words. The video and screen sharing feature is currently being rolled out for ChatGPT Plus and Pro users in most countries and all Team users. OpenAI plans to make this feature available to ChatGPT Plus and Pro users in the EU, Switzerland, Iceland, Norway, and Liechtenstein soon. Enterprise and Edu users can expect access in January. To use video and screen sharing, mobile app users can tap the voice icon, then the new video icon, or choose "Share Screen" from the menu. The combination of video and screen sharing is one of the more significant ways ChatGPT is making things easier with their Advanced Voice Mode. Not everyone can describe things easily, especially to AI. For instance, users could get step-by-step help making pour-over coffee or crafting replies to messages by showing ChatGPT what they're looking at. It's one of those things that should open up the software to more users. This will take time to roll out, as it's being released in phases, but it should be available to all paid users after the holidays. Source: OpenAI, 9to5Mac
[11]
OpenAI Adds Video in Advanced Voice - 12 Days of OpenAI Day 6
On its sixth day of its 12 Days of OpenAI, it's development team has announced more new updates to ChatGPT, transforming its Advanced Voice mode with video communication and screen-sharing capabilities. These enhancements, part of the "12 Days of OpenAI" initiative, aim to make interactions more engaging and versatile. Whether you're collaborating on projects, resolving technical issues, or exploring festive features like the Santa voice mode, these updates redefine how users interact with conversational AI. The addition of video communication introduces a visual dimension to ChatGPT, allowing more natural and immersive interactions. This feature is particularly beneficial for tasks requiring visual context, such as explaining intricate concepts, demonstrating processes, or conducting virtual meetings. By allowing users to see and hear each other, ChatGPT bridges the gap between traditional text-based AI and real-time, face-to-face communication. Screen sharing further enhances collaboration by providing a platform to: These features are being rolled out across mobile apps, desktop applications, and web platforms, making sure accessibility for a wide range of users. OpenAI has also announced plans to extend these capabilities to enterprise and educational users early next year, broadening the tool's reach and practical applications. The Advanced Voice mode has undergone substantial upgrades, making it more powerful and versatile than ever. Now supporting over 50 languages, this feature is designed to cater to a global audience, making sure inclusivity and accessibility. Key improvements include: These enhancements make the voice mode adaptable to a variety of use cases. For instance, users can describe technical issues verbally for faster troubleshooting, practice language skills with real-time feedback, or engage in dynamic conversations for learning and collaboration. By combining advanced voice capabilities with video and screen sharing, ChatGPT offers a seamless and enriched user experience. Uncover more insights about ChatGPT video capabilities in previous articles we have written. To celebrate the holiday season, OpenAI has introduced a Santa voice mode, adding a playful and festive element to ChatGPT. This feature allows users to interact with the AI in a cheerful, Santa-like voice, making conversations more enjoyable during the holidays. Available globally on platforms that support voice mode, this seasonal addition also includes a one-time reset of advanced voice usage limits, making sure users can fully explore and enjoy the feature. While lighthearted in nature, the Santa voice mode demonstrates the platform's creative versatility. It showcases how AI can be adapted for entertainment and seasonal engagement, further enhancing its appeal to a diverse audience. The new features are designed to function seamlessly across mobile, desktop, and web platforms, making sure consistent access regardless of the device or location. OpenAI is rolling out these updates to Plus and Pro subscribers worldwide, with availability in Europe expected soon. This cross-platform approach highlights OpenAI's commitment to delivering a unified and accessible user experience, making it easier for users to integrate ChatGPT into their daily routines. The combination of video communication, screen sharing, and enhanced voice capabilities unlocks a wide range of practical applications for both personal and professional use. These updates make ChatGPT a more powerful tool for real-time problem-solving, collaboration, and learning. Examples of how these features can be used include: These features also enhance interactive learning experiences. Users can practice new skills, explore unfamiliar topics, or receive guidance in innovative ways, making ChatGPT a valuable resource for education and personal development. By integrating these tools, OpenAI has created a platform that adapts to the diverse needs of its users, whether they are seeking professional support or personal enrichment. During the rollout of these new features, OpenAI encountered a brief service interruption, which temporarily affected user access. The company has acknowledged the issue, issued an apology for the inconvenience, and assured users that steps are being taken to improve the platform's reliability. This transparency reflects OpenAI's dedication to maintaining trust and making sure a smooth user experience as it continues to expand ChatGPT's capabilities. By addressing these challenges openly, OpenAI reinforces its commitment to delivering a dependable and user-centric service. The introduction of video communication, screen sharing, and enhanced voice features represents a significant milestone in the evolution of ChatGPT. By combining visual, auditory, and interactive elements, these updates make the platform more versatile and user-friendly than ever before. Whether you're using ChatGPT for professional tasks, educational purposes, or festive fun, its expanded capabilities offer a richer and more engaging experience. With a global rollout underway and plans to extend these features to enterprise and educational users, OpenAI is solidifying ChatGPT's position as a leader in multimodal conversational AI. These innovations not only enhance the platform's functionality but also demonstrate its potential to transform how people interact with AI in their daily lives.
Share
Share
Copy Link
OpenAI introduces real-time video and screen sharing features to ChatGPT's Advanced Voice Mode, enabling users to interact with the AI through their camera and share their screens for immediate assistance.
OpenAI has significantly enhanced ChatGPT's capabilities by introducing real-time video and screen sharing features to its Advanced Voice Mode. This development, announced as part of the company's "12 Days of OpenAI" campaign, marks a major step forward in AI-human interaction 1.
The new video feature allows users to point their phone cameras at objects or scenes, enabling ChatGPT to provide real-time analysis and feedback. Similarly, the screen sharing capability lets users share their device screens for immediate assistance with settings or problem-solving 2.
Kevin Weil, OpenAI's Chief Product Officer, explained, "We're excited to announce that we're bringing video to Advanced voice mode so you can bring live video and also live screen sharing into your conversations with ChatGPT" 1.
The new features open up a wide range of practical applications:
DIY Assistance: Users can seek help with tasks like assembling furniture by showing ChatGPT their progress 3.
Cooking Support: The AI can provide real-time guidance for cooking techniques or recipe steps 3.
Tech Troubleshooting: Users can share their screens for step-by-step guidance through complex settings or software issues 2.
Educational Support: Students can share their screens while solving math problems or analyzing diagrams for immediate assistance 4.
The new features are currently available to ChatGPT Plus, Pro, and Team subscribers. The rollout began on December 12 and is expected to continue over the following week. However, European users and ChatGPT Enterprise and Edu subscribers will gain access in early 2024 2 5.
OpenAI's move comes amidst growing competition in the AI assistant space:
Google recently introduced Project Astra, an AI assistant with similar video capabilities, currently in testing 1.
Meta's AI assistant, launched in September, offers comparable real-time video understanding 1.
Apple has integrated several ChatGPT features into iOS 18.2, enhancing Siri's capabilities 2.
As part of the "12 Days of OpenAI" campaign, the company has made several other announcements:
Introduction of Sora, a new video model for text-to-video and video-to-video content generation 2.
Launch of a special Santa voice mode for the holiday season 4.
Release of Canvas for all web users, improving project management capabilities 2.
These developments collectively represent a significant leap in AI-human interaction, potentially changing how users engage with AI assistants in their daily lives.
Reference
[2]
OpenAI's ChatGPT is on the verge of introducing a 'Live Camera' feature, integrating visual capabilities with its Advanced Voice Mode. This development, spotted in beta code, could revolutionize AI interactions by enabling real-time visual processing and analysis.
4 Sources
4 Sources
OpenAI has finally released its advanced voice feature for ChatGPT Plus and Team users, allowing for more natural conversations with the AI. The feature was initially paused due to concerns over potential misuse.
14 Sources
14 Sources
OpenAI has rolled out an advanced voice mode for ChatGPT, allowing users to engage in verbal conversations with the AI. This feature is being gradually introduced to paid subscribers, starting with Plus and Enterprise users in the United States.
12 Sources
12 Sources
OpenAI launches a new voice-based interaction feature for ChatGPT Plus subscribers, allowing users to engage in conversations with the AI using voice commands and receive spoken responses.
29 Sources
29 Sources
OpenAI has rolled out ChatGPT's Advanced Voice Mode for web browsers, allowing users to have voice conversations with the AI chatbot directly from their desktop. Initially available for paid subscribers, this feature marks a significant step in AI interaction and accessibility.
5 Sources
5 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved