Curated by THEOUTPOST
On Thu, 17 Apr, 4:03 PM UTC
3 Sources
[1]
Microsoft Copilot can now 'see' what's on your screen in Edge
Wes Davis is a weekend editor who covers the latest in tech and entertainment. He has written news, reviews, and more as a tech journalist since 2020. Copilot Vision, Microsoft's AI assistant feature that can interpret what's on your screen and help you use apps, is now available for free use within the Edge browser, Mustafa Suleyman, CEO of Microsoft AI, announced on Bluesky today. Vision is a "talk-based experience," as Microsoft calls it, meaning you use it by speaking into the air, then waiting for Copilot to respond. Suleyman says if you opt into the feature, Copilot Vision can "literally see what you see on screen." Suleyman suggests having Copilot Vision guide you through a recipe while you cook or having it "decode" job descriptions "and jump right into customized interview prep or cover letter brainstorming." (Although it might not be the best idea to use AI for your resume.) According to a Microsoft support page, "Copilot Vision may highlight portions of the screen to help you find relevant information," but it doesn't actually click links or do anything on your behalf. Broader, system-wide Copilot Vision features are still limited to Copilot Pro subscribers. With a subscription, Vision expands beyond Edge, letting you ask it to help you use features in Photoshop or video editing software, or guide you through a game like Minecraft, as it did for The Verge's Tom Warren earlier this month. To try out Copilot Vision, open this link to Microsoft's website in the Edge browser. That should give you a prompt to opt into the feature, and once you've given permission, you can open the Copilot sidebar while on a website, click the microphone icon, and your Vision session begins, signified by a chime and your browser changing its hue. Or that's how it should go. In my case, it took a couple of tries before Edge asked if I wanted to opt in. And once I could opt in and initiate a Vision session, the controls never appeared -- as of this writing, I simply have a message floating over the bottom of my browser that says "One moment..." But I'm using a fairly old, underpowered laptop, so your mileage may vary. According to Microsoft, the company logs Copilot's responses to you but doesn't collect your inputs, images, or page content while in a Copilot Vision session. When you're ready to stop sharing your screen with Copilot, you can either end the session or close the browser window.
[2]
Microsoft's Copilot Vision AI is now free to use, but only for these 9 sites
After months of teasers, previews, and select rollouts, Microsoft's Copilot Vision is now available to try for all Edge users in the U.S. The flashy new AI tool is designed to watch your screen as you browse so you can ask it various questions about what you're doing and get useful context-appropriate responses. The main catch, however, is that it currently only works with nine websites. For the most part, these nine websites seem like pretty random choices, too. We have Amazon, which makes sense, but also Geoguessr? I'm pretty sure the point of that site is to try and guess where you are on the map without any help. Anyway, the full site list is as follows: Wikipedia Tripadvisor Williams Sonoma Amazon Target Wayfair Food & Wine OpenTable Geoguessr Recommended Videos CEO of Microsoft AI Mustafa Suleyman announced the release on Bluesky yesterday and shared a few of his favorite use cases. Usually, when you want to ask Copilot a question, you have to write out the paragraphs of context yourself, and aside from being slow and annoying, this can also be pretty difficult if you're trying to ask about something you don't know much about. With Copilot Vision, instead of trying to describe what you're looking at or what you're talking about, the AI model can see it right on your screen. So, according to Suleyman's examples, you can search for "breathable sheets" on Amazon and ask Copilot if any of the results are made from appropriate fabrics. Copilot can point the right ones out to you or give you examples of breathable fabric to search for. On the Food & Wine recipe website, Copilot can help you go hands-free while you cook by answering your questions and reading out parts of the recipe to you. This works because the whole experience is designed to work through voice -- you speak directly to the AI and the AI speaks back. According to one of the videos on the Copilot Vision page, however, it looks like you can type out questions too and receive written responses. Microsoft is taking things very slowly and carefully with this feature, almost certainly because it wants to avoid triggering another backlash like it did with Recall. The limited number of compatible sites is connected to copyright issues, and the company makes sure to stress that the feature is "opt-in," doesn't record your screen, is only on when you turn it on, and deletes the data as soon as you end a session. If you're interested in testing it out, you can set things up and see a little tutorial through the Microsoft website.
[3]
Copilot in Edge Can Now See Your Screen and Answer Queries About It
Microsoft first unveiled Copilot Vision in December 2024 Earlier, it was only available to Copilot Pro subscribers Copilot Vision supports voice-based interaction Microsoft announced the rollout of Copilot Vision to all users in the Microsoft Edge browser last week. The computer vision-enabled artificial intelligence (AI) feature was first unveiled by the company in December 2024, and so far, it has only been available to Copilot Pro subscribers. The Redmond-based tech giant is now expanding the AI feature to all Edge browser users. With Copilot Vision, the AI chatbot can see the content of web pages and assist the user in navigating through various tasks. In a post on X (formerly known as Twitter), the CEO of Microsoft AI, Mustafa Suleyman, announced the expansion of Copilot Vision to all users irrespective of whether they are a paid subscriber of Copilot or not. "It'll think out loud with you when you're browsing online. No more over-explaining, copy-pasting, or struggling to put something into words," Suleyman said. Essentially, Copilot Vision is a real-time computer vision feature, similar to Gemini Live's screen sharing capability. The AI chatbot can see the web pages and all the content in the Edge browser, and process that information as well as the user's prompt in real time. The feature supports voice mode, and users can simply speak what they want instead of typing it. Microsoft has made Copilot Vision an opt-in feature, as some users may have concerns about their privacy and data security. To opt in, users can open this link in the Edge browser and follow the instructions. Once activated, the users can tap the microphone icon in the Copilot sidebar to activate Vision mode. Copilot Vision will show a floating bar with a text field and a microphone button. Suleyman said Copilot Vision can help users in a variety of ways. It can read multiple reviews and summarise them for the user. It can describe visually confusing objects, for example, it can tell users if a couch they see on a shopping website is "roll arm" or "midcentury modern". It can also help users in shopping by finding the right products based on natural language descriptions. At the time of launch, Microsoft had said that even after opting in, users will have full control over when to turn it on and when not to. Copilot Vision cannot see the user's screen when it is deactivated. Additionally, the company claims that any conversation data, including user prompts and images and content from web pages will all be deleted at the end of each session. However, Copilot's responses are logged and used to improve the company's safety systems.
Share
Share
Copy Link
Microsoft has made Copilot Vision, an AI feature that can interpret screen content and assist users, freely available in the Edge browser. This expansion brings advanced AI capabilities to a wider audience.
Microsoft has taken a significant step in democratizing AI technology by making Copilot Vision freely available to all users of its Edge browser. This move, announced by Mustafa Suleyman, CEO of Microsoft AI, on Bluesky, marks a notable expansion of the company's AI capabilities to a broader audience 1.
Copilot Vision is an AI assistant feature that can interpret what's on a user's screen and provide contextual help. It's designed as a "talk-based experience," allowing users to interact with the AI through voice commands. The system can "literally see what you see on screen," enabling it to guide users through various tasks, from cooking recipes to analyzing job descriptions 1.
While the free version of Copilot Vision is now available in Edge, it currently has some limitations:
Microsoft has implemented several measures to address potential privacy concerns:
Copilot Vision offers various practical applications:
This expansion of Copilot Vision represents Microsoft's ongoing efforts to integrate AI more deeply into everyday computing experiences, potentially reshaping how users interact with web content and perform online tasks.
Reference
[3]
Microsoft introduces Copilot Vision, an AI feature that can view and analyze users' browser content in Edge, raising both excitement and privacy concerns.
3 Sources
3 Sources
Microsoft introduces Copilot Vision, an AI-powered feature for Edge browser that offers real-time assistance and visual analysis, enhancing web browsing experience for users.
22 Sources
22 Sources
Microsoft introduces Copilot Vision for Windows 11, expanding AI capabilities to analyze app interfaces and content across the operating system, aiming to revolutionize user assistance and personalization.
10 Sources
10 Sources
Microsoft has announced significant enhancements to its Copilot AI assistant, including natural voice interactions, personalized news briefings, and improved integration across various platforms. These updates are set to roll out in October 2024, marking a new era in AI-assisted productivity and information access.
19 Sources
19 Sources
Microsoft is experimenting with a new Copilot-powered interface for the New Tab Page in Edge Canary, potentially replacing the current MSN feed with a more streamlined AI-assisted browsing experience.
3 Sources
3 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved