2 Sources
2 Sources
[1]
Talking to Windows' Copilot AI makes a computer feel incompetent
It's not hard to understand the AI future Microsoft is betting billions on -- a world where computers understand what you're saying and do things for you. It's right there in the ads for the latest Copilot PCs, where people cheerfully talk to their laptops and they talk back, answering questions in natural language and even doing things for them. The tagline is straightforward: "The computer you can talk to." "You should be able to talk to your PC, have it understand you, and then be able to have magic happen from that," Microsoft's Yusuf Mehdi told us in October. "The PC should be able to act on your behalf." And that has nothing on Microsoft's ultimate ambitions for AI, which are to rethink computing entirely. In a recent Dwarkesh Podcast interview, Microsoft CEO Satya Nadella agreed when presented with the host's idea that "these models will be able to use a computer as well as a human," and went even further, laying out a vision where Microsoft rearchitects all of its software to be infrastructure for AI agents to use in entirely new ways. This is a bold vision, and an enormous bet. The problem is, right now, talking to Copilot in Windows 11 is an exercise in pure frustration -- a stark reminder that the reality of AI is nowhere close to the hype. I spent a week with Copilot, asking it the same questions Microsoft has in its ads, and tried to get help with tasks I'd find useful. And time after time, Copilot got things wrong, made stuff up, and spoke to me like I was a child. Copilot Vision scans what's on your screen and tries to assist you with voice prompts. Invoking Copilot requires you to share your screen like you're on a Teams call, by hitting okay Every. Single. Time. After it gets your permission, it's excruciatingly slow to respond, and it addressed me by name every time I asked it anything. Like other AI assistants and LLMs, it's here to please, even when it's totally misguided. Let's start by testing what Microsoft's ad shows off. Multiple versions of the ad are posted online, and it even airs on broadcast TV during NFL games. Surely it must be easy to replicate the specific tasks Microsoft wants millions of people to see, especially when this is the groundwork for how Microsoft is reorienting the whole of its business. In the ad, Copilot Vision scans a YouTube video and correctly identifies a HyperX QuadCast 2S microphone when asked "What mic is she using in this video?" In my tests, the assistant first gave me basics about the benefits of dynamic microphones. Then, unprompted, it started talking to me like I was the person in the video ("I can see your setup right now, and I'm noticing that you have... a big setup there!"), then told me the mic in question was actually the first-gen HyperX QuadCast. To be fair, HyperX makes a lot of similar-looking mics, though at one point it said, "without seeing the exact lighting pattern or any specific features, it's hard to say definitively which model it is" despite it being bathed in RGB lighting in the image. On another two occasions, it identified the mic as a Shure SM7b. And when I asked, "Where can I get it nearby?" like in the ad, it once gave me a dead link to Amazon and then a correct link to the wrong mic at Best Buy. The ads also show a person asking "What sort of thrust does this thing have on it?" while pointing at a PowerPoint presentation about the Saturn V rocket. Unlike the ad, Copilot Vision couldn't identify the rocket from the image (or from the words "Saturn V" visible on screen). When I informed Copilot it was a Saturn V, it told me that thrust is generally measured in newtons or kilonewtons, then gave me an estimated thrust of 7.5 million pounds. Telling Copilot to "run some simulations on burn time," as in the ad, led to it telling me it can't, and directing me toward Matlab. Finally, a person in the ads is looking at a picture of a watery cave and asks, "How do I go there?" From context, it's supposed to be a frame from a video, but that video doesn't seem to exist. While the longer version of the ad above correctly identifies the image as Rio Secreto in Playa del Carmen, Mexico, the short version I saw first doesn't answer the question at all. Without the answer already in hand, I used reverse image search and found a match for the cave photo from a cruise line and real estate site, both stating it's from a cave in Belize. But it's listed elsewhere as a cave on Grand Cayman. I made the image full-screen and asked Copilot how to get there. The results were inconsistent, to put it mildly. Of the 20 or so tries: I renamed the file to mention Grand Cayman, and it told me how to book a flight to the Cayman Islands. Once I confirmed Copilot was just looking at the file name, I decided to try to trick it. I renamed the image "new-jersey-crystal-caves-limestone.jpg" and sure enough, the AI assistant was quick to tell me of the famous crystal cave of Ogdensburg, New Jersey. At no point did it correctly identify the location of the image. (To be slightly fair to Copilot, if you don't already know where the image is from, it's not easy to figure out. After manually searching through Trip Advisor images, my editor found a match in a user review album that confirms Microsoft's ad was correct in pinpointing Rio Secreto. Since the video depicted in Microsoft's ad doesn't seem to exist, it's unclear what information Copilot was using to identify the cave.) Beyond simply looking at things and trying to identify them, Microsoft also depicts Copilot actually doing things. Specifically, it's asked to "help me turn my portfolio into a bio," a prompt which in reality caused me an immense amount of psychic damage. In the ad, Copilot looks at an artist's portfolio of images (which look suspiciously AI-generated), their portrait, and a picture of their cat, and makes a one-sentence summary claiming they're inspired by their feline friend. Embarrassing. I don't have a portfolio website for my (real) photographs, so I pointed it at my Instagram. It generated such dreck about me being a "visual storyteller" "capturing life's essence, one frame at a time" that I wanted to sink under the floorboards. I feel physically ill whenever I think about it. And it didn't even mention my cats, who are sorely missed every day. How dare you, Copilot. Outside of trying to replicate the prompts from the ad, I struggled to find a use for Copilot Vision. I'm sure as hell not having it write for me, and it can't take simple actions for you in Windows -- not even to toggle settings like dark mode. Microsoft spokesperson Blake Manfre tells The Verge, "Copilot Actions on Windows, which can take actions on local files, is not yet available. This is an opt-in experimental feature that will be coming soon to Windows Insiders in Copilot Labs, starting with a narrow set of use cases while we optimize model performance and learn. This is separate from Copilot Vision." In third-party apps, it can offer advice, like how to get a dreamy look in Adobe Lightroom Classic, but the tips are generic. And since it transmits everything by audio, it goes from lots of rote preamble to quickly rattling off settings at you, like the worst of the YouTube tutorials it's probably cribbing from. I asked it to help me analyze a benchmark table in Google Sheets. It got a couple of basic percentage calculations right, but constantly misread clear-as-day scores both in the spreadsheet and in the on-page review. So how can you trust it? In gaming -- a thing Microsoft specifically advertises as a use for Copilot Vision -- it offered the most basic and vague information. For Hollow Knight: Silksong, it gave me only cursory instructions, sounding like a child presenting their book report based solely on the cover. (Actually, talking to Copilot is so much like this, it's uncanny.) In Balatro, it couldn't accurately identify the cards in my hand, but it did give me irrelevant info on mechanics from other card games. I tried to meet Copilot where it's at, but it failed at everything I asked it to do. Like much of the generative AI tech out there, it's an incomplete solution in search of problems. There could be something useful here, especially for the accessibility community, if it can one day fully control Windows. But talking to Copilot today makes powerful computers seem incompetent. It's hard to see how we get to Microsoft's bold vision of the agentic AI future from what it's shipping to real consumers today.
[2]
Microsoft's own Windows ad shows Copilot giving wrong instructions
A recent video ad on Twitter showed Copilot helpfully telling a user to select a setting in the wrong menu. Even as someone who viscerally hates "AI" getting stuffed into every aspect of every device and service I use, I can see places where it's helpful. For example, a conversational interface for my grandmother might mean she needs to call me less often for iPad tech support. Microsoft took the same angle for a recent Copilot ad... which bizarrely showed Copilot offering the wrong instructions. Alright, let's set up the dominoes before Copilot knocks them down. In a promotional X/tweet on November 12th, Microsoft showed YouTuber UrAvgConsumer pretending to be my not-so-tech-savvy grandmother, who says "Hey Copilot, I want to make the text on my screen bigger" while looking at Windows 11 Settings. "Can you show me where to click to do that?" he asks, activating the new Copilot Vision feature. Copilot correctly highlights the Display portion of the menu. When the user prompts Copilot with "Can you show me what to click next," the system points him to the Scale setting. And when asked what percentage is needed, it says "Let's start by clicking 150 percent, which is the recommended size"... which is baffling because 150 percent is already selected in the video as the default for that particular laptop. UrAvgConsumer apparently ignores the stated instructions and manually clicks on 200 percent instead. "Boom, and we've instantly got bigger icons, bigger text, easier for grandma to see." This is bewildering on many levels. One, it's pretty darn misleading since the audio of Copilot's instructions doesn't match the user's on-screen actions. As Windows Central points out, Copilot told the user to essentially do nothing. The user -- perhaps being more tech-savvy than Copilot's limited system -- correctly changes the setting to make the Windows UI bigger and easier to see. It's probably something he's done on his own dozens of times before. I'll play devil's advocate and point out that Copilot successfully guided the user to the relevant section of the Settings menu and the individual setting they needed. Even someone like my grandma could fiddle with that percentage option until she found something she liked. But to continue in even-handed treatment, UI scaling isn't quite the same thing as "making the text on my screen bigger." A more relevant setting -- especially for an older user -- would be the Accessibility section of the same menu, where "Text size" is the very first item in that menu, complete with a slider and preview window that'd be even easier for a novice to understand... and wouldn't re-scale the entire user interface. This fact has been pointed out by Twitter users so often that it's been automatically highlighted in the "Readers added context" section of the page, along with a link to an official Microsoft support page that even my grandma could find by searching the web. This page is also the very first result on Bing if you search for "how to make text bigger in windows 11." (I used Bing on the assumption that a novice user would be searching in Edge with no changes applied... which would still get better, faster, and more relevant results than using the LLM-powered Copilot.) Copilot failing in such a basic way isn't all that surprising. The very nature of large language models means that results for identical queries can be inconsistent and even flat-out wrong. But the fact that Microsoft would choose to highlight such a glaring failure of its own system, apparently in the presence of a very experienced technology influencer who applied a different change entirely, is incredibly strange. Why wouldn't Microsoft's promotional team just re-record that video until they got the desired outcome? Assuming that UrAvgConsumer simply didn't have the footage needed -- possibly because this was a rapid-fire shoot for TikTok-style content -- why not get the auto-generated Copilot audio to at least mention the 200 percent scaling option? Why would you choose to showcase something so glaringly wrong, specifically in an example of how that headline Copilot feature could help people? The most generous interpretation I can give of this situation is that it's a result of marketers who aren't that familiar with how Windows works for regular or advanced users. That would be embarrassing for anyone using a company's own products, especially one with billions of users like Windows, but marketing/PR and tech support are not the same job. Fine. It's also possible that we're missing bits of back-and-forth conversation that were edited out to make the video shorter. Maybe Copilot did instruct UrAvgConsumer to click on 200 percent off-screen. Even so, it's crazy to think that this made it through various levels of Microsoft bureaucracy to be put before eyeballs on Twitter and presumably other social platforms.
Share
Share
Copy Link
Independent testing reveals significant gaps between Microsoft's Copilot Vision marketing claims and actual performance, with the AI assistant providing incorrect information and failing basic tasks shown in promotional materials.

Microsoft has positioned Copilot Vision as a revolutionary AI assistant that can understand what users are saying and help them accomplish tasks through natural language interaction. The company's promotional materials feature the tagline "The computer you can talk to" and showcase scenarios where users successfully get help with various computing tasks
1
.However, independent testing reveals a significant gap between these marketing promises and actual performance. When journalists attempted to replicate the exact scenarios shown in Microsoft's advertisements, Copilot Vision consistently failed to deliver accurate results or helpful assistance.
Copilot Vision's implementation creates several user experience problems that undermine its utility. The system requires users to grant screen-sharing permissions for every interaction, similar to joining a Teams call, which creates friction in the user experience. Additionally, the assistant responds slowly to queries and addresses users by name repeatedly, creating an awkward interaction pattern
1
.During testing of advertised scenarios, the AI assistant demonstrated concerning inconsistencies. When asked to identify a HyperX QuadCast 2S microphone shown in a YouTube video - a task featured prominently in Microsoft's advertisements - Copilot Vision provided multiple incorrect answers, including identifying it as a first-generation HyperX QuadCast and, on separate occasions, as a Shure SM7b microphone.
A particularly striking example of the disconnect between marketing and reality emerged in Microsoft's own promotional content. In a November 12th Twitter video featuring YouTuber UrAvgConsumer, Copilot Vision was asked to help make text bigger on screen. The AI assistant correctly guided the user to the Display settings but then instructed them to select 150 percent scaling - which was already the selected option
2
.The user in the video ignored Copilot's instructions and manually selected 200 percent scaling instead, achieving the desired result despite the AI's guidance rather than because of it. This contradiction was so apparent that Twitter's community notes feature automatically highlighted the error, pointing out that the Accessibility section's "Text size" setting would have been more appropriate for the user's needs.
Related Stories
These performance issues occur against the backdrop of Microsoft's ambitious AI strategy. CEO Satya Nadella has outlined a vision where the company rearchitects all of its software to serve as infrastructure for AI agents, fundamentally changing how people interact with computers. The company has invested billions in this AI-first approach, making Copilot's current limitations particularly significant for Microsoft's broader strategic goals
1
.The testing results suggest that while the underlying concept of conversational AI assistance has merit, the current implementation falls short of the seamless experience portrayed in marketing materials. Issues range from basic accuracy problems to fundamental usability concerns that could frustrate rather than help users, particularly those who might benefit most from AI assistance, such as less tech-savvy individuals.
Summarized by
Navi
02 Oct 2024•Technology

26 Apr 2025•Technology

18 Jun 2025•Technology

1
Business and Economy

2
Technology

3
Policy and Regulation
