16 Sources
16 Sources
[1]
Microsoft responds to claims all Word and Excel files are being used to train AI
TL;DR: Microsoft's Connected Experiences feature in Office apps like Word and Excel allegedly collects data for AI training, according to nixCraft. However, Microsoft denies using customer data for training AI models like Copilot. Users can disable this feature via Privacy Settings. Microsoft states the feature supports internet-required functions like co-authoring. Companies developing artificial intelligence tools require large swaths of data for AI training, and what better way to gather large quantities of data than by scraping it from people using popular applications or programs? @nixCraft, an author of Cyberciti.biz has claimed Microsoft is participating in this type of scheme with Office, and it's Connected Experiences. According to nixCraft, Redmond's Connected Experiences feature automatically scraps data from Word and Excel files, and that data is used to train Microsoft's AI tools, such as Copilot. According to reports, this feature is turned on automatically, which means user-generated Word documents and Excel files are included in Microsoft's AI training dataset unless the user manually disables the feature. However, following reports sourcing @nixCraft's claims, Microsoft has since responded, saying customer data within Microsoft 365 apps, which includes Word and Excel, isn't used to train the company's Large Language Models (LLMs), the underlying technology powering AI tools such as Copilot, or ChatGPT. Microsoft also added, "This setting only enables features requiring internet access like co-authoring a document." For those wondering how to check if this feature is enabled, or would like to disable it, you can follow these steps: File > Options > Trust Center > Trust Center Settings > Privacy Options > Privacy Settings > Optional Connected Experiences and uncheck the appropriate box.
[2]
Microsoft says it's not using your Word, Excel data for AI training
Microsoft has denied claims that it uses Microsoft 365 apps (including Word, Excel, and PowerPoint) to collect data to train the company's artificial intelligence (AI) models. "Microsoft does not use customer data from Microsoft 365 consumer and commercial applications to train large language models. Additionally, the Connected Services setting has no connection to how Microsoft trains large language models," a Microsoft spokesperson told BleepingComputer on Monday. This comes after a Tumblr blog post claiming that Redmond used a "Connected Experiences" feature to scrape customers' Word and Excel data for AI training spread on social media. However, Microsoft told BleepingComputer that this optional setting is enabled by default and has been available since April 2019, refuting that it is used to train its AI. BleepingComputer also asked if it has always been enabled by default but did not receive a response before publishing. "The Connected Experiences feature enables features like co-authoring, real-time grammar suggestions, and web-based resources," BleepingComputer was told. "These features are on by default because they're features people naturally expect in a cloud-connected productivity tool. However, customers always have control and can adjust their Connected Experiences settings at any time." As Microsoft explains on its support website, the feature is used to: To toggle this feature on or off, Microsoft 365 users have to open their Office apps (like Word or Excel) and choose whether to enable or disable experiences that download online content or analyze their content under "Connected experiences" after going to the File > Account > Account Privacy > Manage Settings menu. "The Connected Experiences setting enables cloud-backed features designed to increase your productivity in the Microsoft 365 apps like suggesting relevant information and images from the web, real-time co-authoring and cloud storage, and tools like Editor in Word that provide spelling and grammar suggestions," Microsoft also told BleepingComputer. "Microsoft has been using AI in Microsoft 365 for years to enhance productivity and creativity through features like Designer in PowerPoint, which helps create visually compelling slides, and Editor in Word, which provides grammar and writing suggestions. These features do not rely on generative AI or LLMs but rather use simpler machine learning algorithms." Microsoft added that the setting has been available since April 2019, with enterprise admins having the option to choose if connected experiences are available to users in their organizations using multiple policy settings designed to manage privacy controls for Microsoft 365 Apps and Office on Mac, iOS, and Android devices.
[3]
Microsoft Denies Training AI Models on User Data From Word, Excel
People are not comfortable with their data being used to train AI Microsoft on Wednesday denied claims that it uses customer data from its Microsoft 365 applications, including Word and Excel, to train artificial intelligence models. The response came after some users pointed out on social media that the company requires users to opt-out of its "connected experiences" feature, which they argued was used to train AI models. "These claims are untrue. Microsoft does not use customer data from Microsoft 365 consumer and commercial applications to train foundational large language models," a Microsoft spokesperson said in an emailed statement to Reuters. The spokesperson added that the "connected experiences" enables features such as co-authoring and cloud storage, and has no connection to how the company trains its large language models. The conversations on social media indicate that people remain concerned about their data being used to train AI models without permission. © Thomson Reuters 2024
[4]
Microsoft: We Don't Use Your Word, Excel Data for AI Training
Microsoft says it doesn't use your Word and Excel data to train its AI models. As ReadWrite reports, the company's Connected Experiences, which analyzes a user's Office content to provide suggestions, recently came under fire for being enabled by default. Several users suspected Microsoft was using their data for AI training. "Microsoft Office, like many companies in recent months, has slyly turned on an 'opt-out' feature that scrapes your Word and Excel documents to train its internal AI systems," a user named nixCraft wrote on X. "This setting is turned on by default, and you have to manually uncheck a box in order to opt-out. If you are a writer who uses MS Word to write any proprietary content (blog posts, novels, or any work you intend to protect with copyright and/or sell), you're going to want to turn this feature off immediately," nixCraft added, before explaining how to turn it off. In a response posted on X a few days later, Microsoft said: "In the M365 apps, we do not use customer data to train LLMs. This setting only enables features requiring internet access like co-authoring a document." Earlier this year, Microsoft-owned LinkedIn came under similar scrutiny for toggling on a feature that allows the company to scrape user data for AI training. The UK's International Commissioner's Office forced LinkedIn to stop doing that with UK user data. LinkedIn still scrapes US user data by default; disable it by visiting Settings > Data Privacy > Data for Generative AI Improvement.
[5]
Microsoft denies Office docs used to train AI models
Confusion over Connected Experiences settings in 365 apps spark concerns Microsoft's Connected Experiences option in its productivity suite has been causing consternation amid accusations that the default setting might allow Microsoft to train AI models using customers' Word and Excel documents and other data. The Windows giant vehemently denies the claims. A spokesperson told The Register: "In Microsoft 365 consumer and commercial applications, Microsoft does not use customer data to train large language models without your permission." We asked Microsoft what it meant by "permission" and if the permission was opt-in or opt-out, and the IT titan has yet to respond. Connected Experiences has long been a part of Microsoft Office. Want to do some translation? You're probably using Connected Experiences. Transcribe a recording? Again, Connected Experiences. Do some grammar checking in Word? Connected Experiences will be analyzing your content. The spokesperson said: "The Connected Services setting is an industry standard setting that enables features that require an internet connection. Connected experiences play a significant role in enhancing productivity by integrating your content with resources available on the web. These features allow applications to provide more intelligent and personalized services." In recent weeks, users have been looking more deeply at what Microsoft is doing with all this data, and some have worried that it is being used to train the mega-corp's internal AI systems, something Microsoft says it is not. The suggestion was circulated on social media platforms over the weekend. A look at a consumer Windows 11 machine running Microsoft 365 2410 showed that the Connected Experiences setting was checked as on by default. But did that mean the customer's content was being used to train an AI? It's unlikely but not outside the realms of possibility. It is, however, extremely unlikely that content produced by Education and Enterprise users of Microsoft 365 would be collected in this way. After all, security policies are in place to control the Connected Experiences option if that's a concern. The difficulty folks face is that despite Microsoft's protestations, its privacy statement (as of November 2024) does permit it to do all manner of things with the data it collects. And how does it use that data? "As part of our efforts to improve and develop our products, we may use your data to develop and train our AI models." In August, Microsoft said it would be using consumer data from Copilot, Bing, and Microsoft Start to train Copilot's generative AI models. At the time, the biz said it would allow customers to opt out and would start displaying the opt-out control in October. It also said it wouldn't be conducting training on consumer data from the European Economic Area. Could the same apply to documents created by people in the company's flagship productivity suite? It's quite a leap from training a generative AI on what goes into Copilot to making use of Word and Excel documets under the guise of Connected Experiences. The two are very different services. So, on the one side, Microsoft is clear: It does not use customer data to train models. On the other hand, "we may use your data to develop and train our AI models." As for what it means by "data" in its privacy statement, the biz writes: "You provide some of this data directly, and we get some of it by collecting data about your interactions, use, and experiences with our products." The fact that concerns are being raised indicates some users are concerned about Microsoft's AI obsession. The Windows maker must therefore maintain clarity and transparency over what is and is not going to be absorbed into the its models. ®
[6]
Microsoft denies training AI models on user data
(Reuters) - Microsoft on Wednesday denied claims that it uses customer data from its Microsoft 365 applications, including Word and Excel, to train artificial intelligence models. The response came after some users pointed out on social media that the company requires users to opt-out of its "connected experiences" feature, which they argued was used to train AI models. "These claims are untrue. Microsoft does not use customer data from Microsoft 365 consumer and commercial applications to train foundational large language models," a Microsoft spokesperson said in an emailed statement to Reuters. The spokesperson added that the "connected experiences" enables features such as co-authoring and cloud storage, and has no connection to how the company trains its large language models. The conversations on social media indicate that people remain concerned about their data being used to train AI models without permission. (Reporting by Zaheer Kachwala in Bengaluru; Editing by Alan Barona)
[7]
Microsoft Is Denying That Office 365 Trains Its AI
Microsoft says that "connected experiences" have nothing to do with training LLMs. Following concerns that erupted on social media and its own support forums over the past few weeks, Microsoft wants to set the record straight: the company does not use Microsoft 365 (formerly Microsoft Office) apps to train its AI models, Copilot or otherwise. The confusion stemmed from the apps' "connected experiences" toggles, which are on by default and power features like cloud fonts and downloadable document templates. Connected experiences were added to the suite well before the advent of generative AI, so it's a bit unclear what sparked the controversy, although it might be attributable to a recent support document from Microsoft that explains which connected experiences "analyze your content." In response to a popular post on X that claimed the feature "scrapes your Word and Excel documents to train its internal AI systems," the official Microsoft 365 account replied "In the M365 apps, we do not use customer data to train LLMs. This setting only enables features requiring internet access like co-authoring a document." This Tweet is currently unavailable. It might be loading or has been removed. While vigilance in protecting your data is always advisable, the company assures that analytical connected experiences in Microsoft 365 simply search the internet or consult Microsoft's servers for help with your documents, with most working like an advanced spell check. Microsoft says in a different support document that "common examples [of these experiences] include translating text in a document, checking spelling in an email, or suggesting design changes to a presentation." In other words, you need to give Microsoft permission to look at what you've typed if it's going to transcribe it or check its grammar. According to the company, the feature is unrelated to training AI. In a statement to How-To Geek, a Microsoft spokesperson clarified further, calling connected experiences an "industry standard setting," saying: Microsoft does not use customer data from Microsoft 365 consumer and commercial applications to train large language models. Additionally, the Connected Services setting has no connection to how Microsoft trains large language models. Microsoft communications head Frank Shaw also gave his own two cents on the issue over on Bluesky, saying that the concerns are "not true." The controversy follows similar issues Adobe faced after updating its user terms, sparking concerns that the company would now scrape user imagery to feed its AI image generator. While Microsoft is now on the record stating that this particular worry is a misunderstanding, it's understandable why users are concerned. Previously, other tech companies including X and Meta have opted users into AI training by default, legitimizing concerns that other companies might make similar moves in the future. If you would still rather Microsoft's systems not be able to see what you've typed in Word or Excel, you'll need to uncheck a few toggles. If you have a document that is stored solely online, such as Word docs made with a work or school account, your controls over connected experiences will be more limited. Here, navigate to File > About > Privacy Settings > Optional connected experiences. If your organization allows it, you can then uncheck the box to turn off features such as Smart Lookup or Insert Online Picture.
[8]
Microsoft Says It Doesn't Use Your Office Docs to Train Its AI
Microsoft has officially refuted the claim saying that "we do not use customer data to train LLMs." The trend of default opt-in for using user data to train AI models is increasingly rising. Recently, it was revealed that Meta used Instagram photos to train its AI. Similarly, X started using user data to train its Grok AI by default. And now, an X user (@nixcraft) claimed that Microsoft is training its AI models on Word and Excel documents. It created quite a furor on the internet. The X post suggested that Microsoft quietly enabled "Connected Experiences" to use your private Office documents for AI training. It didn't seek explicit user permission and opted-in users without their knowledge. After going through the post, I tried to find out about "Connected Experiences" and what its terms and details say. First of all, the issue started with a Medium post, but it doesn't mention the exact terms where Microsoft states that Office documents are used for AI training. In Microsoft Word, you can find the privacy settings under Account > Manage Settings > Connected Experiences. Here, it says "Experiences that analyze your content." The wording might have spooked users into believing that Microsoft is using Word documents to train AI models. However, when you click on "Learn more", Microsoft's support page says: "Connected experiences that analyze your content are experiences that use your Office content to provide you with design recommendations, editing suggestions, data insights, and similar features. For example, PowerPoint Designer or Translator." Basically, when you use AI-powered features in Office apps, it analyzes your content. For example, Microsoft Editor uses your content to check for grammar. Analyze Data in Excel uses your data to find trends and patterns, and so on. Microsoft has listed all the features where it analyzes the content. Now, if you disable "Connected Experiences", some of these features also become unavailable in Office apps. To sum up, there is no mention of AI training for Office documents on the support page. Finally, Microsoft also responded to the X post and said, "In the M365 apps, we do not use customer data to train LLMs. This setting only enables features requiring internet access like co-authoring a document." So to sum up, Microsoft doesn't use your Word documents or Excel data to train its AI models. While some companies are indeed resorting to dubious tricks to collect your data, that is not the case here, at least for now.
[9]
Microsoft hits back at claims AI data scraping was sneakily turned on in Word, Excel
Publicly available information is considered open for training Microsoft's use of so-called 'Connected Experiences' has come under scrutiny following claims it collected user-generated content to train its AI models. The latest claims stem from an X post by @nixCraft, who accuses Microsoft of turning on an opt-out feature that automatically scrapes Word and Excel documents for AI training. @nixCraft continues: "This setting is turned on by default, and you have to manually uncheck a box in order to opt out." Concerns were raised about the use of proprietary content belonging to writers and creators who wish to protect, copyright or sell their content. The X user even shared steps on how to disable Connected Experiences via File > Options > Trust Center > Trust Center Settings > Privacy Options > Optional Connected Experiences. Despite the claims, Microsoft 365 replied to the thread, stating: "In the M365 apps, we do not use customer data to train LLMs. This setting only enables features requiring internet access like co-authoring a document." In an earlier August 2024 blog post, Microsoft confirmed use data remains private and is not disclosed without permission. The company wrote: "Generative AI models do not store training data or return it to provide a response, and instead are designed to generate new content." Microsoft also promised to alert users "transparently" in the event of a change to how it handles consumer data for training GenAI models in Copilot. On the whole, the company has made substantial efforts to differentiate customer data from readily available online sources. Microsoft seemingly treats the latter completely separately, with Microsoft AI CEO Mustafa Suleyman calling public information "freeware" for AI training.
[10]
Is Microsoft Using Your Word Documents to Train AI?
People are increasingly worried about how AI affects them. Will our jobs become obsolete? Are we constantly being tracked online and in the real world? Most recently, people have raised concerns about whether our creative and professional work is being used to train large language models. Over the past couple of weeks, rumors have surfaced that Microsoft is scraping customers' Word documents and Excel spreadsheets to train its AI systems. These claims arose after Microsoft introduced a new feature in October called Connected Experiences, a tool buried within its programs' Options menus that is activated by default unless you opt out. It contains three checkboxes -- "Turn On Experiences That Analyze Your Content," "Turn On Experiences That Download Online Content," and "Turn On All Connected Experiences." One could be forgiven for being suspicious. After all, wording like "turn on experiences that analyze your content" is enough to set off alarm bells in the minds of anyone who is justifiably vigilant about how AI behaves. The vague wording on Microsoft's Connected Experiences explainer page also doesn't put these concerns to bed, nor does the company's Privacy Statement, which talks about building, training, and improving "the accuracy of our automated methods." Moreover, on its Copilot page, Microsoft says it "uses data from Bing, MSN, Copilot, and interactions with ads on Microsoft for AI training." But does it use your creative or professional work in the Microsoft 365 apps to do the same? We contacted Microsoft to find out. In response, a Microsoft spokesperson said, "Microsoft does not use customer data from Microsoft 365 consumer and commercial applications to train large language models. Additionally, the Connected Services setting has no connection to how Microsoft trains large language models." Adding more context, the spokesperson continued, "The Connected Services setting is an industry standard setting that enables features that require an internet connection. Connected Experiences play a significant role in enhancing productivity by integrating your content with resources available on the web." They also provided specific examples of how Connected Experiences can help you with your work: "Providing relevant suggestions: When you're writing a document in Word, it can suggest relevant information or images from the web. Enhancing collaboration: Features like real-time co-authoring and cloud storage integration make it easier to work with others. Improving productivity: Tools like Editor in Word can offer grammar and style suggestions by analyzing a vast amount of data." The company also posted on X to confirm that it does "not use customer data to train LLMs." Either way, if you still feel uneasy about the Connected Services feature, you can opt out by opening the File menu in Word or Excel, selecting "Options," navigating to "Trust Center," and selecting "Trust Center Settings." From there, open "Privacy Options," navigate to "Privacy Settings," and uncheck the three options listed in the Connected Experiences section. Source: Microsoft
[11]
Microsoft accused of using online Word and Excel content for AI training, unless you opt-out
TL;DR: Microsoft Office's "Connected Experiences" feature, enabled by default, has been criticized for potentially using user content to train AI without clear disclosure. Microsoft denies using customer data for AI training, stating the feature is for internet-required functions like document co-authoring. Critics highlight Microsoft's Services Agreement, which grants the company rights to user content. Opting out requires navigating specific settings. Microsoft Office, which includes popular productivity apps like Word and Excel, has been accused of scraping user-created content to train its internal AI systems. The feature, called 'Microsoft Connected Experiences,' is enabled by default, and to opt-out, you need to navigate various settings and options. Linux expert @nixcraft on X called 'Connected Experiences' "unethical" because the feature's name doesn't convey that it's all about AI training. Microsoft's knowledge base states that connected experiences are "designed to enable you to create, communicate, and collaborate more effectively" by analyzing your content. There's no mention of AI or data being used for Copilot features or to train Microsoft's AI models. Various media publications have picked up on these allegations, and Microsoft has responded to them using the official @Microsoft365 account on social media, denying the claims. "We do not use customer data to train LLMs," Microsoft said. According to the company, 'Microsoft Connected Experiences' isn't about generative AI tools or training. "This setting only enables features requiring internet access like co-authoring a document." However, as Tom's Hardware points out, Microsoft 365 and Microsoft Connected Experiences are covered by Microsoft's Services Agreement, which gives the company full intellectual property rights to all of your content. "To the extent necessary to provide the Services to you and others, to protect you and the Services, and to improve Microsoft products and services, you grant to Microsoft a worldwide and royalty-free intellectual property license to use Your Content, for example, to make copies of, retain, transmit, reformat, display, and distribute via communication tools Your Content on the Services," the agreement reads. With web-based apps and services, social media, search engines, and more, several major players in the tech and AI space have been accurately called out for scraping user data to train AI. The CEO of Microsoft AI, Mustafa Suleyman, recently said that any content on the web "is fair use" for AI training, reproduction, and more. Everything else, which includes data that people actively opt out of sharing or checking a 'do not scrape' box, is a 'gray area.' Not only that, but it's up to the courts to decide what data is off-limits. Back to Microsoft Connected Experiences, the company has stated that it is not using your Word documents to train AI models. Still, if you want to opt-out, you need to go to File > Options > Trust Center > Trust Center Settings > Privacy Options > Privacy Settings > Optional Connected Experiences > Uncheck the box: "Turn on optional connected experiences."
[12]
No, Microsoft isn't using your Office docs to train its AI
Microsoft says it isn't using customer data from its Microsoft 365 apps to train its AI models. The clarification addresses reports circulating online in the last few weeks claiming Microsoft required Word and Excel users to opt out of training the company's AI systems. The confusion arose from a privacy setting in Microsoft Office that toggles "optional connected experiences" -- a feature that helps users "search for online pictures" or "find information available online," according to Microsoft. This toggle is switched on by default, and doesn't mention AI training in the disclosure. Similarly, a Microsoft learning document posted on October 21st, 2024 seems to have contributed to the confusion by describing a long list of connected experiences in Office that "analyze your content" without explicitly excluding AI training.
[13]
Microsoft Office AI training report is a misunderstanding
A tweet claiming that Microsoft Office uses the content of customer documents to train its AI systems raised a lot of eyebrows, but the tech giant says that it's a simple misunderstanding. A Linux account with hundreds of thousands of followers tweeted the claim ... NixCraft tweeted: Heads up: Microsoft Office, like many companies in recent months, has slyly turned on an "opt-out" feature that scrapes your Word and Excel documents to train its internal AI systems. This setting is turned on by default, and you have to manually uncheck a box in order to opt out. If you are a writer who uses MS Word to write any proprietary content (blog posts, novels, or any work you intend to protect with copyright and/or sell), you're going to want to turn this feature off immediately. The tweet included a screengrab of instructions which referenced it as a "new AI privacy agreement." On a Windows computer, follow these steps to turn off "Connected Experiences": File > Options > Trust Center > Trust Center Settings > Privacy Options > Privacy Settings > Optional Connected Experiences > Uncheck box: "Turn on optional connected experiences" However, the official Microsoft 365 account responded, stating that the claim is false. In the M365 apps, we do not use customer data to train LLMs. This setting only enables features requiring internet access like co-authoring a document. Given some of the unethical behavior we've seen when it comes to training AI models, it's not surprising that these claims arise, and get attention when they do. In this case, Microsoft's unequivocal statement makes it clear it's based on a misunderstanding. Tech giants do in part bring this on themselves, however, through extremely lengthy and opaque privacy policies which can make it hard to understand what they are and aren't doing. Adobe ran into a similar problem earlier this year, not helped by an initially dismissive comment which failed to properly clarify the position. It was subsequently forced to make a much clearer statement. Tech companies would be doing a service to themselves as well as their users if they revamped their privacy policies to be as succinct and transparent as possible.
[14]
Microsoft is using Word and Excel user data for AI training by default, and opting out isn't easy
A hot potato: The controversial issue of companies training their AI models on user-generated content is once again under the spotlight. This time, the programs in question are the widely used Microsoft Office and Excel. The data gathering is enabled by default, and opting out is a laborious, multi-step process. Microsoft's connected experiences in Office analyzes user content to provide the likes of design recommendations, editing suggestions, data insights, and similar features. On X, nixCraft pointed out that the Redmond firm has recently enabled a feature that scrapes users' Word and Excel documents to train its internal AI systems. It's turned on by default, too. As is so often the case when a company wants its customers to keep something enabled, opting out of the data collection is far from quick and simple. On Windows, it requires going to File > Options > Trust Center > Trust Center Settings > Privacy Options > Privacy Settings > Optional Connected Experiences and unchecking the box. Furthermore, once you untick the box, a prompt appears warning that disabling the option means some experiences won't be available. For those few who think this isn't a big deal, Tom's Hardware notes that a clause in Microsoft's Services Agreement grants the company a worldwide and royalty-free intellectual property license to use your content. "To the extent necessary to provide the Services to you and others, to protect you and the Services, and to improve Microsoft products and services, you grant to Microsoft a worldwide and royalty-free intellectual property license to use Your Content, for example, to make copies of, retain, transmit, reformat, display, and distribute via communication tools Your Content on the Services," according to the clause. Microsoft isn't the only one guilty of this sort of underhand behavior, of course. Meta also uses public posts, comments, photos, and interactions with chatbots from Facebook, Instagram, Threads, and WhatsApp to train its AI models. Unlike in the EU and UK, those in the US do not have a straightforward way to opt-out - setting your account to private helps, but it still isn't a guarantee this won't happen. In August, it was revealed that Nvidia, the company whose hardware powers the generative AI revolution, had been downloading 80 years of videos daily from YouTube, Netflix, and other platforms to train its AI models. Microsoft hasn't commented on the story. It's possible that the outcry could lead to a clarification of its terms of use, much like Adobe did after after a popup suggested that the company could access and claim ownership of content made with its creative suite to train AI models.
[15]
Microsoft Word and Excel AI data scraping slyly switched to opt-in by default -- the opt-out toggle is not that easy to find
Opting out requires a lengthy excursion through Microsoft's Trust Center. It is not a secret that Microsoft's Office has Connected Experiences which analyze content created by users. However, according to @nixCraft, an author of Cyberciti.biz. Microsoft's Connected Experiences feature automatically gathers data from Word and Excel files to train the company's AI models. This feature is turned on by default, meaning user-generated content is included in AI training unless manually deactivated. However, this deactivation is a very convoluted process. Microsoft has yet to comment on the information, so take it with a grain of salt. This default setting allows Microsoft to use documents such as articles, novels, or other works intended for copyright or commercial purposes without explicit consent. The implications are significant for creators and businesses relying on Microsoft Office for proprietary work, as their data could become part of the company's AI development. For this reason, anyone concerned about protecting their intellectual property or sensitive information should take action immediately. To do so, users must actively opt out by finding and disabling the feature in settings. The process requires unchecking the box 'Turn on optional connected experiences' that is enabled by default. On a Windows PC, the steps include going to File > Options > Trust Center > Trust Center Settings > Privacy Options > Privacy Settings > Optional Connected Experiences and unchecking the box. Seven steps to disable a critical feature that is turned on automatically seems very convoluted. Microsoft's approach mirrors a broad trend in the tech industry, where other companies have introduced similar features to train their AI models. While all AI models are trained on something generated by humans, doing so without their consent is unethical, to put it mildly. Microsoft has not publicly confirmed or denied that it uses content from Excel and Word documents generated by users of Microsoft Office to train its AI models. Nonetheless, there is a clause in Microsoft's Services Agreement that grants the company 'a worldwide and royalty-free intellectual property license to use Your Content.' "To the extent necessary to provide the Services to you and others, to protect you and the Services, and to improve Microsoft products and services, you grant to Microsoft a worldwide and royalty-free intellectual property license to use Your Content, for example, to make copies of, retain, transmit, reformat, display, and distribute via communication tools Your Content on the Services," the clause reads.
[16]
After Office Software Outage, Microsoft Faces Allegations of Data Scraping in Word and Excel
According to a report on tech news site Tom's Hardware, Microsoft's Office system -- the subscription-based online software suite that contains many tools aimed at digital workers, and used by over a billion people worldwide -- has a new "Connected Experiences" feature that, as part of its normal working, has a sneaky feature in which it "automatically gathers data from Word and Excel files to train the company's AI models." The data grab is apparently enabled by default, which means users' content may be used to train AI algorithms unless it's specifically deactivated. The deactivation itself is said to be "a very convoluted process." This default setting, Tom's Hardware contends, allows Microsoft to use all sorts of documents, be they "articles, novels, or other works intended for copyright or commercial purposes without explicit consent." This has gigantic implications for anyone who may use these well-known tools for generating proprietary content, be it written material or, perhaps sensitive financial data. The issue is that when a piece of content is used to train one of the current crop of AIs -- like Microsoft's Copilot -- an imprint, of sorts, is left in the algorithm. When a different user asks for information at some point in the future, it's possible that your content, however sensitive it may be, or whatever proprietary information it contains, may "surface" in some form in the generative AI's response.
Share
Share
Copy Link
Microsoft responds to claims that it uses customer data from Microsoft 365 apps like Word and Excel to train AI models, sparking a debate about data privacy and AI development.
Microsoft has firmly denied allegations that it uses customer data from Microsoft 365 applications, including Word and Excel, to train its artificial intelligence (AI) models. The controversy erupted following claims on social media that the company's "Connected Experiences" feature was being used to collect data for AI training without explicit user consent
1
.The "Connected Experiences" feature, enabled by default in Microsoft 365 apps since April 2019, has been the focal point of this debate. Microsoft clarifies that this feature is designed to support internet-required functions such as co-authoring, real-time grammar suggestions, and access to web-based resources
2
.A Microsoft spokesperson stated unequivocally, "Microsoft does not use customer data from Microsoft 365 consumer and commercial applications to train large language models"
3
. The company emphasizes that the Connected Experiences setting has no connection to how Microsoft trains its large language models or AI tools like Copilot2
.Microsoft asserts that users have control over their data and can adjust Connected Experiences settings at any time. The company provides instructions for disabling the feature through the Privacy Settings in Office applications
1
. For enterprise users, administrators have additional policy settings to manage privacy controls across their organizations2
.While denying the use of customer data for training large language models, Microsoft acknowledges its long-standing use of AI in Microsoft 365. Features like Designer in PowerPoint and Editor in Word utilize simpler machine learning algorithms to enhance productivity and creativity, rather than relying on generative AI or large language models
2
.Related Stories
The controversy highlights ongoing public concerns about data privacy and AI training practices. Users remain wary of their data being used without explicit permission, especially in light of recent incidents involving other tech companies
4
.Despite Microsoft's denials, its privacy statement (as of November 2024) does permit the use of collected data to "develop and train our AI models"
5
. This apparent contradiction underscores the need for greater clarity and transparency in how tech companies handle user data in the age of AI development.As AI technology continues to evolve, the incident serves as a reminder of the delicate balance between innovation and user privacy, prompting ongoing discussions about data usage policies and user consent in the tech industry.
Summarized by
Navi
[2]
[3]
[5]
1
Business and Economy
2
Business and Economy
3
Technology