Curated by THEOUTPOST
On Fri, 4 Apr, 12:02 AM UTC
2 Sources
[1]
Vana is letting users own a piece of the AI models trained on their data
Caption: "This data is needed to create better AI systems," says Vana co-founder Anna Kazlauskas '19, pictured here. In February 2024, Reddit struck a $60 million deal with Google to let the search giant use data on the platform to train its artificial intelligence models. Notably absent from the discussions were Reddit users, whose data were being sold. The deal reflected the reality of the modern internet: Big tech companies own virtually all our online data and get to decide what to do with that data. Unsurprisingly, many platforms monetize their data, and the fastest-growing way to accomplish that today is to sell it to AI companies, who are themselves massive tech companies using the data to train ever more powerful models. The decentralized platform Vana, which started as a class project at MIT, is on a mission to give power back to the users. The company has created a fully user-owned network that allows individuals to upload their data and govern how they are used. AI developers can pitch users on ideas for new models, and if the users agree to contribute their data for training, they get proportional ownership in the models. The idea is to give everyone a stake in the AI systems that will increasingly shape our society while also unlocking new pools of data to advance the technology. "This data is needed to create better AI systems," says Vana co-founder Anna Kazlauskas '19. "We've created a decentralized system to get better data -- which sits inside big tech companies today -- while still letting users retain ultimate ownership." From economics to the blockchain A lot of high school students have pictures of pop stars or athletes on their bedroom walls. Kazlauskas had a picture of former U.S. Treasury Secretary Janet Yellen. Kazlauskas came to MIT sure she'd become an economist, but she ended up being one of five students to join the MIT Bitcoin club in 2015, and that experience led her into the world of blockchains and cryptocurrency. From her dorm room in MacGregor House, she began mining the cryptocurrency Ethereum. She even occasionally scoured campus dumpsters in search of discarded computer chips. "It got me interested in everything around computer science and networking," Kazlauskas says. "That involved, from a blockchain perspective, distributed systems and how they can shift economic power to individuals, as well as artificial intelligence and econometrics." Kazlauskas met Art Abal, who was then attending Harvard University, in the former Media Lab class Emergent Ventures, and the pair decided to work on new ways to obtain data to train AI systems. "Our question was: How could you have a large number of people contributing to these AI systems using more of a distributed network?" Kazlauskas recalls. Kazlauskas and Abal were trying to address the status quo, where most models are trained by scraping public data on the internet. Big tech companies often also buy large datasets from other companies. The founders' approach evolved over the years and was informed by Kazlauskas' experience working at the financial blockchain company Celo after graduation. But Kazlauskas credits her time at MIT with helping her think about these problems, and the instructor for Emergent Ventures, Ramesh Raskar, still helps Vana think about AI research questions today. "It was great to have an open-ended opportunity to just build, hack, and explore," Kazlauskas says. "I think that ethos at MIT is really important. It's just about building things, seeing what works, and continuing to iterate." Today Vana takes advantage of a little-known law that allows users of most big tech platforms to export their data directly. Users can upload that information into encrypted digital wallets in Vana and disburse it to train models as they see fit. AI engineers can suggest ideas for new open-source models, and people can pool their data to help train the model. In the blockchain world, the data pools are called data DAOs, which stands for decentralized autonomous organization. Data can also be used to create personalized AI models and agents. In Vana, data are used in a way that preserves user privacy because the system doesn't expose identifiable information. Once the model is created, users maintain ownership so that every time it's used, they're rewarded proportionally based on how much their data helped trained it. "From a developer's perspective, now you can build these hyper-personalized health applications that take into account exactly what you ate, how you slept, how you exercise," Kazlauskas says. "Those applications aren't possible today because of those walled gardens of the big tech companies." Crowdsourced, user-owned AI Last year, a machine-learning engineer proposed using Vana user data to train an AI model that could generate Reddit posts. More than 140,000 Vana users contributed their Reddit data, which contained posts, comments, messages, and more. Users decided on the terms in which the model could be used, and they maintained ownership of the model after it was created. Vana has enabled similar initiatives with user-contributed data from the social media platform X; sleep data from sources like Oura rings; and more. There are also collaborations that combine data pools to create broader AI applications. "Let's say users have Spotify data, Reddit data, and fashion data," Kazlauskas explains. "Usually, Spotify isn't going to collaborate with those types of companies, and there's actually regulation against that. But users can do it if they grant access, so these cross-platform datasets can be used to create really powerful models." Vana has over 1 million users and over 20 live data DAOs. More than 300 additional data pools have been proposed by users on Vana's system, and Kazlauskas says many will go into production this year. "I think there's a lot of promise in generalized AI models, personalized medicine, and new consumer applications, because it's tough to combine all that data or get access to it in the first place," Kazlauskas says. The data pools are allowing groups of users to accomplish something even the most powerful tech companies struggle with today. "Today, big tech companies have built these data moats, so the best datasets aren't available to anyone," Kazlauskas says. "It's a collective action problem, where my data on its own isn't that valuable, but a data pool with tens of thousands or millions of people is really valuable. Vana allows those pools to be built. It's a win-win: Users get to benefit from the rise of AI because they own the models. Then you don't end up in scenario where you don't have a single company controlling an all-powerful AI model. You get better technology, but everyone benefits."
[2]
Decentralized platform is letting users own a piece of the AI models trained on their data
In February 2024, Reddit struck a $60 million deal with Google to let the search giant use data on the platform to train its artificial intelligence models. Notably absent from the discussions were Reddit users, whose data were being sold. The deal reflected the reality of the modern internet: Big tech companies own virtually all our online data and get to decide what to do with that data. Unsurprisingly, many platforms monetize their data, and the fastest-growing way to accomplish that today is to sell it to AI companies, who are themselves massive tech companies using the data to train ever more powerful models. The decentralized platform Vana, which started as a class project at MIT, is on a mission to give power back to the users. The company has created a fully user-owned network that allows individuals to upload their data and govern how they are used. AI developers can pitch users on ideas for new models, and if the users agree to contribute their data for training, they get proportional ownership in the models. The idea is to give everyone a stake in the AI systems that will increasingly shape our society while also unlocking new pools of data to advance the technology. "This data is needed to create better AI systems," says Vana co-founder Anna Kazlauskas '19. "We've created a decentralized system to get better data -- which sits inside big tech companies today -- while still letting users retain ultimate ownership." From economics to the blockchain A lot of high school students have pictures of pop stars or athletes on their bedroom walls. Kazlauskas had a picture of former U.S. Treasury Secretary Janet Yellen. Kazlauskas came to MIT sure she'd become an economist, but she ended up being one of five students to join the MIT Bitcoin club in 2015, and that experience led her into the world of blockchains and cryptocurrency. From her dorm room in MacGregor House, she began mining the cryptocurrency Ethereum. She even occasionally scoured campus dumpsters in search of discarded computer chips. "It got me interested in everything around computer science and networking," Kazlauskas says. "That involved, from a blockchain perspective, distributed systems and how they can shift economic power to individuals, as well as artificial intelligence and econometrics." Kazlauskas met Art Abal, who was then attending Harvard University, in the former Media Lab class Emergent Ventures, and the pair decided to work on new ways to obtain data to train AI systems. "Our question was: How could you have a large number of people contributing to these AI systems using more of a distributed network?" Kazlauskas recalls. Kazlauskas and Abal were trying to address the status quo, where most models are trained by scraping public data on the internet. Big tech companies often also buy large datasets from other companies. The founders' approach evolved over the years and was informed by Kazlauskas' experience working at the financial blockchain company Celo after graduation. But Kazlauskas credits her time at MIT with helping her think about these problems, and the instructor for Emergent Ventures, Ramesh Raskar, still helps Vana think about AI research questions today. "It was great to have an open-ended opportunity to just build, hack, and explore," Kazlauskas says. "I think that ethos at MIT is really important. It's just about building things, seeing what works, and continuing to iterate." Today Vana takes advantage of a little-known law that allows users of most big tech platforms to export their data directly. Users can upload that information into encrypted digital wallets in Vana and disburse it to train models as they see fit. AI engineers can suggest ideas for new open-source models, and people can pool their data to help train the model. In the blockchain world, the data pools are called data DAOs, which stands for decentralized autonomous organization. Data can also be used to create personalized AI models and agents. In Vana, data are used in a way that preserves user privacy because the system doesn't expose identifiable information. Once the model is created, users maintain ownership so that every time it's used, they're rewarded proportionally based on how much their data helped trained it. "From a developer's perspective, now you can build these hyper-personalized health applications that take into account exactly what you ate, how you slept, how you exercise," Kazlauskas says. "Those applications aren't possible today because of those walled gardens of the big tech companies." Crowdsourced, user-owned AI Last year, a machine-learning engineer proposed using Vana user data to train an AI model that could generate Reddit posts. More than 140,000 Vana users contributed their Reddit data, which contained posts, comments, messages, and more. Users decided on the terms in which the model could be used, and they maintained ownership of the model after it was created. Vana has enabled similar initiatives with user-contributed data from the social media platform X; sleep data from sources like Oura rings; and more. There are also collaborations that combine data pools to create broader AI applications. "Let's say users have Spotify data, Reddit data, and fashion data," Kazlauskas explains. "Usually, Spotify isn't going to collaborate with those types of companies, and there's actually regulation against that. But users can do it if they grant access, so these cross-platform datasets can be used to create really powerful models." Vana has more than 1 million users and more than 20 live data DAOs. More than 300 additional data pools have been proposed by users on Vana's system, and Kazlauskas says many will go into production this year. "I think there's a lot of promise in generalized AI models, personalized medicine, and new consumer applications, because it's tough to combine all that data or get access to it in the first place," Kazlauskas says. The data pools are allowing groups of users to accomplish something even the most powerful tech companies struggle with today. "Today, big tech companies have built these data moats, so the best datasets aren't available to anyone," Kazlauskas says. "It's a collective action problem, where my data on its own isn't that valuable, but a data pool with tens of thousands or millions of people is really valuable. Vana allows those pools to be built. "It's a win-win: Users get to benefit from the rise of AI because they own the models. Then you don't end up in scenario where you don't have a single company controlling an all-powerful AI model. You get better technology, but everyone benefits."
Share
Share
Copy Link
Vana, a decentralized platform born from an MIT class project, is reshaping the AI landscape by allowing users to own and govern AI models trained on their data, challenging the status quo of big tech data monopolies.
In a groundbreaking move, Vana, a decentralized platform originating from an MIT class project, is revolutionizing the way personal data is used in AI development. This innovative approach comes at a time when big tech companies dominate the data landscape, as exemplified by the recent $60 million deal between Reddit and Google for AI training data 12.
Vana, co-founded by Anna Kazlauskas, an MIT alumna, aims to shift the power dynamics in data ownership. The platform allows individuals to upload their data, govern its usage, and potentially profit from AI models trained on their information 12.
"This data is needed to create better AI systems," Kazlauskas explains. "We've created a decentralized system to get better data -- which sits inside big tech companies today -- while still letting users retain ultimate ownership." 12
Kazlauskas' journey from economics to blockchain technology at MIT played a crucial role in Vana's development. Her involvement with the MIT Bitcoin club in 2015 and subsequent exploration of cryptocurrency mining sparked her interest in distributed systems and their potential to shift economic power to individuals 12.
Vana leverages existing laws that allow users to export their data from major tech platforms. Users can then upload this information into encrypted digital wallets on Vana's platform. AI developers can propose ideas for new open-source models, and users can pool their data to train these models, forming what are known as data DAOs (Decentralized Autonomous Organizations) 12.
The platform ensures user privacy by not exposing identifiable information. Once a model is created, users maintain ownership and are rewarded proportionally based on how much their data contributed to its training. This approach opens up possibilities for hyper-personalized applications, particularly in health and lifestyle domains 12.
Vana has already facilitated several successful initiatives:
With over 1 million users and more than 20 active data DAOs, Vana is rapidly expanding. The platform has received proposals for over 300 additional data pools, many of which are expected to go into production this year 12.
As Vana continues to grow, it represents a significant shift in the AI data landscape, potentially democratizing access to AI development and giving individuals a stake in the AI systems that increasingly shape our world.
Reference
[1]
Massachusetts Institute of Technology
|Vana is letting users own a piece of the AI models trained on their dataVana, a pioneering network for user-owned data, has launched its mainnet and native $VANA token, aiming to transform data ownership and monetization in the AI era while addressing data scarcity issues in AI development.
3 Sources
3 Sources
A startup aims to solve AI's shoplifting detection issues, while major tech companies explore new ways to acquire AI talent and technology without traditional buyouts.
2 Sources
2 Sources
Capital One is revolutionizing its data management practices to create a robust, AI-ready data ecosystem. This move comes as the financial industry grapples with data scarcity challenges that impact AI innovation.
2 Sources
2 Sources
PIN AI introduces a mobile app that allows users to create their own personalized AI assistant using open-source models, prioritizing privacy and data control through blockchain technology and local processing.
3 Sources
3 Sources
The rise of open-source AI models is reshaping the tech landscape, with FTC Chair Lina Khan advocating for openness to prevent monopolies. Silicon Valley faces disruption as new models match industry leaders' capabilities.
4 Sources
4 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved