Startups tap India's gig workers to collect physical AI training data, igniting privacy concerns

7 Sources

Share

Silicon Valley startup Human Archive has raised $8.2 million to deploy camera-equipped headsets on Indian gig workers, collecting first-person video of everyday tasks to train robots. The initiative has sparked controversy as home services platforms like Pronto pilot in-home data recording, prompting India's IT ministry to take notice while legal experts warn existing privacy laws may not adequately address AI systems learning inside private spaces.

Human Archive Secures Funding to Scale Physical AI Data Collection

Silicon Valley-based startup Human Archive has raised $8.2 million from Wing Venture Capital, NVP Capital, Y Combinator, and angels from OpenAI, Nvidia, Google, and Meta to expand its operations collecting real-world data for physical AI systems

1

. Founded by four students from Berkeley and Stanford—Samay Mani, Rushil Agarwal, Shloke Patel, and Raj Patel—the company is betting that India's gig economy represents an untapped source of AI training data needed to teach robots how to perform everyday tasks. Unlike large language models that learn from internet text, physical AI requires first-person footage of real people performing real tasks in actual environments, data that simply doesn't exist at scale online

5

.

Source: TechCrunch

Source: TechCrunch

The startup partners with companies in home services, hospitality, and restaurant sectors, equipping workers with camera-equipped caps to capture egocentric video data—first-person point of view footage of tasks like washing dishes, folding laundry, and cooking

1

. Human Archive currently has more than 1,000 active headsets deployed across multiple locations and over 50 different devices collecting various data points. To differentiate from competitors, the company is developing additional wearable devices including tactile gloves, full-body motion capture suits, and wrist cameras that synchronize RGB-D imagery with depth information, motion, and tactile force data

1

.

Training Robots Through India's Gig Workers

India's booming on-demand services market, exemplified by platforms like Zomato, Swiggy, and Urban Company, provides Human Archive with access to workers who already perform the exact tasks robotics labs need to replicate

1

. The global physical AI market is projected to grow from $1.50 billion in 2026 to $15.24 billion by 2032, according to MarketandMarkets

3

. Companies collecting this real-world data for AI are emerging across India, including HumynAI Labs, Egodata, Neo Cambrian, XP Robotics, and Objectways, which gather footage from household chores to manufacturing tasks

4

.

Source: ET

Source: ET

Data collection industry sources indicate that firms are selling these datasets to AI labs in the US, where physical AI development is more advanced. While companies previously commanded rates of $10-15 per hour for such datasets, increased competition has driven prices down to around $3-4 per hour

3

. One AI lab founder noted that appetite for this kind of data is enormous because almost nothing exists today, even for tasks as simple as folding cloth, requiring massive amounts of human-generated data for AI models to learn effectively

3

.

Pronto Controversy Triggers Government Scrutiny

The data collection push has sparked significant controversy after home services startup Pronto reportedly deployed camera-equipped workers into customers' homes to collect footage for training robotics systems. An internal memo from investor Glade Brook Capital revealed that Pronto is "piloting real-world training data with leading physical AI labs" and "developing a data business leveraging its workforce to capture real-world household data for robotics labs"

5

. The Ministry of Electronics and Information Technology (MeitY) has taken note of these developments, particularly around Pronto's in-home data recording pilots, though detailed queries about potential audit requirements remained unanswered .

Source: Digit

Source: Digit

Competitor Snabbit confirmed it conducted a pilot in April with Human Archive, evaluating a preliminary proposal within a controlled training-center environment before deciding not to proceed further

3

. Urban Company CEO Abhiraj Singh Bhal publicly stated his company would not engage in such arrangements, emphasizing that customer privacy is paramount

1

. Human Archive co-founder Raj Patel responded sharply on X, suggesting Urban Company would be forced to reconsider or risk losing relevance, while co-founder Rushil Agarwal claimed Pronto founder Anjali Sardana had called him "stupid" when he raised the partnership idea

1

.

Data Privacy Concerns and Legal Gaps

Legal experts warn that India's privacy laws may not adequately regulate continuously learning AI systems operating inside homes. Hardeep Sachdeva, partner at AZB & Partners, noted that the concern extends beyond traditional surveillance to "the gradual creation of highly sophisticated behavioural ecosystems capable of mapping routines, habits, preferences, conversations and emotional patterns"

2

. The real legal complexity lies in the fact that even if raw recordings are deleted, AI models may continue to retain behavioral patterns, spatial intelligence, predictive insights, and model improvements extracted from that data

2

.

Supratim Chakraborty, partner at Khaitan & Co, explained that while India lacks standalone AI legislation, existing frameworks including the Digital Personal Data Protection Act, Information Technology Act, and consumer protection provisions could apply through consent and purpose limitation principles

2

. However, home-based AI systems may operate in a "persistent and context-aware" manner inside highly private spaces, potentially collecting information about family members, children, guests, or domestic workers present in households

2

. Anushkaa Arora, founder of ABA Law Office, called for "targeted regulations specifically addressing AI-assisted surveillance, retention of recordings and accountability mechanisms"

3

.

India's Role as AI Data Services Hub

India is rapidly emerging as a data services hub for AI, with thousands of gig workers and data annotation jobs proliferating across the country

4

. Vineet Saraogi, cofounder of XP Robotics, called this the "new age back office for AI" where data collected from India would train AI models globally. Some experts draw parallels to historical patterns where India supplied raw materials while value-added processing happened elsewhere

4

. India ranked 38 in the 2025 Global Innovation Index, trailing the US, Europe, and several Asian countries, with multiple founders noting that India lacks the research muscle necessary to build frontier technologies

4

. As AI companies increasingly seek real-world behavioral data beyond internet content, homes and private spaces could become large-scale AI training grounds if regulations fail to keep pace with technology, experts warned

2

.🟡 waving, wearing headphones, or appearing to be in a staged corporate setting. The blurred audience adds authenticity to the scene without distracting from the main subjects. This makes the image highly relevant to the context of a startup securing funding and presenting its innovative data collection methods.) The image directly supports the content about Human Archive's operations and funding.

  1. ar-140298 (The image illustrates a central robot (representing AI) connected by dotted lines to various human workers performing tasks like data entry, sewing, operating heavy machinery, welding, and packaging. This visually represents how AI systems integrate with and gather data from different real-world human activities in various work environments.) This image effectively illustrates how AI systems are integrated with and gather data from various human activities, aligning with the "Training Robots Through India's Gig Workers" section and the concept of physical AI.

  2. ar-140220 (An image displaying three mobile phone screens showcasing the Pronto application interface. The central phone shows the "Hourly Services" booking page with options for 1 hr, 1.5 hrs, and 2 hrs, along with a promotional image of a person holding cleaning supplies and the text "One visit. Everything handled. BOOK NOW." The left phone displays a list of various "All house help services" like Bathroom Cleaning, Fridge Cleaning, Utensils, and Kitchen Prep. The right phone shows a "My Cart" screen with a scheduled booking and options for payment.) This image directly references the Pronto application, which is a central point of the "Pronto Controversy Triggers Government Scrutiny" section. It visually grounds the discussion about the company and its services.

Today's Top Stories

© 2026 TheOutpost.AI All rights reserved