3 Sources
3 Sources
[1]
The laid-off lawyers and PhDs training AI to steal their careers
This article is a collaboration between The Verge and New York Magazine. heThe LinkedIn post seemed like yet another scam job offer, but Katya was desperate enough to click. After college, she'd struggled to make a living as a freelance journalist, gone to grad school, then pivoted to what she hoped would be a more stable career in content marketing -- only to find AI had automated much of the work. This company was called Crossing Hurdles, and it promised copywriting jobs starting at $45 per hour. Katya clicked and was taken to a page for another company, called Mercor, where she was instructed to interview on-camera with an AI named Melvin. "It just seemed like the sketchiest thing in the world," Katya says. She closed the tab. But a few weeks later, still unemployed, she got a message inviting her to apply to Mercor. This time, she looked up the company. Mercor, it seemed, sold data to train AI, and she was being recruited to create that data. "My job is gone because of ChatGPT, and I was being invited to train the model to do the worst version of it imaginable," she says. The idea depressed her. But her financial situation was increasingly dire, and she had to find a new place to live in a hurry, so she turned on her webcam and said "hello" to Melvin. It was a strange, if largely pleasant, experience. Manifesting on Katya's laptop as a disembodied male voice, Melvin seemed to have actually read her résumé and asked specific questions about it. A few weeks later, Katya, who like most workers in this story asked to use a pseudonym out of fear of retaliation, received an email from Mercor offering her a job. If she accepted, she should sign the contract, submit to a background check, and install monitoring software onto her computer. She signed immediately. She was added to a Slack channel, where it was clear she was entering a project already underway. Hundreds of people were busy writing examples of prompts someone might ask a chatbot, writing the chatbot's ideal response to those prompts, then creating a detailed checklist of criteria that defined that ideal response. Each task took several hours to complete before the data was sent to workers stationed somewhere down the digital assembly line for further review. Katya wasn't told whose AI she was training -- managers referred to it only as "the client" -- or what purpose the project served. But she enjoyed the work. She was having fun playing with the models, and the pay was very good. "It was like having a real job," she says. Two days after Katya started, the project was abruptly paused. A few days after that, a supervisor popped into the room to let everyone know it had been canceled. "I'm working assuming that I can plan around this. I'm saving up for first and last month's rent for an apartment," Katya says, "and then I'm back on my ass. No warning, no security, nothing." Several days later, she got an email from Mercor with another offer, this one for a job evaluating what seemed to be conversations between chatbots and real users -- many appeared to be from people in Malaysia and Vietnam practicing English -- according to various criteria, like how well the chatbot followed instructions and the appropriateness of its tone. Sign the contract, the email said, and you'll have a Zoom onboarding call in 45 minutes. It was 6:30PM on a Sunday night. Scarred from the abrupt disappearance of the previous gig, she accepted the offer and worked until she couldn't stay awake. Machine-learning systems learn by finding patterns in enormous quantities of data, but first that data has to be sorted, labeled, and produced by people. ChatGPT got its startling fluency from thousands of humans hired by companies such as Scale AI and Surge AI to write examples of things a helpful chatbot assistant would say and to grade its best responses. A little over a year ago, concerns began to mount in the industry about a plateau in the technology's progress. Training models based on this type of grading yielded chatbots that were very good at sounding smart but still too unreliable to be useful. The exception was software engineering, where the ability of models to automatically check whether bits of code worked -- did the code compile, did it print HELLO WORLD -- allowed them to trial-and-error their way to genuine competence. The problem was that few other human activities offer such unambiguous feedback. There are no objective tests for whether financial analysis or advertising copy is "good." Undeterred, AI companies set out to make such tests, collectively paying billions of dollars to professionals of all types to write exacting and comprehensive criteria for a job well done. Mercor, the company Katya stumbled upon, was founded in 2023 by three then-19-year-olds from the Bay Area, Brendan Foody, Adarsh Hiremath, and Surya Midha, as a jobs platform that used AI interviews to match overseas engineers with tech companies. The company received so many inquiries from AI developers seeking professionals to produce training data that it decided to adapt. Last year, Mercor was valued at $10 billion, making its trio of founders the world's youngest self-made billionaires. OpenAI has been a client; so has Anthropic. Each of these data companies touts its stable of pedigreed experts. Mercor says around 30,000 professionals work on its platform each week, while Scale AI claims to have more than 700,000 "M.A.'s, Ph.D.'s, and college graduates." Surge AI advertises its Supreme Court litigators, McKinsey principals, and platinum recording artists. These companies are hiring people with experience in law, finance, and coding, all areas where AI is making rapid inroads. But they're also hiring people to produce data for practically any job you can imagine. Job listings seek chefs, management consultants, wildlife-conservation scientists, archivists, private investigators, police sergeants, reporters, teachers, and rental-counter clerks. One recent job ad called for experts in "North American early to mid-teen humor" who can, among other requirements, "explain humor using clear, logical language, including references to North American slang, trends, and social norms." It is, as one industry veteran put it, the largest harvesting of human expertise ever attempted. These companies have found rich recruiting ground among the growing ranks of the highly educated and underemployed. Aside from the 2008 financial crash and the pandemic, hiring is at its lowest point in decades. This past August, the early-career job-search platform Handshake found that job postings on the site had declined more than 16 percent compared with the year before and that listings were receiving 26 percent more applications. Meanwhile, Handshake launched an initiative last year connecting job seekers with roles producing AI training data. "As AI reshapes the future of work," the company wrote, announcing the program, "we have the responsibility to rethink, educate, and prepare our network to navigate careers and participate in the AI economy." There is an underlying tension between the predictions of generally intelligent systems that can replace much of human cognitive labor and the money AI labs are actually spending on data to automate one task at a time. It is the difference between a future of abrupt mass unemployment and something more subtle but potentially just as disruptive: a future in which a growing number of people find work teaching AI to do the work they once did. The first wave of these workers consists of software engineers, graphic designers, writers, and other professionals in fields where the new training techniques are proving effective. They find themselves in a surreal situation, competing for precarious gigs pantomiming the careers they'd hoped to have. achEach of the more than 30 workers I spoke with occupied a position along a vast and growing data-supply chain. There are people crafting checklists that define a good chatbot response, typically called "rubrics," and other people grading those rubrics. Others grade chatbot answers according to those rubrics, and still others take the rubrics and write out what's often described as a "golden output," or the ideal chatbot answer. Others are asked to explain every step they took to arrive at this golden output in the voice of a chatbot thinking to itself, producing what's called a "reasoning trace" for AI to follow later when it encounters a similar task out in the real world. Sometimes the labs want only rubrics for prompts their AI can't already do, which means companies like Mercor ask workers to produce "stumpers," or requests that will make the model fail. "It sounds easy, but it's really hard," says a worker who was trying to stump models by asking them to make inventory-management dashboards. Models fail in counterintuitive ways. They may be able to solve advanced-physics exam questions, but ask them for transit directions and they'll recommend transferring on nonconnecting train lines. Finding these weak spots takes time and creativity. One type of project gathers groups of lawyers, human-resources managers, teachers, consultants, or bankers for something Mercor calls world-building. "You and your team will role-play a real-life team within your profession," the training materials read. The teams are given dedicated emails, calendars, and chat apps and asked to create a hundred or more documents that would be associated with some corporate undertaking, like a fictional mining company analyzing whether to enter the data-center business. After several 16-hour days of fantasy document production, one worker recounts, the resulting slide decks, meeting notes, and financial forecasts are sent to another team, which uses them as grist in their attempts to stump a model operating in this simulated corporate environment. Then, having stumped the model, that team writes new, more nuanced rubrics, golden answers, and so on. Workers can only guess who the customer is or how many others are working on the project -- based on references to teams like Management Consulting World No. 133, there could be hundreds, maybe thousands. There are people hired to evaluate the ability of image models to follow their prompts and others who summarize video clips in extraordinary detail, presumably to train video models. Efforts to improve AI's ability to have spoken conversations have resulted in a surging demand for voice actors, who might find themselves recording "authentic, emotionally resonant" speeches, according to one listing. "I just tell people I'm an AI trainer, then it sounds more professional than what I'm doing," says an aspiring screenwriter who was instructed to record himself pretending to ask a chatbot for a fitness plan while pots and pans clanged in the kitchen. Another time, he was told to record himself dispensing financial advice over the phone to a parade of people he assumed were other workers. This audio might then be broken down and sent to someone like Ernest, who used to make a living as an online tutor until the company he worked for replaced him with a chatbot. When we spoke, he was listening to minutelong clips of random dialogue slowed to 0.1x speed and marking when someone started and stopped speaking down to the millisecond. Many of the clips included a person talking with a chatbot and interjecting "huh" or "I see," so he assumes he was improving AI's ability to have naturally flowing conversation, but he has no actual idea. As is standard practice in the field, the project was referred to by a codename and the client only ever as "the client." The entire system is designed so that workers have minimal insight into the supply chain they are part of. If they find out who the customer is, they are contractually forbidden from telling anyone, even their own colleagues. Nor are they allowed to describe the details of their work beyond broad generalities like "providing expertise in XYZ domain to improve models for a top AI lab," according to one Mercor agreement. So afraid are workers of inadvertently violating their confidentiality agreements and getting fired that when they discuss their work in public forums, they mask their already codenamed projects with additional codenames, for example by referring to a project called "Raven" as "Poe." atya'sKatya's second project with Mercor was far more stressful. There was less work to go around, and it came in fits and starts. Managers would drop a message in the Slack channel saying new tasks were incoming in half an hour, and, she says, "everyone in Slack would drop what they were doing and jump on them like piranhas," working as fast as they could while the bar showing how many tasks remained slid toward zero. Then they were back in Slack again, politely begging supervisors for more work and more hours, talking about their kids' birthdays or their need to pay rent, or telling anyone who might be listening that their availability was wide open in case there was more work to be done. Soon, Katya was dropping everything at the sound of a Slack ding too. "Sometimes I'm on the toilet or at dinner and I get the Slack notification. I'm like, 'Oh, sorry, I gotta work now.'" That project soon ended and then came another. It was nearly identical to the first, which she had enjoyed, but now, on top of writing rubrics, she had to stump the model and complete the more difficult task in the same amount of time. She was also getting paid $8 an hour less. This is common at Mercor. Nearly every worker I spoke with reported that demands increased, time requirements shrank, and pay decreased as projects continued. Those who couldn't meet the new demands got "offboarded" and replaced by new recruits. Chris joined Mercor last year, after a difficult few months struggling to find film work. Unlike many people who suspect they're casualties of automation, he knew for certain that this was the case. He'd had a recurring job drafting episodes for an unscripted television show -- doing preinterviews, sketching scenes, writing the reality TV equivalent of a screenplay. But in late 2024, he was told the show would be running on a "skeleton crew" and his work was no longer needed. He found out later the company was using ChatGPT to draft new episodes. So that October, when Chris received an offer to write an entire sci-fi screenplay for a major AI company, he said "yes," grim as the prospect was. Since then, he has gone from gig to gig. "This is my only source of income right now," he says. "I know people who are award-winning producers and directors, and they're not advertising that they're doing this work, but that's how they're putting food on the table." His first jobs with Mercor were, like Katya's, relatively pleasant and well paid, but soon came the 6PM fist-bump-emoji Slack exhortations to "come on team, let's push through this," followed by sudden halts and months of silence. "You were just constantly waiting for the crack of the starting gun at any hour of the day," Chris says. Then it was crunch time again and managers, increasingly panicked as deadlines neared, started threatening workers with offboarding if they didn't complete tasks quickly enough. The time he spent working was tracked to the second by software called Insightful, which monitored everything he did on his computer. Time that the software deems "unproductive" could be deducted from his pay, and if a few minutes passed without him typing, the system pinged him to ask whether he had been working. Sometimes Chris saw people post in Slack that they'd gone over the target time on a particularly tricky task and that they hoped it would be okay; the next day, they would be gone. Increasingly worried he would be offboarded too, he started working off the clock, deactivating Insightful while reading instructions so he could move faster. If he went over the target time, he turned the clock off and kept working for free. Companies say this software is necessary to accurately track hours and prevent workers from cheating, which, in this case, means using AI, something all data companies strictly forbid. The ground truth of verified human expertise is what they're selling, and when AI trains on AI-generated data, it gradually degrades, a phenomenon researchers call "model collapse." Employees of data companies say it is a constant battle to screen out AI slop. For workers, AI is a particular temptation as pressure increases. When the retail expert trying to stump models with analytics dashboards had her target time dropped from eight hours per task to five to three and a half, she turned off Insightful and sought outside help. "To be honest, I went into Copilot and ChatGPT and put my prompt in there and said, 'How can I work this so you guys can't answer it?'" Then she went to another chatbot and asked if the prompt sounded AI generated and, if so, to make it sound more human. "It's just so horrible, the mental effect of it," says Mimi, a screenwriter who has worked on multiple streaming shows and has been training AI for Mercor for several months. She found out about Mercor from a fellow screenwriter who dropped one of its job links in a Writers Guild of America Facebook group. Like a lot of people in this line of work, Mimi is conflicted. "One documentary-maker who's won Emmys, he messaged me and he was like, 'I'm being handed a shovel and told to dig my own grave,' and that's exactly how everyone thinks about it," she says. Still, as a single mom, she needed the money. She was thankful for the work at first, then the project was paused, unpaused, and paused again. For five weeks, she was told a project would be starting imminently. When it finally did, requirements were added, while the expected time shortened, and she raced to keep up under the watchful eye of Insightful. She felt that someone put it well on Slack when they said it was like they were living in a fishbowl waiting for their human masters to drop in food, and only the ones who were fast enough to swim to the top could eat. "Last night, I got so fucking stressed because my kid came home and it was 7PM, and I get this message, 'The tasks are out!' and I'm just working, just trying to get as many hours in before I can go to bed," Mimi says, choking up. "I spend no time with my kid, and at one point, he can't find something for school and I just start screaming at him. This work is turning me into a fucking demon." She's especially disturbed by the surveillance: "The idea that somebody can measure your time and that all the little bits that go into being a human are taken away because they're not profitable, that you can't charge for going to the toilet because that's not time you're working, you can't charge for making a cup of coffee because that's not time you're working, you can't charge for having a stretch because your back hurts. This is why unions were formed, so people could have guaranteed hours and guaranteed lunch breaks and guaranteed holidays and sick pay. This is the gig economy to the very extreme." This is what concerns her more than the AI itself: that it's bringing to knowledge work the sort of precarious platform labor that has transformed taxi driving and food delivery. Meanwhile, she watches in horror the desperate gratitude of her colleagues as they rejoice at the 7PM announcement of incoming work. "How long are these tasks expected to last?" one worker asked in Slack. "I'm wondering too, I'd like to know whether I can sleep or not." With no answer forthcoming, they swapped tips on how to stave off sleep. henWhen Mercor began recruiting aggressively last year, it framed itself as a more worker-friendly version of the platforms that had come before it. Criticizing his rival Scale AI on a podcast, Foody, Mercor's CEO, said, "Having phenomenal people that you treat incredibly well is the most important thing in this market." Workers who joined during this time do report being treated well; the pay was better than elsewhere, and instead of being managed by opaque algorithms, as is common, there were actual human supervisors they could go to with questions. But people who have worked in management at data companies say they often start out this way, wooing workers off incumbent platforms with promises of better treatment, only for conditions to degrade as they compete to win eight-figure contracts doled out by the half-dozen AI companies who are interested in buying this data in bulk. At Mercor, there was the additional complication of management largely consisting of people in their 20s with minimal work experience who had been given hundreds of millions of investor dollars to pursue rapid growth. "I don't care if somebody's 21 and they're my manager," says Chris, the reality TV producer. "But they've never worked at this scale. When you try to find some kind of guidance in Slack, very maturely and clearly explaining what the situation is, you get a meme back with a corgi rolling its eyes and it says, 'Use your judgment.' But it's like, 'Use your judgment and fuck it up, and you get fired.' You went to Harvard, you graduated last year, and your guidance for a group of people, many of whom are experienced professionals, is a meme?" Lawyers, designers, producers, writers, scientists -- all complained of inexperienced managers giving contradictory instructions, demanding long hours or mandatory Zoom meetings for ostensibly flexible work, and threatening people with offboarding for moving too slowly, threats that were particularly galling for mid-career professionals who felt their 20-year-old bosses barely understood the fields they were trying to automate. "The founders pride themselves on '9-9-6,'" says a lawyer, referring to a term that originated in China to describe 72-hour workweeks associated with burnout and suicide but has been appropriated by Silicon Valley as aspirational. "You need to be accessible at all hours, and they're going to pump out messages at 6AM, and you better jump because the perception is you will be offboarded and another person will replace you." "It's not just that team leads are young, project managers are young, senior project managers are young. It's that the senior-senior project managers, the ones responsible for the project in its entirety, are young. I guess that comes from the top because they're young, right?" says Lindsay, a graphic designer and illustrator in her 50s who came to Mercor after 85 percent of her work evaporated over the past year, owing, she believes, to improvements in generative AI. Increasingly desperate for work, she scoured job boards; it seemed the only listings matching her expertise were offers to help build the technology she blamed for demolishing her career. "I swallowed my hatred and signed up," she says. After some initial work producing graphic-design data, she was invited to join a job for Meta grabbing videos from Instagram Reels and tagging whatever was in them. It was boring, and at $21 per hour, the pay was middling, but Lindsay needed the money. So, she discovered when she was brought into the project's Slack, did approximately 5,000 others. In early November, a Mercor representative announced that Lindsay's project would be ending owing to "scope changes," though workers had previously been told the project would run through the end of the year. Lindsay and thousands of others found themselves removed from the company's Slack. Soon, an email arrived in their inbox, inviting them to a new project called Nova paying $16 per hour. Thousands of workers poured into the new Slack only to discover it was the exact same job, now paying 24 percent less. All but two of the Slack channels had been deleted, including the watercooler, support, and help rooms. The ability to direct-message one another had also been cut off. There were no team leads to be found. With no one to ask for assistance, workers flooded the main rooms with pleas and indignation. "Nobody knows what's going on. Everybody's really confused," says Lindsay. "The messages are coming so fast in that channel. It's just absolute chaos. 'Help, please. What do I do? What am I supposed to do? Where do I go? Can I get started tasking? Am I supposed to redo all the assessments that I've done before?'" Someone emailed support asking for help, and for some reason that email was sent to every one of the thousand-some people on the project, who seized on it and began to reply-all with their bafflement and outrage. "It was absolute carnage," says Lindsay. "There's no other word for it." Workers began posting complaints on Mercor's subreddit, only to have their posts quickly deleted by the Mercor representatives who moderate it. In response, two unsanctioned Mercor subreddits were created, where workers could freely express such sentiments as "CHILDREN RUN THIS COMPANY, THEY WILL SOON HAVE THEIR DAY OF RECKONING." "It's just really sad," says Lindsay. "There are some people in there where it's genuinely the difference between them being able to feed their families and not feed their families." "I hate gen AI," she adds. "I think AI should be used for curing cancer. I think it should be used for space exploration, not in the creative industries. But I need to be able to pay my rent. And then when people like Mercor pull this stuff where they treat you like nothing more than a lab rat -- I've been working for a very long time. I have never, ever been treated as badly as this." ntermittentIntermittent work, extreme secrecy, and abrupt firings are the norm across the data industry. On Surge AI's work platform, called Data Annotation Tech, workers are not only regularly terminated without explanation; they are often not even told they've been fired. They just log in one day and find the dashboard empty of tasks. The phenomenon is so ubiquitous they call it simply "the dash of death." Last year, a Texan with a master's degree in divinity who was teaching voice models to respond to queries with appropriate levels of feeling -- different tones for a user telling them their dog died versus asking for a trip itinerary -- logged in to work one morning and found his dashboard empty. Scrolling to the bottom of the page for the support button, he discovered it no longer worked. That's when he knew he had been terminated. His mind raced through possible reasons: Had he worked too much? Had his quality slipped? He knew he would never find out. "I felt cut adrift," he says. Anxious about how he would pay his bills and care for his ailing dog, he grew depressed, then horrified. He thought about his teacher friends who couldn't get their students to write and all the people graduating with now-worthless computer-science degrees. "The technology makes us see everything as a utility, something to be used," he says, a category that he feels includes discarded data workers like himself. He resolved to become a chaplain, figuring that no matter what the AI future holds, people will need a fellow human to be there for them. The on-again, off-again nature of the work is not just the result of company culture; it stems from the cadence of AI development itself. People across the industry described the pattern. A model builder, like OpenAI or Anthropic, discovers that its model is weak on chemistry, so it pays a data vendor like Mercor or Scale AI to find chemists to make data. The chemists do tasks until there is a sufficient quantity for a batch to go back to the lab, and the job is paused until the lab sees how the data affects the model. Maybe the lab moves forward, but this time, it's asking for a slightly different type of data. When the job resumes, the vendor discovers the new instructions make the tasks take longer, which means the cost estimate the vendor gave the lab is now wrong, which means the vendor cuts pay or tries to get workers to move faster. The new batch of data is delivered, and the job is paused once more. Maybe the lab changes its data requirements again, discovers it has enough data, and ends the project or decides to go with another vendor entirely. Maybe now the lab wants only organic chemists and everyone without the relevant background gets taken off the project. Next, it's biology data that's in demand, or architectural sketches, or K-12 syllabus design. To compete, data companies arrange things so that they will always have workers on call while preserving their freedom to drop them at a moment's notice. "Every vendor is going to have some kind of setup whereby they don't really make promises to people," says a senior employee of a major data company. The companies rarely have much notice of these shifts themselves, sometimes because the AI developers aren't sure exactly what data they need in the first place, other times because they are shopping around for the best deal. "They want to keep us in the dark," the employee continues, "so we inevitably keep the contributors in the dark, then a purchase falls through and you have a thousand people you've trained and formed a relationship with just saying, like, 'What the fuck? Why isn't there work?' It's a horrible feeling from an operator's perspective, too, but obviously it's way worse for them." The workers at the bottom of this supply chain exist in a state of extreme precarity and maximum competitive frenzy -- especially because their strict confidentiality agreements make it impossible for them to establish any kind of seniority or relationship that might outlast a particular project. "The power is all on one side because they can't talk about it," says Matthew McMullen, a strategy and operations executive who has worked in the industry since the self-driving-car boom in the mid-2010s. "The labs benefit from you not being able to leverage your experience in the market, and this silence is like their pricing power. The silence is their ability to extract mass information from people without giving them the power to object or to unionize or to make companies themselves. As long as they can't prove what they've done, these raters can't demand what they're worth. The only way that people can demand things is by showing their ability to step up, to take on more work. The only power that they have is to keep going, to get back in line." Which is what they do. When a project for Mercor ends, managers often post a link to other projects on the platform and encourage people to apply. "But again, there are thousands of people applying, so you throw your application into a hole and hope to hear back at some undefined point," says Katya. While they wait, workers sign up for Handshake, Micro1, Alignerr, or another of the ever-growing number of data providers. These companies are always recruiting. Like Mercor, many use AI interviewers and automated evaluations, meaning they have no incentive to limit the number of interviews they do. Mercor offers referral bonuses of several hundred dollars, leading some to promote the company so aggressively that mentions of it have been banned from several subreddits. Katya has applied for dozens of jobs and gotten three, not an unusual ratio. Nor do companies bear any cost for overhiring. Because workers are ostensibly independent contractors, they are not owed paid time off, breaks, healthcare, overtime pay, or unemployment benefits. It's free to keep them hanging around, and a surplus of vetted workers ensures they will jump quickly to finish tasks before someone else does. It all combines to create an arrangement in which employers can turn labor on and off like a tap. (Reached for comment, Mercor spokesperson Heidi Hagberg said that "the nature of this is project based contract work, meaning it can extend, pause, or end at any time, especially as the client's scopes and needs evolve," and that many of the worker complaints "were centered around the misalignment of expectations of a full-time job versus -project-based work.") If you move fast and get lucky and have the right combination of expertise and stay on the right side of each platform's unique and mysterious recipe of productivity metrics, you can make decent money. I spoke to a playwright making $10,000 a month, a multitalented chemist who at various points found gigs demonstrating poker and singing for AI. But even then, there is an inescapable awareness of ephemerality because producing training data means working toward your own obsolescence. While the number of people doing data work may continue to rise, any particular gig will last only as long as it takes for the machines to successfully mimic it. It takes years for a human to develop expertise, and sooner or later, they're going to run out of skills to sell. A worker with a master's in linguistics had found steady rubric work for a year, but late in 2025, he noticed it was becoming more difficult to stump the models. Any obscure theory or Indigenous language he asked about, the model would find the correct papers. Instead of submitting three or four rubrics per week, he was lucky to get one. Everyone else on the project was following the same trajectory, so he wasn't surprised when it came to an end. Their know-how had been extracted. In the past, he'd always been able to find a new gig, but now when he looked around, he saw only requests for medical experts, human-resources managers, and teachers. He has now been without work for five months and isn't sure what to do next. oTo the extent that policy responses to AI automation are discussed at all, they mostly concern what to do when AI renders large categories of workers obsolete. Maybe this will happen, but another possibility is that particular tasks will get automated and humans redistributed to other parts of the production process, some revising so-so AI output, others crafting rubrics to improve it. Much of this work will be inherently intermittent, which means it will be done by independent contractors, workers whom current regulations leave almost wholly unprotected. Daron Acemoglu, a professor of economics at MIT who studies automation, compares the situation to that of weavers, who before the industrial revolution were "like the labor aristocracy," self-employed artisans in control of their own time. Then came weaving machines, and in order to survive, they were forced to take new jobs in factories, where they worked longer hours for less money under the close supervision of management. The problem wasn't simply that technology took their jobs; it enabled a new organization of work that gave all power to the owners of capital, who made work a nightmare until labor organizing and regulation set limits. Early labor skirmishes are already happening, mostly in California, which has some of the most aggressive rules around classifying platform workers. Three class-action lawsuits have been filed against Mercor in the past six months. (Similar suits were previously filed against Surge AI and Scale AI, which is settling.) The lawsuits all accuse the companies of misclassifying workers as independent contractors given the "extraordinary control" they exert over them. This is "an entirely new kind of work," one that the company trains people to do and that cannot be done except on the company's platform. Workers have so little visibility into what they're working on that one person, alleges a suit filed in December, accepted a Mercor project only to be tasked with recording himself reading sexually explicit scripts. Once he discovered this, the worker risked deactivation if he abandoned the project, forcing him to "choose between being paid and being humiliated." These companies are reminiscent of Uber and Lyft a decade ago, says Glenn Danas, a partner at the law firm Clarkson, which is suing Mercor and several other data platforms. Yet in some ways these workers are in a worse position, more replaceable despite their advanced degrees. Uber drivers have to be physically present in a city to work, and they can organize and push for regulation there. If the same were to happen with data workers, companies could just recruit from somewhere else where people will work for less. When Mercor cut pay for its Meta project to $16 per hour, it dropped below the minimum wage in California and other states, yet people there kept working because they needed the money. This was something at least one supervisor acknowledged, writing in Slack, "While we won't actively hire from any states where the minimum wage is above the project's rate, if you are already active on the project and would like to work at the $16/hr rate, we want to enable you to do so." Entire professions risk a similar race to the bottom, says Acemoglu, if companies are able to pit workers against one another, each selling their data before someone else can underbid them. "We may also need unionlike organizations that exercise some sort of collective ownership and prevent any kind of simple divide-and-rule strategies by large companies to drive down data prices," he says. "If there isn't the legal infrastructure for a data economy of this sort, many of the people who produce the data will be underpaid or, to use a more loaded term, exploited." Katya was among the thousands of people invited to join the $16-an-hour Project Nova and was appalled by the low pay. "I think that was Mercor's experiment in how close to the bottom they can scrape without jeopardizing the data that they're getting," she says. Her main project had been paused for weeks and might resume the next day or never. In the end, she decided the money wasn't worth it. She applied to work at a local coffee shop. It wasn't the career pivot she'd imagined when she went to grad school; she just hoped working as a barista would be more stable. "At least when you work at a coffee shop for minimum wage, you have some friends to talk to and a boss who pretends to care about you. You have some kind of security; you know what your hours are going to be week to week," she says.
[2]
Coders Coded Their Job Away. Why Are So Many of Them Happy About It?
Clive Thompson interviewed more than 70 software developers at Google, Amazon, Microsoft and small start-ups. He is the author of "Coders: The Making of a New Tribe and the Remaking of the World." Lately, Manu Ebert has been trying to keep his A.I. from humiliating him. I recently visited Ebert, a machine-learning engineer and former neuroscientist, at the spare apartment where he and Conor Brennan-Burke run their start-up, Hyperspell. Ebert, a tall and short-bearded 39-year-old with the air of a European academic, sat before a mammoth curved monitor. Onscreen, Claude Code -- the A.I. tool from Anthropic -- was busy at work. One of its agents was writing a new feature and another was testing it; a third supervised everything, like a virtual taskmaster. After a few minutes, Claude flashed: "Implementation complete!" Ebert grew up in the '90s, learning to code the old-fashioned way: He typed it out, line by painstaking line. After college, he held jobs as a software developer in Silicon Valley for companies like Airbnb before becoming a co-founder of four start-ups. Back then, developing software meant spending days hunched over his keyboard, pondering gnarly details, trying to avoid mistakes. All that ended last fall. A.I. had become so good at writing code that Ebert, initially cautious, began letting it do more and more. Now Claude Code does the bulk of it. The agents are so fast -- and generally so accurate -- that when a customer recently needed Hyperspell to write some new code, it took only half an hour. In the before times? "That alone would have taken me a day," he said. He and Brennan-Burke, who is 32, are still software developers, but like most of their peers now, they only rarely write code. Instead, they spend their days talking to the A.I., describing in plain English what they want from it and responding to the A.I.'s "plan" for what it will do. Then they turn the agents loose. A.I. being A.I., things occasionally go haywire. Sometimes when Claude misbehaves and fails to test the code, Ebert scolds the agent: Claude, you really do have to run all the tests. To avoid repeating these sorts of errors, Ebert has added some stern warnings to his prompt file, the list of instructions -- a stern Ten Commandments -- that his agents must follow before they do anything. When you behold the prompt file of a coder using A.I., you are viewing a record of the developer's attempts to restrain the agents' generally competent, but unpredictably deviant, actions. I looked at Ebert's prompt file. It included a prompt telling the agents that any new code had to pass every single test before it got pushed into Hyperspell's real-world product. One such test for Python code, called a pytest, had its own specific prompt that caught my eye: "Pushing code that fails pytest is unacceptable and embarrassing." Embarrassing? Did that actually help, I wondered, telling the A.I. not to "embarrass" you? Ebert grinned sheepishly. He couldn't prove it, but prompts like that seem to have slightly improved Claude's performance. His experience is not unusual; many software developers these days berate their A.I. agents, plead with them, shout important commands in uppercase -- or repeat the same command multiple times, like a hypnotist -- and discover that the A.I. now seems to be slightly more obedient. Such melodramatic prose might seem kind of nuts, but as their name implies, large language models are language machines. "Embarrassing" probably imparted a sense of urgency. "If you say, This is a national security imperative, you need to write this test, there is a sense of just raising the stakes," Ebert said. Brennan-Burke chimed in: "You remember seeing the research that showed the more rude you were to models, the better they performed?" They chuckled. Computer programming has been through many changes in its 80-year history. But this may be the strangest one yet: It is now becoming a conversation, a back-and-forth talk fest between software developers and their bots. This vertiginous shift threatens to stir up some huge economic consequences. For decades, coding was considered such wizardry that if you were halfway competent you could expect to enjoy lifetime employment. If you were exceptional at it (and lucky), you got rich. Silicon Valley panjandrums spent the 2010s lecturing American workers in dying industries that they needed to "learn to code." Now coding itself is being automated. To outsiders, what programmers are facing can seem richly deserved, and even funny: American white-collar workers have long fretted that Silicon Valley might one day use A.I. to automate their jobs, but look who got hit first! Indeed, coding is perhaps the first form of very expensive industrialized human labor that A.I. can actually replace. A.I.-generated videos look janky, artificial photos surreal; law briefs can be riddled with career-ending howlers. But A.I.-generated code? If it passes its tests and works, it's worth as much as what humans get paid $200,000 or more a year to compose. You might imagine this would unsettle and demoralize programmers. Some of them, certainly. But I spoke to scores of developers this past fall and winter, and most were weirdly jazzed about their new powers. "We're talking 10 to 20 -- to even 100 -- times as productive as I've ever been in my career," Steve Yegge, a veteran coder who built his own tool for running swarms of coding agents, told me. "It's like we've been walking our whole lives," he says, but now they have been given a ride, "and it's fast as [expletive]." Like many of his peers, though, Yegge can't quite figure out what it means for the future of his profession. For decades, being a software developer meant mastering coding languages, but now a language technology itself is upending the very nature of the job. The enthusiasm of software developers for generative A.I. stands in stark contrast to how other Americans feel about the impact of large language models. Polls show a majority are neutral or skeptical; creatives are often enraged. But if coders are more upbeat, it's because their encounters with A.I. are diametrically opposite to what's happening in many other occupations, says Anil Dash, a friend of mine who is a longtime programmer and tech executive. "The reason that tech generally -- and coders in particular -- see L.L.M.s differently than everyone else is that in the creative disciplines, L.L.M.s take away the most soulful human parts of the work and leave the drudgery to you," Dash says. "And in coding, L.L.M.s take away the drudgery and leave the human, soulful parts to you." Coding has been drudgery, historically. In the movies, programmers excitedly crank out code at typing speed. In reality, writing software has always been an agonizingly slow and frustrating affair. You write a few lines of code, a single "function" that does one little thing, and then discover that you made some niggling error, like leaving out a single colon. As a company's "codebase" -- every line of code in its software, accreting over the years -- gets larger and involves dozens or thousands of functions interacting with one another, you could spend hours, days or weeks pulling your hair out trying to find which little mistakes are bringing everything to a halt. Maybe a line of yours broke something your colleague is coding two cubicles over. For decades, computer engineers tried to automate this drudgery. In the industry, they call every step in this direction "adding a layer of abstraction": If you often find yourself doing something step by step in an onerous fashion, you automate it. For example, one early computer language was Assembly, and it was devilishly hard to write. Computers had very little memory, so coders had to be efficient in how they used it, putting each bit of data carefully in place and then keeping mental track of it. Even simple calculations required an incremental, meticulous approach. Say you wanted to write some code that would calculate 5 percent interest on $10,000 over 10 years. Back in the 1960s, that would have required perhaps nine lines of pretty obtuse Assembly: "VAL, FLDECML 10000.0" to set the starting amount at $10,000, "CLA VAL" to load the amount into the processor, "FAD ZERO" to tell the computer you're working with numbers that have decimal points; and so on. Learning to Live With A.I. Read more from the New York Times Magazine's special issue. * New Opportunities for People: It might take your job, but A.I. will create new (human) careers. * Widespread Use: The hosts of The Times's "Hard Fork" podcast on how everyone seems to be using A.I. -- for everything. * Scholarship Tool: The winners of the A.I. race might soon transform the stories that historians tell about the past. * Never Saying Goodbye: After a man's terminal diagnosis, his family decides to make a virtual avatar that lives on after his death. By the '80s and '90s, as computers became more powerful, engineers were able to create languages that took care of all that memory management for you, and also turned common asks into simple commands. In Python, a coder can perform that exact same calculation very simply: "interest = 10000 * (1.05 ** 10)." That single line tells the computer to multiply 10,000 by the interest rate over 10 years and store the result in the variable labeled "interest." Programmers no longer need to think about where all the data is being stored in the computer's memory; Python does that for them. It is, in other words, a layer of abstraction on top of all that fiddly memory business. Writing in that language is delightfully easier. During the 2000s and 2010s, programmers abstracted away more and more scut work. Virtually anytime they encountered an onerous task, they wrote some code to automate it and then -- very often -- made it open source, giving it away for others to use. Here's an example: As a hobbyist programmer, I sometimes want to automatically "scrape" the text from a website. I've never written code myself to do that; I just use Beautiful Soup, a freely available package of thousands of lines of Python code that manages all the complexity. I don't even need to understand how Beautiful Soup works. It just gives me simple, typically one-line Python commands that -- whoosh -- retrieve and analyze website text for me. A significant amount of software is produced in precisely this way: developers stitching together big piles of code that someone else wrote. With A.I., though, programmers ascend to an even higher level of abstraction. They describe, in regular language, what the program should do, and the agents translate that idea -- that human intent -- into code. Writing software no longer means mentally juggling the nuances of a language like Python, say, or JavaScript or Rust. Coding no longer involves messing up an algorithm and then trying to figure out where your error lies. That part, too, has been abstracted away. So what exactly is left? Or as Boris Cherny, the head of Claude Code, put it when we met at Anthropic's headquarters in January: "What is computation -- what is coding?" Then he added, "You can get pretty philosophical pretty fast." His answer echoed what I've heard from pretty much every developer I've spoken to: A coder is now more like an architect than a construction worker. Developers using A.I. focus on the overall shape of the software, how its features and facets work together. Because the agents can produce functioning code so quickly, their human overseers can experiment, trying things out to see what works and discarding what doesn't. Several programmers told me they felt a bit like Steve Jobs, who famously had his staffers churn out prototypes so he could handle lots of them and settle on what felt right. The work of a developer is now more judging than creating. Cherny himself has been through all the layers of abstraction: As a teenager in California, he taught himself a little Assembly so he could write a program that solved math homework automatically on his calculator. Today he simply pulls out his phone and dictates to Claude what he wants the A.I. agent to do; in a sort of Ouroboric loop, 100 percent of Cherny's contributions to the Claude codebase are now written entirely by Claude. While we talked, his phone was sitting on the table in front of us, and at the end of an hour he showed me the screen: 10 Claude agents had been tweaking the codebase. "I haven't written a single line by hand, and I'm like the most prolific coder on the team," he said. "It's an alien intelligence that we're learning to work with." For most of the coders I met, learning to work with A.I. means learning to talk to A.I. This struck me as an unexpected paradox of this new age, because traditionally coding was a haven for introverts who preferred to talk as little as possible to others at work. But now their entire job involves constantly chatting with this alien life form. If describing and talking are now much of the work of a software developer, the talk nonetheless remains pretty complex and highly technical. An amateur can't do it. You can't just tell an agent, Build me the code for a successful start-up. The agents work best when they're being asked to perform one step at a time; ask for too much and they can lose the plot. Aayush Naik, whose start-up in San Francisco uses Claude Code, says it's a delusion to imagine that your A.I. agent will generate a whole project at once, in a "Big Bang" moment. Yes, you can get it to write 5,000 lines of code -- but then, he says, "you test it and nothing works." This, all the software developers say, is where their training and expertise are still needed: knowing how a big codebase ought to be structured, how to design the system so it's reliable and how to figure out if the agent is sloppy. Given A.I.'s penchant to hallucinate, it might seem reckless to let agents push code out into the real world. But software developers point out that coding has a unique quality: They can tether their A.I.s to reality, because they can demand the agents test the code to see if it runs correctly. "I feel like programmers have it easy," says Simon Willison, a tech entrepreneur and an influential blogger about how to code using A.I. "If you're a lawyer, you're screwed, right?" There's no way to automatically check a legal brief written by A.I. for hallucinations -- other than face total humiliation in court. When I visited Dima Yanovsky at his small San Francisco apartment, he, too, was busily chatting with Claude. He's a quick-to-smile 25-year-old programmer at Prox, a company that uses A.I. to help e-commerce companies. He founded it last year with his childhood friend Gregory Makodzeba. Both of them grew up in Ukraine, where their families were in the shipping business. As he dictated commands to Claude, a number of agents were busy at work on his laptop, which was perched on his small desk. At one point, one of them started hallucinating, insisting that a table of data existed that clearly didn't exist. "What?" Yanovsky said, peering at his screen with a frown. He mashed out a disdainful reprimand on his keyboard: "who told you there is gonna be this table? i havent created this table." Claude replied, in a daft and chipper tone: "You're right! I shouldn't assume tables exist." It began to redo the work. Even with this occasional backtracking, Claude codes so much faster than Yanovsky that he struggles to put a number on how much faster he can now get his work done. "Like, 20X?" he offered. What once took weeks now takes hours. Every Silicon Valley founder he knows is experiencing the same thing. If you want to build a company in a hurry, nobody does it by hand anymore. The fact that A.I. can boost coder productivity so drastically has been one of the more remarkable talking points in the field. I've noticed this myself: Just last week, I needed a web tool to clean up some messy transcripts, and I used A.I. to build it in about 10 minutes. On my own, it would have taken an hour, possibly longer. But software start-ups -- or individuals like me who are vibe-coding their own small apps -- are a special case. They involve what's known in the industry as "greenfield" coding, where there are no pre-existing lines of code to deal with. An entirely new codebase is being created from scratch. A vast majority of software developers aren't working in greenfield contexts. They're "brownfield," employed by mature companies, where the code was written years (or decades) earlier and already reaches millions or billions of lines. Rapidly adding new functions is usually a terrible idea -- they might accidentally conflict with another part of the code and break something that millions of customers rely on. At most mature software firms, coders historically spent a minority of their time -- sometimes barely more than an hour per day -- actually writing code. The rest was planning, hashing out priorities and meeting to discuss progress. This is the curse of success, and why big, established software firms can be slower to deliver upgrades than younger companies. Before a coder's new work is released, colleagues and higher-ups typically put it through a "code review," looking carefully at its lines and the results of any testing. If you want to put a number on how much more productive A.I. is making the programmers at mature tech firms like Google, it's 10 percent, Sundar Pichai, Google's chief executive, has said. That's the bump that Google has seen in "engineering velocity" -- how much faster its more than 100,000 software developers are able to work. And that 10 percent is the average inside the company, Ryan Salva, a senior director of product at the company, told me. Some work, like writing a simple test, is now tens of times faster. Major changes are slower. At the start-ups whose founders I spoke to, closer to 100 percent of their code is being written by A.I., but at Google it is not quite 50 percent. I visited Salva in Sunnyvale, Calif., to shoulder-surf as he showed me how L.L.M.s have been woven into Google's work flow. For a firm with billions of lines of code, he noted, A.I.'s value isn't necessarily in writing new code so much as in figuring out what's going on with the existing lines. Developers will use it to analyze and explain what "sprawling" portions of code are doing, so they can determine how to help improve or alter it. "A.I. is much better at wading into an unfamiliar part of the codebase, making sense of what's happening," he told me. It also helps developers work in languages they might not be very familiar with. As a result, developers on Salva's team form smaller groups: A year ago, these might have needed 30 people, each with their own specialty. Now a group needs only three to six people, which enables them to move more nimbly, so "we're able to clear through a lot more of our backlog," Salva said. Salva opened up his code editor -- essentially a word processor for writing code -- to show me what it's like to work alongside Gemini, Google's L.L.M. For the first few years of the A.I. boom, he said, it was still "very much what I would describe as 'human in the loop.'" The A.I. assisted but didn't work independently. While he typed away, Gemini analyzed a piece of code for him, explaining whether it had been fully tested or not. When it suggested a few new lines, it was up to him to accept them. But Google's metabolism is gradually speeding up, and Gemini is writing much more code on its own. Salva showed me an example. He had been hankering to solve a problem that Google coders had been complaining about: Sometimes they would log into Gemini's "command line interface" (or C.L.I.) from different accounts, and it was not easy to see which account they were using. He typed out a request for Gemini: "When working inside of Gemini C.L.I., it would be nice to have a command that lets users see their logged-in identity." The A.I. processed the request for a few minutes, then told Salva how it intended to fulfill it. Salva gave his approval, and Gemini worked away in the background. When he checked back in 10 minutes, the code had been written and Gemini was testing it for errors. Then Salva realized the A.I. had become a bit overeager. "Oh, Jesus," he said. "It ran 8,000 tests," far more than was strictly necessary. About 15 minutes later, though, the tests were finished, and Salva tried the new function. Lo and behold, the code worked, correctly displaying his logged-in account. "Not bad," he said. Making a demo like this was only the first baby step; before it could be incorporated into Google's codebase, it would have to go through several rounds of code review, rewriting and testing. "As an engineer, I care less that the models are really good at producing the right result the first time," he said. "I care much more that there are validation steps in place so that it eventually gets the perfect or the right answer." A 10 percent increase in Google's "velocity" may seem underwhelming, Salva noted, given the hoopla around A.I. "We have collectively -- both in the software industry as well as in the media -- oh, my God, created a hype cycle," he had told me when we first talked, last summer in New York. But the reality was impressive enough for him. "We should be delighted when there's 10 percent efficiency gains for the entire company. That's freaking bonkers!" At old and huge brownfield companies, where the effort is focused on keeping the existing systems up and running, many programmers work like digital plumbers, fixing leaks that erupt at all hours. I saw that firsthand when I met in Seattle with David Yanacek, a senior principal engineer for AWS Agentic A.I. "AWS" stands for "Amazon Web Services," the server cloud that is the digital backbone for millions of firms. If a server crashes, you might not be able to watch Netflix, hail an Uber or play Fortnite. An old-school pager sat beneath Yanacek's monitor. For years, Amazon used it to wake him during middle-of-the-night incidents; these days, he gets a smartphone alert. Whatever the devices involved, someone is expected to fix things as soon as possible. "Server ops is annoying," said Yanacek, a trim man of 42 with a gray beard and jittery intensity. "I actually love it! But it's also annoying, and it's nonstop." His team has long built automations to speed up the pace of diagnosing problems. But L.L.M.s have offered powerful new ones, he said, because the A.I.'s fluency in both human language and programming means it can interpret error reports from crashed systems and look at their code. It can sometimes have a fix ready even before a bleary-eyed employee is fully awake. Yanacek looked at his screen and noticed that, 11 minutes earlier, a demo application had issued an error alert -- and Amazon's A.I. had already pinpointed what went wrong and written a short report. The agent had discovered that a code change had apparently added a new time-stamp field, but some other part of the codebase wasn't expecting that new field to be there. The result was an "unexpected field" error. Yanacek peered at the A.I.'s suggested fix, pondered for a moment, then hit "enter" to approve it. The A.I. took about eight minutes to figure things out, he told me. "By the time I'd opened my laptop, it's ready." One customer recently told him that Amazon's A.I. agent fixed a problem in only 15 minutes; when a similar problem occurred months before, it had taken a full team of engineers eight hours to debug. In other Amazon sectors, the brownfield engineers work on revising segments of old code (sometimes decades old) to make them more efficient, or perhaps to redo them entirely in a more modern language. It's work that is crucial but finicky and delicate, like performing a heart transplant. These digital renovations have sped up, too. McLaren Stanley, a senior principal engineer at Amazon, recently modernized a piece of code he had personally written years earlier. The original version had taken a month to create; this time, with the help of Amazon's in-house A.I., he finished the job in a morning. His team has similarly reworked other big chunks of code. One of A.I.'s key advantages, Stanley told me, is that it makes it easier to try out new ideas. "Things I've always wanted to do now only take a six-minute conversation and a 'Go do that,'" he says. I've written about developers for decades, and they have always rhapsodized about the thrill of bringing a machine to life through arcane commands. Sure, the work could be cosmically exasperating, requiring hours or even weeks to chase down a single bug. But the grind sharpened the joy. When things finally started working, the burst of satisfaction was intoxicating. So I was surprised by how many software developers told me they were happy to no longer write code by hand. Most said they still feel the jolt of success, even with A.I. writing the lines. "I love programming. I love getting in the zone. I love thinking big thoughts. It's the creative act," says Kent Beck, a longtime guru of the software industry who has been coding since 1972. Ten years ago, he mostly stopped writing software; he was frustrated with the latest languages and software tools. But L.L.M.s got him going again, and he's now cranking out more projects than ever: a personalized note-taking app, new types of databases. Even the fact that A.I.'s output can be unpredictable -- if you ask it to write a piece of code, it might do so in a slightly different way each time -- "is addictive, in a slot-machine way." A few programmers did say that they lamented the demise of hand-crafting their work. "I believe that it can be fun and fulfilling and engaging, and having the computer do it for you strips you of that," one Apple engineer told me. (He asked to remain unnamed so he wouldn't get in trouble for criticizing Apple's embrace of A.I.) He went on: "I didn't do it to make a lot of money and to excel in the career ladder. I did it because it's my passion. I don't want to outsource that passion." He also worries that A.I. is atomizing the work force. In the past, if developers were stuck on an intractable bug, they asked colleagues for advice; today they just ask the agents. But only a few people at Apple openly share his dimmer views, he said. The coders who still actively avoid A.I. may be in the minority, but their opposition is intense. Some dislike how much energy it takes to train and deploy the models, and others object to how they were trained by tech firms pillaging copyrighted works. There is suspicion that the sheer speed of A.I.'s output means firms will wind up with mountains of flabbily written code that won't perform well. The tech bosses might use agents as a cudgel: Don't get uppity at work -- we could replace you with a bot. And critics think it is a terrible idea for developers to become reliant on A.I. produced by a small coterie of tech giants. Thomas Ptacek, a Chicago-based developer and a co-founder of the tech firm Fly.io, has seen the lacerating fights between the developers who love A.I. and those few who hate it, and "it's a civil war," he told me. He's in the middle. He thinks the refuseniks are deluding themselves when they claim that A.I. doesn't work well and that it can't work well. "It's like being gaslit," he says. The holdouts are in the minority, and "you can watch the five stages of grief playing out." He's not a Pollyanna, though. "L.L.M.s are going to win on coding, but I don't know what that's going to mean for us," he adds. "People may be right about how bad that is for the profession, right?" It certainly could mean terrible job prospects. New computer-science graduates are particularly concerned. Companies used to hire junior developers to do the menial labor for their senior colleagues, but who is going to hire a neophyte when a senior engineer can be even more productive with an army of deathless code-writing ghosts? Silicon Valley has already been through a huge wave of layoffs. During the 2010s, tech firms were hiring aggressively, competing for new grads and adding an average of 74,000 new employees a year, according to the Bureau of Labor Statistics. Job postings soared in the early years of the pandemic. Then firms abruptly reversed course, and postings for new jobs collapsed. More than 700,000 tech workers have been laid off in the last four years, according to Roger Lee at Layoffs.fyi (this number includes all jobs in tech). Most tech observers say A.I. probably wasn't the cause of those layoffs because, at the time, it wasn't yet good enough to replace coders. Other factors, they figure, were more significant: Interest rates rose, so tech firms lost their easy growth money. Companies that overhired shed that excess capacity. Some also suspect that when Elon Musk bought Twitter and said he laid off 80 percent of his work force, tech executives at other firms took note and decided that maybe they didn't need so many engineers either. But there's evidence that A.I. is now eroding entry-level coding jobs. Last year, Erik Brynjolfsson, an economist who directs the Stanford Digital Economy Lab, and his colleagues analyzed industries based on their age group and how easily their jobs could be done by A.I. He found that computer programmers had one of the most "A.I.-exposed" jobs -- and junior developers were hit the hardest. The number of jobs for those between the ages 22 and 25 (when one is most likely to be entering the field) had declined by 16 percent since 2022, while older programmers saw no significant decrease. Virtually all of the tech executives I've spoken to, from those at coastal giants to those at small regional firms, have sworn to me that A.I. would not stop them from hiring appealing new talent. It's true that A.I. makes their existing developers more productive, but they always need more done. "In my many years at Google, we have always been constrained by having way, way, way more ideas of things we would like to do than there was time and energy and hours in the day to go do them," Jen Fitzpatrick, the company's senior vice president for Google Core Systems & Experiences, told me. "I have never met a team at Google who says, 'You know, I'm out of good ideas.' The answer is always, 'The list of things I would like to do is nine miles longer than what we can pull off.'" Several developers suggested, in fact, that the number of software jobs might actually grow. An untold number of small firms around the country would love to have their own custom-made software, but were never big enough to hire, say, a five-person programmer team necessary to produce it. But if you could hire a single A.I.-assisted coder to do that same work, or even a part-time one? This is, as Brynjolfsson notes, a version of the "Jevons paradox": When something gets cheaper to do, we don't just pocket the savings -- we do more of it. Though it could also be that these software jobs won't pay as well as in the past, because, of course, the jobs aren't as hard as they used to be. Acquiring the skills isn't as challenging. This question of skills can lead in some unsettling directions, though, when you chase it down. Many midcareer coders told me they felt confident using A.I. because they had spent decades developing a strong sense of what good, efficient code looks like. That allows them to explain to the agents precisely what they want and lets them spot quickly when the agents have cranked out something inefficient or sloppy. But what happens to the next generation? Will they still develop that intuitive sense for code? If your job is now less about writing than assessing, how will newbies learn to assess? Some new developers told me they can feel their skills weakening. Pia Torain is a software engineer for Point Health A.I., and she was only two years into her job when, in the summer of 2024, the company told her to start using Github's Copilot code-writing tool. "I realized that it was just four months that I was prompting hundreds, 500 prompts a day, that I started to lose my ability to code," she says. She stopped using them for a while; these days, she'll have A.I. write for her, but she carefully reads the output, making sure she's absorbing how the code works. "If you don't use it," Torain told me, "you're going to lose it." Point Health co-founder Rachel Gollub is less worried. She has been a software developer for almost 40 years, and for decades coders have worried that the craft is imminently doomed. When languages like Python and JavaScript emerged, they abstracted away the need to think about memory management, so developers stopped needing those skills. The old-school coders caterwauled: It's not real coding unless you're managing your own memory! "People were all like, 'You're losing all your ability to code,'" Gollub told me. But plenty of big, reliable companies -- Dropbox, say -- relied heavily on newer languages like Python, and they have worked fine. Memory management is crucial in only a subset of coding tasks today, such as with devices that don't have much computing power. The vast majority of the software industry has moved on. Gollub expects the same transition will happen as A.I. tools become the norm. Writing code is now so highly abstracted that nearly anyone could crack open a L.L.M. and describe an app. Maybe not a complex one. But if they needed some simple software for personal use? An A.I. could likely craft it. This is what Maxime Cuisy recently did. He's a production manager for a print shop in Paris that produces photo books for high-end clients including Dior and Louis Vuitton. Educationally, he's your classic liberal-arts grad, having completed a master's thesis on the French graphic novel. He knows nothing of coding, and didn't even pay much attention to A.I. until a couple of years ago, when he says ChatGPT "basically helped me and my wife to save our cat." They had gotten two new kittens, and both became so sick that one suddenly died. The vet told them the remaining cat had terminal cancer. Cuisy thought that was improbable, so he explained the cat's symptoms to ChatGPT, which suggested it was an infection. This inspired him to do more research and led him to a diagnosis of feline infectious peritonitis. A day later, the cat was on the mend. At work, Cuisy soon had a different problem. The company had bought new printers only to run into problems with their existing software: To get the photos to display correctly, they now had to painstakingly adjust the margins. The company isn't big enough to have a developer team that could make custom software to automate this for them. Cuisy decided to try vibe-coding the solution himself, using Codex, OpenAI's code-writing tool. "I basically told it, 'I need to have an app that does this, and this is the form factor that the printer can receive,'" he says. He spent a few hours carefully detailing the way files would need to be adjusted, and by the end of the day ChatGPT had produced an app that works on Mac and Windows operating systems. Employees use it to process up to 2,000 images in a single shot. His boss is happy. Cuisy has no idea how the code actually works. It's written in Python, which might as well be ancient Greek. This is the cultural side effect of coding becoming conversational: The realms of programmers and everyday people, separated for decades by an ocean of arcane know-how, are drifting closer together. If code-writing A.I. continues to improve, there will likely be far more people in Cuisy's situation -- the Jevons paradox in action. "Maybe they don't label themselves as software engineers, but they're creating code," Brynjolfsson says. "A lot of people have ideas." The world becomes flooded with far more software than ever before -- written by individuals, for individuals. How things will shake out for professional coders themselves isn't yet clear. But their mix of exhilaration and anxiety may be a preview for workers in other fields. Anywhere a job involves language and information, this new combination of skills -- part rhetoric, part systems thinking, part skepticism about a bot's output -- may become the fabric of white-collar work. Skills that seemed the most technical and forbidding can turn out to be the ones most easily automated. Social and imaginative ones come to the fore. We will produce fewer first drafts and do more judging, while perhaps feeling uneasy about how well we can still judge. Abstraction may be coming for us all. Read by James Patrick Cronin Narration produced by Tanya Pérez
[3]
The Lawyers and Scientists Training AI to Steal Their Career
This article is a collaboration between New York Magazine and The Verge. The LinkedIn post seemed like yet another scam job offer, but Katya was desperate enough to click. After college, she'd struggled to make a living as a freelance journalist, gone to grad school, then pivoted to what she hoped would be a more stable career in content marketing -- only to find AI had automated much of the work. This company was called Crossing Hurdles, and it promised copywriting jobs starting at $45 per hour. Katya clicked and was taken to a page for another company, called Mercor, where she was instructed to interview on-camera with an AI named Melvin. "It just seemed like the sketchiest thing in the world," Katya says. She closed the tab. But a few weeks later, still unemployed, she got a message inviting her to apply to Mercor. This time, she looked up the company. Mercor, it seemed, sold data to train AI, and she was being recruited to create that data. "My job is gone because of ChatGPT, and I was being invited to train the model to do the worst version of it imaginable," she says. The idea depressed her. But her financial situation was increasingly dire, and she had to find a new place to live in a hurry, so she turned on her webcam and said "hello" to Melvin. It was a strange, if largely pleasant, experience. Manifesting on Katya's laptop as a disembodied male voice, Melvin seemed to have actually read her résumé and asked specific questions about it. A few weeks later, Katya, who like most workers in this story asked to use a pseudonym out of fear of retaliation, received an email from Mercor offering her a job. If she accepted, she should sign the contract, submit to a background check, and install monitoring software onto her computer. She signed immediately. She was added to a Slack channel, where it was clear she was entering a project already underway. Hundreds of people were busy writing examples of prompts someone might ask a chatbot, writing the chatbot's ideal response to those prompts, then creating a detailed checklist of criteria that defined that ideal response. Each task took several hours to complete before the data was sent to workers stationed somewhere down the digital assembly line for further review. Katya wasn't told whose AI she was training -- managers referred to it only as "the client" -- or what purpose the project served. But she enjoyed the work. She was having fun playing with the models, and the pay was very good. "It was like having a real job," she says. Two days after Katya started, the project was abruptly paused. A few days after that, a supervisor popped into the room to let everyone know it had been canceled. "I'm working assuming that I can plan around this. I'm saving up for first and last month's rent for an apartment," Katya says, "and then I'm back on my ass. No warning, no security, nothing." Several days later, she got an email from Mercor with another offer, this one for a job evaluating what seemed to be conversations between chatbots and real users -- many appeared to be from people in Malaysia and Vietnam practicing English -- according to various criteria, like how well the chatbot followed instructions and the appropriateness of its tone. Sign the contract, the email said, and you'll have a Zoom onboarding call in 45 minutes. It was 6:30 p.m. on a Sunday night. Scarred from the abrupt disappearance of the previous gig, she accepted the offer and worked until she couldn't stay awake. Machine-learning systems learn by finding patterns in enormous quantities of data, but first that data has to be sorted, labeled, and produced by people. ChatGPT got its startling fluency from thousands of humans hired by companies such as Scale AI and Surge AI to write examples of things a helpful chatbot assistant would say and to grade its best responses. A little over a year ago, concerns began to mount in the industry about a plateau in the technology's progress. Training models based on this type of grading yielded chatbots that were very good at sounding smart but still too unreliable to be useful. The exception was software engineering, where the ability of models to automatically check whether bits of code worked -- did the code compile, did it print HELLO WORLD -- allowed them to trial-and-error their way to genuine competence. The problem was that few other human activities offer such unambiguous feedback. There are no objective tests for whether financial analysis or advertising copy is "good." Undeterred, AI companies set out to make such tests, collectively paying billions of dollars to professionals of all types to write exacting and comprehensive criteria for a job well done. Mercor, the company Katya stumbled upon, was founded in 2023 by three then-19-year-olds from the Bay Area, Brendan Foody, Adarsh Hiremath, and Surya Midha, as a jobs platform that used AI interviews to match overseas engineers with tech companies. The company received so many inquiries from AI developers seeking professionals to produce training data that it decided to adapt. Last year, Mercor was valued at $10 billion, making its trio of founders the world's youngest self-made billionaires. OpenAI has been a client; so has Anthropic. Each of these data companies touts its stable of pedigreed experts. Mercor says around 30,000 professionals work on its platform each week, while Scale AI claims to have more than 700,000 "M.A.'s, Ph.D.'s, and college graduates." Surge AI advertises its Supreme Court litigators, McKinsey principals, and platinum recording artists. These companies are hiring people with experience in law, finance, and coding, all areas where AI is making rapid inroads. But they're also hiring people to produce data for practically any job you can imagine. Job listings seek chefs, management consultants, wildlife-conservation scientists, archivists, private investigators, police sergeants, reporters, teachers, and rental-counter clerks. One recent job ad called for experts in "North American early to mid-teen humor" who can, among other requirements, "explain humor using clear, logical language, including references to North American slang, trends, and social norms." It is, as one industry veteran put it, the largest harvesting of human expertise ever attempted. These companies have found rich recruiting ground among the growing ranks of the highly educated and underemployed. Aside from the 2008 financial crash and the pandemic, hiring is at its lowest point in decades. This past August, the early-career job-search platform Handshake found that job postings on the site had declined more than 16 percent compared with the year before and that listings were receiving 26 percent more applications. Meanwhile, Handshake launched an initiative last year connecting job seekers with roles producing AI training data. "As AI reshapes the future of work," the company wrote, announcing the program, "we have the responsibility to rethink, educate, and prepare our network to navigate careers and participate in the AI economy." There is an underlying tension between the predictions of generally intelligent systems that can replace much of human cognitive labor and the money AI labs are actually spending on data to automate one task at a time. It is the difference between a future of abrupt mass unemployment and something more subtle but potentially just as disruptive: a future in which a growing number of people find work teaching AI to do the work they once did. The first wave of these workers consists of software engineers, graphic designers, writers, and other professionals in fields where the new training techniques are proving effective. They find themselves in a surreal situation, competing for precarious gigs pantomiming the careers they'd hoped to have. Each of the more than 30 workers I spoke with occupied a position along a vast and growing data-supply chain. There are people crafting checklists that define a good chatbot response, typically called "rubrics," and other people grading those rubrics. Others grade chatbot answers according to those rubrics, and still others take the rubrics and write out what's often described as a "golden output," or the ideal chatbot answer. Others are asked to explain every step they took to arrive at this golden output in the voice of a chatbot thinking to itself, producing what's called a "reasoning trace" for AI to follow later when it encounters a similar task out in the real world. Sometimes the labs want only rubrics for prompts their AI can't already do, which means companies like Mercor ask workers to produce "stumpers," or requests that will make the model fail. "It sounds easy, but it's really hard," says a worker who was trying to stump models by asking them to make inventory-management dashboards. Models fail in counterintuitive ways. They may be able to solve advanced-physics exam questions, but ask them for transit directions and they'll recommend transferring on nonconnecting train lines. Finding these weak spots takes time and creativity. One type of project gathers groups of lawyers, human-resources managers, teachers, consultants, or bankers for something Mercor calls world-building. "You and your team will role-play a real-life team within your profession," the training materials read. The teams are given dedicated emails, calendars, and chat apps and asked to create a hundred or more documents that would be associated with some corporate undertaking, like a fictional mining company analyzing whether to enter the data-center business. After several 16-hour days of fantasy document production, one worker recounts, the resulting slide decks, meeting notes, and financial forecasts are sent to another team, which uses them as grist in their attempts to stump a model operating in this simulated corporate environment. Then, having stumped the model, that team writes new, more nuanced rubrics, golden answers, and so on. Workers can only guess who the customer is or how many others are working on the project -- based on references to teams like Management Consulting World No. 133, there could be hundreds, maybe thousands. There are people hired to evaluate the ability of image models to follow their prompts and others who summarize video clips in extraordinary detail, presumably to train video models. Efforts to improve AI's ability to have spoken conversations have resulted in a surging demand for voice actors, who might find themselves recording "authentic, emotionally resonant" speeches, according to one listing. "I just tell people I'm an AI trainer, then it sounds more professional than what I'm doing," says an aspiring screenwriter who was instructed to record himself pretending to ask a chatbot for a fitness plan while pots and pans clanged in the kitchen. Another time, he was told to record himself dispensing financial advice over the phone to a parade of people he assumed were other workers. This audio might then be broken down and sent to someone like Ernest, who used to make a living as an online tutor until the company he worked for replaced him with a chatbot. When we spoke, he was listening to minutelong clips of random dialogue slowed to 0.1x speed and marking when someone started and stopped speaking down to the millisecond. Many of the clips included a person talking with a chatbot and interjecting "huh" or "I see," so he assumes he was improving AI's ability to have naturally flowing conversation, but he has no actual idea. As is standard practice in the field, the project was referred to by a code name and the client only ever as "the client." The entire system is designed so that workers have minimal insight into the supply chain they are part of. If they find out who the customer is, they are contractually forbidden from telling anyone, even their own colleagues. Nor are they allowed to describe the details of their work beyond broad generalities like "providing expertise in XYZ domain to improve models for a top AI lab," according to one Mercor agreement. So afraid are workers of inadvertently violating their confidentiality agreements and getting fired that when they discuss their work in public forums, they mask their already code-named projects with additional code names, for example by referring to a project called "Raven" as "Poe." Katya's second project with Mercor was far more stressful. There was less work to go around, and it came in fits and starts. Managers would drop a message in the Slack channel saying new tasks were incoming in half an hour, and, she says, "everyone in Slack would drop what they were doing and jump on them like piranhas," working as fast as they could while the bar showing how many tasks remained slid toward zero. Then they were back in Slack again, politely begging supervisors for more work and more hours, talking about their kids' birthdays or their need to pay rent, or telling anyone who might be listening that their availability was wide open in case there was more work to be done. Soon, Katya was dropping everything at the sound of a Slack ding too. "Sometimes I'm on the toilet or at dinner and I get the Slack notification. I'm like, 'Oh, sorry, I gotta work now.'" That project soon ended and then came another. It was nearly identical to the first, which she had enjoyed, but now, on top of writing rubrics, she had to stump the model and complete the more-difficult task in the same amount of time. She was also getting paid $8 an hour less. This is common at Mercor. Nearly every worker I spoke with reported that demands increased, time requirements shrank, and pay decreased as projects continued. Those who couldn't meet the new demands got "offboarded" and replaced by new recruits. Chris joined Mercor last year, after a difficult few months struggling to find film work. Unlike many people who suspect they're casualties of automation, he knew for certain that this was the case. He'd had a recurring job drafting episodes for an unscripted television show -- doing preinterviews, sketching scenes, writing the reality-TV equivalent of a screenplay. But in late 2024, he was told the show would be running on a "skeleton crew" and his work was no longer needed. He found out later the company was using ChatGPT to draft new episodes. So that October, when Chris received an offer to write an entire sci-fi screenplay for a major AI company, he said "yes," grim as the prospect was. Since then, he has gone from gig to gig. "This is my only source of income right now," he says. "I know people who are award-winning producers and directors, and they're not advertising that they're doing this work, but that's how they're putting food on the table." His first jobs with Mercor were, like Katya's, relatively pleasant and well paid, but soon came the 6 p.m. fist-bump-emoji Slack exhortations to "come on team, let's push through this," followed by sudden halts and months of silence. "You were just constantly waiting for the crack of the starting gun at any hour of the day," Chris says. Then it was crunch time again and managers, increasingly panicked as deadlines neared, started threatening workers with offboarding if they didn't complete tasks quickly enough. The time he spent working was tracked to the second by software called -Insightful, which monitored everything he did on his computer. Time that the software deems "unproductive" could be deducted from his pay, and if a few minutes passed without him typing, the system pinged him to ask whether he had been working. Sometimes Chris saw people post in Slack that they'd gone over the target time on a particularly tricky task and that they hoped it would be okay; the next day, they would be gone. Increasingly worried he would be offboarded too, he started working off the clock, deactivating Insightful while reading instructions so he could move faster. If he went over the target time, he turned the clock off and kept working for free. Companies say this software is necessary to accurately track hours and prevent workers from cheating, which, in this case, means using AI, something all data companies strictly forbid. The ground truth of verified human expertise is what they're selling, and when AI trains on AI-generated data, it gradually degrades, a phenomenon researchers call "model collapse." Employees of data companies say it is a constant battle to screen out AI slop. For workers, AI is a particular temptation as pressure increases. When the retail expert trying to stump models with analytics dashboards had her target time dropped from eight hours per task to five to three and a half, she turned off Insightful and sought outside help. "To be honest, I went into Copilot and ChatGPT and put my prompt in there and said, 'How can I work this so you guys can't answer it?'" Then she went to another chatbot and asked if the prompt sounded AI generated and, if so, to make it sound more human. "It's just so horrible, the mental effect of it," says Mimi, a screenwriter who has worked on multiple streaming shows and has been training AI for Mercor for several months. She found out about Mercor from a fellow screenwriter who dropped one of its job links in a Writers Guild of America Facebook group. Like a lot of people in this line of work, Mimi is conflicted. "One documentary-maker who's won Emmys, he messaged me and he was like, 'I'm being handed a shovel and told to dig my own grave,' and that's exactly how everyone thinks about it," she says. Still, as a single mom, she needed the money. She was thankful for the work at first, then the project was paused, unpaused, and paused again. For five weeks, she was told a project would be starting imminently. When it finally did, requirements were added, while the expected time shortened, and she raced to keep up under the watchful eye of Insightful. She felt that someone put it well on Slack when they said it was like they were living in a fishbowl waiting for their human masters to drop in food, and only the ones who were fast enough to swim to the top could eat. "Last night, I got so fucking stressed because my kid came home and it was 7 p.m., and I get this message, 'The tasks are out!,' and I'm just working, just trying to get as many hours in before I can go to bed," Mimi says, choking up. "I spend no time with my kid, and at one point, he can't find something for school and I just start screaming at him. This work is turning me into a fucking demon." She's especially disturbed by the surveillance: "The idea that somebody can measure your time and that all the little bits that go into being a human are taken away because they're not profitable, that you can't charge for going to the toilet because that's not time you're working, you can't charge for making a cup of coffee because that's not time you're working, you can't charge for having a stretch because your back hurts. This is why unions were formed, so people could have guaranteed hours and guaranteed lunch breaks and guaranteed holidays and sick pay. This is the gig economy to the very extreme." This is what concerns her more than the AI itself: that it's bringing to knowledge work the sort of precarious platform labor that has transformed taxi driving and food delivery. Meanwhile, she watches in horror the desperate gratitude of her colleagues as they rejoice at the 7 p.m. announcement of incoming work. "How long are these tasks expected to last?" one worker asked in Slack. "I'm wondering too, I'd like to know whether I can sleep or not ." With no answer forthcoming, they swapped tips on how to stave off sleep. When Mercor began recruiting aggressively last year, it framed itself as a more worker-friendly version of the platforms that had come before it. Criticizing his rival Scale AI on a podcast, Foody, Mercor's CEO, said, "Having phenomenal people that you treat incredibly well is the most important thing in this market." Workers who joined during this time do report being treated well; the pay was better than elsewhere, and instead of being managed by opaque algorithms, as is common, there were actual human supervisors they could go to with questions. But people who have worked in management at data companies say they often start out this way, wooing workers off incumbent platforms with promises of better treatment, only for conditions to degrade as they compete to win eight-figure contracts doled out by the half-dozen AI companies who are interested in buying this data in bulk. At Mercor, there was the additional complication of management largely consisting of people in their 20s with minimal work experience who had been given hundreds of millions of investor dollars to pursue rapid growth. "I don't care if somebody's 21 and they're my manager," says Chris, the reality-TV producer. "But they've never worked at this scale. When you try to find some kind of guidance in Slack, very maturely and clearly explaining what the situation is, you get a meme back with a corgi rolling its eyes and it says, 'Use your judgment.' But it's like, 'Use your judgment and fuck it up, and you get fired.' You went to Harvard, you graduated last year, and your guidance for a group of people, many of whom are experienced professionals, is a meme?" Lawyers, designers, producers, writers, scientists -- all complained of inexperienced managers giving contradictory instructions, demanding long hours or mandatory Zoom meetings for ostensibly flexible work, and threatening people with offboarding for moving too slowly, threats that were particularly galling for mid-career professionals who felt their 20-year-old bosses barely understood the fields they were trying to automate. "The founders pride themselves on '9-9-6,'" says a lawyer, referring to a term that originated in China to describe 72-hour workweeks associated with burnout and suicide but has been appropriated by Silicon Valley as aspirational. "You need to be accessible at all hours, and they're going to pump out messages at 6 a.m., and you better jump because the perception is you will be offboarded and another person will replace you." "It's not just that team leads are young, project managers are young, senior project managers are young. It's that the senior-senior project managers, the ones responsible for the project in its entirety, are young. I guess that comes from the top because they're young, right?" says Lindsay, a graphic designer and illustrator in her 50s who came to Mercor after 85 percent of her work evaporated over the past year, owing, she believes, to improvements in generative AI. Increasingly desperate for work, she scoured job boards; it seemed the only listings matching her expertise were offers to help build the technology she blamed for demolishing her career. "I swallowed my hatred and signed up," she says. After some initial work producing graphic-design data, she was invited to join a job for Meta grabbing videos from Instagram Reels and tagging whatever was in them. It was boring, and at $21 per hour, the pay was middling, but Lindsay needed the money. So, she discovered when she was brought into the project's Slack, did approximately 5,000 others. In early November, a Mercor representative announced that Lindsay's project would be ending owing to "scope changes," though workers had previously been told the project would run through the end of the year. Lindsay and thousands of others found themselves removed from the company's Slack. Soon, an email arrived in their inbox, inviting them to a new project called Nova paying $16 per hour. Thousands of workers poured into the new Slack only to discover it was the exact same job, now paying 24 percent less. All but two of the Slack channels had been deleted, including the watercooler, support, and help rooms. The ability to direct-message one another had also been cut off. There were no team leads to be found. With no one to ask for assistance, workers flooded the main rooms with pleas and indignation. "Nobody knows what's going on. Everybody's really confused," says Lindsay. "The messages are coming so fast in that channel. It's just absolute chaos. 'Help, please. What do I do? What am I supposed to do? Where do I go? Can I get started tasking? Am I supposed to redo all the assessments that I've done before?'" Someone emailed support asking for help, and for some reason that email was sent to every one of the thousand-some people on the project, who seized on it and began to reply-all with their bafflement and outrage. "It was absolute carnage," says Lindsay. "There's no other word for it." Workers began posting complaints on Mercor's sub-Reddit, only to have their posts quickly deleted by the Mercor representatives who moderate it. In response, two unsanctioned Mercor sub-Reddits were created, where workers could freely express such sentiments as "CHILDREN RUN THIS COMPANY, THEY WILL SOON HAVE THEIR DAY OF RECKONING." "It's just really sad," says Lindsay. "There are some people in there where it's genuinely the difference between them being able to feed their families and not feed their families." "I hate gen AI," she adds. "I think AI should be used for curing cancer. I think it should be used for space exploration, not in the creative industries. But I need to be able to pay my rent. And then when people like Mercor pull this stuff where they treat you like nothing more than a lab rat -- I've been working for a very long time. I have never, ever been treated as badly as this." Intermittent work, extreme secrecy, and abrupt firings are the norm across the data industry. On Surge AI's work platform, called Data Annotation Tech, workers are not only regularly terminated without explanation; they are often not even told they've been fired. They just log in one day and find the dashboard empty of tasks. The phenomenon is so ubiquitous they call it simply "the dash of death." Last year, a Texan with a master's degree in divinity who was teaching voice models to respond to queries with appropriate levels of feeling -- different tones for a user telling them their dog died versus asking for a trip itinerary -- logged in to work one morning and found his dashboard empty. Scrolling to the bottom of the page for the support button, he discovered it no longer worked. That's when he knew he had been terminated. His mind raced through possible reasons: Had he worked too much? Had his quality slipped? He knew he would never find out. "I felt cut adrift," he says. Anxious about how he would pay his bills and care for his ailing dog, he grew depressed, then horrified. He thought about his teacher friends who couldn't get their students to write and all the people graduating with now-worthless computer-science degrees. "The technology makes us see everything as a utility, something to be used," he says, a category that he feels includes discarded data workers like himself. He resolved to become a chaplain, figuring that no matter what the AI future holds, people will need a fellow human to be there for them. The on-again, off-again nature of the work is not just the result of company culture; it stems from the cadence of AI development itself. People across the industry described the pattern. A model builder, like OpenAI or Anthropic, discovers that its model is weak on chemistry, so it pays a data vendor like Mercor or Scale AI to find chemists to make data. The chemists do tasks until there is a sufficient quantity for a batch to go back to the lab, and the job is paused until the lab sees how the data affects the model. Maybe the lab moves forward, but this time, it's asking for a slightly different type of data. When the job resumes, the vendor discovers the new instructions make the tasks take longer, which means the cost estimate the vendor gave the lab is now wrong, which means the vendor cuts pay or tries to get workers to move faster. The new batch of data is delivered, and the job is paused once more. Maybe the lab changes its data requirements again, discovers it has enough data, and ends the project or decides to go with another vendor entirely. Maybe now the lab wants only organic chemists and everyone without the relevant background gets taken off the project. Next, it's biology data that's in demand, or architectural sketches, or K-12 syllabus design. To compete, data companies arrange things so that they will always have workers on call while preserving their freedom to drop them at a moment's notice. "Every vendor is going to have some kind of setup whereby they don't really make promises to people," says a senior employee of a major data company. The companies rarely have much notice of these shifts themselves, sometimes because the AI developers aren't sure exactly what data they need in the first place, other times because they are shopping around for the best deal. "They want to keep us in the dark," the employee continues, "so we inevitably keep the contributors in the dark, then a purchase falls through and you have a thousand people you've trained and formed a relationship with just saying, like, 'What the fuck? Why isn't there work?' It's a horrible feeling from an operator's perspective, too, but obviously it's way worse for them." The workers at the bottom of this supply chain exist in a state of extreme precarity and maximum competitive frenzy -- especially because their strict confidentiality agreements make it impossible for them to establish any kind of seniority or relationship that might outlast a particular project. "The power is all on one side because they can't talk about it," says Matthew McMullen, a strategy-and-operations executve who has worked in the industry since the self-driving-car boom in the mid-2010s. "The labs benefit from you not being able to leverage your experience in the market, and this silence is like their pricing power. The silence is their ability to extract mass information from people without giving them the power to object or to unionize or to make companies themselves. As long as they can't prove what they've done, these raters can't demand what they're worth. The only way that people can demand things is by showing their ability to step up, to take on more work. The only power that they have is to keep going, to get back in line." Which is what they do. When a project for Mercor ends, managers often post a link to other projects on the platform and encourage people to apply. "But again, there are thousands of people applying, so you throw your application into a hole and hope to hear back at some undefined point," says Katya. While they wait, workers sign up for Handshake, Micro1, Alignerr, or another of the ever-growing number of data providers. These companies are always recruiting. Like Mercor, many use AI interviewers and automated evaluations, meaning they have no incentive to limit the number of interviews they do. Mercor offers referral bonuses of several hundred dollars, leading some to promote the company so aggressively that mentions of it have been banned from several sub-Reddits. Katya has applied for dozens of jobs and gotten three, not an unusual ratio. Nor do companies bear any cost for overhiring. Because workers are ostensibly independent contractors, they are not owed paid time off, breaks, health care, overtime pay, or unemployment benefits. It's free to keep them hanging around, and a surplus of vetted workers ensures they will jump quickly to finish tasks before someone else does. It all combines to create an arrangement in which employers can turn labor on and off like a tap. (Reached for comment, Mercor spokesperson Heidi Hagberg said that "the nature of this is project based contract work, meaning it can extend, pause, or end at any time, especially as the client's scopes and needs evolve," and that many of the worker complaints "were centered around the misalignment of expectations of a full-time job versus -project-based work.") If you move fast and get lucky and have the right combination of expertise and stay on the right side of each platform's unique and mysterious recipe of productivity metrics, you can make decent money. I spoke to a playwright making $10,000 a month, a multitalented chemist who at various points found gigs demonstrating poker and singing for AI. But even then, there is an inescapable awareness of ephemerality because producing training data means working toward your own obsolescence. While the number of people doing data work may continue to rise, any particular gig will last only as long as it takes for the machines to successfully mimic it. It takes years for a human to develop expertise, and sooner or later, they're going to run out of skills to sell. A worker with a master's in linguistics had found steady rubric work for a year, but late in 2025, he noticed it was becoming more difficult to stump the models. Any obscure theory or Indigenous language he asked about, the model would find the correct papers. Instead of submitting three or four rubrics per week, he was lucky to get one. Everyone else on the project was following the same trajectory, so he wasn't surprised when it came to an end. Their know-how had been extracted. In the past, he'd always been able to find a new gig, but now when he looked around, he saw only requests for medical experts, human-resources managers, and teachers. He has now been without work for five months and isn't sure what to do next. To the extent that policy responses to AI automation are discussed at all, they mostly concern what to do when AI renders large categories of workers obsolete. Maybe this will happen, but another possibility is that particular tasks will get automated and humans redistributed to other parts of the production process, some revising so-so AI output, others crafting rubrics to improve it. Much of this work will be inherently intermittent, which means it will be done by independent contractors, workers whom current regulations leave almost wholly unprotected. Daron Acemoglu, a professor of economics at MIT who studies automation, compares the situation to that of weavers, who before the industrial revolution were "like the labor aristocracy," self-employed artisans in control of their own time. Then came weaving machines, and in order to survive, they were forced to take new jobs in factories, where they worked longer hours for less money under the close supervision of management. The problem wasn't simply that technology took their jobs; it enabled a new organization of work that gave all power to the owners of capital, who made work a nightmare until labor organizing and regulation set limits. Early labor skirmishes are already happening, mostly in California, which has some of the most aggressive rules around classifying platform workers. Three class-action lawsuits have been filed against Mercor in the past six months. (Similar suits were previously filed against Surge AI and Scale AI, which is settling.) The lawsuits all accuse the companies of misclassifying workers as independent contractors given the "extraordinary control" they exert over them. This is "an entirely new kind of work," one that the company trains people to do and that cannot be done except on the company's platform. Workers have so little visibility into what they're working on that one person, alleges a suit filed in December, accepted a Mercor project only to be tasked with recording himself reading sexually explicit scripts. Once he discovered this, the worker risked deactivation if he abandoned the project, forcing him to "choose between being paid and being humiliated." These companies are reminiscent of Uber and Lyft a decade ago, says Glenn Danas, a partner at the law firm Clarkson, which is suing Mercor and several other data platforms. Yet in some ways these workers are in a worse position, more replaceable despite their advanced degrees. Uber drivers have to be physically present in a city to work, and they can organize and push for regulation there. If the same were to happen with data workers, companies could just recruit from somewhere else where people will work for less. When Mercor cut pay for its Meta project to $16 per hour, it dropped below the minimum wage in California and other states, yet people there kept working because they needed the money. This was something at least one supervisor acknowledged, writing in Slack, "While we won't actively hire from any states where the minimum wage is above the project's rate, if you are already active on the project and would like to work at the $16/hr rate, we want to enable you to do so." Entire professions risk a similar race to the bottom, says Acemoglu, if companies are able to pit workers against one another, each selling their data before someone else can underbid them. "We may also need unionlike organizations that exercise some sort of collective ownership and prevent any kind of simple divide-and-rule strategies by large companies to drive down data prices," he says. "If there isn't the legal infrastructure for a data economy of this sort, many of the people who produce the data will be underpaid or, to use a more loaded term, exploited." Katya was among the thousands of people invited to join the $16-an-hour Project Nova and was appalled by the low pay. "I think that was Mercor's experiment in how close to the bottom they can scrape without jeopardizing the data that they're getting," she says. Her main project had been paused for weeks and might resume the next day or never. In the end, she decided the money wasn't worth it. She applied to work at a local coffee shop. It wasn't the career pivot she'd imagined when she went to grad school; she just hoped working as a barista would be more stable. "At least when you work at a coffee shop for minimum wage, you have some friends to talk to and a boss who pretends to care about you. You have some kind of security; you know what your hours are going to be week to week," she says. But then she heard her phone ding. One of her projects was back on.
Share
Share
Copy Link
White-collar professionals displaced by AI are now working precarious gig jobs training the same systems that automated their careers. Companies like Mercor pay up to $45 per hour for experts to create prompts and evaluate chatbot responses, while software developers watch AI agents write code in minutes that once took days. The irony is stark: the more effective their training, the fewer jobs remain.
A former journalist named Katya found herself in an uncomfortable paradox. After AI automated much of her content marketing work, she clicked on what seemed like a scam job offer from a company called Crossing Hurdles, promising copywriting jobs starting at $45 per hour
1
. The posting led her to Mercor, where she interviewed with an AI named Melvin and was recruited to create training data for AI systems. "My job is gone because of ChatGPT, and I was being invited to train the model to do the worst version of it imaginable," she said1
. Despite the bitter irony, financial desperation drove her to accept.
Source: NYMag
Katya's experience reflects a broader transformation reshaping professional labor. Laid-off lawyers and PhDs are increasingly finding work in the precarious gig economy, training AI models for companies like Scale AI, Surge AI, and Mercor
3
. These workers spend hours writing examples of prompts someone might ask a chatbot, crafting ideal responses, and creating detailed checklists of criteria that define quality work. Each task takes several hours before data moves down a digital assembly line for further review1
.Machine-learning systems learn by finding patterns in enormous quantities of data, but first that data must be sorted, labeled, and produced by people. ChatGPT achieved its fluency from thousands of humans hired to write examples of helpful chatbot responses and grade outputs
3
. However, a little over a year ago, concerns mounted about a plateau in technology progress. Training models through simple grading yielded chatbots good at sounding smart but too unreliable for practical use.
Source: The Verge
Software engineering proved the exception. The ability of models to automatically check whether code worked—did it compile, did it print HELLO WORLD—allowed them to trial-and-error their way to genuine competence
1
. Few other human activities offer such unambiguous feedback. There are no objective tests for whether financial analysis or advertising copy is "good." Undeterred, AI companies set out to create such tests, collectively paying billions of dollars to professionals to write exacting criteria for jobs well done3
.Manu Ebert, a 39-year-old machine-learning engineer, has been trying to keep his AI from humiliating him. At his start-up Hyperspell, Claude Code from Anthropic now does the bulk of coding work
2
. AI agents write features, test them, and supervise everything like virtual taskmasters. When a customer recently needed new code, it took only half an hour—work that would have taken Ebert a full day previously2
.
Source: NYT
Software developers now rarely write code themselves. Instead, they spend days talking to AI, describing in plain English what they want and responding to the AI's plan. Ebert maintains a prompt file—a stern set of instructions his agents must follow. One prompt warns that "pushing code that fails pytest is unacceptable and embarrassing"
2
. Many developers berate their AI agents, plead with them, shout commands in uppercase, and discover the AI becomes slightly more obedient. Computer programming is becoming a conversation, a back-and-forth between software developers and their bots2
.Related Stories
Mercor was founded in 2023 by three then-19-year-olds from the Bay Area—Brendan Foody, Adarsh Hiremath, and Surya Midha—as a jobs platform using AI interviews to match overseas engineers with tech companies
1
. The company pivoted to selling training data as demand surged.For workers like Katya, the reality proved unstable. Two days after starting, her project was abruptly paused, then canceled entirely. "I'm working assuming that I can plan around this. I'm saving up for first and last month's rent for an apartment, and then I'm back on my ass. No warning, no security, nothing," she said
1
. Days later, Mercor offered another job evaluating chatbot conversations with users from Malaysia and Vietnam practicing English. The email arrived at 6:30 PM on Sunday night, demanding she sign immediately for a Zoom onboarding in 45 minutes1
.The situation reveals profound questions about human cognitive labor and job security. For decades, coding was considered such valuable expertise that competent practitioners could expect lifetime employment. Silicon Valley spent the 2010s telling American workers in dying industries to "learn to code"
2
. Now coding itself faces automation. AI job displacement has struck white-collar professionals first, creating a bitter irony: those who built and trained these systems are among the first casualties.Coding may be the first form of expensive industrialized human labor that AI can actually replace effectively. While AI-generated videos look janky and artificial photos appear surreal, AI-generated code that passes tests and works holds genuine value
2
. The question facing professionals across industries is whether their expertise will follow the same trajectory—first automated, then requiring them to train their own replacements for temporary gig work creating prompts for AI systems they once mastered. As large language models advance, the line between training AI and being replaced by it grows increasingly blurred, leaving workers navigating an uncertain future where today's employment might be tomorrow's obsolete skill.Summarized by
Navi