3 Sources
[1]
How do you know when AI is powerful enough to be dangerous? Regulators try to do the math
How do you know if an artificial intelligence system is so powerful that it poses a security danger and shouldn't be unleashed without careful oversight? For regulators trying to put guardrails on AI, it's mostly about the arithmetic. Specifically, an AI model trained on 10 to the 26th floating-point operations per second must now be reported to the U.S. government and could soon trigger even stricter requirements in California. Say what? Well, if you're counting the zeroes, that's 100,000,000,000,000,000,000,000,000, or 100 septillion, calculations each second, using a measure known as flops. What it signals to some lawmakers and AI safety advocates is a level of computing power that might enable rapidly advancing AI technology to create or proliferate weapons of mass destruction, or conduct catastrophic cyberattacks. Those who've crafted such regulations acknowledge they are an imperfect starting point to distinguish today's highest-performing generative AI systems -- largely made by California-based companies like Anthropic, Google, Meta Platforms and ChatGPT-maker OpenAI -- from the next generation that could be even more powerful. Critics have pounced on the thresholds as arbitrary -- an attempt by governments to regulate math. "Ten to the 26th flops," said venture capitalist Ben Horowitz on a podcast this summer. "Well, what if that's the size of the model you need to, like, cure cancer?" An executive order signed by President Joe Biden last year relies on that threshold. So does California's newly passed AI safety legislation -- which Gov. Gavin Newsom has until Sept. 30 to sign into law or veto. California adds a second metric to the equation: regulated AI models must also cost at least $100 million to build. Following Biden's footsteps, the European Union's sweeping AI Act also measures floating-point operations per second, or flops, but sets the bar 10 times lower at 10 to the 25th power. That covers some AI systems already in operation. China's government has also looked at measuring computing power to determine which AI systems need safeguards. No publicly available models meet the higher California threshold, though it's likely that some companies have already started to build them. If so, they're supposed to be sharing certain details and safety precautions with the U.S. government. Biden employed a Korean War-era law to compel tech companies to alert the U.S. Commerce Department if they're building such AI models. AI researchers are still debating how best to evaluate the capabilities of the latest generative AI technology and how it compares to human intelligence. There are tests that judge AI on solving puzzles, logical reasoning or how swiftly and accurately it predicts what text will answer a person's chatbot query. Those measurements help assess an AI tool's usefulness for a given task, but there's no easy way of knowing which one is so widely capable that it poses a danger to humanity. "This computation, this flop number, by general consensus is sort of the best thing we have along those lines," said physicist Anthony Aguirre, executive director of the Future of Life Institute, which has advocated for the passage of California's Senate Bill 1047 and other AI safety rules around the world. Floating point arithmetic might sound fancy "but it's really just numbers that are being added or multiplied together," making it one of the simplest ways to assess an AI model's capability and risk, Aguirre said. "Most of what these things are doing is just multiplying big tables of numbers together," he said. "You can just think of typing in a couple of numbers into your calculator and adding or multiplying them. And that's what it's doing -- ten trillion times or a hundred trillion times." For some tech leaders, however, it's too simple and hard-coded a metric. There's "no clear scientific support" for using such metrics as a proxy for risk, argued computer scientist Sara Hooker, who leads AI company Cohere's nonprofit research division, in a July paper. "Compute thresholds as currently implemented are shortsighted and likely to fail to mitigate risk," she wrote. Venture capitalist Horowitz and his business partner Marc Andreessen, founders of the influential Silicon Valley investment firm Andreessen Horowitz, have attacked the Biden administration as well as California lawmakers for AI regulations they argue could snuff out an emerging AI startup industry. For Horowitz, putting limits on "how much math you're allowed to do" reflects a mistaken belief there will only be a handful of big companies making the most capable models and you can put "flaming hoops in front of them and they'll jump through them and it's fine." In response to the criticism, the sponsor of California's legislation sent a letter to Andreessen Horowitz this summer defending the bill, including its regulatory thresholds. Regulating at over 10 to the 26th flops is "a clear way to exclude from safety testing requirements many models that we know, based on current evidence, lack the ability to cause critical harm," wrote state Sen. Scott Wiener of San Francisco. Existing publicly released models "have been tested for highly hazardous capabilities and would not be covered by the bill," Wiener said. Both Wiener and the Biden executive order treat the metric as a temporary one that could be adjusted later. Yacine Jernite, who leads policy research at the AI company Hugging Face, said the flops metric emerged in "good faith" ahead of last year's Biden order but is already starting to grow obsolete. AI developers are doing more with smaller models requiring less computing power, while the potential harms of more widely used AI products won't trigger California's proposed scrutiny. "Some models are going to have a drastically larger impact on society, and those should be held to a higher standard, whereas some others are more exploratory and it might not make sense to have the same kind of process to certify them," Jernite said. Aguirre said it makes sense for regulators to be nimble, but he characterizes some opposition to the flops threshold as an attempt to avoid any regulation of AI systems as they grow more capable. "This is all happening very fast," Aguirre said. "I think there's a legitimate criticism that these thresholds are not capturing exactly what we want them to capture. But I think it's a poor argument to go from that to, 'Well, we just shouldn't do anything and just cross our fingers and hope for the best.'"
[2]
How do you know when AI is powerful enough to be dangerous? Regulators try to do the math
How do you know if an artificial intelligence system is so powerful that it poses a security danger and shouldn't be unleashed without careful oversight How do you know if an artificial intelligence system is so powerful that it poses a security danger and shouldn't be unleashed without careful oversight? For regulators trying to put guardrails on AI, it's mostly about the arithmetic. Specifically, an AI model trained on 10 to the 26th floating-point operations per second must now be reported to the U.S. government and could soon trigger even stricter requirements in California. Say what? Well, if you're counting the zeroes, that's 100,000,000,000,000,000,000,000,000, or 100 septillion, calculations each second, using a measure known as flops. What it signals to some lawmakers and AI safety advocates is a level of computing power that might enable rapidly advancing AI technology to create or proliferate weapons of mass destruction, or conduct catastrophic cyberattacks. Those who've crafted such regulations acknowledge they are an imperfect starting point to distinguish today's highest-performing generative AI systems -- largely made by California-based companies like Anthropic, Google, Meta Platforms and ChatGPT-maker OpenAI -- from the next generation that could be even more powerful. Critics have pounced on the thresholds as arbitrary -- an attempt by governments to regulate math. "Ten to the 26th flops," said venture capitalist Ben Horowitz on a podcast this summer. "Well, what if that's the size of the model you need to, like, cure cancer?" An executive order signed by President Joe Biden last year relies on that threshold. So does California's newly passed AI safety legislation -- which Gov. Gavin Newsom has until Sept. 30 to sign into law or veto. California adds a second metric to the equation: regulated AI models must also cost at least $100 million to build. Following Biden's footsteps, the European Union's sweeping AI Act also measures floating-point operations per second, or flops, but sets the bar 10 times lower at 10 to the 25th power. That covers some AI systems already in operation. China's government has also looked at measuring computing power to determine which AI systems need safeguards. No publicly available models meet the higher California threshold, though it's likely that some companies have already started to build them. If so, they're supposed to be sharing certain details and safety precautions with the U.S. government. Biden employed a Korean War-era law to compel tech companies to alert the U.S. Commerce Department if they're building such AI models. AI researchers are still debating how best to evaluate the capabilities of the latest generative AI technology and how it compares to human intelligence. There are tests that judge AI on solving puzzles, logical reasoning or how swiftly and accurately it predicts what text will answer a person's chatbot query. Those measurements help assess an AI tool's usefulness for a given task, but there's no easy way of knowing which one is so widely capable that it poses a danger to humanity. "This computation, this flop number, by general consensus is sort of the best thing we have along those lines," said physicist Anthony Aguirre, executive director of the Future of Life Institute, which has advocated for the passage of California's Senate Bill 1047 and other AI safety rules around the world. Floating point arithmetic might sound fancy "but it's really just numbers that are being added or multiplied together," making it one of the simplest ways to assess an AI model's capability and risk, Aguirre said. "Most of what these things are doing is just multiplying big tables of numbers together," he said. "You can just think of typing in a couple of numbers into your calculator and adding or multiplying them. And that's what it's doing -- ten trillion times or a hundred trillion times." For some tech leaders, however, it's too simple and hard-coded a metric. There's "no clear scientific support" for using such metrics as a proxy for risk, argued computer scientist Sara Hooker, who leads AI company Cohere's nonprofit research division, in a July paper. "Compute thresholds as currently implemented are shortsighted and likely to fail to mitigate risk," she wrote. Venture capitalist Horowitz and his business partner Marc Andreessen, founders of the influential Silicon Valley investment firm Andreessen Horowitz, have attacked the Biden administration as well as California lawmakers for AI regulations they argue could snuff out an emerging AI startup industry. For Horowitz, putting limits on "how much math you're allowed to do" reflects a mistaken belief there will only be a handful of big companies making the most capable models and you can put "flaming hoops in front of them and they'll jump through them and it's fine." In response to the criticism, the sponsor of California's legislation sent a letter to Andreessen Horowitz this summer defending the bill, including its regulatory thresholds. Regulating at over 10 to the 26th flops is "a clear way to exclude from safety testing requirements many models that we know, based on current evidence, lack the ability to cause critical harm," wrote state Sen. Scott Wiener of San Francisco. Existing publicly released models "have been tested for highly hazardous capabilities and would not be covered by the bill," Wiener said. Both Wiener and the Biden executive order treat the metric as a temporary one that could be adjusted later. Yacine Jernite, who leads policy research at the AI company Hugging Face, said the flops metric emerged in "good faith" ahead of last year's Biden order but is already starting to grow obsolete. AI developers are doing more with smaller models requiring less computing power, while the potential harms of more widely used AI products won't trigger California's proposed scrutiny. "Some models are going to have a drastically larger impact on society, and those should be held to a higher standard, whereas some others are more exploratory and it might not make sense to have the same kind of process to certify them," Jernite said. Aguirre said it makes sense for regulators to be nimble, but he characterizes some opposition to the flops threshold as an attempt to avoid any regulation of AI systems as they grow more capable. "This is all happening very fast," Aguirre said. "I think there's a legitimate criticism that these thresholds are not capturing exactly what we want them to capture. But I think it's a poor argument to go from that to, 'Well, we just shouldn't do anything and just cross our fingers and hope for the best.'"
[3]
How do you know when AI is powerful enough to be dangerous? Regulators try to do the math
How do you know if an artificial intelligence system is so powerful that it poses a security danger and shouldn't be unleashed without careful oversight? For regulators trying to put guardrails on AI, it's mostly about the arithmetic. Specifically, an AI model trained on 10 to the 26th floating-point operations per second must now be reported to the U.S. government and could soon trigger even stricter requirements in California. Say what? Well, if you're counting the zeroes, that's 100,000,000,000,000,000,000,000,000, or 100 septillion, calculations each second, using a measure known as flops. What it signals to some lawmakers and AI safety advocates is a level of computing power that might enable rapidly advancing AI technology to create or proliferate weapons of mass destruction, or conduct catastrophic cyberattacks. Those who've crafted such regulations acknowledge they are an imperfect starting point to distinguish today's highest-performing generative AI systems -- largely made by California-based companies like Anthropic, Google, Meta Platforms and ChatGPT-maker OpenAI -- from the next generation that could be even more powerful. Critics have pounced on the thresholds as arbitrary -- an attempt by governments to regulate math. "Ten to the 26th flops," said venture capitalist Ben Horowitz on a podcast this summer. "Well, what if that's the size of the model you need to, like, cure cancer?" An executive order signed by President Joe Biden last year relies on that threshold. So does California's newly passed AI safety legislation -- which Gov. Gavin Newsom has until Sept. 30 to sign into law or veto. California adds a second metric to the equation: regulated AI models must also cost at least $100 million to build. Following Biden's footsteps, the European Union's sweeping AI Act also measures floating-point operations per second, or flops, but sets the bar 10 times lower at 10 to the 25th power. That covers some AI systems already in operation. China's government has also looked at measuring computing power to determine which AI systems need safeguards. No publicly available models meet the higher California threshold, though it's likely that some companies have already started to build them. If so, they're supposed to be sharing certain details and safety precautions with the U.S. government. Biden employed a Korean War-era law to compel tech companies to alert the U.S. Commerce Department if they're building such AI models. AI researchers are still debating how best to evaluate the capabilities of the latest generative AI technology and how it compares to human intelligence. There are tests that judge AI on solving puzzles, logical reasoning or how swiftly and accurately it predicts what text will answer a person's chatbot query. Those measurements help assess an AI tool's usefulness for a given task, but there's no easy way of knowing which one is so widely capable that it poses a danger to humanity. "This computation, this flop number, by general consensus is sort of the best thing we have along those lines," said physicist Anthony Aguirre, executive director of the Future of Life Institute, which has advocated for the passage of California's Senate Bill 1047 and other AI safety rules around the world. Floating point arithmetic might sound fancy "but it's really just numbers that are being added or multiplied together," making it one of the simplest ways to assess an AI model's capability and risk, Aguirre said. "Most of what these things are doing is just multiplying big tables of numbers together," he said. "You can just think of typing in a couple of numbers into your calculator and adding or multiplying them. And that's what it's doing -- ten trillion times or a hundred trillion times." For some tech leaders, however, it's too simple and hard-coded a metric. There's "no clear scientific support" for using such metrics as a proxy for risk, argued computer scientist Sara Hooker, who leads AI company Cohere's nonprofit research division, in a July paper. "Compute thresholds as currently implemented are shortsighted and likely to fail to mitigate risk," she wrote. Venture capitalist Horowitz and his business partner Marc Andreessen, founders of the influential Silicon Valley investment firm Andreessen Horowitz, have attacked the Biden administration as well as California lawmakers for AI regulations they argue could snuff out an emerging AI startup industry. For Horowitz, putting limits on "how much math you're allowed to do" reflects a mistaken belief there will only be a handful of big companies making the most capable models and you can put "flaming hoops in front of them and they'll jump through them and it's fine." In response to the criticism, the sponsor of California's legislation sent a letter to Andreessen Horowitz this summer defending the bill, including its regulatory thresholds. Regulating at over 10 to the 26th flops is "a clear way to exclude from safety testing requirements many models that we know, based on current evidence, lack the ability to cause critical harm," wrote state Sen. Scott Wiener of San Francisco. Existing publicly released models "have been tested for highly hazardous capabilities and would not be covered by the bill," Wiener said. Both Wiener and the Biden executive order treat the metric as a temporary one that could be adjusted later. Yacine Jernite, who leads policy research at the AI company Hugging Face, said the flops metric emerged in "good faith" ahead of last year's Biden order but is already starting to grow obsolete. AI developers are doing more with smaller models requiring less computing power, while the potential harms of more widely used AI products won't trigger California's proposed scrutiny. "Some models are going to have a drastically larger impact on society, and those should be held to a higher standard, whereas some others are more exploratory and it might not make sense to have the same kind of process to certify them," Jernite said. Aguirre said it makes sense for regulators to be nimble, but he characterizes some opposition to the flops threshold as an attempt to avoid any regulation of AI systems as they grow more capable. "This is all happening very fast," Aguirre said. "I think there's a legitimate criticism that these thresholds are not capturing exactly what we want them to capture. But I think it's a poor argument to go from that to, 'Well, we just shouldn't do anything and just cross our fingers and hope for the best.'"
Share
Copy Link
California regulators are considering a new approach to measure the safety and potential risks of artificial intelligence systems. The proposed metric, based on computing power, aims to help assess when AI becomes powerful enough to pose significant dangers.
In a groundbreaking move, California regulators are spearheading efforts to quantify the potential risks associated with artificial intelligence (AI) systems. The California Public Utilities Commission has proposed a new metric called "compute-aware bits" to measure AI capabilities and potential dangers 1.
The proposed metric is based on the computing power used to train AI models, measured in floating-point operations per second (FLOPS). This approach aims to provide a tangible way to assess when AI becomes powerful enough to pose significant risks 2.
Under this system, AI models would be categorized into tiers:
If adopted, this system would require AI companies to report their systems' compute-aware bits to regulators. Companies with Tier 3 systems would face additional obligations, including providing advance notice of new model deployments and submitting annual audit reports 3.
While innovative, the proposal faces several challenges:
Complexity: The metric's calculation involves complex factors, potentially making it difficult for non-experts to understand and implement.
Accuracy Concerns: Some experts argue that computing power alone may not accurately reflect an AI system's capabilities or risks 1.
Industry Pushback: Major tech companies, including Google and Microsoft, have expressed concerns about the proposal's feasibility and potential to hinder innovation 2.
California's proposal comes amid growing global concern about AI safety. The European Union is working on its own AI regulations, while the Biden administration has been pushing for voluntary commitments from AI companies 3.
As AI continues to advance rapidly, the need for effective regulation becomes increasingly urgent. California's proposed metric represents a significant step towards quantifying AI risks, potentially influencing future regulatory frameworks worldwide.
NASA and IBM have developed Surya, an open-source AI model that can predict solar flares and space weather, potentially improving the protection of Earth's critical infrastructure from solar storms.
5 Sources
Technology
6 hrs ago
5 Sources
Technology
6 hrs ago
Meta introduces an AI-driven voice translation feature for Facebook and Instagram creators, enabling automatic dubbing of content from English to Spanish and vice versa, with plans for future language expansions.
8 Sources
Technology
22 hrs ago
8 Sources
Technology
22 hrs ago
OpenAI CEO Sam Altman reveals plans for GPT-6, focusing on memory capabilities to create more personalized and adaptive AI interactions. The upcoming model aims to remember user preferences and conversations, potentially transforming the relationship between humans and AI.
2 Sources
Technology
22 hrs ago
2 Sources
Technology
22 hrs ago
Chinese AI companies DeepSeek and Baidu are making waves in the global AI landscape with their open-source models, challenging the dominance of Western tech giants and potentially reshaping the AI industry.
2 Sources
Technology
6 hrs ago
2 Sources
Technology
6 hrs ago
A comprehensive look at the emerging phenomenon of 'AI psychosis', its impact on mental health, and the growing concerns among experts and tech leaders about the psychological risks associated with AI chatbots.
3 Sources
Technology
6 hrs ago
3 Sources
Technology
6 hrs ago