2 Sources
[1]
Constellation Network and Common Crawl Provide Secure Validation of AI Training Data
Enter your email to get Benzinga's ultimate morning update: The PreMarket Activity Newsletter SAN FRANCISCO, Dec. 19, 2024 (GLOBE NEWSWIRE) -- Constellation Network, a Web3 ecosystem validated by the US Department of Defense, today announced the launch of a customized blockchain developed in partnership with the Common Crawl Foundation, to create the industry's first cryptographically secure, immutable archive of internet data for AI training and development. The collaboration introduces a new approach to validating and securely accessing 17 years of internet crawl data -- spanning nearly 9 petabytes which 80% of Large Language Models (LLMs) use to train AI -- through an immutable, cryptographically secured blockchain network built on Constellation. This innovative application-specific network, or Metagraph, addresses pressing concerns in AI development while exploring vast new use cases for blockchain technology in emerging industries: data provenance, privacy, and ethical sourcing. Furthermore, the network will utilize Constellation's DAG utility asset to secure the archived internet crawls. This represents a significant advancement in utilizing cryptocurrency as a mechanism for businesses to notarize data, shifting the focus from consumer costs or gas fees typical of many other layer-one networks to an operational expense. Key Technological Innovations Comprehensive Data Archiving: A fully immutable copy of internet history, providing unprecedented transparency and traceability for AI training datasetsEnd-to-End Encryption: Cryptographic security that ensures data integrity throughout the AI development lifecycleEthical AI Framework: A robust solution for addressing concerns around data collection, storage, and usage in large language models "This integration is a critical step forward in securing the future of AI development," said Alex Brandes, CTO of Constellation Network. "By ensuring cryptographic integrity and immutability of training data, we are addressing one of the most pressing challenges in the field today: trustworthiness and provenance of datasets. We believe our platform will grow to become a cornerstone in the field of responsible AI development, setting new standards for data integrity and trust." Industry Applications The blockchain-enabled data archive is already attracting attention from advanced AI research initiatives. TraceAI, a project developed through the National Science Foundation (NSF) and SBIR program, is in testing stages in the development of their own application-specific network, built on Constellation, to add immutability, auditability, and proof of authorship to its training models and to develop advanced watermarking technologies. TraceAI will also leverage Common Crawl's Constellation-built solution to further extend their work in blockchain encrypted AI to include tracking the source origin of data. Kevin Jackson, Vice President of Space Domain Communications & Commercialization for Forward EdgeAI, emphasizes the significance of this breakthrough: "This represents the natural evolution of AI and machine learning model development -- transforming data management from a technical challenge to a trusted business tool that drives global standardization and verification." Looking Forward Over the coming months, Constellation Network and Common Crawl Foundation will work together to expand on solution sets for AI developers and further integrate the distribution of the cryptographically validated access to the crawl as part of the standard release process. "For users of the Crawl who are concerned about the provenance of the data, especially those using it for AI models, Constellation and their hypergraph blockchain provides an elegant solution", said Rich Skrenta, Executive Director of the Common Crawl, "we are looking forward to adding the ability to securely validate the crawl as part of our standard distribution by partnering with Constellation". Evidence of this integration can be found on Constellation's transaction viewer, called the "DAG explorer," and developers can get started using verified historical crawls for AI applications. Please follow along for further solutions to be developed by Constellation, Forward Edge-AI, and Common Crawl. About Constellation Network Constellation is a leading blockchain network advancing innovation through on-chain data security, partnering with critical global stakeholders, including the U.S. Department of Defense, to deliver transformative, next-generation technologies. About Common Crawl Foundation The Common Crawl Foundation is a 501(c)(3) non-profit organization dedicated to providing a copy of the internet to the public, free of charge. Their web archive consists of petabytes of data collected over years of web crawling, serving as a critical resource for researchers, businesses, and developers worldwide. About Forward Edge-AI Forward Edge-AI is at the forefront of a revolution in responsible and inclusive Artificial Intelligence (AI) for the betterment of humanity. Since its foundation in 2019, our goal is to become the dominant player in Artificial Intelligence and lead the revolution in augmenting edge technology with human intelligence. About Common Crawl Foundation Contact Email: [email protected] Website: https://constellationnetwork.io/ Twitter: https://x.com/conste11ation GitHub: https://github.com/Constellation-Labs/tessellation DAG Explorer: https://mainnet.dagexplorer.io/ Contact Dagnum PI [email protected] A photo accompanying this announcement is available at https://www.globenewswire.com/NewsRoom/AttachmentNg/aed92cb9-444b-4b50-9a0b-5ee1889af9ea Market News and Data brought to you by Benzinga APIs
[2]
Constellation Network and Common Crawl provide secure validation of AI training data
SAN FRANCISCO, Dec. 19, 2024 -- Constellation Network, a Web3 ecosystem validated by the US Department of Defense, today announced the launch of a customized blockchain developed in partnership with the Common Crawl Foundation, to create the industry's first cryptographically secure, immutable archive of internet data for AI training and development. The collaboration introduces a new approach to validating and securely accessing 17 years of internet crawl data -- spanning nearly 9 petabytes which 80% of Large Language Models (LLMs) use to train AI -- through an immutable, cryptographically secured blockchain network built on Constellation. This innovative application-specific network, or Metagraph, addresses pressing concerns in AI development while exploring vast new use cases for blockchain technology in emerging industries: data provenance, privacy, and ethical sourcing. Furthermore, the network will utilize Constellation's DAG utility asset to secure the archived internet crawls. This represents a significant advancement in utilizing cryptocurrency as a mechanism for businesses to notarize data, shifting the focus from consumer costs or gas fees typical of many other layer-one networks to an operational expense. Key technological innovations "This integration is a critical step forward in securing the future of AI development," said Alex Brandes, CTO of Constellation Network. "By ensuring cryptographic integrity and immutability of training data, we are addressing one of the most pressing challenges in the field today: trustworthiness and provenance of datasets. We believe our platform will grow to become a cornerstone in the field of responsible AI development, setting new standards for data integrity and trust." Industry applications The blockchain-enabled data archive is already attracting attention from advanced AI research initiatives. TraceAI, a project developed through the National Science Foundation (NSF) and SBIR program, is in testing stages in the development of their own application-specific network, built on Constellation, to add immutability, auditability, and proof of authorship to its training models and to develop advanced watermarking technologies. TraceAI will also leverage Common Crawl's Constellation-built solution to further extend their work in blockchain encrypted AI to include tracking the source origin of data. Kevin Jackson, Vice President of Space Domain Communications & Commercialization for Forward EdgeAI, emphasizes the significance of this breakthrough: "This represents the natural evolution of AI and machine learning model development -- transforming data management from a technical challenge to a trusted business tool that drives global standardization and verification." Looking forward Over the coming months, Constellation Network and Common Crawl Foundation will work together to expand on solution sets for AI developers and further integrate the distribution of the cryptographically validated access to the crawl as part of the standard release process. "For users of the Crawl who are concerned about the provenance of the data, especially those using it for AI models, Constellation and their hypergraph blockchain provides an elegant solution", said Rich Skrenta, Executive Director of the Common Crawl, "we are looking forward to adding the ability to securely validate the crawl as part of our standard distribution by partnering with Constellation." Evidence of this integration can be found on Constellation's transaction viewer, called the "DAG explorer," and developers can get started using verified historical crawls for AI applications. Please follow along for further solutions to be developed by Constellation, Forward Edge-AI, and Common Crawl. About Constellation Network Constellation is a leading blockchain network advancing innovation through on-chain data security, partnering with critical global stakeholders, including the U.S. Department of Defense, to deliver transformative, next-generation technologies. About Common Crawl Foundation The Common Crawl Foundation is a 501(c)(3) non-profit organization dedicated to providing a copy of the internet to the public, free of charge. Their web archive consists of petabytes of data collected over years of web crawling, serving as a critical resource for researchers, businesses, and developers worldwide. About Forward Edge-AI Forward Edge-AI is at the forefront of a revolution in responsible and inclusive Artificial Intelligence (AI) for the betterment of humanity. Since its foundation in 2019, our goal is to become the dominant player in Artificial Intelligence and lead the revolution in augmenting edge technology with human intelligence.
Share
Copy Link
Constellation Network partners with Common Crawl Foundation to create a blockchain-based, cryptographically secure archive of internet data for AI training, addressing data provenance and ethical concerns in AI development.
Constellation Network, a Web3 ecosystem validated by the US Department of Defense, has announced a groundbreaking partnership with the Common Crawl Foundation to create the industry's first cryptographically secure, immutable archive of internet data for AI training and development 12. This collaboration aims to address critical concerns in AI development, including data provenance, privacy, and ethical sourcing.
The partnership introduces a novel method for validating and securely accessing 17 years of internet crawl data, spanning nearly 9 petabytes, which is used by 80% of Large Language Models (LLMs) for AI training 1. This data will be secured through an immutable, cryptographically protected blockchain network built on Constellation's platform, known as a Metagraph 2.
The blockchain-enabled data archive is already gaining attention from advanced AI research initiatives. TraceAI, a project developed through the National Science Foundation (NSF) and SBIR program, is testing its own application-specific network built on Constellation 1. This network aims to add immutability, auditability, and proof of authorship to its training models and develop advanced watermarking technologies.
Kevin Jackson, VP of Space Domain Communications & Commercialization for Forward EdgeAI, emphasized the significance of this breakthrough: "This represents the natural evolution of AI and machine learning model development -- transforming data management from a technical challenge to a trusted business tool that drives global standardization and verification" 12.
Constellation Network and Common Crawl Foundation plan to expand solution sets for AI developers and further integrate the distribution of cryptographically validated access to the crawl as part of the standard release process 1. Rich Skrenta, Executive Director of Common Crawl, stated, "For users of the Crawl who are concerned about the provenance of the data, especially those using it for AI models, Constellation and their hypergraph blockchain provides an elegant solution" 2.
This innovative approach represents a significant advancement in utilizing cryptocurrency as a mechanism for businesses to notarize data. It shifts the focus from consumer costs or gas fees typical of many other layer-one networks to an operational expense 1. Alex Brandes, CTO of Constellation Network, believes that this platform will become a cornerstone in responsible AI development, setting new standards for data integrity and trust 2.
As the AI industry continues to grapple with issues of data reliability and ethical sourcing, this blockchain-secured archive offers a promising solution that could reshape the landscape of AI training and development.
NVIDIA announces significant upgrades to its GeForce NOW cloud gaming service, including RTX 5080-class performance, improved streaming quality, and an expanded game library, set to launch in September 2025.
9 Sources
Technology
3 hrs ago
9 Sources
Technology
3 hrs ago
As nations compete for dominance in space, the risk of satellite hijacking and space-based weapons escalates, transforming outer space into a potential battlefield with far-reaching consequences for global security and economy.
7 Sources
Technology
19 hrs ago
7 Sources
Technology
19 hrs ago
OpenAI updates GPT-5 to make it more approachable following user feedback, sparking debate about AI personality and user preferences.
6 Sources
Technology
11 hrs ago
6 Sources
Technology
11 hrs ago
A pro-Russian propaganda group, Storm-1679, is using AI-generated content and impersonating legitimate news outlets to spread disinformation, raising concerns about the growing threat of AI-powered fake news.
2 Sources
Technology
19 hrs ago
2 Sources
Technology
19 hrs ago
A study reveals patients' increasing reliance on AI for medical advice, often trusting it over doctors. This trend is reshaping doctor-patient dynamics and raising concerns about AI's limitations in healthcare.
3 Sources
Health
11 hrs ago
3 Sources
Health
11 hrs ago