6 Sources
[1]
You can now download the source code that sparked the AI boom
On Thursday, Google and the Computer History Museum (CHM) jointly released the source code for AlexNet, the convolutional neural network (CNN) that many credit with transforming the AI field in 2012 by proving that "deep learning" could achieve things conventional AI techniques could not. Deep learning, which uses multi-layered neural networks that can learn from data without explicit programming, represented a significant departure from traditional AI approaches that relied on hand-crafted rules and features. The Python code, now available on CHM's GitHub page as open source software, offers AI enthusiasts and researchers a glimpse into a key moment of computing history. AlexNet served as a watershed moment in AI because it could accurately identify objects in photographs with unprecedented accuracy -- correctly classifying images into one of 1,000 categories like "strawberry," "school bus," or "golden retriever" with significantly fewer errors than previous systems. Like viewing original ENIAC circuitry or plans for Babbage's Difference Engine, examining the AlexNet code may provide future historians insight into how a relatively simple implementation sparked a technology that has reshaped our world. While deep learning has enabled advances in health care, scientific research, and accessibility tools, it has also facilitated concerning developments like deepfakes, automated surveillance, and the potential for widespread job displacement. But in 2012, those negative consequences still felt like far-off sci-fi dreams to many. Instead, experts were simply amazed that a computer could finally recognize images with near-human accuracy. Teaching computers to see As the CHM explains in its detailed blog post, AlexNet originated from the work of University of Toronto graduate students Alex Krizhevsky and Ilya Sutskever, along with their advisor Geoffrey Hinton. The project proved that deep learning could outperform traditional computer vision methods. The neural network won the 2012 ImageNet competition by recognizing objects in photos far better than any previous method. Computer vision veteran Yann LeCun, who attended the presentation in Florence, Italy, immediately recognized its importance for the field, reportedly standing up after the presentation and calling AlexNet "an unequivocal turning point in the history of computer vision." As Ars detailed in November, AlexNet marked the convergence of three critical technologies that would define modern AI. According to CHM, the museum began efforts to acquire the historically significant code in 2020, when Hansen Hsu (CHM's curator) reached out to Krizhevsky about releasing the source code due to its historical importance. Since Google had acquired the team's company DNNresearch in 2013, it owned the intellectual property rights. The museum worked with Google for five years to negotiate the release and carefully identify which specific version represented the original 2012 implementation -- an important distinction, as many recreations labeled "AlexNet" exist online but aren't the authentic code used in the breakthrough. How AlexNet worked While AlexNet's impact on AI is now legendary, understanding the technical innovation behind it helps explain why it represented such a pivotal moment. The breakthrough wasn't any single revolutionary technique, but rather the elegant combination of existing technologies that had previously developed separately. The project combined three previously separate components: deep neural networks, massive image datasets, and graphics processing units (GPUs). Deep neural networks formed the core architecture of AlexNet, with multiple layers that could learn increasingly complex visual features. The network was named after Krizhevsky, who implemented the system and performed the extensive training process. Unlike traditional AI systems that required programmers to manually specify what features to look for in images, these deep networks could automatically discover patterns at different levels of abstraction -- from simple edges and textures in early layers to complex object parts in deeper layers. While AlexNet used a CNN architecture specialized for processing grid-like data such as images, today's AI systems like ChatGPT and Claude rely primarily on Transformer models. Those models are a 2017 Google Research invention that excels at processing sequential data and capturing long-range dependencies in text and other media through a mechanism called "attention." For training data, AlexNet used ImageNet, a database started by Stanford University professor Dr. Fei-Fei Li in 2006. Li collected millions of Internet images and organized them using a database called WordNet. Workers on Amazon's Mechanical Turk platform helped label the images. The project needed serious computational power to process this data. Krizhevsky ran the training process on two Nvidia graphics cards installed in a computer in his bedroom at his parents' house. Neural networks perform many matrix calculations in parallel, tasks that graphics chips handle well. Nvidia, led by Jensen Huang, had made their graphics chips programmable for non-graphics tasks through their CUDA software, released in 2007. The impact from AlexNet extends beyond computer vision. Deep learning neural networks now power voice synthesis, game-playing systems, language models, and image generators. They're also responsible for potential society-fracturing effects such as filling social networks with AI-generated slop, empowering abusive bullies, and potentially altering the historical record. Where are they now? In the 13 years since their breakthrough, the creators of AlexNet have taken their expertise in different directions, each contributing to the field in unique ways. After AlexNet's success, Krizhevsky, Sutskever, and Hinton formed a company called DNNresearch Inc., which Google acquired in 2013. Each team member has followed a different path since then. Sutskever co-founded OpenAI in 2015, which released ChatGPT in 2022, and more recently launched Safe Superintelligence (SSI), a startup that has secured $1 billion in funding. Krizhevsky left Google in 2017 to work on new deep learning techniques at Dessa. Hinton has gained acclaim and notoriety for warning about the potential dangers of future AI systems, resigning from Google in 2023 so he could speak freely about the topic. Last year, Hinton stunned the scientific community when he received the 2024 Nobel Prize in Physics alongside John J. Hopfield for their foundational work in machine learning that dates back to the early 1980s. Regarding who gets the most credit for AlexNet, Hinton described the project roles with characteristic humor to the Computer History Museum: "Ilya thought we should do it, Alex made it work, and I got the Nobel Prize."
[2]
How Did AlexNet Transform AI? Explore the Groundbreaking Source Code
In partnership with Google, the Computer History Museum has released the source code to AlexNet, the neural network that in 2012 kickstarted today's prevailing approach to AI. The source code is available as open source on CHM's GitHub page. AlexNet is an artificial neural network created to recognize the contents of photographic images. It was developed in 2012 by then University of Toronto graduate students Alex Krizhevsky and Ilya Sutskever and their faculty advisor, Geoffrey Hinton. Hinton is regarded as one of the fathers of deep learning, the type of artificial intelligence that uses neural networks and is the foundation of today's mainstream AI. Simple three-layer neural networks with only one layer of adaptive weights were first built in the late 1950s -- most notably by Cornell researcher Frank Rosenblatt -- but they were found to have limitations. [This explainer gives more details on how neural networks work.] In particular, researchers needed networks with more than one layer of adaptive weights, but there wasn't a good way to train them. By the early 1970s, neural networks had been largely rejected by AI researchers. In the 1980s, neural network research was revived outside the AI community by cognitive scientists at the University of California San Diego, under the new name of "connectionism." After finishing his Ph.D. at the University of Edinburgh in 1978, Hinton had become a postdoctoral fellow at UCSD, where he collaborated with David Rumelhart and Ronald Williams. The three rediscovered the backpropagation algorithm for training neural networks, and in 1986 they published two papers showing that it enabled neural networks to learn multiple layers of features for language and vision tasks. Backpropagation, which is foundational to deep learning today, uses the difference between the current output and the desired output of the network to adjust the weights in each layer, from the output layer backward to the input layer. In 1987, Hinton joined the University of Toronto. Away from the centers of traditional AI, Hinton's work and those of his graduate students made Toronto a center of deep learning research over the coming decades. One postdoctoral student of Hinton's was Yann LeCun, now chief scientist at Meta. While working in Toronto, LeCun showed that when backpropagation was used in "convolutional" neural networks, they became very good at recognizing handwritten numbers. Despite these advances, neural networks could not consistently outperform other types of machine learning algorithms. They needed two developments from outside of AI to pave the way. The first was the emergence of vastly larger amounts of data for training, made available through the Web. The second was enough computational power to perform this training, in the form of 3D graphics chips, known as GPUs. By 2012, the time was ripe for AlexNet. The data needed to train AlexNet was found in ImageNet, a project started and led by Stanford professor Fei-Fei Li. Beginning in 2006, and against conventional wisdom, Li envisioned a dataset of images covering every noun in the English language. She and her graduate students began collecting images found on the Internet and classifying them using a taxonomy provided by WordNet, a database of words and their relationships to each other. Given the enormity of their task, Li and her collaborators ultimately crowdsourced the task of labeling images to gig workers, using Amazon's Mechanical Turk platform. Completed in 2009, ImageNet was larger than any previous image dataset by several orders of magnitude. Li hoped its availability would spur new breakthroughs, and she started a competition in 2010 to encourage research teams to improve their image recognition algorithms. But over the next two years, the best systems only made marginal improvements. The second condition necessary for the success of neural networks was economical access to vast amounts of computation. Neural network training involves a lot of repeated matrix multiplications, preferably done in parallel, something that GPUs are designed to do. NVIDIA, cofounded by CEO Jensen Huang, had led the way in the 2000s in making GPUs more generalizable and programmable for applications beyond 3D graphics, especially with the CUDA programming system released in 2007. Both ImageNet and CUDA were, like neural networks themselves, fairly niche developments that were waiting for the right circumstances to shine. In 2012, AlexNet brought together these elements -- deep neural networks, big datasets, and GPUs -- for the first time, with pathbreaking results. Each of these needed the other. By the late 2000s, Hinton's grad students at the University of Toronto were beginning to use GPUs to train neural networks for both image and speech recognition. Their first successes came in speech recognition, but success in image recognition would point to deep learning as a possible general-purpose solution to AI. One student, Ilya Sutskever, believed that the performance of neural networks would scale with the amount of data available, and the arrival of ImageNet provided the opportunity. In 2011, Sutskever convinced fellow grad student Alex Krizhevsky, who had a keen ability to wring maximum performance out of GPUs, to train a convolutional neural network for ImageNet, with Hinton serving as principal investigator. Krizhevsky had already written CUDA code for a convolutional neural network using NVIDIA GPUs, called cuda-convnet, trained on the much smaller CIFAR-10 image dataset. He extended cuda-convnet with support for multiple GPUs and other features and retrained it on ImageNet. The training was done on a computer with two NVIDIA cards in Krizhevsky's bedroom at his parents' house. Over the course of the next year, he constantly tweaked the network's parameters and retrained it until it achieved performance superior to its competitors. The network would ultimately be named AlexNet, after Krizhevsky. Geoff Hinton summed up the AlexNet project this way: "Ilya thought we should do it, Alex made it work, and I got the Nobel prize." Krizhevsky, Sutskever, and Hinton wrote a paper on AlexNet that was published in the fall of 2012 and presented by Krizhevsky at a computer vision conference in Florence, Italy, in October. Veteran computer vision researchers weren't convinced, but LeCun, who was at the meeting, pronounced it a turning point for AI. He was right. Before AlexNet, almost none of the leading computer vision papers used neural nets. After it, almost all of them would. AlexNet was just the beginning. In the next decade, neural networks would advance to synthesize believable human voices, beat champion Go players, and generate artwork, culminating with the release of ChatGPT in November 2022 by OpenAI, a company cofounded by Sutskever. In 2020, I reached out to Krizhevsky to ask about the possibility of allowing CHM to release the AlexNet source code, due to its historical significance. He connected me to Hinton, who was working at Google at the time. Google owned AlexNet, having acquired DNNresearch, the company owned by Hinton, Sutskever, and Krizhevsky. Hinton got the ball rolling by connecting CHM to the right team at Google. CHM worked with the Google team for five years to negotiate the release. The team also helped us identify the specific version of the AlexNet source code to release -- there have been many versions of AlexNet over the years. There are other repositories of code called AlexNet on GitHub, but many of these are re-creations based on the famous paper, not the original code. CHM is proud to present the source code to the 2012 version of AlexNet, which transformed the field of artificial intelligence. You can access the source code on CHM's GitHub page. This post originally appeared on the blog of the Computer History Museum. Special thanks to Geoffrey Hinton for providing his quote and reviewing the text, to Cade Metz and Alex Krizhevsky for additional clarifications, and to David Bieber and the rest of the team at Google for their work in securing the source code release.
[3]
AlexNet, the AI model that started it all, released in source code form - for all to download
There are many stories of how artificial intelligence came to take over the world, but one of the most important developments is the emergence in 2012 of AlexNet, a neural network that, for the first time, demonstrated a huge jump in a computer's ability to recognize images. Thursday, the Computer History Museum (CHM), in collaboration with Google, released for the first time the AlexNet source code written by University of Toronto graduate student Alex Krizhevsky, placing it on GitHub for all to peruse and download. "CHM is proud to present the source code to the 2012 version of Alex Krizhevsky, Ilya Sutskever, and Geoffery Hinton's AlexNet, which transformed the field of artificial intelligence," write the Museum organizers in the readme file on GitHub. Also: Is OpenAI doomed? Open-source models may crush it, warns expert Krizhevsky's creation would lead to a flood of innovation in the ensuing years, and tons of capital, based on proof that with sufficient data and computing, neural networks could achieve breakthroughs previously viewed as mainly theoretical. The code, which weighs in at a scant 200KB in the source folder, combines Nvidia CUDA code, Python script, and a little bit of C++ to describe how to make a convolutional neural network parse and categorize image files. The Museum's software historian, Hansen Hsu, spent five years negotiating with Google, which owns the rights to the source, to release the code, as he describes in his essay about the legacy of AI and how AlexNet came to be. Krizhevsky was a graduate student under Nobel Prize-winning AI scientist Geoffrey Hinton at the time. A second grad student, Ilya Sutskever, who later co-founded OpenAI, urged Krizhevsky to pursue the project. As Hsu quotes Hinton, "Ilya thought we should do it, Alex made it work, and I got the Nobel Prize." Google owns the AlexNet intellectual property because it acquired Hinton, Krizhevsky, and Sutskever's startup company, DNNResearch. Until AlexNet, Hinton and others had toiled for years to prove that "deep learning" collections of artificial neurons could learn patterns in data. As Hsu notes, AI had become a backwater because it failed to demonstrate meaningful results. The convolutional neural network (CNN) had shown promising starts in performing tasks such as recognizing hand-written digits, but it had not transformed any industries until then. Also: AI scholars win Turing Prize for technique that made possible AlphaGo's chess triumph Hinton and other true believers kept working, refining the design of neural networks, including CNNs, and figuring out in small experiments on Nvidia GPU chips how increasing the number of layers of artificial neurons could theoretically lead to better results. According to Hsu, Sutskever had the insight that the theoretical work could be scaled up to a much larger neural network given enough horsepower and training data. As Sutskever told Nvidia co-founder and CEO Jensen Huang during a fireside chat in 2023, he knew that making neural networks big would work, even if it went against conventional wisdom. "People weren't looking at large neural networks" in 2012, Sutskever told Huang. "People were just training on neural networks with 50, 100 neurons," rather than the millions and billions that later became standard. Sutskever knew they were wrong. Also: What if AI ran ER triage? Here's how it sped up patient care in real-world tests "It wasn't just an intuition; it was, I would argue, an irrefutable argument, which went like this: If your neural network is deep and large, then it could be configured to solve a hard task." The trio found the training data they needed in ImageNet, which was a new creation by Stanford University professor Fei Fei Li at the time. Li had herself bucked conventional wisdom in enlisting Amazon Mechanical Turk workers to hand-label 14 million images of every kind of object, a data set much larger than any computer vision data set at the time. "It seemed like this unbelievably difficult dataset, but it was clear that if we were to train a large convolutional neural network on this dataset, it must succeed if we just can have the compute," Sutskever told Huang in 2023. The fast computing they needed turned out to be a dual-GPU desktop computer that Krizhevsky worked on in his bedroom at his parents' house. Also: What is DeepSeek AI? Is it safe? Here's everything you need to know When the work was presented at the ImageNet annual competition in September of 2012, AlexNet scored almost 11 points better than the closest competitor, a 15.3% error rate. They described the work in a formal paper. Yann LeCun, chief AI scientist at Meta Platforms, who had earlier studied under Hinton and had pioneered CNN engineering in the 1990s, proclaimed AlexNet at the time to be a turning point. "He was right," writes Hsu. "Before AlexNet, almost none of the leading computer vision papers used neural nets. After it, almost all of them would." What the trio had done was to make good on all the theoretical work on making "deep" neural networks out of many more layers of neurons, to prove that they could really learn patterns. "AlexNet was just the beginning," writes Hsu. "In the next decade, neural networks would advance to synthesize believable human voices, beat champion Go players, model human language, and generate artwork, culminating with the release of ChatGPT in 2022 by OpenAI, a company co-founded by Sutskever." Sutskever would later prove once again that making neural networks bigger could lead to surprising breakthroughs. The arrival of ChatGPT in the fall of 2022, another shot heard around the world, was the result of all the GPT 1, 2, and 3 models before it. Those models were all a result of Sutskever's faith in scaling neural networks to unprecedented size. Also: The CTO vs. CMO AI power struggle - who should really be in charge? "I had a very strong belief that bigger is better and that one of the goals that we had at OpenAI is to figure out how to use the scale correctly," he told Huang in 2023. Huang credited the trio during his keynote speech at the Consumer Electronics Show in January. "In 2012, Alex Krizhevsky, Ilya Sutskever, and Geoff Hinton discovered CUDA," said Huang, "used it to process AlexNet, and the rest is history." The release of AlexNet in source code form has interesting timing. It arrives just as the AI field and the entire world economy are enthralled with another open-source model, DeepSeek AI's R1.
[4]
AlexNet, the AI model that started it all, released in source code form
There are many stories of how artificial intelligence came to take over the world, but one of the most important developments is the emergence in 2012 of AlexNet, a neural network that, for the first time, demonstrated a huge jump in a computer's ability to recognize images. Thursday, the Computer History Museum (CHM), in collaboration with Google, released for the first time the AlexNet source code written by University of Toronto graduate student Alex Krizhevsky, placing it on GitHub for all to peruse and download. "CHM is proud to present the source code to the 2012 version of Alex Krizhevsky, Ilya Sutskever, and Geoffery Hinton's AlexNet, which transformed the field of artificial intelligence," write the Museum organizers in the readme file on GitHub. Also: Is OpenAI doomed? Open-source models may crush it, warns expert Krizhevsky's creation would lead to a flood of innovation in the ensuing years, and tons of capital, based on proof that with sufficient data and computing, neural networks could achieve breakthroughs previously viewed as mainly theoretical. The code, which weighs in at a scant 200KB in the source folder, combines Nvidia CUDA code, Python script, and a little bit of C++ to describe how to make a convolutional neural network parse and categorize image files. The Museum's software historian, Hansen Hsu, spent five years negotiating with Google, which owns the rights to the source, to release the code, as he describes in his essay about the legacy of AI and how AlexNet came to be. Krizhevsky was a graduate student under Nobel Prize-winning AI scientist Geoffrey Hinton at the time. A second grad student, Ilya Sutskever, who later co-founded OpenAI, urged Krizhevsky to pursue the project. As Hsu quotes Hinton, "Ilya thought we should do it, Alex made it work, and I got the Nobel Prize." Google owns the AlexNet intellectual property because it acquired Hinton, Krizhevsky, and Sutskever's startup company, DNNResearch. Until AlexNet, Hinton and others had toiled for years to prove that "deep learning" collections of artificial neurons could learn patterns in data. As Hsu notes, AI had become a backwater because it failed to demonstrate meaningful results. The convolutional neural network (CNN) had shown promising starts in performing tasks such as recognizing hand-written digits, but it had not transformed any industries until then. Also: AI scholars win Turing Prize for technique that made possible AlphaGo's chess triumph Hinton and other true believers kept working, refining the design of neural networks, including CNNs, and figuring out in small experiments on Nvidia GPU chips how increasing the number of layers of artificial neurons could theoretically lead to better results. According to Hsu, Sutskever had the insight that the theoretical work could be scaled up to a much larger neural network given enough horsepower and training data. As Sutskever told Nvidia co-founder and CEO Jensen Huang during a fireside chat in 2023, he knew that making neural networks big would work, even if it went against conventional wisdom. "People weren't looking at large neural networks" in 2012, Sutskever told Huang. "People were just training on neural networks with 50, 100 neurons," rather than the millions and billions that later became standard. Sutskever knew they were wrong. Also: What if AI ran ER triage? Here's how it sped up patient care in real-world tests "It wasn't just an intuition; it was, I would argue, an irrefutable argument, which went like this: If your neural network is deep and large, then it could be configured to solve a hard task." The trio found the training data they needed in ImageNet, which was a new creation by Stanford University professor Fei Fei Li at the time. Li had herself bucked conventional wisdom in enlisting Amazon Mechanical Turk workers to hand-label 14 million images of every kind of object, a data set much larger than any computer vision data set at the time. "It seemed like this unbelievably difficult dataset, but it was clear that if we were to train a large convolutional neural network on this dataset, it must succeed if we just can have the compute," Sutskever told Huang in 2023. The fast computing they needed turned out to be a dual-GPU desktop computer that Krizhevsky worked on in his bedroom at his parents' house. Also: What is DeepSeek AI? Is it safe? Here's everything you need to know When the work was presented at the ImageNet annual competition in September of 2012, AlexNet scored almost 11 points better than the closest competitor, a 15.3% error rate. They described the work in a formal paper. Yann LeCun, chief AI scientist at Meta Platforms, who had earlier studied under Hinton and had pioneered CNN engineering in the 1990s, proclaimed AlexNet at the time to be a turning point. "He was right," writes Hsu. "Before AlexNet, almost none of the leading computer vision papers used neural nets. After it, almost all of them would." What the trio had done was to make good on all the theoretical work on making "deep" neural networks out of many more layers of neurons, to prove that they could really learn patterns. "AlexNet was just the beginning," writes Hsu. "In the next decade, neural networks would advance to synthesize believable human voices, beat champion Go players, model human language, and generate artwork, culminating with the release of ChatGPT in 2022 by OpenAI, a company co-founded by Sutskever." Sutskever would later prove once again that making neural networks bigger could lead to surprising breakthroughs. The arrival of ChatGPT in the fall of 2022, another shot heard around the world, was the result of all the GPT 1, 2, and 3 models before it. Those models were all a result of Sutskever's faith in scaling neural networks to unprecedented size. Also: The CTO vs. CMO AI power struggle - who should really be in charge? "I had a very strong belief that bigger is better and that one of the goals that we had at OpenAI is to figure out how to use the scale correctly," he told Huang in 2023. Huang credited the trio during his keynote speech at the Consumer Electronics Show in January. "In 2012, Alex Krizhevsky, Ilya Sutskever, and Geoff Hinton discovered CUDA," said Huang, "used it to process AlexNet, and the rest is history." The release of AlexNet in source code form has interesting timing. It arrives just as the AI field and the entire world economy are enthralled with another open-source model, DeepSeek AI's R1.
[5]
Before ChatGPT, there was AlexNet: the AI code that started it all is now open source
The big picture: You can't go five minutes these days without hearing about AI this and AI that. But have you ever wondered how we got here? The credit largely goes to a groundbreaking neural network from 2012 called AlexNet. While it didn't cause an immediate sensation, it ultimately became the foundation for the deep learning revolution we're experiencing today. Now, after years of negotiations, the original source code has finally been released to the public. That's thanks to a collaborative effort between the Computer History Museum and Google. The source code, originally written by University of Toronto graduate student Alex Krizhevsky, has now been uploaded to GitHub. AlexNet was a neural network that marked a major breakthrough in a computer's ability to recognize and classify images. By 2012, the theory behind neural networks - including the pivotal backpropagation algorithm - had been around for decades. However, two key components were missing: the massive datasets needed to train these networks and the raw computing power required to process them. Initiatives like Stanford's ImageNet project and Nvidia's CUDA GPU programming finally provided those crucial elements. These advancements enabled Krizhevsky, working under AI pioneers Geoffrey Hinton and Ilya Sutskever, to train AlexNet and unlock the full potential of deep learning. For the first time, deep neural networks, big datasets, and GPU computing came together with groundbreaking results. Each was essential to the other. The home computer with GPUs used to create AlexNet. Eventually, the AlexNet paper was presented at a 2012 computer vision conference. At the time, most researchers shrugged it off, but Yann LeCun - now recognized as a leading AI pioneer - immediately saw its significance, calling it a turning point for the field. His prediction proved accurate. After AlexNet's unveiling, neural networks quickly became the foundation of nearly all cutting-edge computer vision research. AlexNet's breakthrough lay in demonstrating that training a relatively simple neural network could achieve superhuman performance on highly complex tasks like image recognition. This marked the birth of the deep learning paradigm, in which machines master skills by ingesting and modeling vast datasets. From that moment, progress accelerated rapidly. Neural networks evolved at an unprecedented pace, leading to milestones such as defeating human champions at Go, synthesizing realistic speech and music, and even generating original art and creative writing. However, the real turning point for generative AI came with the 2022 release of OpenAI's ChatGPT, which arguably represented the pinnacle of deep learning's evolution. Understandably, open-sourcing such a historically significant piece of code was no simple task. The Computer History Museum had to navigate a five-year negotiation process with Krizhevsky, Hinton (now at Google), and Google's legal team before securing approval to publish the original source files.
[6]
The 2012 source code for AlexNet, the precursor to modern AI, is now on Github thanks to Google and the Computer History Museum
One of the first modern neural networks makes its way to Github. AI is one of the biggest and most all-consuming zeitgeists I've ever seen in technology. I can't even search the internet without being served several ads about potential AI products, including the one that's still begging for permissions to run my devices. AI may be everywhere we look in 2025, but the kind of neural networks now associated with it are a bit older. This kind of AI was actually being dabbled with as far back as the 1950's, though it wasn't until 2012 that we saw it kick off the current generation of machine learning with AlexNet; an image recognition bot whose code has just been released as open source by Google and the Computer History Museum. We've seen many different ideas of AI over the years, but generally the term is used in reference to computers or machines with self learning capabilities. While the concept has been talked about by science-fiction writers since the 1800's, it's far from being fully realised. Today most of what we call AI refers to language models and machine learning, as opposed to unique individual thought or reasoning by a machine. This kind of deep learning technique is essentially feeding computers large sets of data to train them on specific tasks. The idea of deep learning also isn't new. In the 1950's researchers like Frank Rosenblatt at Cornell had already created a simplified machine learning neural network using similar foundational ideas to what we have today. Unfortunately the technology hadn't quite caught up to the idea, and was largely rejected. It wasn't until the 1980's that we really saw machine learning come up once again. In 1986, Geoffrey Hinton, David Rumelhart and Ronald J. Williams, published a paper around backpropagation, an algorithm that applies appropriate weights to the responses of a neural network based on the cost. They weren't the first to raise the idea, but rather the first that managed to popularise it. Backpropagation as an idea for machine learning was raised by several including Frank Rosenblatt as early as the '60s but couldn't really be implemented. Many also credit it as a machine learning implementation of the chain rule, for which the earliest written attribution is to Gottfried Wilhelm Leibniz in 1676. Despite promising results, the technology wasn't quite up to the speed required to make this kind of deep learning viable. To bring AI up to the level we see today we needed a heap more data to train them on, and much higher level computational power in order to achieve this. In 2006 professor Fei-Fei Li at Stanford University began building ImageNet. Li envisioned a database that held an image for every English noun, so she and her students began collecting and categorising photographs. They used WordNet, an established collection of words and relationships to identify the images. The task was so huge it was eventually outsourced to freelancers until it was realised as by far the largest dataset of its kind in 2009. It was around the same time Nvidia was working on the CUDA programming system for its GPUs. This is the company which just went hard on AI at 2025's GTC, and is even using the tech to help people learn sign language. With CUDA, these powerful compute chips could be far more easily programmed to tackle things other than just visual graphics. This allowed researchers to start implementing neural networks in areas like speech recognition, and actually see success. In 2011 two such students under Goeffrey Hinton, Ilya Sutskever (who went on to co-found OpenAI) and Alex Krizhevsky began work on what would become AlexNet. Sutskever saw the potential from their previous work, and convinced his peer Krizhevsky to use his mastery of GPU squeezing to train this neural network, while Hinton acted as principal investigator. Over the next year Krizhevsky trained, tweaked, and retrained the system on a single computer using two Nvidia GPUs with his own CUDA code. In 2012 the three released a paper which Hinton also presented at a computer vision conference in Florence. Hinton summarised the experience to CHM as "Ilya thought we should do it, Alex made it work, and I got the Nobel Prize." It didn't make much noise at the time, but AlexNet completely changed the direction of modern AI. Before AlexNet, neural networks weren't commonplace in these developments. Now, they're the framework for most anything touting the name AI, from robot dogs with nervous systems to miracle working headsets. As computers get more powerful we're only set to see even more of it. Given how huge AlexNet has been for AI, CHM releasing the source code is not only a wonderful nod, but also quite prudent in making sure this information is freely available to all. To ensure it was done fairly, correctly -- and above all legally -- CHM reached out to AlexNet's namesake, Alex Krizhevsky, who put them in touch with Hinton who was working with Google after being acquired. Now, considered one of the fathers of machine learning, Hinton was able to connect CHM to the right team at Google who began a five-year negotiation process before release This may mean the code, available to all on Github might be a somewhat sanitised version of AlexNet, but it's also the correct one. There are several with similar or even the same name around, but they're likely to be homages or interpretations. This upload is described as the "AlexNet source code as it was in 2012" so it should serve as an interesting marker along the pathway to AI, and whatever form it learns to take in the future.
Share
Copy Link
Google and the Computer History Museum have released the source code for AlexNet, the neural network that revolutionized AI in 2012 by proving the effectiveness of deep learning in image recognition tasks.
In 2012, a groundbreaking artificial intelligence model called AlexNet emerged, transforming the field of computer vision and kickstarting the modern AI boom. Developed by University of Toronto graduate students Alex Krizhevsky and Ilya Sutskever, along with their advisor Geoffrey Hinton, AlexNet demonstrated unprecedented accuracy in image recognition tasks 12.
AlexNet's success was the result of three key components coming together:
Deep Neural Networks: Building on decades of theoretical work, AlexNet utilized a multi-layered convolutional neural network (CNN) architecture 13.
Big Data: The model was trained on ImageNet, a massive dataset of labeled images created by Stanford professor Fei-Fei Li 24.
GPU Computing: AlexNet leveraged the parallel processing power of NVIDIA graphics cards, using CUDA technology to accelerate training 13.
AlexNet's breakthrough moment came at the 2012 ImageNet competition, where it achieved a 15.3% error rate in image classification, nearly 11 percentage points better than the closest competitor 4. This performance gap stunned the AI community, with computer vision expert Yann LeCun declaring it "an unequivocal turning point in the history of computer vision" 1.
The success of AlexNet sparked a revolution in AI research and applications:
Widespread Adoption: Neural networks quickly became the dominant approach in computer vision and other AI domains 4.
Industry Transformation: Deep learning techniques spread to various fields, including speech recognition, natural language processing, and autonomous vehicles 1.
Tech Giants Take Notice: Google acquired the team's startup, DNNresearch, in 2013 24.
On March 14, 2025, Google and the Computer History Museum jointly released AlexNet's original source code, making it available on GitHub 12. This release came after five years of negotiations, spearheaded by CHM curator Hansen Hsu 34.
The AlexNet source code, a mere 200KB in size, combines:
AlexNet's success paved the way for rapid advancements in AI, culminating in modern language models like ChatGPT. Ilya Sutskever, who co-founded OpenAI, continued to push the boundaries of neural network scaling, directly influencing the development of GPT models 34.
As AI continues to evolve and impact various aspects of society, the release of AlexNet's source code serves as a valuable historical artifact, offering insights into the foundations of the ongoing AI revolution 12.
Anthropic reveals sophisticated cybercriminals are using its Claude AI to automate and scale up attacks, including a large-scale data extortion campaign targeting 17 organizations.
12 Sources
Technology
10 hrs ago
12 Sources
Technology
10 hrs ago
Google's latest Pixel 10 series showcases significant AI advancements while maintaining familiar hardware, offering a blend of innovative features and reliable performance.
35 Sources
Technology
3 hrs ago
35 Sources
Technology
3 hrs ago
China aims to significantly increase its AI chip production capacity, with plans to triple output by 2026. This move is part of a broader strategy to reduce dependence on foreign technology, particularly Nvidia, and develop a robust domestic AI ecosystem.
5 Sources
Technology
10 hrs ago
5 Sources
Technology
10 hrs ago
The massive influx of AI investments is boosting the real economy, but concerns about a potential bubble are growing as the industry faces scrutiny and mixed results.
2 Sources
Business
19 hrs ago
2 Sources
Business
19 hrs ago
OpenAI and Anthropic, two leading AI labs, conducted joint safety testing on their AI models, revealing insights into hallucinations, sycophancy, and other critical issues in AI development.
2 Sources
Technology
10 hrs ago
2 Sources
Technology
10 hrs ago