Curated by THEOUTPOST
On Tue, 25 Mar, 4:02 PM UTC
2 Sources
[1]
Software Engineer Runs Generative AI on 20-Year-Old PowerBook G4
In a blog post this week, software engineer Andrew Rossignol (my brother!) detailed how he managed to run generative AI on an old PowerBook G4. While hardware requirements for large language models (LLMs) are typically high, this particular PowerBook G4 model from 2005 is equipped with a mere 1.5GHz PowerPC G4 processor and 1GB of RAM. Despite this 20-year-old hardware, my brother was able to achieve inference with Meta's LLM model Llama 2 on the laptop. The experiment involved porting the open-source llama2.c project, and then accelerating performance with a PowerPC vector extension called AltiVec.
[2]
A Powerbook G4 can run an LLM -- but not fast enough to be practical
A PowerBook G4 running a TinyStories 110M Llama2 LLM inference -- Image credit: Andrew Rossignol/TheResistorNetwork A software developer has proven it is possible to run a modern LLM on old hardware like a 2005 PowerBook G4, albeit nowhere near the speeds expected by consumers. Most artificial intelligence projects, such as the constant push for Apple Intelligence, leans on having a powerful enough device to handle queries locally. This has meant that newer computers and processors, such as the latest A-series chips in the iPhone 16 generation, tend to be used for AI applications, simply due to having enough performance for it to work. In a blog post published on Monday by The Resistor Network, Andrew Rossignol -- the brother of Joe Rossignol at MacRumors -- writes about his work getting a modern large language model (LLM) to run on older hardware. What was available to him was a 2005 PowerBook G4, equipped with a 1.5GHz processor, a gigabyte of memory, and architecture limitations such as a 32-bit address space. After checking out the llama2.c project to implement the Llama2 LLM inference with a single vanilla C file and no accelerators, Rossignol forked the core of the project to make some improvements. Those improvements included wrappers for system functions, organizing the code into a library with a public API, and eventually porting the project to run on a PowerPC Mac. This latter point involved issues with the "big-endian" processor, where the model checkpoint and tokenizers instead expect the use of "little-endian" processors, referring to byte ordering systems. The recommendation by the llama2.c project was to use the TinyStories model, which was used to maximize the chance of outputs without specialized hardware acceleration, such as a modern GPU. Testing was mostly done with the 15 million-parameter (15M) variant of the model, before switching to the 110M version, as anything higher would be too large for the address space. The number of parameters used in a model can result in a more complex model, so the aim is to get as many as possible in use for it to generate an accurate response, without sacrificing the speed of the response. Given the severe constraints of the project, it was a case of choosing models that were small enough to be usable. To compare the performance of the PowerBook G4 project, it was put against a single Intel Xeon Silver 4216 core clocked at 3.2GHz. The benchmark resulted in a test time for a query of 26.5 seconds and 6.91 tokens per second. Running the same code on the PowerBook G4 worked, but at a much slower rate of 4 minutes, or nine times slower than the single Xeon core. With more optimizations, including using vector extensions like AltiVec, another half a minute was shaved off the inference operation, or making the PowerBook G4 just eight times slower. It was found that the selected models were capable of producing "whimsical children's stories." This helped lighten the mood during debugging. It seems unlikely that there will be much more performance that could be squeezed out of the test hardware, due to limitations like its use of 32-bit and a maximum addressable memory of 4GB. While quantization could help, there's too little address space to be usable. Admitting that the project probably stops at this point for the moment, Rossignol offers that the project "has been a great way to get my toes wet with LLMs and how they operate." He also adds that "it is fairly impressive that a computer which is 15 years junior [to the Xeon] can do this at all." This demonstration of older hardware running a modern LLM gives hope to users that their older hardware could be brought out of retirement and still be used with AI. However, while keeping in mind that the cutting edge software developments will run with limitations and a considerably slower speed than modern hardware. Short of the discovery of extreme optimizations to minimize the processing requirements, those working on LLMs and AI in general will still have to keep buying more modern hardware for the task.
Share
Share
Copy Link
A software engineer successfully ran a modern large language model (LLM) on a 2005 PowerBook G4, demonstrating the potential for AI to operate on older hardware, albeit with significant performance limitations.
In a remarkable demonstration of artificial intelligence's adaptability, software engineer Andrew Rossignol has successfully run a generative AI model on a 20-year-old PowerBook G4. This experiment, detailed in a recent blog post, pushes the boundaries of what's possible with older hardware in the age of AI 1.
Rossignol's project involved running Meta's LLM model Llama 2 on a 2005 PowerBook G4, equipped with a 1.5GHz PowerPC G4 processor and 1GB of RAM. This hardware, considered antiquated by today's standards, presents significant challenges for running modern AI models 1.
The experiment utilized the open-source llama2.c project, which implements Llama2 LLM inference using a single vanilla C file. Rossignol made several improvements to the project, including:
One of the main hurdles was the PowerBook G4's "big-endian" processor architecture, which conflicted with the "little-endian" expectations of the model checkpoint and tokenizers. Rossignol had to address these byte ordering issues to make the project functional 2.
To assess the PowerBook G4's performance, Rossignol compared it to a single Intel Xeon Silver 4216 core clocked at 3.2GHz:
The optimization included using vector extensions like AltiVec, which improved performance slightly 2.
Due to hardware constraints, Rossignol used the TinyStories model, focusing on the 15 million-parameter (15M) and 110 million-parameter (110M) variants. The 32-bit address space of the PowerBook G4 limited the use of larger models 2.
While the experiment proves that older hardware can run modern LLMs, the performance is far from practical for everyday use. However, this demonstration opens up possibilities for repurposing older devices for AI applications, albeit with limitations 2.
Rossignol acknowledges that further significant improvements are unlikely due to hardware limitations. Nevertheless, he views the project as a valuable learning experience in understanding LLMs and their operations 2.
As AI continues to evolve, this experiment highlights the potential for broader hardware compatibility. However, it also underscores the need for modern, powerful hardware to run cutting-edge AI applications efficiently.
Reference
Innovative developers have successfully adapted Meta's Llama 2 AI model to run on outdated hardware, including a Windows 98 Pentium II PC and an Xbox 360 console, showcasing the potential for AI accessibility on diverse platforms.
2 Sources
2 Sources
Apple's latest Mac Studio, featuring M4 Max and M3 Ultra chips, offers unprecedented power and AI capabilities, challenging even dedicated Windows users to reconsider their preferences.
10 Sources
10 Sources
An exploration of the growing trend of running powerful AI models like DeepSeek R1 locally on personal computers, highlighting the benefits, challenges, and implications for privacy and accessibility.
7 Sources
7 Sources
Apple and NVIDIA have joined forces to integrate the ReDrafter technique into NVIDIA's TensorRT-LLM framework, significantly improving the speed and efficiency of large language models.
3 Sources
3 Sources
Meta has released Llama 3, an open-source AI model that can run on smartphones. This new version includes vision capabilities and is freely accessible, marking a significant step in AI democratization.
3 Sources
3 Sources
The Outpost is a comprehensive collection of curated artificial intelligence software tools that cater to the needs of small business owners, bloggers, artists, musicians, entrepreneurs, marketers, writers, and researchers.
© 2025 TheOutpost.AI All rights reserved