Run Local LLMs on Hardware from $50 to $50,000 - We Test and Compare!

Channel

Dave tests llama3.1 and llama3.2 using Ollama on a Raspberry Pi, a Herk Orion Mini PC, a 3970X, an M2 Mac Pro, and a 7995WX featuring the NVidia 6000 Ada GPU. See the results live! Check out my book on the autism spectrum! https://amzn.to/4elzfQv

I’m doing something that’s never been done before: we’re going to run a ChatGPT-style large language model locally on a wide range of hardware, from a $50 Raspberry Pi 4 all the way up to a $50,000 Dell AI workstation with dual Nvidia 6000 Ada cards.

If you saw my last video, you know I caught some heat for using top-tier hardware and running everything in WSL on Windows. This time, I’m doing things differently. We’re starting small and budget-friendly—no Linux shenanigans—and testing out local inference on everything from a Pi to a high-performance mini PC, a gaming rig, an M2 Mac Pro, and, of course, that beast of a Dell workstation.

Along the way, I’ll show you how to install Ollama on Windows, and we’ll compare how well each machine can handle models like Llama 3.1 and even the monstrous 405-billion parameter model. Which system will shine? Which one will falter? Can a Raspberry Pi even handle a large language model at all? And what happens when we push the $50,000 workstation to its limit?

If you’ve ever wondered what it takes to run a large language model locally, or just want to see how different hardware stacks up, this episode is for you! Be sure to stick around to the end for some surprising results.

💻 Hardware tested in this episode:

Raspberry Pi 4 (8GB RAM)
Orion Herk Mini PC (Ryzen 9 7940HS)
Desktop Gaming PC (Threadripper 3970X & Nvidia 4080)
Apple M2 Mac Pro
Dell Threadripper Workstation (96 cores & Nvidia 6000 Ada)

Check out Dave’s Attic for behind-the-scenes Q&A on episodes like this one.!
HTTP://youtube.com/@UCtb6a_CnmGbSns9G8W2Ny0w

Follow me on Facebook for daily updates!
HTTP://fb.com/davepl
Twitter: @davepl1968davepl1968