Running Llama LLM locally without a GPU

A few mornings (months!) ago, I read that Meta has made LLaMA available to the public and you can host the LLMs on your local machine. I have always associated AIs with GPUs, and GPUs are expensive. So the extent of my understanding of AI was that it was expensive—and ChatGPT is free.

Then, out of sheer curiosity, I decided to search if I would be able to run a local LLM on my i5, 16 GB Windows machine. Voila, apparently I could do that in less than half an hour. Trust me, I was neck-deep in work when I decided to take this route.

So there you go—a step-by-step (no code required!) way to run a LLaMA 3.2:1b model on your basic Windows machine. I’m not sure if anything above this model will work with such a configuration, but LLaMA 3.2:1b does.

Step 1: Installing Docker Desktop

If you don’t have Docker Desktop, install it. It’s just a few “Next” clicks and one “Install” button. You can download it from here—just choose your desktop configuration, download, and install.

Now, when you install docker you will need linux subsystem installed. Do not worry! just run the command below in your command prompt (Windows+R). You can do this step before or after you install docker desktop. You might want to restart the system as well.

wsl.exe --update

Now, if you want to know what Docker is, try ChatGPT — it can give you detailed explanations.

Also, you’ll need Docker Desktop only to run Open Web UI. It gives you an interface similar to ChatGPT to interact with your LLM.

The latest versions of Docker come with a lot of AI goodies. More on that later!

Step 2: Installing Ollama

Now, Ollama is a tool that lets you run large language models on your local machine.
You can download and install it from here. Just choose your OS and follow the steps — the installation takes just a few clicks.

Step 3: Installing Llama 3.2:1b

Once Ollama is installed, open up command prompt again and run this command.

ollama run llama3.2:1b

his step may take some time, depending on your internet speed. It took me about 30 minutes to download and install. Once everything is done (and you'll know when it is), run the following command in your command prompt:

Ollama list

Once you execute this command, you will see something like this.

This means your Llama3.2:1b is running in your local machine.

Try it out by typing this command,

Ollama run llama3.2:1b

You should see this.

Now your local LLaMA is up and running — and you can ask it all the questions you want. This model is lightweight and fast, but don’t expect miracles from it. It can struggle with complex prompts and larger token sizes (more on that later). Still, it’s a great way to get a feel for how local LLMs work!

Step 4: Installing Open Web-UI

This step is important if you want a web interface—similar to what ChatGPT offers—on your local machine. With Open WebUI, you can also create your own workspaces, attach proprietary knowledge, and get responses based on that data.

More on this in the future!

Since you have Docker installed and LLaMA running on your system, open the Docker terminal to run the next command. In the bottom right corner of the Docker Desktop app, you’ll see a button labeled ">_ Terminal". Click on it — this will open the Docker CLI.

Then, just paste the entire command there and hit enter.

docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Wait for a few seconds, you will see this in your docker desktop under the "Container" tab.

You’ll see “Port(s)” listed as 3000:8080 — click on that, and it will take you to the home page of Open WebUI.

Sign up as an Admin using any email and password (it doesn’t matter which, as long as you use the same credentials to log in later).

Once you're logged in, you’ll see LLaMA 3.2:1b displayed on the top left corner of the UI.

And that’s it — you’re now a proud user of your own LLM running right on your desktop!

If you have any questions, feel free to reach out to me! Happy prompting!!

Search This Blog

Sankar's Blog

Running Llama LLM locally without a GPU

Comments

Post a Comment

Popular posts from this blog

Building a Deep Research Workflow with n8n: My Setup, Tools, and Learnings