On May 3rd, 2024, the world experienced a collective moment of panic as ChatGPT and Gemini, two of the most widely used AI language models, went down for over two hours. This outage served as a stark reminder of how deeply we have become reliant on these generative AI (GenAI) tools in our daily lives and work.
As we scrambled to find alternative solutions, many of us were left feeling frustrated and helpless. It was during this time that I remembered the recent release of Llama 3, an open-source large language model developed by Meta. Unlike the proprietary models of ChatGPT and Gemini, Llama 3 can be run locally on your computer, providing a level of independence and control that is increasingly valuable in our rapidly evolving technological landscape.
Prerequisite
Some things need to be installed on your computer
node
npm
Llama3
Meta releasing their LLM open source is a net benefit for the tech community at large, and their permissive license allows most medium and small businesses to use their LLMs with little to no restrictions (within the bounds of the law, of course). Their latest release is Llama 3, which has been highly anticipated.
Llama 3 comes in two sizes: 8 billion and 70 billion parameters. This kind of model is trained on a massive amount of text data and can be used for a variety of tasks, including generating text, translating languages, writing different kinds of creative content, and answering your questions in an informative way. Meta touts Llama 3 as one of the best open models available, but it is still under development. Here are the 8B model benchmarks when compared to Mistral and Gemma (according to Meta).
This begs the question: how can I, the regular individual, run these models locally on my computer?
Getting Started with Ollama
That’s where Ollama comes in! Ollama is a free and open-source application that allows you to run various large language models, including Llama 3, on your own computer, even with limited resources. Ollama takes advantage of the performance gains of llama.cpp, an open-source library designed to allow you to run LLMs locally with relatively low hardware requirements. It also includes a sort of package manager, allowing you to download and use LLMs quickly and effectively with just a single command.
The first step is installing Ollama. It supports all 3 of the major OS.
Once this is installed, open up your terminal. On all platforms, the command is the same.
ollama run llama3
Wait a few minutes while it downloads and loads the model, and then start chatting! It should bring you to a chat prompt similar to this one.
You can chat all day within this terminal chat, but what if you want something more ChatGPT-like?
ChatBot WebUI
First, we have to clone this repo chatbot-ollama in your local machine, You can use this command to so do
git clone https://github.com/ivanfioravanti/chatbot-ollama.git
Now cd in that repo and install all packages with this command
npm ci
Now we have to create .env
a file where will we scarify your public IP where Ollama is running with the llama3 model.
# Chatbot Ollama
DEFAULT_MODEL="mistral:latest"
NEXT_PUBLIC_DEFAULT_SYSTEM_PROMPT=""
OLLAMA_HOST="http://0.0.0.0:11434"
Paste this code in your .env
file in the root folder.
Now every everything is set let's start the Chatbot application by just running these commands.
npm run dev
It will take fu sec to start then you visit on localhost:3000
make sure there is no other application running on that port
Nice!!! As you can see now you have your ai chatbot just like ChatGPT.
For GPU User
So I'm running on my Mac laptop, but if you have a PC with a powerful Nvidia GPU or AMD GPU, I highly recommend downloading the Llama3 with 70 billion parameters. Which is a much faster and much better answer.
You have to run this command.
ollama run llama3:8b
The model file will be 40GB which will take time to download but will work way better than the model with 4GB.
Conclusion
The ability to run powerful AI models locally has become increasingly valuable. With Ollama and Llama 3, you can ensure that you have access to the tools you need, whenever you need them, without being at the mercy of external service providers.