Files
oneuptime/LLM
Simon Larsen 733901a870 refactor: Use array of messages in CopilotActionsBase.ts, RefactorCode.ts, ImproveReadme.ts, ImproveVariableNames.ts, FixGrammarAndSpelling.ts, and WriteUnitTests.ts
This refactor changes the code in CopilotActionsBase.ts, RefactorCode.ts, ImproveReadme.ts, ImproveVariableNames.ts, FixGrammarAndSpelling.ts, and WriteUnitTests.ts to use an array of messages instead of a single prompt. Each message in the array contains the content and role of the prompt, improving flexibility and readability.
2024-07-02 11:19:53 +00:00
..

LLM

Development Guide

Step 1: Downloading Model from Hugging Face

Please make sure you have git lfs installed before cloning the model.

git lfs install
cd ./LLM/Models
# Here we are downloading the Meta-Llama-3-8B-Instruct model
git clone https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct

You will be asked for username and password. Please use Hugging Face Username as Username and, Hugging Face API Token as Password.

Step 2: Install Docker.

Install Docker and Docker Compose

sudo apt-get update
sudo curl -sSL https://get.docker.com/ | sh  

Install Rootless Docker

sudo apt-get install -y uidmap
dockerd-rootless-setuptool.sh install

See if the installation works

docker --version
docker ps 

# You should see no containers running, but you should not see any errors. 

Step 3: Insall nvidia drivers on the machine to use GPU

Step 4: Run the test workload to see if GPU is connected to Docker.

docker run --rm -it --gpus=all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark

You have configured the machine to use GPU with Docker.

Build

  • Download models from meta
  • Once the model is downloaded, place them in the Llama/Models folder. Please make sure you also place tokenizer.model and tokenizer_checklist.chk in the same folder.
  • Edit Dockerfile to include the model name in the MODEL_NAME variable.
  • Docker build
npm run build-ai

Run

npm run start-ai    

After you start, run nvidia-smi to see if the GPU is being used. You should see the python process running on the GPU.