Cursor, CodeGPT and DeepSeek R1 Ollama GPU in Docker on Ubuntu 24

I have a Dell XPS with 64GB of RAM and a NVIDA GPU, I use the NVIDIA Containers Docker toolchain.

I want to try the AI code generation stack using commercial open source using Cursor, CodeGPT and Ollama hosting Deepseek R1.

Using this CodeGPT founders X post as inspiration

Download AppImage

cursor-0.44.11-build-250103fqxdt5u9z-x86_64.AppImage

Install Cursor

From the Cursor website

Workaround an issue in Electron upstream from Cursor.

sudo sysctl -w kernel.apparmor_restrict_unprivileged_userns=0

Install pre-requisites

sudo apt install libfuse-dev

Make AppImage executable

chmod +x cursor-0.44.11-build-250103fqxdt5u9z-x86_64.AppImage

Run Cursor AppImage

./cursor-0.44.11-build-250103fqxdt5u9z-x86_64.AppImage

Signup or signin using Gmail, Github or email.

Open Project

Install CodeGPT in Cursor

From the CodeGPT installation official docs.

Search for CodeGPT: Chat & AI Agents in the Marketplace bar.

Cursor seems to be a wrapper around VSCode, based on Electron.

Install Ollama

Inside Docker Compose I spin up an Ollama NVIDIA GPU enabled Docker container, using the following compose.yml

networks:
  internet: {}
  data: {}

services:

  cuda:
    image: nvidia/cuda:12.3.1-base-ubuntu20.04
    command: nvidia-smi
    deploy:
      resources:
        reservations:
          devices:
          - capabilities: ["utility"]
            count: all

  store_ollama:
    volumes:
      - ./ollama/ollama:/root/.ollama
    container_name: store-ollama
    pull_policy: always
    tty: true
    restart: unless-stopped
    image: ollama/ollama:latest
    ports:
      -  11434:11434
    environment:
      - OLLAMA_KEEP_ALIVE=24h
    networks:
      - data
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

Validate CUDA

We want to make sure Docker CUDA is up and running with the NVIDIA GPU in the coontainer.

01:54:08 niccolox@devekko devekko.store ±|master ✗|→ docker compose up cuda
WARN[0000] Found orphan containers ([ollama]) for this project. If you removed or renamed this service in your compose file, you can run this command with the --remove-orphans flag to clean it up. 
[+] Running 1/1
 ✔ Container devekkostore-cuda-1  Created                                                                                                                                                                                                                                                    0.0s 
Attaching to cuda-1
cuda-1  | Fri Jan 24 21:54:25 2025       
cuda-1  | +-----------------------------------------------------------------------------------------+
cuda-1  | | NVIDIA-SMI 550.120                Driver Version: 550.120        CUDA Version: N/A      |
cuda-1  | |-----------------------------------------+------------------------+----------------------+
cuda-1  | | GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
cuda-1  | | Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
cuda-1  | |                                         |                        |               MIG M. |
cuda-1  | |=========================================+========================+======================|
cuda-1  | |   0  NVIDIA GeForce RTX 3050 ...    Off |   00000000:01:00.0 Off |                  N/A |
cuda-1  | | N/A   48C    P0             11W /   40W |       8MiB /   4096MiB |      0%      Default |
cuda-1  | |                                         |                        |                  N/A |
cuda-1  | +-----------------------------------------+------------------------+----------------------+
cuda-1  |                                                                                          
cuda-1  | +-----------------------------------------------------------------------------------------+
cuda-1  | | Processes:                                                                              |
cuda-1  | |  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
cuda-1  | |        ID   ID                                                               Usage      |
cuda-1  | |=========================================================================================|
cuda-1  | +-----------------------------------------------------------------------------------------+
cuda-1 exited with code 0

Ollama GPU

docker compose up store_ollama

We run without daemon mode so we can see into the container

 01:54:45 niccolox@devekko devekko.store ±|master ✗|→ docker compose up store_ollama
[+] Running 1/1
 ✔ store_ollama Pulled                                                                                                                                                                                                                                                                       1.0s 
WARN[0001] Found orphan containers ([ollama]) for this project. If you removed or renamed this service in your compose file, you can run this command with the --remove-orphans flag to clean it up. 
Attaching to store-ollama
store-ollama  | 2025/01/24 21:56:29 routes.go:1187: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:24h0m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/root/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
store-ollama  | time=2025-01-24T21:56:29.246Z level=INFO source=images.go:432 msg="total blobs: 0"
store-ollama  | time=2025-01-24T21:56:29.246Z level=INFO source=images.go:439 msg="total unused blobs removed: 0"
store-ollama  | [GIN-debug] [WARNING] Creating an Engine instance with the Logger and Recovery middleware already attached.
store-ollama  | 
store-ollama  | [GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production.
store-ollama  |  - using env:   export GIN_MODE=release
store-ollama  |  - using code:  gin.SetMode(gin.ReleaseMode)
store-ollama  | 
store-ollama  | [GIN-debug] POST   /api/pull                 --> github.com/ollama/ollama/server.(*Server).PullHandler-fm (5 handlers)
store-ollama  | [GIN-debug] POST   /api/generate             --> github.com/ollama/ollama/server.(*Server).GenerateHandler-fm (5 handlers)
store-ollama  | [GIN-debug] POST   /api/chat                 --> github.com/ollama/ollama/server.(*Server).ChatHandler-fm (5 handlers)
store-ollama  | [GIN-debug] POST   /api/embed                --> github.com/ollama/ollama/server.(*Server).EmbedHandler-fm (5 handlers)
store-ollama  | [GIN-debug] POST   /api/embeddings           --> github.com/ollama/ollama/server.(*Server).EmbeddingsHandler-fm (5 handlers)
store-ollama  | [GIN-debug] POST   /api/create               --> github.com/ollama/ollama/server.(*Server).CreateHandler-fm (5 handlers)
store-ollama  | [GIN-debug] POST   /api/push                 --> github.com/ollama/ollama/server.(*Server).PushHandler-fm (5 handlers)
store-ollama  | [GIN-debug] POST   /api/copy                 --> github.com/ollama/ollama/server.(*Server).CopyHandler-fm (5 handlers)
store-ollama  | [GIN-debug] DELETE /api/delete               --> github.com/ollama/ollama/server.(*Server).DeleteHandler-fm (5 handlers)
store-ollama  | [GIN-debug] POST   /api/show                 --> github.com/ollama/ollama/server.(*Server).ShowHandler-fm (5 handlers)
store-ollama  | [GIN-debug] POST   /api/blobs/:digest        --> github.com/ollama/ollama/server.(*Server).CreateBlobHandler-fm (5 handlers)
store-ollama  | [GIN-debug] HEAD   /api/blobs/:digest        --> github.com/ollama/ollama/server.(*Server).HeadBlobHandler-fm (5 handlers)
store-ollama  | [GIN-debug] GET    /api/ps                   --> github.com/ollama/ollama/server.(*Server).PsHandler-fm (5 handlers)
store-ollama  | [GIN-debug] POST   /v1/chat/completions      --> github.com/ollama/ollama/server.(*Server).ChatHandler-fm (6 handlers)
store-ollama  | [GIN-debug] POST   /v1/completions           --> github.com/ollama/ollama/server.(*Server).GenerateHandler-fm (6 handlers)
store-ollama  | [GIN-debug] POST   /v1/embeddings            --> github.com/ollama/ollama/server.(*Server).EmbedHandler-fm (6 handlers)
store-ollama  | [GIN-debug] GET    /v1/models                --> github.com/ollama/ollama/server.(*Server).ListHandler-fm (6 handlers)
store-ollama  | [GIN-debug] GET    /v1/models/:model         --> github.com/ollama/ollama/server.(*Server).ShowHandler-fm (6 handlers)
store-ollama  | [GIN-debug] GET    /                         --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func1 (5 handlers)
store-ollama  | [GIN-debug] GET    /api/tags                 --> github.com/ollama/ollama/server.(*Server).ListHandler-fm (5 handlers)
store-ollama  | [GIN-debug] GET    /api/version              --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func2 (5 handlers)
store-ollama  | [GIN-debug] HEAD   /                         --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func1 (5 handlers)
store-ollama  | [GIN-debug] HEAD   /api/tags                 --> github.com/ollama/ollama/server.(*Server).ListHandler-fm (5 handlers)
store-ollama  | [GIN-debug] HEAD   /api/version              --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func2 (5 handlers)
store-ollama  | time=2025-01-24T21:56:29.248Z level=INFO source=routes.go:1238 msg="Listening on [::]:11434 (version 0.5.7-0-ga420a45-dirty)"
store-ollama  | time=2025-01-24T21:56:29.249Z level=INFO source=routes.go:1267 msg="Dynamic LLM libraries" runners="[cpu cpu_avx cpu_avx2 cuda_v11_avx cuda_v12_avx]"
store-ollama  | time=2025-01-24T21:56:29.250Z level=INFO source=gpu.go:226 msg="looking for compatible GPUs"
store-ollama  | time=2025-01-24T21:56:29.408Z level=INFO source=types.go:131 msg="inference compute" id=GPU-1cbfe056-fcc4-4ab2-c563-6fc44e92565a library=cuda variant=v12 compute=8.6 driver=12.4 name="NVIDIA GeForce RTX 3050 Ti Laptop GPU" total="3.8 GiB" available="3.7 GiB"

Daemon mode would be

docker compose up store_ollama -d

Configure ChatGPT DeepSeek R1

Top left pane in Cursor pull down icon and Pin CodeGPT, choose Freemium

Signup and authorize and login

Pull download Ollama DeepSeek R1

Choose LLM Local Models

Pull down the model via Cursor and the CodeGPT Local LLM UX

I can now see DeepSeek R1 as my model for Cursor

In the Docker container for ollama I can see successful download and posts

store-ollama  | time=2025-01-24T22:17:33.271Z level=INFO source=gpu.go:226 msg="looking for compatible GPUs"
store-ollama  | time=2025-01-24T22:17:33.429Z level=INFO source=types.go:131 msg="inference compute" id=GPU-1cbfe056-fcc4-4ab2-c563-6fc44e92565a library=cuda variant=v12 compute=8.6 driver=12.4 name="NVIDIA GeForce RTX 3050 Ti Laptop GPU" total="3.8 GiB" available="3.7 GiB"
store-ollama  | [GIN] 2025/01/24 - 22:18:52 | 200 |    6.029998ms |      172.28.0.1 | GET      "/api/tags"
store-ollama  | time=2025-01-24T22:18:53.213Z level=INFO source=download.go:175 msg="downloading 9801e7fce27d in 405 1 GB part(s)"
store-ollama  | [GIN] 2025/01/24 - 22:22:32 | 200 |     463.954µs |      172.28.0.1 | GET      "/api/tags"
store-ollama  | [GIN] 2025/01/24 - 22:22:37 | 200 |     184.288µs |      172.28.0.1 | GET      "/api/tags"
store-ollama  | [GIN] 2025/01/24 - 22:30:00 | 200 |     166.766µs |      172.28.0.1 | GET      "/api/tags"
store-ollama  | [GIN] 2025/01/24 - 22:30:02 | 200 |     193.451µs |      172.28.0.1 | GET      "/api/tags"
store-ollama  | time=2025-01-24T23:17:28.831Z level=INFO source=download.go:175 msg="downloading 369ca498f347 in 1 387 B part(s)"
store-ollama  | time=2025-01-24T23:17:30.130Z level=INFO source=download.go:175 msg="downloading 6e4c38e1172f in 1 1.1 KB part(s)"
store-ollama  | time=2025-01-24T23:17:31.397Z level=INFO source=download.go:175 msg="downloading f4d24e9138dd in 1 148 B part(s)"
store-ollama  | time=2025-01-24T23:17:32.868Z level=INFO source=download.go:175 msg="downloading fdf3d6cb73c7 in 1 497 B part(s)"
store-ollama  | [GIN] 2025/01/24 - 23:32:13 | 200 |      1h13m21s |      172.28.0.1 | POST     "/api/pull"
store-ollama  | [GIN] 2025/01/24 - 23:32:13 | 200 |       1h9m36s |      172.28.0.1 | POST     "/api/pull"
store-ollama  | [GIN] 2025/01/24 - 23:32:13 | 200 |       1h2m10s |      172.28.0.1 | POST     "/api/pull"
store-ollama  | [GIN] 2025/01/24 - 23:37:49 | 200 |    2.051761ms |      172.28.0.1 | GET      "/api/tags"
store-ollama  | [GIN] 2025/01/24 - 23:37:54 | 200 |     618.334µs |      172.28.0.1 | GET      "/api/tags"
store-ollama  | [GIN] 2025/01/24 - 23:37:54 | 200 |     548.637µs |      172.28.0.1 | GET      "/api/tags"
store-ollama  | [GIN] 2025/01/24 - 23:37:57 | 200 |     465.119µs |      172.28.0.1 | GET      "/api/tags"
store-ollama  | [GIN] 2025/01/24 - 23:37:57 | 200 |     589.321µs |      172.28.0.1 | GET      "/api/tags"