Using the raw Flux Kontext Dev open-weight model of 23GB with a LoRa on Lambda.ai cloud GPU.

I have the models stored on my own hard drive, and I upload them directly to the Lambda instance via Filebrowser through an SSH port forward on port 8080. I also expose the ComfyUI dashboard through the tunnel on port 8188.

Host LambdaComfy
    Hostname 192.9.251.153 #ip lambalabs
    User ubuntu
    LocalForward 8080 127.0.0.1:8080
    LocalForward 8188 127.0.0.1:8188
 
I'm using an Nvidia A10 with 24GB of VRAM. It's a bit tight for the full 23GB model, but it works... it takes about 1 to 2 minutes to generate an image.
 
Depending on your available VRAM

32GB VRAM --> Full-speed, full-precision model (24Go)
20GB VRAM --> FP8 quantized (12Go)
≤ 12GB VRAM --> Int8/GGUF small version (~5Go)
 
Install on Lambda.ai
 
$ curl -LsSf https://astral.sh/uv/install.sh | sh
$ uv add torch torchvision torchaudio
$ uv pip install -r requirements.txt
$ uv run main.py # run ComfyUI dash
 
$ curl -fsSL https://raw.githubusercontent.com/filebrowser/get/master/get.sh | bash
$ filebrowser -r ComfyUI/  # helping upload models
Post image
Post image
Post image
Post image
Post image
Close
Fullscreen image