Why two RTX 3090s instead of one RTX 4090?

Two 3090s give 48 GB of combined VRAM versus 24 GB on a 4090 — and often for similar money used. For running 70B models at good quality, total VRAM matters more than per-card speed.

Do I need NVLink for a dual-GPU LLM build?

No. Modern inference frameworks split a model across GPUs over PCIe just fine. NVLink helps some training workloads but isn't required for running LLMs.

What PSU do I need for two 3090s?

Plan for 1000–1200W 80+ Gold or better. Two 3090s can pull a lot under load; undervolting them reduces power and heat with minimal performance loss.

Dual-GPU Build: 48 GB VRAM for 70B Models

By LocalLLMGear Editorial · Editorial Team · Updated 2026-06-28

We test hardware hands-on and may use AI tools in research — every guide is human-reviewed. Editorial policy.

We may earn a commission from links in this article, at no extra cost to you. Disclosure.

When a single 24 GB card isn’t enough, the most cost-effective way to reach 48 GB of VRAM is two used RTX 3090s. That unlocks 70B models at genuinely good quality — the kind of jump that a single consumer card can’t make. Here’s how to build it without the classic power and compatibility mistakes.

The 30-second answer: Two used RTX 3090s = 48 GB VRAM, often for the price of one new 4090. Pair them with a 1000W+ PSU, a board with two spaced PCIe x16 slots, and good airflow. No NVLink needed for inference.

Why 48 GB unlocks 70B

A 70B model quantized to 4-bit needs roughly 40–48 GB. One 24 GB card forces heavy quantization (quality drops); 48 GB lets you run it at a higher quality level. Two 3090s are the budget path there.

Parts list

Dual-GPU 48 GB rig — key parts (approx. 2026 prices)

GPU / Option	Price (approx.)	Best for
2× RTX 3090 24 GB (used) ★ Our pick	~$1,600	48 GB combined VRAM	Check price →
PSU: 1000–1200W 80+ Gold	~$200	Headroom for two cards	Check price →
Motherboard: 2× PCIe x16, spaced	~$250	Airflow between cards	Check price →
CPU + 64–128 GB RAM + NVMe + case	~$700	No bottleneck, big airflow case

Ad · "Check price" links are affiliate links. We may earn a commission at no extra cost to you.

Power and cooling — where builds fail

Two 3090s generate real heat. Use a large, well-ventilated case, space the cards (a spaced motherboard or a riser), and undervolt both GPUs — you lose a few percent of speed for a big drop in power and temperature. Don’t cheap out on the PSU.

Software: splitting a model across two GPUs

Frameworks like llama.cpp, vLLM and Ollama split layers across GPUs automatically over PCIe. Our Tutorials & Setup section covers the configuration.

Not ready to build?

Rent a 48 GB (A6000) or larger instance to test 70B before committing to the build:

Test a 70B model on RunPod Ad

Starting smaller? See Build a local LLM rig under $2,000.