LLM Training Compute Cost Calculator
Project the total cost of fine-tuning large language models by combining GPU runtime, electricity, and contingency overheads. Enter your model size, dataset volume, measured throughput, and pricing assumptions to obtain a defensible training budget in USD.
Educational information, not financial advice.
Examples
- 7B parameters, 30B tokens, 90 TFLOPs/s, 32 GPUs at $1.80/hr, 85% utilisation, 0.3 kW TDP, PUE 1.2, $0.09/kWh, 10% overhead ⇒ $9,184.82
- 65B parameters, 200B tokens, 180 TFLOPs/s, 128 GPUs at $3.50/hr, 88% utilisation, 0.5 kW TDP, PUE 1.35, $0.12/kWh, 20% overhead ⇒ $584,244.95
FAQ
How should I estimate effective throughput per GPU?
Use profiler measurements from trial runs or vendor benchmarks that reflect your framework, precision, and parallelism strategy rather than theoretical peak TFLOPs.
Can I model parameter-efficient fine-tuning techniques?
Yes. Reduce the Trainable Parameters field to the subset updated by adapters or LoRA weights so the FLOPs requirement reflects the smaller optimisation footprint.
What overheads belong in the contingency percentage?
Include orchestration software, checkpoint storage, experiment tracking, quality assurance, and on-call labour—items that scale with training but are not billed as GPU hours.
Additional Information
- The model converts parameter and token counts into total training FLOPs, then divides by sustained throughput to estimate GPU-hours before applying your hourly rate.
- Energy consumption accounts for utilisation, per-GPU thermal design power, total runtime, and facility PUE so electricity spend scales with operational reality.
- Adjust the overhead percentage to capture orchestration tooling, checkpoint storage, evaluation clusters, or staffing costs not represented in direct GPU pricing.