How to Calculate LLM Training Run Carbon Intensity

Training state-of-the-art language models now consumes megawatt-weeks of electricity and emits tonnes of CO₂e. Investors, enterprise buyers, and regulators therefore expect precise accounting of the carbon intensity for each run. This guide translates core telemetry—GPU-hours, power draw, power usage effectiveness, renewable procurement, and offsets—into a reproducible emission figure with per-token context.

The workflow complements runtime planning from the GPU training time and cost walkthrough and budget approvals handled in the LLM training budget calculator. Together they allow engineering, finance, and sustainability teams to speak the same language when evaluating training strategies or compliance disclosures.

What carbon intensity measures

Carbon intensity captures the greenhouse-gas footprint associated with producing a unit of output—in this case, a complete training run or a million processed tokens. It includes direct electricity consumption by GPUs and indirect overhead such as cooling and power conversion, adjusted for renewable procurement and carbon offsets. Reporting intensity rather than absolute emissions enables benchmarking across models, data centers, and mitigation strategies.

The methodology presented here aligns with the Greenhouse Gas Protocol’s market-based accounting approach. It allows you to integrate energy attribute certificates, power purchase agreements, and verified offsets while keeping location-based emissions visible for transparency.

Required variables and units

Gather the following inputs before you start modelling. Energy is measured in kilowatt-hours (kWh); emissions are expressed in kilograms of CO₂ equivalent (kg CO₂e).

  • HGPU – Total GPU-hours consumed. Multiply GPU count by runtime hours across the full training schedule.
  • Pavg – Average electrical power per GPU (kW). Derived from telemetry or facility metering at the achieved utilisation.
  • EF – Grid emission factor (kg CO₂e/kWh). Choose location-based or market-based according to reporting policy.
  • PUE – Power usage effectiveness (dimensionless). Ratio of facility power to IT power during the run.
  • R% – Percentage of load served with renewable energy (0–100%). Represents bundled PPAs or on-site generation matched to the run.
  • O – Carbon offsets applied (kg CO₂e). Include only retired credits dedicated to the run.
  • Ttok – Optional tokens processed (billions). Enables per-token intensity reporting.

Maintain audit trails for each input: GPU-hours from orchestration logs, power data from rack-level meters, emission factors from utility disclosures, and renewable procurement contracts. Consistency with enterprise sustainability reporting is essential.

Core formulas

The intensity calculation follows the energy balance of the data center and then applies emission factors. All steps use deterministic arithmetic so stakeholders can reproduce results in spreadsheets or emissions software.

IT energy EIT = HGPU × Pavg

Total facility energy Efac = EIT × PUE

Adjusted emission factor EFnet = EF × (1 − R% ÷ 100)

Gross emissions G = Efac × EFnet

Net emissions N = max(0, G − O)

Intensity per GPU-hour IGPU = N ÷ HGPU

Intensity per million tokens Itok = N ÷ (Ttok × 1000) when tokens are provided

The net emission factor effectively discounts zero-carbon electricity that is demonstrably matched to the run. Offsets are applied after electricity-based emissions to ensure they do not mask inefficient operations. Always report gross and net emissions together so auditors can evaluate mitigation quality.

Step-by-step calculation workflow

1. Reconstruct GPU-hours

Pull job telemetry from your orchestration platform or scheduler. Sum GPU count × runtime for every job phase, including warm-up, evaluation, and checkpointing. Ensure failed jobs are either excluded or accounted for separately as wasted energy.

2. Measure average power

Combine rack-level power monitoring with GPU utilisation logs. If telemetry fluctuates, compute an energy-weighted average rather than a simple mean. Confirm the power reading covers auxiliary devices (network switches, storage) if those loads sit behind the same power distribution units.

3. Establish facility efficiency

Obtain PUE from the data center operator for the exact interval of the training run. Some facilities provide real-time PUE dashboards; otherwise, request a metered report. For colocation sites, cross-check that ancillary services (lighting, office HVAC) are excluded so the PUE reflects IT load overhead only.

4. Apply grid and renewable data

Select the emission factor that matches your disclosure standard. Market-based accounting uses supplier-specific or contractual factors; location-based uses the regional average. Document renewable energy purchases clearly and ensure they are time-matched to the run if following emerging 24/7 carbon-free energy standards.

5. Calculate emissions and intensity

Multiply total facility energy by the adjusted emission factor to obtain gross emissions. Subtract any high-quality offsets retired for the project. Divide by GPU-hours and, optionally, tokens processed to derive intensity metrics. Present results with two-decimal precision for kg CO₂e/GPU-hour and three decimals for kg per million tokens to avoid overstating accuracy.

Validation and controls

Reconcile total facility energy against utility invoices or substation meters. Variances often indicate unmetered loads or incorrect PUE. Compare emission factors with those used in corporate sustainability reports to guarantee alignment. For per-token metrics, verify token counts against data pipeline logs and account for discarded or filtered records.

Establish a review cadence that includes engineering, sustainability, and finance stakeholders. Version-control the calculation template, capture assumptions in change logs, and attach supporting evidence—meter data, contract IDs, offset certificates—so auditors can trace every figure.

Limits and extensions

This methodology covers Scope 2 emissions from electricity. Upstream Scope 3 impacts of hardware manufacturing, dataset preparation, or cooling refrigerants fall outside the calculation but may be material for lifecycle assessments. Likewise, the approach treats offsets as immediate reductions; if your policy only recognises removals after verification, adjust the timing accordingly.

For multi-tenant clusters running concurrent jobs, allocate energy based on GPU-hours and, when available, actual power metering per tenant. Large operators increasingly deploy carbon-aware schedulers that modulate workloads based on marginal emission factors; integrate those signals to refine the EF input dynamically.

Embed: LLM training carbon intensity calculator

Provide GPU-hours, average power, facility PUE, emission factors, and mitigation actions to generate gross and net emissions along with intensity metrics per GPU-hour and per million tokens.

LLM Training Run Carbon Intensity Calculator

Combine GPU runtime, electrical efficiency, grid intensity, and mitigation actions to quantify the carbon footprint of a language-model training run.

Sum of accelerator count × runtime for the entire training job.
Sustained electrical draw per accelerator at the utilisation achieved.
Location-based emission intensity of delivered electricity.
Power usage effectiveness for the data center during the run.
Defaults to 0%. Represents bundled renewables or credits that zero out matching load.
Enter verified offsets retired for this run; defaults to 0.
Used to report emissions per million tokens; leave blank if unknown.

Indicative estimator; complement with lifecycle assessments and contract-specific emission factors for formal ESG reporting.