Too much work per run

Matrix explosion: when parallelism increases cost

By Keith Mazanec, Founder, CostOps · Updated January 29, 2026

A developer adds Windows to the CI matrix alongside Linux and macOS. Another adds Node 22 to the version list. Now the matrix is 4 versions × 3 operating systems × 2 databases = 24 jobs per push. Each job spins up its own runner, installs dependencies from scratch, and gets billed independently, with each job rounded up to the nearest minute. The matrix was 6 jobs a month ago. Nobody noticed it tripled.

Symptoms

How to tell if your matrix is costing more than it's worth

Matrix growth is gradual. Each new dimension seems harmless in isolation. Look for these signs in your Actions tab:

Job count multiplies faster than coverage value. Adding one entry to a matrix dimension doesn't add one job. Instead, it multiplies across all other dimensions. A 4 × 3 matrix becomes 5 × 3 = 15 jobs, not 13. Every workflow run fans out into dozens of jobs, most of which pass identically.
Billable minutes far exceed wall-clock time. A workflow that takes 8 minutes to complete shows 120+ billable minutes. That's because each of the 15+ matrix jobs runs independently, each billed at a minimum of 1 minute, and each repeating checkout, install, and setup steps.
Cross-platform jobs that never fail differently. You test on Windows, Linux, and macOS, but failures are always in the application logic, never OS-specific. The Windows and macOS jobs pass for months without catching a single platform-specific bug, yet they consume 2x and 10x the per-minute rate of Linux.
Non-LTS versions in the matrix. The matrix includes Node 17, 19, and 21, all odd-numbered versions that are already end-of-life. Or it tests Python 3.8 for an application that requires 3.11+. These jobs consume minutes without providing actionable signal.

Metrics

The combinatorial math behind matrix cost

GitHub Actions bills each matrix job independently, rounded up to the nearest minute. A 24-job matrix where each job runs for 8 minutes costs 24 × 8 = 192 billable minutes per workflow run, even though wall-clock time is just 8 minutes. Compare that to running 4 representative jobs:

Full cross-product matrix

Matrix jobs 24

Minutes/job 8

Billable min/run 192

Runs/day 20

Monthly cost $506/mo

At $0.006/min (Linux 2-core) · 22 working days

Reduced matrix on PRs (4 jobs)

Matrix jobs 4

Minutes/job 8

Billable min/run 32

Runs/day 20

Monthly cost $84/mo

Save $422/mo · $5,064/year · per workflow

That's all Linux. If the matrix includes macOS jobs at $0.062/min, a single macOS matrix dimension with 8 entries costs $1,743/mo on its own. Dropping macOS from PR builds and running it only on main can save thousands per year from one workflow change.

Fix 1

Use a smaller matrix on pull requests

Most PR builds don't need the full compatibility matrix. A developer changing application logic doesn't need to validate against 4 Node versions and 3 operating systems on every push. Run a representative subset on PRs, and save the full matrix for main or release branches where the broader coverage actually matters.

The cleanest approach uses a prep job that outputs different JSON matrices based on the event type. The build job consumes that output via fromJSON.

.github/workflows/ci.yml

name: CI

on:
  push:
    branches: [main]
  pull_request:

jobs:
  matrix-prep:
    runs-on: ubuntu-latest
    outputs:
      matrix: ${{ steps.set-matrix.outputs.matrix }}
    steps:
      - id: set-matrix
        run: |
          if [ "${{ github.event_name }}" == "push" ]; then
            # Full matrix on main: 4 versions × 3 OS = 12 jobs
            echo 'matrix={"node-version":["18","20","22","23"],"os":["ubuntu-latest","windows-latest","macos-latest"]}' >> $GITHUB_OUTPUT
          else
            # PR matrix: 2 versions × 1 OS = 2 jobs
            echo 'matrix={"node-version":["20","22"],"os":["ubuntu-latest"]}' >> $GITHUB_OUTPUT
          fi

  test:
    needs: matrix-prep
    strategy:
      matrix: ${{ fromJSON(needs.matrix-prep.outputs.matrix) }}
      fail-fast: true
    runs-on: ${{ matrix.os }}
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: ${{ matrix.node-version }}
      - run: npm ci
      - run: npm test

This drops the PR matrix from 12 jobs to 2, which is an 83% reduction in billable minutes per PR run. The full 12-job matrix still runs on every merge to main, so you don't lose coverage. You just stop paying for it on every intermediate push.

One caveat: the prep job itself consumes a billed minute (rounded up from a few seconds of shell execution). For matrices with fewer than 3 jobs, the overhead of the prep job may not be worth it. For matrices above 6 jobs, the savings are substantial.

Fix 2

Replace cross-products with exact combinations

The default matrix behavior computes a Cartesian product: every value in every dimension combined with every value in every other dimension. But you rarely need all those combinations. You probably don't need to test Node 18 on macOS and Node 20 on macOS and Node 22 on macOS. You need Node 18 on Linux, Node 22 on Linux, and Node 22 on macOS.

GitHub supports an include-only matrix that skips the cross-product entirely. You list exactly the combinations you want, and only those jobs run.

Cross-product: 12 jobs

strategy:
  matrix:
    node: [18, 20, 22, 23]
    os:
      - ubuntu-latest
      - windows-latest
      - macos-latest

# 4 × 3 = 12 jobs
# Most combinations add no signal

Include-only: 5 jobs

strategy:
  matrix:
    include:
      - node: 18
        os: ubuntu-latest
      - node: 20
        os: ubuntu-latest
      - node: 22
        os: ubuntu-latest
      - node: 22
        os: windows-latest
      - node: 22
        os: macos-latest

Same coverage where it matters: all Node versions validated on Linux, and the latest version validated cross-platform. That takes the matrix from 12 jobs to 5, a 58% reduction. The include-only approach also makes the matrix self-documenting: anyone reading the workflow sees exactly what runs, instead of having to mentally compute a cross-product.

Fix 3

Remove matrix dimensions that don't catch bugs

Every dimension in the matrix should earn its place by catching bugs that other dimensions miss. If the Windows jobs haven't caught a single platform-specific failure in 6 months, they're not providing signal, just generating invoices. Audit each dimension against your failure history.

Dimension	Keep if	Drop if
OS variants	You ship native binaries or use OS-specific APIs	App only deploys to Linux containers
Runtime versions	You publish a library consumed on multiple versions	You control the runtime in production (one version)
Database engines	You support Postgres and MySQL in production	You only deploy against one database
Non-LTS versions	You need to verify upcoming breaking changes	The version is EOL and not used by consumers

A common pattern in Node.js projects: the matrix tests versions 16, 18, 20, 22. But Node 16 reached end-of-life in September 2023, and Node 18 maintenance ended in April 2025. If your package.json specifies "engines": { "node": ">=20" }, testing 16 and 18 is pure waste. Removing two entries from a 4 × 3 matrix cuts it from 12 to 6 jobs, effectively halving your CI bill for that workflow.

Fix 4

Use fail-fast to stop wasting minutes on broken builds

When one matrix job fails, do you need the other 23 to keep running? By default, GitHub Actions sets fail-fast: true, which cancels all remaining matrix jobs when any job fails. This is the correct default for cost optimization, but it can be accidentally disabled.

Verify your workflows haven't set fail-fast: false. Some teams disable it to "see all failures at once," but on a 24-job matrix with an 8-minute runtime, a failure in minute 2 means 22 jobs run for 6 unnecessary minutes each: 132 wasted minutes per failed run.

fail-fast disabled

strategy:
  fail-fast: false
  matrix:
    node: [18, 20, 22, 23]
    os: [ubuntu-latest, windows-latest]

# Job 1 fails at minute 2
# Jobs 2-8 keep running to completion
# Billed: 8 × 8 = 64 minutes

fail-fast enabled (default)

strategy:
  fail-fast: true
  matrix:
    node: [18, 20, 22, 23]
    os: [ubuntu-latest, windows-latest]

# Job 1 fails at minute 2
# Jobs 2-8 cancelled within seconds
# Billed: ~10 minutes total

If you need all failures visible for debugging, consider a hybrid approach: use fail-fast: true on PRs (where speed and cost matter) and fail-fast: false only on scheduled nightly runs where you specifically want the full failure report. For more on stopping canceled runs from wasting minutes, see our dedicated guide.

Reference

How matrix size affects monthly cost

Use this table to estimate the cost impact of your matrix configuration. Assumes 8 min/job, 20 runs/day, 22 working days/month.

Matrix size	Min/run	Linux/mo	Mixed OS/mo
2 jobs	16	$42	-
6 jobs	48	$127	$549
12 jobs	96	$253	$1,098
24 jobs	192	$506	$2,196

"Mixed OS" assumes equal splits across Linux ($0.006/min), Windows ($0.010/min), and macOS ($0.062/min), which is the typical result of adding an OS dimension to a matrix. The macOS jobs alone account for roughly 80% of the mixed-OS cost. If most of your jobs only need Linux, you may also be overpaying for runner types you don't need. GitHub imposes a limit of 256 jobs per matrix per workflow run.

Related guides

Too Many Small Jobs

Per-minute rounding turns dozens of tiny jobs into outsized bills. Consolidate to cut overhead.

Canceled Runs Wasting Minutes

Concurrency groups and fail-fast settings to stop paying for runs that are already obsolete.

Over-Parallelized Test Suites

When splitting tests across too many runners costs more in overhead than it saves in wall-clock time.

Reduce CI Setup and Install Overhead

Each matrix job repeats checkout and install. Reduce the per-job overhead that compounds at scale.