Too much work per run
Matrix explosion: when parallelism increases cost
By Keith Mazanec, Founder, CostOps · Updated January 29, 2026
A developer adds Windows to the CI matrix alongside Linux and macOS. Another adds Node 22 to the version list. Now the matrix is 4 versions × 3 operating systems × 2 databases = 24 jobs per push. Each job spins up its own runner, installs dependencies from scratch, and gets billed independently, with each job rounded up to the nearest minute. The matrix was 6 jobs a month ago. Nobody noticed it tripled.
Symptoms
How to tell if your matrix is costing more than it's worth
Matrix growth is gradual. Each new dimension seems harmless in isolation. Look for these signs in your Actions tab:
-
Job count multiplies faster than coverage value. Adding one entry to a matrix dimension doesn't add one job. Instead, it multiplies across all other dimensions. A 4 × 3 matrix becomes 5 × 3 = 15 jobs, not 13. Every workflow run fans out into dozens of jobs, most of which pass identically.
-
Billable minutes far exceed wall-clock time. A workflow that takes 8 minutes to complete shows 120+ billable minutes. That's because each of the 15+ matrix jobs runs independently, each billed at a minimum of 1 minute, and each repeating checkout, install, and setup steps.
-
Cross-platform jobs that never fail differently. You test on Windows, Linux, and macOS, but failures are always in the application logic, never OS-specific. The Windows and macOS jobs pass for months without catching a single platform-specific bug, yet they consume 2x and 10x the per-minute rate of Linux.
-
Non-LTS versions in the matrix. The matrix includes Node 17, 19, and 21, all odd-numbered versions that are already end-of-life. Or it tests Python 3.8 for an application that requires 3.11+. These jobs consume minutes without providing actionable signal.
Metrics
The combinatorial math behind matrix cost
GitHub Actions bills each matrix job independently, rounded up to the nearest minute. A 24-job matrix where each job runs for 8 minutes costs 24 × 8 = 192 billable minutes per workflow run, even though wall-clock time is just 8 minutes. Compare that to running 4 representative jobs:
Full cross-product matrix
At $0.006/min (Linux 2-core) · 22 working days
Reduced matrix on PRs (4 jobs)
Save $422/mo · $5,064/year · per workflow
That's all Linux. If the matrix includes macOS jobs at $0.062/min, a single macOS matrix dimension with 8 entries costs $1,743/mo on its own. Dropping macOS from PR builds and running it only on main can save thousands per year from one workflow change.
Fix 1
Use a smaller matrix on pull requests
Most PR builds don't need the full compatibility matrix. A developer changing application logic doesn't need to validate against 4 Node versions and 3 operating systems on every push. Run a representative subset on PRs, and save the full matrix for main or release branches where the broader coverage actually matters.
The cleanest approach uses a prep job that outputs different JSON matrices based on the event type. The build job consumes that output via fromJSON.
name: CI on: push: branches: [main] pull_request: jobs: matrix-prep: runs-on: ubuntu-latest outputs: matrix: ${{ steps.set-matrix.outputs.matrix }} steps: - id: set-matrix run: | if [ "${{ github.event_name }}" == "push" ]; then # Full matrix on main: 4 versions × 3 OS = 12 jobs echo 'matrix={"node-version":["18","20","22","23"],"os":["ubuntu-latest","windows-latest","macos-latest"]}' >> $GITHUB_OUTPUT else # PR matrix: 2 versions × 1 OS = 2 jobs echo 'matrix={"node-version":["20","22"],"os":["ubuntu-latest"]}' >> $GITHUB_OUTPUT fi test: needs: matrix-prep strategy: matrix: ${{ fromJSON(needs.matrix-prep.outputs.matrix) }} fail-fast: true runs-on: ${{ matrix.os }} steps: - uses: actions/checkout@v4 - uses: actions/setup-node@v4 with: node-version: ${{ matrix.node-version }} - run: npm ci - run: npm test
This drops the PR matrix from 12 jobs to 2, which is an 83% reduction in billable minutes per PR run. The full 12-job matrix still runs on every merge to main, so you don't lose coverage. You just stop paying for it on every intermediate push.
One caveat: the prep job itself consumes a billed minute (rounded up from a few seconds of shell execution). For matrices with fewer than 3 jobs, the overhead of the prep job may not be worth it. For matrices above 6 jobs, the savings are substantial.
Fix 2
Replace cross-products with exact combinations
The default matrix behavior computes a Cartesian product: every value in every dimension combined with every value in every other dimension. But you rarely need all those combinations. You probably don't need to test Node 18 on macOS and Node 20 on macOS and Node 22 on macOS. You need Node 18 on Linux, Node 22 on Linux, and Node 22 on macOS.
GitHub supports an include-only matrix that skips the cross-product entirely. You list exactly the combinations you want, and only those jobs run.
strategy: matrix: node: [18, 20, 22, 23] os: - ubuntu-latest - windows-latest - macos-latest # 4 × 3 = 12 jobs # Most combinations add no signal
strategy: matrix: include: - node: 18 os: ubuntu-latest - node: 20 os: ubuntu-latest - node: 22 os: ubuntu-latest - node: 22 os: windows-latest - node: 22 os: macos-latest
Same coverage where it matters: all Node versions validated on Linux, and the latest version validated cross-platform. That takes the matrix from 12 jobs to 5, a 58% reduction. The include-only approach also makes the matrix self-documenting: anyone reading the workflow sees exactly what runs, instead of having to mentally compute a cross-product.
Fix 3
Remove matrix dimensions that don't catch bugs
Every dimension in the matrix should earn its place by catching bugs that other dimensions miss. If the Windows jobs haven't caught a single platform-specific failure in 6 months, they're not providing signal, just generating invoices. Audit each dimension against your failure history.
| Dimension | Keep if | Drop if |
|---|---|---|
| OS variants | You ship native binaries or use OS-specific APIs | App only deploys to Linux containers |
| Runtime versions | You publish a library consumed on multiple versions | You control the runtime in production (one version) |
| Database engines | You support Postgres and MySQL in production | You only deploy against one database |
| Non-LTS versions | You need to verify upcoming breaking changes | The version is EOL and not used by consumers |
A common pattern in Node.js projects: the matrix tests versions 16, 18, 20, 22. But Node 16 reached end-of-life in September 2023, and Node 18 maintenance ended in April 2025. If your package.json specifies "engines": { "node": ">=20" }, testing 16 and 18 is pure waste. Removing two entries from a 4 × 3 matrix cuts it from 12 to 6 jobs, effectively halving your CI bill for that workflow.
Fix 4
Use fail-fast to stop wasting minutes on broken builds
When one matrix job fails, do you need the other 23 to keep running? By default, GitHub Actions sets fail-fast: true, which cancels all remaining matrix jobs when any job fails. This is the correct default for cost optimization, but it can be accidentally disabled.
Verify your workflows haven't set fail-fast: false. Some teams disable it to "see all failures at once," but on a 24-job matrix with an 8-minute runtime, a failure in minute 2 means 22 jobs run for 6 unnecessary minutes each: 132 wasted minutes per failed run.
strategy: fail-fast: false matrix: node: [18, 20, 22, 23] os: [ubuntu-latest, windows-latest] # Job 1 fails at minute 2 # Jobs 2-8 keep running to completion # Billed: 8 × 8 = 64 minutes
strategy: fail-fast: true matrix: node: [18, 20, 22, 23] os: [ubuntu-latest, windows-latest] # Job 1 fails at minute 2 # Jobs 2-8 cancelled within seconds # Billed: ~10 minutes total
If you need all failures visible for debugging, consider a hybrid approach: use fail-fast: true on PRs (where speed and cost matter) and fail-fast: false only on scheduled nightly runs where you specifically want the full failure report. For more on stopping canceled runs from wasting minutes, see our dedicated guide.
Reference
How matrix size affects monthly cost
Use this table to estimate the cost impact of your matrix configuration. Assumes 8 min/job, 20 runs/day, 22 working days/month.
| Matrix size | Min/run | Linux/mo | Mixed OS/mo |
|---|---|---|---|
| 2 jobs | 16 | $42 | - |
| 6 jobs | 48 | $127 | $549 |
| 12 jobs | 96 | $253 | $1,098 |
| 24 jobs | 192 | $506 | $2,196 |
"Mixed OS" assumes equal splits across Linux ($0.006/min), Windows ($0.010/min), and macOS ($0.062/min), which is the typical result of adding an OS dimension to a matrix. The macOS jobs alone account for roughly 80% of the mixed-OS cost. If most of your jobs only need Linux, you may also be overpaying for runner types you don't need. GitHub imposes a limit of 256 jobs per matrix per workflow run.
Related guides
Too Many Small Jobs
Per-minute rounding turns dozens of tiny jobs into outsized bills. Consolidate to cut overhead.
Canceled Runs Wasting Minutes
Concurrency groups and fail-fast settings to stop paying for runs that are already obsolete.
Over-Parallelized Test Suites
When splitting tests across too many runners costs more in overhead than it saves in wall-clock time.
Reduce CI Setup and Install Overhead
Each matrix job repeats checkout and install. Reduce the per-job overhead that compounds at scale.