Guides / Speed up CI pipelines

Too much work per run

Speed up slow CI pipelines and optimize expensive jobs

By Keith Mazanec, Founder, CostOps · Updated February 17, 2026

A developer pushes to a PR branch. The CI pipeline takes 35 minutes to complete. They context-switch to another task. When they come back, a test failed, so they fix it, push again, and wait another 35 minutes. Two feedback cycles burned an hour of wall time and 70 minutes of billable compute. Multiply by a team of 10 and 30 runs per day, and you're spending over $400/mo on slow pipelines alone. Long-running pipelines are one of the most visible CI cost problems, and the fix is rarely "add more parallelism." It starts with finding where the money actually goes, then targeting the specific jobs and build steps that drive the bill.

Symptoms

How to tell if slow pipelines are costing you money

Open your repository's Actions tab and look at the run durations. If you see these patterns, you have a pipeline speed problem:

  • High p90 pipeline duration. Your median run might be 20 minutes, but the p90 is 35+ minutes. That tail represents 10% of your runs consuming disproportionate compute. When developers experience the 35-minute runs, they lose trust in CI speed and start re-triggering or context-switching.

  • A few jobs dominate the bill. GitHub Actions job cost follows a power law. The top 5 jobs by billable minutes typically account for 60–80% of total spend. If one integration test job runs 3–5x longer than others in the same workflow, that single job is where most of your money goes. You optimized lint and saved 30 seconds. The real waste is the 14-minute job you haven't touched.

  • The same build runs in multiple jobs. Your test job runs npm run build, your lint job runs npm run build, and your deploy job runs npm run build. Each job compiles identical source into identical output, independently. Three 8-minute builds means 24 minutes of build time for a single push, and only one of those builds was necessary.

  • No correlation between change size and run duration. A one-line config change takes the same 30 minutes as a 50-file refactor. This means your pipeline has no incremental behavior and does full builds, full test suites, and full installs on every run regardless of what changed. This is a sign you could benefit from path filtering to skip CI for non-code changes.

Metrics

Quantify the cost of slow pipelines

The cost formula is billable minutes per run × runs/day × rate. Here's a typical scenario for a team running 30 CI runs per day on Linux, where the pipeline averages 22 billable minutes per run:

Before optimization

Runs/day 30
Billable min/run 22
Monthly minutes 14,520
Monthly cost $87/mo

22 min × 30 runs/day × 22 days at $0.006/min

After optimization (40% fewer minutes)

Runs/day 30
Billable min/run 13
Monthly minutes 8,580
Monthly cost $51/mo

Save $36/mo · $432/year · per workflow

That's one workflow on Linux. On macOS runners at $0.062/min, which is 10x the rate, the same 22→13 minute reduction saves $370/mo per workflow. And this doesn't account for the productivity cost: a 2025 case study found that cutting pipeline time from 20 minutes to 9 minutes reduced runner hours by 35% and dropped flaky failures from 15% to 4%.


Step 1

Find your hotspots: identify the top-cost jobs

Before optimizing anything, measure where the money actually goes. Most pipelines have 2–3 jobs that account for 60–80% of total billable minutes. Fixing the wrong job wastes effort. A 30% improvement on a 15-minute job saves more than a 90% improvement on a 1-minute job.

GitHub's billing page shows total minutes but not per-job breakdowns. Use the workflow jobs API to pull job-level timing and rank by total cost. This script does it over the last 30 days:

cost-by-job.sh
#!/bin/bash
# Rank jobs by total cost over the last 30 days
# Requires: gh CLI, jq

OWNER="your-org"
REPO="your-repo"
SINCE=$(date -v-30d +%Y-%m-%dT00:00:00Z)

# Get completed run IDs from the last 30 days
gh api --paginate \
  "/repos/$OWNER/$REPO/actions/runs?created=>=$SINCE&status=completed" \
  --jq '.workflow_runs[].id' \
  | while read -r run_id; do
    # Get job name and duration for each run
    gh api "/repos/$OWNER/$REPO/actions/runs/$run_id/jobs" \
      --jq '.jobs[] |
        select(.conclusion != null) |
        [.name,
         ((.completed_at | fromdateiso8601) -
          (.started_at | fromdateiso8601))] | @tsv'
  done | awk -F'\t' '{
    seconds[$1] += $2; count[$1]++
  } END {
    for (j in seconds) {
      mins = seconds[j] / 60
      # Adjust rate for your runner type
      cost = mins * 0.006
      printf "%s\t%.0f mins\t%d runs\t$%.2f\n",
        j, mins, count[j], cost
    }
  }' | sort -t$'\t' -k4 -rn | head -10

A typical output shows a steep drop-off after the top 3–5 jobs:

Job name Total mins Runs Cost Share
test-integration 5,600 400 $33.60 37%
build 3,200 400 $19.20 21%
test-e2e 2,400 400 $14.40 16%
test-unit 1,200 400 $7.20 8%
lint 800 400 $4.80 5%
all other jobs 2,000 1,200 $12.00 13%

Or skip the scripting. CostOps tracks per-job billable minutes and cost automatically, so you can see your most expensive jobs ranked in a single view without querying the API run by run.

The top 3 jobs (test-integration, build, test-e2e) account for 74% of total cost. A targeted 30% reduction in these 3 jobs alone saves more than a 10% reduction spread across all jobs combined. Once you know your hotspots, apply the fixes below in order.

Step 2

Add targeted caching to the expensive jobs

The most common reason a pipeline is slow is that it redoes work every run: reinstalling dependencies, recompiling code, rebuilding Docker images. GitHub's actions/cache can store and restore these outputs between runs, but only if the cache key is configured correctly. A cache keyed on github.sha misses every run. Key on your lockfile hash instead.

.github/workflows/ci.yml
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: npm          # Built-in dep caching
      - run: npm ci

      # Cache build output so unchanged source skips the build
      - uses: actions/cache@v4
        id: build-cache
        with:
          path: dist/
          key: build-${{ runner.os }}-${{ hashFiles('src/**', 'package-lock.json') }}
          restore-keys: |
            build-${{ runner.os }}-

      - if: steps.build-cache.outputs.cache-hit != 'true'
        run: npm run build

      - run: npm test

The restore-keys fallback is critical for incremental caching. When the exact key misses (because source changed), the prefix match restores the closest recent cache. Build tools with incremental compilation, such as Next.js, Webpack 5, Gradle, and Turborepo, can then rebuild only changed modules instead of starting from scratch. A full build that takes 8 minutes can drop to under 2 minutes with a warm incremental cache.

One caveat: cache transfer time matters. If your cached directory exceeds 5 GB, the restore step alone can take over a minute. At that point, the cache may cost more than it saves. Split large caches by purpose (dependencies vs. build output) and measure whether the cache step itself has become a bottleneck. GitHub provides 10 GB of cache storage per repository; if you're seeing low hit rates, check cache size and prune stale branch caches.

Step 3

Speed up build steps

Build steps (compile, bundle, transpile) are often the single largest line item in a CI workflow. When multiple jobs each run the same build from scratch, the cost multiplies. An 8-minute build running in 3 jobs costs 24 minutes per run. The fix has two parts: build once, and enable framework-level caching so even that single build runs faster.

Build once and share output via artifacts

The most impactful fix is structural: build once in a dedicated job, then share the output with downstream jobs using actions/upload-artifact and actions/download-artifact. Test, lint, and deploy jobs download the pre-built output instead of rebuilding it. This turns 3 builds into 1, saving 16 minutes per run immediately.

.github/workflows/ci.yml
jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: npm
      - run: npm ci

      # Restore cached build output if source hasn't changed
      - uses: actions/cache@v4
        id: build-cache
        with:
          path: dist/
          key: build-${{ runner.os }}-${{ hashFiles('src/**', 'package-lock.json') }}
          restore-keys: |
            build-${{ runner.os }}-

      # Only build if cache missed
      - if: steps.build-cache.outputs.cache-hit != 'true'
        run: npm run build

      # Share output with downstream jobs
      - uses: actions/upload-artifact@v4
        with:
          name: build-output
          path: dist/
          retention-days: 1

  test:
    needs: build
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: npm
      - run: npm ci
      - uses: actions/download-artifact@v4
        with:
          name: build-output
          path: dist/
      - run: npm test

Set retention-days: 1 on ephemeral build artifacts. You only need them for the duration of the workflow run. The default retention is 90 days for public repos, and keeping build artifacts that long wastes storage quota. The deploy job can also use download-artifact to deploy the exact artifact that was tested, never a fresh build.

Enable framework-level build caching

Most modern build tools support incremental compilation caches. When the GitHub Actions cache restores the tool's internal cache directory, the build tool can skip recompiling unchanged modules. This reduces even the non-cached builds from full recompilation to incremental updates.

Build tool Cache path Typical savings
Next.js .next/cache 60–90%
Webpack 5 node_modules/.cache/webpack 50–80%
Gradle ~/.gradle/caches 60–80%
Turborepo node_modules/.cache/turbo 70–95%
Docker (BuildKit) GHA cache backend 50–80%

For Next.js, cache .next/cache with a restore-keys fallback. On partial match, Next.js does an incremental rebuild, recompiling only changed modules. A full 5-minute build can drop to under 1 minute. For Gradle, use gradle/actions/setup-gradle@v4 with cache-read-only: true on non-main branches to prevent PR builds from polluting the cache.

Step 4

Split long test suites (but watch billable minutes)

If your test job is the bottleneck, say 20 minutes of a 25-minute pipeline, splitting it across parallel runners can cut wall-clock time significantly. But parallelism has a cost trap: GitHub bills each job independently, rounded up to the nearest minute. Splitting a 20-minute suite across 5 runners doesn't cost 20 minutes. It costs 5 × (4 min + setup overhead), and setup overhead (checkout, install, cache restore) can easily add 2–3 minutes per shard.

The key insight: parallelism reduces wall-clock time but can increase total billable minutes. Only add parallelism when it reduces total minutes, not just wall time. For a deeper look at the cost traps of excessive splitting, see over-parallelized test suites. Here's how to split tests with a matrix while controlling cost:

.github/workflows/ci.yml
jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        shard: [1, 2, 3]
      fail-fast: false
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: npm
      - run: npm ci

      # Split by test file groups, not random assignment
      - run: npx jest --shard=${{ matrix.shard }}/3
Shards Wall time Billable min Cost change
1 (no split) 22 min 22 min baseline
2 shards 12 min 24 min +9%
3 shards 9 min 27 min +23%
5 shards 6 min 30 min +36%

If you need parallelism for developer experience but want to control cost, limit shards on PRs (e.g., 2–3) and run the full split on main where correctness matters more than speed.

Step 5

Reorder jobs so cheap checks run first

A common pattern in slow pipelines is running expensive jobs (build, E2E tests, security scans) before cheap ones (lint, typecheck, unit tests). If the linter would catch an error in 30 seconds, but it doesn't run until after a 15-minute build, you've wasted 15 minutes on a run that was always going to fail. Our guide on running expensive jobs before cheap ones covers this pattern in detail.

Use the needs keyword to create a dependency graph where fast-fail jobs gate expensive work:

All jobs run in parallel
jobs:
  build:    # 8 min
  test:     # 12 min
  lint:     # 1 min
  e2e:      # 15 min
# Lint fails at 1 min, but build
# and e2e already burned 8+15 min
Cheap checks gate expensive jobs
jobs:
  lint:     # 1 min, runs first
  build:
    needs: lint  # 8 min
  test:
    needs: build # 12 min
  e2e:
    needs: build # 15 min
# Lint fails at 1 min → nothing
# else runs. Saved 35 minutes.

The tradeoff: chaining jobs increases wall-clock time on success (lint + build + test runs sequentially instead of in parallel). But it dramatically cuts billable minutes on failure, which is where most of the waste occurs. If your lint/typecheck failure rate is even 5%, the savings on failed runs outweigh the added wall time on passing runs.

Step 6

Move heavy jobs off PR triggers

Not every test needs to run on every PR. If your test-e2e job takes 6 minutes and has a 2% failure rate on PRs, 98% of those runs pass without catching anything. Running it 400 times a month costs 2,400 minutes × $0.006 = $14.40. Move the full E2E suite to main or a merge queue, and run a focused smoke test on PRs instead.

.github/workflows/ci.yml
jobs:
  # Runs on every PR (fast, cheap)
  test-unit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: npm
      - run: npm ci
      - run: npm test

  # Full E2E only on main and merge queue
  test-e2e:
    if: github.event_name != 'pull_request'
    runs-on: ubuntu-latest
    timeout-minutes: 15
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: npm
      - run: npm ci
      - run: npm run test:e2e

One caveat: if test-e2e is a required status check, skipping it on PRs will block merging. Either remove it from required checks and rely on the merge queue run, or replace it with a lightweight smoke test that satisfies the check. See our E2E tests running too often guide for patterns.


Reference

Optimization strategies compared: cost vs. speed

Always prefer strategies that reduce both wall time and billable minutes (caching, artifact reuse, build-once patterns). Use parallelism selectively, and only when the wall-time improvement justifies the cost increase after you've exhausted the cheaper optimizations first.

Strategy Wall time Billable min $/run (Linux)
Caching (reduces per-job time)
Build once + artifact reuse
Fast-fail ordering ↑ (on pass) ↓ (on fail) ↓ (net)
Move heavy jobs off PRs
More parallel shards
Larger runner (e.g., 8-core) varies

Reference

When larger runners make sense

For CPU-bound jobs (compilation, bundling, test execution), a larger runner can reduce wall-clock time and total cost, provided the speedup is proportional to the rate increase. But I/O-bound jobs like integration tests and database-heavy suites rarely benefit from more CPU cores.

Runner Rate 20-min build on 2-core Break-even time
Linux 2-core $0.006/min $0.12 20 min
Linux 4-core $0.012/min $0.12 10 min
Linux 8-core $0.022/min $0.12 5.5 min
Linux 16-core $0.042/min $0.12 2.9 min

Team / Enterprise Larger runners (4-core and above) are available on GitHub Free, Pro, and Team plans but are not included in free-tier minutes. You pay per-minute at the larger runner rate.

The "break-even time" column shows how fast the build must complete on the larger runner to cost the same as 20 minutes on a 2-core. If an 8-core runner finishes the build in 6 minutes, the per-run cost is $0.132, which is 10% more but 3.3x faster. If it finishes in 5 minutes, you break even on cost while getting 4x the speed. Always benchmark your actual build before committing to a larger runner.

FAQ

Common questions about speeding up CI pipelines

How do I find which GitHub Actions jobs cost the most?

Use the GitHub Actions workflow jobs API to pull job-level timing for your most expensive workflow. Each job exposes started_at and completed_at timestamps. Multiply each job's billable minutes by the runner rate ($0.006/min for Linux 2-core, $0.062/min for macOS). Rank by total cost over 30 days. The top 5 jobs typically account for 60-80% of total job-level spend. CostOps tracks this automatically.

Does splitting tests across parallel runners save money?

Not necessarily. GitHub bills each job independently, rounded up to the nearest minute. Splitting a 20-minute test suite across 5 runners costs 5 times the per-shard time plus setup overhead. Parallelism reduces wall-clock time but can increase total billable minutes by 20-40%. Only add shards when the wall-time improvement justifies the cost increase.

Should I build once and share artifacts, or build in every job?

Build once. If your workflow has multiple jobs that each run the same build step, you're paying for redundant builds. Use actions/upload-artifact in a dedicated build job and actions/download-artifact in downstream jobs. Set retention-days: 1 on PR artifacts to minimize storage costs. The artifact transfer adds about 10-30 seconds of overhead, which is negligible compared to the 5-10 minutes saved per redundant build.

How much cache storage does GitHub provide per repository?

GitHub provides 10 GB of cache storage per repository across all branches. When the total exceeds 10 GB, the oldest entries are evicted using a least-recently-used (LRU) policy. Stale branch caches from merged or deleted branches count against this limit. Pruning them on PR close helps maintain high cache hit rates on your default branch.

Do larger GitHub Actions runners save money?

Only if the speedup ratio exceeds the cost ratio. A 4-core Linux runner costs $0.012/min (2x the 2-core rate). If a job runs 2x faster on 4-core, the total cost is the same. If the speedup is less than 2x, you pay more. I/O-bound jobs like integration tests and database-heavy test suites rarely benefit from additional CPU cores.

Should I optimize all CI jobs equally?

No. Because job cost is concentrated (top 3-5 jobs drive 60-80% of spend), optimizing your most expensive jobs yields 5-10x the savings of spreading effort across all jobs. A 30% reduction in jobs that account for 72% of spend saves far more than the same effort spread across 20 smaller jobs.

Related guides

Guides / Speed up CI pipelines

See which jobs are slowing down your pipeline

CostOps breaks down per-job duration and cost across every workflow run, so you can find the bottlenecks before you refactor the YAML.

Free for 1 repo. No credit card. No code access.

Built by engineers who've managed CI spend at scale.