Runner & infrastructure mismatch

When queue time exceeds job runtime

By Keith Mazanec, Founder, CostOps · Updated February 17, 2026

A developer pushes a commit. The lint check takes 20 seconds to run, but it waits 3 minutes in the queue for a runner. The entire pipeline is gated on this job, so every downstream step inherits the delay. Queue time is not billed, but it destroys feedback loops and drives developers to re-push, creating more queued jobs and compounding the problem. What starts as a runner bottleneck becomes a vicious cycle of wasted minutes and developer frustration.

Symptoms

How to tell if queue time is your bottleneck

Open your repository's Actions Performance Metrics or check individual job timings. Look for these patterns:

Queue time exceeds run time on 25%+ of jobs. Click into any workflow run and compare the "Waiting for a runner" time to the actual execution time. If the queue phase is longer than the execution phase on a quarter or more of your jobs, you have a runner availability bottleneck.
Short jobs with long waits. Jobs that take under 2 minutes to execute but consistently wait 2+ minutes for a runner. Typical offenders: lint, format, typecheck, and smoke tests. These are the jobs that should give developers the fastest feedback, but queue delays invert their purpose.
macOS or Windows jobs queued for minutes (or hours). GitHub caps concurrent macOS jobs at 5 on Free, Pro, and Team plans. If you run a matrix of 3 Xcode versions with build + test jobs, that is 6 jobs competing for 5 slots. One always queues, and during busy periods the backlog grows.
Developers re-pushing to "unstick" CI. When queue times are long, developers often push empty commits or rebase to trigger a fresh run, hoping it will get picked up faster. This creates more queued jobs and makes the problem worse. Look for multiple runs on the same branch started within a 5-minute window.
High runs-per-PR despite few commits. A pull request with 3 commits triggers 6–8 workflow runs. The extra runs come from developers re-pushing or clicking “Re-run all jobs” because the original run was stuck in queue. This is a leading indicator that queue time is driving unnecessary cost.

Metrics

The hidden cost of queue time

Queue time is not billed by GitHub. But it has real costs that do not appear on any invoice. Consider a team with 8 developers, each waiting on CI feedback throughout the day:

Before optimization

Jobs/day 120

Avg queue time 3 min

Avg run time 1 min

Daily queue waste 360 min

Dev wait cost $450/day

At $75/hr fully loaded (6 hours × $75)

After optimization (queue < 30s)

Jobs/day 90

Avg queue time 25 sec

Avg run time 1 min

Daily queue waste 38 min

Dev wait cost $47/day

Save $403/day · $8,060/mo · in developer time

The runner bill stays roughly the same (queue time is not billed), but developer throughput improves dramatically. And because shorter queues reduce re-push behavior, you also end up with fewer total jobs, which further reduces the $0.006/min bill on Linux runners. The compounding effect is real: fewer stale jobs means shorter queues, which means fewer re-pushes, which means fewer stale jobs.

The re-push cycle

Queue time causes more runs

Developer wait cost is only half the story. Every re-push triggered by queue frustration generates a full workflow run that GitHub does bill for. Queue time itself is not billed, but every extra run it causes is. Here is what happens when an 8-person team averages 2 unnecessary re-pushes per developer per day on a 15-minute Linux workflow:

With queue-driven re-pushes

Normal runs/day 18

Extra re-push runs/day 16

Minutes/run 15

Monthly minutes 11,220

Monthly cost $67/mo

34 runs/day × 15 min × 22 days × $0.006/min

After fixing queue time

Normal runs/day 18

Extra re-push runs/day 0

Minutes/run 15

Monthly minutes 5,940

Monthly cost $36/mo

Save $31/mo · $377/year · per workflow

That is one workflow on Linux at $0.006/min. On macOS at $0.062/min, where queue time is worst due to the 5-job concurrency cap, the same re-push overhead costs $326/mo instead of $31/mo. And macOS queues are the ones most likely to trigger re-pushes in the first place. Breaking the re-push cycle is the single highest-leverage fix for teams hitting runner concurrency limits.

Fix 1

Use concurrency groups to free runner slots

The most common cause of queue pressure is stale jobs occupying runner slots. When a developer pushes 3 commits quickly, 3 workflow runs queue up. The first two are already obsolete. Without concurrency groups, all three compete for runners, and downstream jobs wait for all of them to clear.

Adding a concurrency group with cancel-in-progress: true automatically cancels the older runs when a newer one arrives. This frees runner slots immediately and breaks the re-push cycle: fewer concurrent runs means shorter queues, which means fewer frustrated re-pushes.

.github/workflows/ci.yml

name: CI

on:
  push:
    branches: [main]
  pull_request:

concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
  cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm run lint

The group key ${{ github.workflow }}-${{ github.head_ref || github.run_id }} scopes cancellation to the same workflow on the same PR branch. Pushes to different branches do not cancel each other. Include github.workflow in the group name to avoid cancelling jobs from different workflows. The conditional cancel-in-progress expression cancels stale PR runs but never cancels production builds on main.

One caveat: GitHub's concurrency queue depth is fixed at 1. In a concurrency group, there can be at most one running job and one pending job. Any previously pending job is replaced when a new one enters the group, even without cancel-in-progress: true. This is usually the right behavior for PR checks, but be aware of it for deployment pipelines where you want FIFO ordering.

Fix 2

Consolidate short jobs to reduce queue contention

Every job in a workflow requests its own runner. If you have separate jobs for lint, format, typecheck, and unit tests, that is 4 runners requested simultaneously. Each job pays the queue-time overhead independently. And because GitHub rounds each job up to the nearest minute for billing, four 15-second jobs consume 4 billed minutes, not 1.

Combining them into a single job means one queue wait, one runner provisioned, and one billed minute instead of four.

4 jobs, 4 queue waits, 4 billed minutes

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - run: npm run lint
  format:
    runs-on: ubuntu-latest
    steps:
      - run: npm run format:check
  typecheck:
    runs-on: ubuntu-latest
    steps:
      - run: npm run typecheck
  unit:
    runs-on: ubuntu-latest
    steps:
      - run: npm test

1 job, 1 queue wait, 1 billed minute

jobs:
  checks:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
          cache: npm
      - run: npm ci
      - run: npm run lint
      - run: npm run format:check
      - run: npm run typecheck
      - run: npm test

The tradeoff is that individual step failures are less granular in the GitHub UI. You lose the separate green/red status per check. If that matters for your branch protection rules, keep the jobs separate but consider the next fix to isolate them from heavier workloads.

Fix 3

Split fast checks into a separate workflow

If you cannot consolidate jobs (because branch protection requires separate status checks), split your fast checks into their own workflow file. GitHub Actions provisions runners per workflow, and a lightweight workflow with 1 or 2 jobs will get runner assignments faster than a heavy workflow requesting 10+ runners simultaneously.

This also gives developers fast feedback. If lint or unit tests fail, they know within 2–3 minutes and do not have to wait in queue for the full pipeline. This eliminates the most common trigger for re-pushes: waiting for CI with no signal.

.github/workflows/checks.yml (fast feedback)

name: Checks

on:
  pull_request:

concurrency:
  group: checks-${{ github.head_ref }}
  cancel-in-progress: true

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm run lint

  typecheck:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm run typecheck

.github/workflows/test.yml (heavy suite)

name: Tests

on:
  pull_request:

concurrency:
  group: tests-${{ github.head_ref }}
  cancel-in-progress: true

jobs:
  integration:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        shard: [1, 2, 3, 4]
    steps:
      - uses: actions/checkout@v4
      - run: npm run test:shard -- --shard=${{ matrix.shard }}/4

The fast Checks workflow requests 2 runners. The heavy Tests workflow requests 4. They run independently, so the lint result arrives while integration tests are still provisioning runners.

Fix 4

Reduce macOS and Windows queue pressure

macOS runners have the tightest concurrency caps and the longest queue times. GitHub allows only 5 concurrent macOS jobs on Free, Pro, and Team plans (50 on Enterprise Cloud). Windows is capped at the standard concurrent job limit but provisions more slowly than Linux.

The fix is to run the minimum necessary work on these expensive, constrained runners. Move lint, format, and typecheck to Linux (where they produce the same results), and reserve macOS/Windows for platform-specific tests. On PRs, run a single representative platform test. Run the full matrix only on main and release branches.

.github/workflows/ci.yml

jobs:
  # Fast checks on Linux (20 concurrent slots on Free plan)
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm run lint && npm run typecheck

  # Platform tests only where needed (5 concurrent macOS slots)
  test-macos:
    runs-on: macos-latest
    needs: [lint]
    steps:
      - uses: actions/checkout@v4
      - run: xcodebuild test -scheme MyApp

  test-windows:
    runs-on: windows-latest
    needs: [lint]
    steps:
      - uses: actions/checkout@v4
      - run: dotnet test

By gating platform-specific tests behind needs: [lint], macOS/Windows runners are only requested after fast checks pass. If lint fails, the expensive runners are never queued. This reduces both queue pressure and cost: macOS runners bill at $0.062/min (10x Linux), so every avoided macOS job saves real money.

For even more savings, see our guide on reducing macOS/Windows CI spend for detailed patterns on matrix reduction and cross-compilation strategies.

Fix 5

Increase runner capacity for bottleneck platforms

If you have added concurrency groups, consolidated jobs, and reduced matrix fanout but queue time is still high, the bottleneck is raw capacity. There are two paths forward:

Option A: Self-hosted runners. Available on all plans, including Free. Self-hosted runners have no GitHub-imposed concurrency limit, so capacity is bounded only by your infrastructure. For macOS, this means running on a Mac mini fleet or a cloud Mac service like AWS EC2 Mac instances. Use the Actions Runner Controller (ARC) for Kubernetes-based autoscaling of Linux and Windows runners. See our guide on always-on self-hosted runners for cost tradeoffs.

Self-hosted runner label in workflow

jobs:
  test-macos:
    runs-on: [self-hosted, macOS, ARM64]
    steps:
      - uses: actions/checkout@v4
      - run: swift test

Option B: Larger runners. GitHub's larger runners have a separate concurrency pool of up to 1,000 jobs on Team and Enterprise Cloud plans. Moving CPU-bound jobs to larger runners means they finish faster (freeing slots sooner) and do not compete with standard runner jobs for the same pool. If you suspect your current runners are also underpowered for the workload, upgrading can solve both problems at once.

Team / Enterprise Cloud Larger runners require GitHub Team or Enterprise Cloud. They are not available on Free or Pro plans.

Reference

Measure queue time with the GitHub API

GitHub does not surface queue time in the Actions UI. You need the API. The created_at timestamp on a workflow run is when it was queued; run_started_at is when the first job started executing. The difference is queue time.

Query queue time with gh CLI

# List recent runs with queue time in seconds
gh api "repos/{owner}/{repo}/actions/runs?per_page=50" \
  --jq '.workflow_runs[] |
    {
      id: .id,
      name: .name,
      created: .created_at,
      started: .run_started_at,
      queue_seconds:
        ((.run_started_at | fromdateiso8601) -
         (.created_at | fromdateiso8601))
    } |
    select(.queue_seconds > 60)'

This filters for runs that waited more than 60 seconds in queue. Sort by queue_seconds descending to find the worst offenders. If specific workflows consistently show high queue times, those are the ones whose matrix size, concurrency groups, or runner type need attention.

For job-level queue time (more granular), query the workflow jobs endpoint and compare each job's created_at to its started_at. This reveals which specific jobs (e.g., macOS matrix legs) are queuing longest.

Reference

GitHub Actions concurrency limits by plan

Queue time is directly influenced by how many jobs can run simultaneously. If your repository hits the concurrent job limit, new jobs queue until a slot opens. These limits are per organization (or per user for personal accounts). All repositories in the org share the same pool. A busy monorepo can easily consume the full quota, causing queue delays across every other repo in the organization.

Plan	Standard	macOS	Larger
Free	20	5	N/A
Pro	40	5	N/A
Team	60	5	1,000
Enterprise Cloud	500	50	1,000

The macOS limit is shared between standard and larger runners. GitHub Support can increase job concurrency limits via a support ticket, but only for Enterprise Cloud customers. For all other plans, self-hosted runners are the primary escape hatch.

Related guides

Too Many Small Jobs, Too Much Overhead

Dozens of tiny jobs repeat setup work and multiply queue waits.

Reduce macOS/Windows CI Spend

Move non-platform-specific checks to Linux to free scarce macOS slots.

Canceled Runs Are Wasting Minutes

Concurrency groups and debounced triggers to stop paying for stale runs.

Always-On Self-Hosted Runners

Evaluate when self-hosted runners make financial sense over GitHub-hosted capacity.

Underpowered Runners

When a bigger runner actually costs less per job because it finishes faster.

Overpowered Runners

The opposite problem: paying for CPU and RAM your jobs never use.