Reruns & flakiness
Integration tests rebuilding the world every time
By Keith Mazanec, Founder, CostOps ยท Updated January 31, 2026
A developer opens a PR. CI spins up a PostgreSQL service container, runs 400 migrations, seeds fixture data, and then executes 200 integration tests. The next push does it all again. Every run recreates the entire environment from scratch, and you pay for every minute of that setup. On a team merging 10 PRs a day, the database lifecycle alone can consume more CI minutes than the tests themselves.
Symptoms
How to tell if integration test setup is costing you money
Open your workflow run logs and look at the time spent before any test assertion runs. If the setup phase dominates, you have this problem.
-
Long environment startup in logs. Your CI logs show 3–10 minutes of database creation, migration, and seeding before any test output appears. The "Setup Database" or "Prepare Test Environment" step dominates the job timeline.
-
Low assertion-to-runtime ratio. Your test suite runs 200 tests in 15 minutes, but the tests themselves only account for 5 minutes of execution. The other 10 minutes are environment setup, service container startup, and fixture loading. That overhead scales with CI runs, not test count.
-
Identical setup repeated across shards. If you shard tests across parallel jobs, each shard independently creates a database, runs migrations, and loads seeds. With 4 shards, you're paying for that setup 4 times per run, even though each shard's setup is identical. This compounds the cost of over-parallelized test suites.
-
Service container pull overhead. The "Initialize containers" step pulls Docker images for PostgreSQL, Redis, or Elasticsearch on every run. Without image caching, this adds 30–90 seconds of pure download time before your job even starts.
Metrics
Quantify the integration test setup tax
The cost compounds because setup overhead is per-run, not per-test. Every push to every PR pays the full environment creation cost. Here's a typical scenario for a team running integration tests on Linux runners:
Before optimization
At $0.006/min (Linux 2-core) · 40% is setup
After optimization (setup cut 75%)
Save $24/mo · $288/year · per workflow
That's one workflow on Linux. With 4 parallel shards, each repeating the same 8-minute setup, the waste quadruples to $96/mo in pure setup overhead. On macOS runners at $0.062/min, the same scenario costs $820/mo before optimization, with $246/mo recoverable from setup reduction alone.
Fix 1
Use schema load instead of running migrations
The most common CI setup mistake is running every migration from scratch on each run. A project with 400 migrations replays years of schema history just to reach the current state. The fix is straightforward: load db:schema:load (or db:structure:load) instead of db:migrate. This applies the final schema in one step instead of replaying every historical migration.
- name: Setup Database run: bundle exec rails db:create db:migrate # 400 migrations → 3-5 minutes
- name: Setup Database run: bundle exec rails db:create db:schema:load # Single SQL file → 5-15 seconds
For Rails apps, ensure db/schema.rb (or db/structure.sql) is committed and up-to-date. For other frameworks, the equivalent is loading a schema dump rather than replaying migration history. Django uses migrate by default but you can dump with inspectdb; many Go projects use tools like goose or atlas which support schema-apply modes.
One caveat: if your migrations contain data migrations or seed logic, those won't run via schema load. Move data seeds to a separate db:seed step or a fixture file.
Fix 2
Wrap tests in transactions instead of recreating the database
Some test suites recreate the database for every test or test file by dropping, creating, and migrating from scratch each time. This can add ~10 seconds per test of pure overhead. The alternative is to wrap each test in a database transaction and roll it back when the test finishes. The database stays intact; only the test's changes are undone.
Most test frameworks support this natively. RSpec and Rails use use_transactional_fixtures by default. Jest with Prisma can use $transaction. Java's Spring has @Transactional on test classes. The key is ensuring your test framework is actually using this strategy rather than truncation or recreation.
RSpec.configure do |config| # Each test runs inside a transaction that rolls back on completion. # No database recreation, no truncation, no leftover state. config.use_transactional_fixtures = true end
If you use database_cleaner, check which strategy it's set to. The :transaction strategy is fastest; :truncation clears every table between tests (slower); :deletion uses DELETE instead of TRUNCATE (slowest for large tables). Switch to :transaction wherever possible:
RSpec.configure do |config| config.before(:suite) do DatabaseCleaner.clean_with(:truncation) # Once at start end config.before(:each) do DatabaseCleaner.strategy = :transaction # Fast per-test cleanup end # Only use truncation for tests that need it (e.g., multi-connection) config.before(:each, type: :feature) do DatabaseCleaner.strategy = :truncation end config.before(:each) do DatabaseCleaner.start end config.after(:each) do DatabaseCleaner.clean end end
One caveat: transaction rollback doesn't work for tests that span multiple database connections (e.g., system tests with a separate browser thread hitting the app server). For those tests, use truncation selectively, but keep the vast majority on transaction rollback.
Fix 3
Tune the database for CI (not for production)
GitHub Actions service containers run with production-default PostgreSQL settings: fsync = on, synchronous_commit = on, full_page_writes = on. These settings protect against data loss during crashes, but that is irrelevant in CI where the database is ephemeral. Disabling durability features can speed up database operations by 5–15x.
services: postgres: image: postgres:16 env: POSTGRES_USER: test POSTGRES_PASSWORD: test POSTGRES_DB: app_test ports: - 5432:5432 options: >- --health-cmd pg_isready --health-interval 10s --health-timeout 5s --health-retries 5 command: >- postgres -c fsync=off -c synchronous_commit=off -c full_page_writes=off -c shared_buffers=256MB -c work_mem=64MB -c maintenance_work_mem=128MB
The command key on a GitHub Actions service container overrides the default entrypoint arguments. These flags disable write-ahead log durability (fsync, synchronous_commit, full_page_writes) and increase memory buffers for faster query execution. Real-world results from teams applying this: test throughput improved from 78 tps to 1,215 tps, a 15x improvement on database-heavy suites.
One caveat: never use these settings in production or staging environments. They sacrifice data durability for speed. In CI, where the database is destroyed after every run, that tradeoff is free.
Fix 4
Replace global seed files with per-test factories
Many test suites load a large seed file at startup, often hundreds or thousands of rows across dozens of tables, to create a "realistic" baseline. But most tests only need a handful of records. Loading the full seed adds minutes of insert time that doesn't improve test coverage. The fix is to use factories (like FactoryBot, Faker, or fixtures) that create only what each test needs.
- name: Prepare Database run: | bundle exec rails db:schema:load bundle exec rails db:seed # Seeds insert 5,000+ rows across 40 tables # Adds 60-120 seconds per run
- name: Prepare Database run: bundle exec rails db:schema:load # No seeds - each test creates its own data # via FactoryBot/fixtures # In tests: # let(:user) { create(:user) } # let(:repo) { create(:repository, user: user) }
If some tests genuinely need a complex data graph, create a shared context or let_it_be block (via the test-prof gem) that loads the data once per file rather than per test. This gives you the realistic setup without the per-test insertion cost.
describe Billing do # Created once for the entire describe block, not per test. # Survives transaction rollback via a separate connection. let_it_be(:account) { create(:account, :with_full_history) } it "calculates monthly cost" do expect(account.monthly_cost).to eq(99.00) end it "includes free tier discount" do expect(account.discount).to be_positive end end
Reference
Complete optimized integration test workflow
Here's a full GitHub Actions workflow combining all four fixes: schema load instead of migrations, a tuned PostgreSQL service container, transaction-based test cleanup, and no global seeds. It also uses a concurrency group to cancel superseded runs. Copy this and adjust the Ruby/Rails specifics for your stack.
name: Integration Tests on: pull_request: push: branches: [main] concurrency: group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }} cancel-in-progress: ${{ github.ref != 'refs/heads/main' }} jobs: test: runs-on: ubuntu-latest services: postgres: image: postgres:16 env: POSTGRES_USER: test POSTGRES_PASSWORD: test POSTGRES_DB: app_test ports: - 5432:5432 options: >- --health-cmd pg_isready --health-interval 10s --health-timeout 5s --health-retries 5 command: >- # Fix 3: CI-tuned Postgres postgres -c fsync=off -c synchronous_commit=off -c full_page_writes=off -c shared_buffers=256MB -c work_mem=64MB redis: image: redis:7 ports: - 6379:6379 steps: - uses: actions/checkout@v4 - uses: ruby/setup-ruby@v1 with: bundler-cache: true # Cache gems - name: Setup Database run: bundle exec rails db:schema:load # Fix 1: No migrations env: DATABASE_URL: postgres://test:test@localhost:5432/app_test - name: Run Tests # Fix 2 + 4: Transactions, no seeds run: bundle exec rspec --tag integration env: DATABASE_URL: postgres://test:test@localhost:5432/app_test REDIS_URL: redis://localhost:6379
Reference
Setup overhead comparison by strategy
The right strategy depends on your test suite's needs. Here's how the common approaches compare for a project with 400 tables and 200 integration tests:
| Strategy | Per-test cost | Suite overhead |
|---|---|---|
| Drop + create + migrate per test | ~10s | ~33 min |
| Truncate all tables | ~200ms | ~40s |
| Delete from accessed tables | ~30ms | ~6s |
| Transaction rollback | <1ms | <1s |
The difference between the worst and best strategy is 33 minutes vs under 1 second for 200 tests. Even moving from truncation to transaction rollback saves 40 seconds per run, and at 30 runs/day on Linux that is $3.60/mo. Moving from per-test recreation to transaction rollback saves $178/mo at the same volume.
Related guides
Over-Parallelized Test Suites
When sharding tests into too many jobs multiplies your setup overhead.
Reduce CI Setup and Install Overhead
Cut checkout, install, and build time across all your CI jobs.
Dependency Cache Not Working
Fix broken cache keys and restore-keys to stop re-downloading dependencies every run.
E2E Tests Running Too Often
Run expensive end-to-end tests only when they add value, not on every push.