CI/CD

CI/CD Pipelines Done Right

Continuous Integration and Continuous Deployment changed how teams ship software. They also became the default to such a degree that most teams have a CI pipeline they inherited from someone else, and most pipelines we audit are slow, flaky, or unsafe in subtle ways. This article is the architectural overview of pipelines that genuinely speed up a team rather than slowing them down.

What CI/CD actually buys you

Three concrete things, each measurable:

  • Automatic verification of every change. Every commit runs the test suite. Bad commits do not merge. Without CI, regressions are caught on staging or in production; with CI, they are caught in seconds.
  • Reproducible builds. Builds happen in a known, clean environment, not on someone's laptop. The artifact produced is the artifact deployed.
  • Fast deploys. A merge to main produces a deployable artifact in minutes, not hours. The friction to release goes down.

The standard pipeline shape

flowchart LR Push[Git push] --> Lint[Lint + format check
~30 sec] Lint --> Test[Unit tests
~2 min] Test --> Build[Build artifact
image, binary, package
~3 min] Build --> Integ[Integration tests
~5 min] Integ --> Stage[Deploy to staging] Stage --> Smoke[Smoke tests
~1 min] Smoke --> Prod[Deploy to production
manual or automatic] Prod --> Verify[Production smoke tests] style Stage fill:#fde68a,stroke:#b45309,color:#451a03 style Prod fill:#dbeafe,stroke:#1e40af,color:#0c1e3b

A standard pipeline. Each stage gates the next; failures stop the pipeline. Most teams should aim for a 10-minute total from push to staging.

The platforms

  • GitHub Actions — integrated with GitHub, generous free tier (2000 minutes/month). YAML workflows. Most popular for open-source and small teams.
  • GitLab CI — built into GitLab. YAML pipelines. Stronger for self-hosted GitLab.
  • CircleCI — powerful caching and parallelism. Better than Actions for very large pipelines.
  • Buildkite — agents run on your infrastructure; Buildkite orchestrates. Used by larger engineering organisations.
  • Jenkins — the legacy standard. Still widely deployed; usually painful to operate.
  • Cloud-native: AWS CodePipeline, Google Cloud Build, Azure DevOps. Tied to the respective cloud.

For a new project today, GitHub Actions is almost certainly the right starting point. Migrate to a heavier platform only if you have specific reasons.

A simple GitHub Actions workflow

# .github/workflows/ci.yml
name: CI
on:
  push: { branches: [main] }
  pull_request:

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v4
    - uses: actions/setup-node@v4
      with: { node-version: 20, cache: npm }
    - run: npm ci
    - run: npm run lint
    - run: npm test -- --coverage
    - uses: codecov/codecov-action@v4

  build:
    needs: test
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    steps:
    - uses: actions/checkout@v4
    - uses: docker/login-action@v3
      with:
        registry: ghcr.io
        username: ${{ github.actor }}
        password: ${{ secrets.GITHUB_TOKEN }}
    - uses: docker/build-push-action@v5
      with:
        push: true
        tags: ghcr.io/${{ github.repository }}:${{ github.sha }}

  deploy:
    needs: build
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    environment: production
    steps:
    - run: |
        # Trigger deploy: replace with your real deploy mechanism
        curl -X POST $DEPLOY_HOOK -H "Authorization: $DEPLOY_TOKEN"
      env:
        DEPLOY_HOOK: ${{ secrets.DEPLOY_HOOK }}
        DEPLOY_TOKEN: ${{ secrets.DEPLOY_TOKEN }}

This pipeline runs tests on every push and PR, builds and pushes a Docker image on main, and triggers a deploy. About 50 lines for a complete CI/CD setup.

Secrets management

Secrets — database credentials, API keys, deploy tokens — are the most common security incident from CI. Two rules:

  1. Never put secrets in the repo, ever. Even private repos. They get pasted into Slack, copied into other systems, leaked.
  2. Use the platform's secret store: GitHub Actions Secrets, GitLab CI Variables, Vault, AWS Secrets Manager. Inject as environment variables at runtime.

For higher security, use OIDC: the CI runner authenticates to your cloud provider with a short-lived token instead of a long-lived secret. AWS, GCP, and Azure all support this with GitHub Actions natively.

Deployment strategies

Rolling deployment

Replace old containers with new ones gradually. The most common pattern; default in Kubernetes Deployments and ECS services. Simple, no extra infrastructure, downtime is the time it takes for new pods to start.

Blue-green deployment

Stand up a second complete environment (green) running the new version, while the old (blue) still serves traffic. When green is verified healthy, switch the load balancer. Zero downtime; instant rollback (switch back to blue).

flowchart LR LB[Load balancer] LB -->|currently routing here| Blue[Blue
old version] LB -.->|after switch| Green[Green
new version] Green -.->|if problem
switch back to blue| Blue

Blue-green: two environments, one switch flip to deploy.

Canary deployment

Send a small percentage of traffic (1%, 5%, 25%) to the new version while the rest goes to the old. Monitor metrics; if healthy, ramp up. If not, route everyone back to the old version.

Canary is the safest pattern for systems with significant traffic but adds operational complexity. For most small teams, rolling deploy with good metrics and fast rollback is sufficient.

Common antipatterns

  • Slow tests. A pipeline that takes 30 minutes is a pipeline engineers stop running. Aim for under 10 minutes total. Parallelise. Cache aggressively.
  • Flaky tests. Tests that fail randomly train engineers to retry until green, defeating the purpose. Mark known flakies as flaky; fix them. Zero tolerance for new flakies.
  • Manual deploys to staging. If staging requires a human to deploy, no one deploys to staging, and staging diverges from production. Auto-deploy main to staging always.
  • Deploy from main without verification. Auto-deploying main to production is reasonable only when you have strong canary or rollback infrastructure. Otherwise, gate production deploys on a manual approval.
  • Tests passing in CI but failing locally. Sign that your test setup is non-deterministic. The fix is to make CI more like local, not less.
  • Building images without tagging. Tag with the git SHA. Always know exactly which commit produced which image.
  • One-environment-per-PR sounds great in theory. Genuinely useful for some teams. Operationally expensive. Try it; measure if your team uses it; cut it if not.

Pipeline performance tips

  • Cache dependencies. npm, pip, Maven, Gradle, Cargo, Go modules — all cacheable. Saves 30+ seconds per run.
  • Cache Docker layers. BuildKit's cache-from + cache-to dramatically speed up image builds.
  • Run jobs in parallel. Lint, unit tests, type check can all run simultaneously. Wall-clock time matters more than CPU cost.
  • Use larger runners for the slow steps. GitHub Actions offers larger runners (4-core, 8-core, 16-core). Pay $0.04/min for the 8-core to cut a 5-minute test run to 2 minutes.
  • Skip work when nothing relevant changed. If only docs changed, skip the test pipeline. Path filters do this in YAML.

Frequently Asked Questions

How fast should my pipeline be?

Under 10 minutes for the lint+test+build path is reasonable. Under 5 minutes is excellent. Above 15 minutes, engineers start working around the pipeline (pushing untested commits, batching changes). Treat pipeline speed as a developer experience metric.

Should I deploy automatically to production?Depends on your testing maturity. If your tests genuinely cover the user-visible surface, automatic deploys are reasonable and increase shipping velocity. If your tests are thin, gate deploys on manual approval. Most small teams sit in the middle: auto-deploy to staging, manual approval for prod.

What is the difference between CI and CD?CI (Continuous Integration) is automated build and test on every commit. CD (Continuous Delivery) is automated deploy to a non-prod environment, ready for human release. CD (Continuous Deployment) is fully automated deploy to production. The first two are universal; the third is a maturity step.

How do I handle database migrations?Two patterns. First: backward-compatible migrations only, applied automatically on deploy. The new code must work with both the old and new schema; you migrate in two stages (deploy code that handles both, then drop old columns). Second: gated migrations — explicit approval for any non-trivial schema change.

Should I use mono-repo or multi-repo for CI?Either works; CI patterns differ. Mono-repos need path-based triggers so a change in one service does not test all services. Multi-repos have simpler per-service pipelines but cross-repo coordination becomes a problem. Both are common at scale; pick based on team preferences, not CI considerations.

Share your thoughts

Worked with this in production and have a story to share, or disagree with a tradeoff? Email us at support@mybytenest.com — we read everything.