CI/CD

Guía para integrar Semgrep en pipelines de CI/CD con diferentes plataformas, estrategias de bloqueo y gestión de resultados.

Estrategia recomendada

Empezar sin bloqueo — ejecuta Semgrep en modo informativo para medir el volumen de findings
Filtrar ruido — ajusta rulesets y exclusiones hasta tener una tasa de false positives aceptable
Activar bloqueo gradual — bloquea solo ERROR, luego añade WARNING cuando el equipo esté listo
Escaneo diferencial en PRs — escanea solo archivos modificados para feedback rápido

GitHub Actions

Escaneo completo en main + diferencial en PRs

name: Semgrep
on:
  pull_request:
  push: { branches: [ main ] }

jobs:
  scan:
    runs-on: ubuntu-latest
    container:
      image: semgrep/semgrep
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Scan (diff en PR, full en main)
        run: |
          mkdir -p reports
          if [ "${{ github.event_name }}" = "pull_request" ]; then
            # Solo archivos modificados en el PR
            git diff --name-only origin/${{ github.base_ref }}...HEAD > /tmp/changed.txt
            semgrep --config=auto \
              --sarif -o reports/semgrep.sarif \
              --target-list /tmp/changed.txt || true
          else
            # Escaneo completo en main
            semgrep --config=auto \
              --sarif -o reports/semgrep.sarif . || true
          fi

      - name: Upload SARIF
        if: always()
        uses: github/codeql-action/upload-sarif@v3
        with: { sarif_file: reports/semgrep.sarif }

Con reglas locales del repo

name: Semgrep (custom rules)
on:
  pull_request:

jobs:
  scan:
    runs-on: ubuntu-latest
    container:
      image: semgrep/semgrep
    steps:
      - uses: actions/checkout@v4
      - name: Scan with custom + registry rules
        run: |
          semgrep \
            --config=semgrep-rules/ \
            --config=p/owasp-top-ten \
            --config=p/secrets \
            --severity=ERROR \
            --error \
            --sarif -o semgrep.sarif .
      - name: Upload SARIF
        if: always()
        uses: github/codeql-action/upload-sarif@v3
        with: { sarif_file: semgrep.sarif }

Bloquear PR con comment

      - name: Comment on PR
        if: failure() && github.event_name == 'pull_request'
        uses: actions/github-script@v7
        with:
          script: |
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: '⚠️ Semgrep encontró findings de seguridad. Revisa la pestaña Security > Code scanning alerts.'
            })

GitLab CI

stages:
  - security

semgrep:
  stage: security
  image: semgrep/semgrep
  variables:
    SEMGREP_RULES: "p/owasp-top-ten p/secrets"
  script:
    - semgrep --config=${SEMGREP_RULES// / --config=} --sarif -o semgrep.sarif .
    - semgrep --config=${SEMGREP_RULES// / --config=} --json -o semgrep.json .
  artifacts:
    reports:
      sast: semgrep.sarif
    paths:
      - semgrep.json
    when: always
  rules:
    - if: $CI_MERGE_REQUEST_IID
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH

pre-commit hook

Escaneo local antes de hacer commit, para encontrar problemas antes de que lleguen al CI.

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/semgrep/semgrep
    rev: v1.67.0
    hooks:
      - id: semgrep
        args: ["--config=auto", "--error", "--severity=ERROR"]

# Instalar
pip install pre-commit
pre-commit install

# Ejecutar manualmente en todos los archivos
pre-commit run semgrep --all-files

Semgrep App (plataforma cloud)

Para equipos que necesitan dashboard, gestión de findings y triage centralizado.

# Login (genera token en semgrep.dev)
semgrep login

# Escaneo conectado a la plataforma
semgrep ci

GitHub Actions con Semgrep App

      - name: Semgrep CI
        env:
          SEMGREP_APP_TOKEN: ${{ secrets.SEMGREP_APP_TOKEN }}
        run: semgrep ci

Ventajas de la plataforma:

Dashboard con métricas y tendencias
Triage — marcar findings como false positive o accepted risk
Políticas — diferentes rulesets por repo o equipo
Notificaciones — Slack, email, webhooks

Gestión de resultados

Procesar JSON con jq

# Contar findings por severidad
semgrep --config=auto --json . | jq '[.results[] | .extra.severity] | group_by(.) | map({(.[0]): length}) | add'

# Listar findings únicos (rule_id + archivo)
semgrep --config=auto --json . | jq -r '.results[] | "\(.check_id)\t\(.path):\(.start.line)"' | sort -u

# Extraer solo findings de severidad ERROR
semgrep --config=auto --json . | jq '.results | map(select(.extra.severity == "ERROR"))'

Comparar escaneos

# Guardar baseline
semgrep --config=auto --json -o baseline.json .

# Después de cambios, comparar
semgrep --config=auto --json -o current.json .
diff <(jq -r '.results[].check_id' baseline.json | sort) \
     <(jq -r '.results[].check_id' current.json | sort)

Errores comunes en CI

Timeout — en repos grandes usa --timeout=300 (segundos por archivo) y --timeout-threshold=3 (máximo de timeouts antes de abortar)
Exit code 1 inesperado — --error hace que cualquier finding retorne exit code 1. Sin --error, Semgrep siempre retorna 0 aunque haya findings
Escaneo lento en PRs — usa escaneo diferencial (solo archivos cambiados) en vez de escaneo completo
SARIF vacío — verifica que el directorio de output exista (mkdir -p reports) antes de ejecutar

Estrategia recomendada​

GitHub Actions​

Escaneo completo en main + diferencial en PRs​

Con reglas locales del repo​

Bloquear PR con comment​

GitLab CI​

pre-commit hook​

Semgrep App (plataforma cloud)​

GitHub Actions con Semgrep App​

Gestión de resultados​

Procesar JSON con jq​

Comparar escaneos​

Errores comunes en CI​

Referencias​