SAST Tools Comparison: Semgrep vs Checkmarx vs SonarQube vs Snyk Code
A deep technical comparison of leading SAST tools — false positive rates, CI/CD integration, custom rule writing, language coverage, and cost models.
Static Application Security Testing (SAST) analyzes source code, bytecode, or binary without executing it. The idea is to catch security vulnerabilities during development rather than after deployment. But not all SAST tools are equal — they differ dramatically in accuracy, speed, language support, and how well they fit into a modern CI/CD pipeline.
This post compares four leading tools in depth: Semgrep, Checkmarx, SonarQube, and Snyk Code.
What SAST Actually Does
A SAST engine parses your code into an abstract syntax tree (AST) or a more complex representation like a control-flow graph (CFG) or data-flow graph. It then applies rules that describe patterns associated with vulnerabilities.
The simplest rules are syntactic: "find calls to eval() with user-controlled input." More sophisticated engines track taint flow — they follow data from an untrusted source (HTTP request, environment variable) through transformations to a sensitive sink (SQL query, shell command, HTML output).
The quality of taint tracking is the single biggest differentiator between tools. Poor taint tracking means either false positives (flagging safe code) or false negatives (missing real vulnerabilities).
Semgrep: Pattern-Matching With Lightweight Taint
Semgrep is open-source and uses a pattern-matching approach with optional taint mode. It's deliberately simple: rules are YAML files with patterns that look like the code they're matching.
# semgrep rule: detect SQL injection via string concatenation
rules:
- id: sql-injection-string-concat
patterns:
- pattern: |
$DB.query($QUERY + $USER_INPUT)
- pattern-not: |
$DB.query($QUERY + sanitize($USER_INPUT))
message: "Potential SQL injection: user input concatenated into query"
languages: [javascript, typescript]
severity: ERROR
metadata:
cwe: CWE-89
owasp: A03:2021
Writing custom Semgrep rules takes minutes, not days. The pattern language supports metavariables ($X), ellipsis matching (...), and negative patterns. You can also write rules that match across multiple files or track function arguments across call boundaries with taint mode:
rules:
- id: taint-xss
mode: taint
pattern-sources:
- pattern: req.params.$X
- pattern: req.query.$X
- pattern: req.body.$X
pattern-sinks:
- pattern: res.send($SINK)
- pattern: res.html($SINK)
pattern-sanitizers:
- pattern: DOMPurify.sanitize(...)
message: "XSS: unsanitized user input reaches HTML output"
languages: [javascript, typescript]
severity: ERROR
Semgrep's false positive rate on well-tuned rules is low. The tradeoff is that it requires maintaining your rule library, and deep interprocedural analysis (tracking data across many function calls) is limited compared to enterprise tools.
CI/CD integration is straightforward. Semgrep provides a Docker image and GitHub Actions:
# .github/workflows/semgrep.yml
name: Semgrep SAST
on: [push, pull_request]
jobs:
semgrep:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: returntocorp/semgrep-action@v1
with:
config: >-
p/default
p/owasp-top-ten
p/nodejs
auditOn: push
env:
SEMGREP_APP_TOKEN: ${{ secrets.SEMGREP_APP_TOKEN }}
The free tier (OSS) is unlimited for open-source. Semgrep Code (paid) adds the managed platform, cross-file analysis, and priority support.
Checkmarx: Enterprise-Grade Deep Analysis
Checkmarx SAST (now CxSAST and Checkmarx One) is the standard in regulated industries. It performs full interprocedural taint analysis with a query language (CxQL) that gives security teams fine-grained control.
CxQL queries look like SQL and allow you to define exactly what constitutes a source, sink, and sanitizer:
// CxQL: custom rule for detecting unsafe deserialization in Java
result = All.FindByName("ObjectInputStream").
GetParameters().
FindByType("InputStream").
GetCaller().
FindByName("readObject");
The depth of Checkmarx's analysis means it can track vulnerabilities across dozens of function calls and even across file boundaries. This is valuable for large enterprise codebases, but it comes with a cost: scans are slow (30-60 minutes for large repos is common) and require significant infrastructure.
False positive rates in Checkmarx are higher out of the box compared to Semgrep. The tool is powerful but requires tuning. Security teams typically spend considerable time writing exclusions and triaging results before developers trust the output.
Language coverage is broad: Java, C#, C/C++, JavaScript/TypeScript, Python, PHP, Go, Kotlin, Swift, and more. This makes it suitable for polyglot enterprises.
Cost model: Checkmarx is priced per line of code or per developer seat, typically in the range of tens of thousands of dollars annually for large teams. It's an enterprise procurement.
SonarQube: Code Quality Meets Security
SonarQube sits at the intersection of code quality and security. It's deployed as a server that developers interact with through IDE plugins (SonarLint) and CI pipeline integrations.
SonarQube uses rules from the OWASP, CWE, and SANS Top 25 catalogs. The security rules perform some taint analysis, but SonarQube's primary strength is code quality: complexity, duplication, maintainability. Security is a secondary concern.
# sonar-project.properties
sonar.projectKey=my-app
sonar.sources=src
sonar.exclusions=**/*.test.ts,**/node_modules/**
sonar.javascript.lcov.reportPaths=coverage/lcov.info
# Security hotspot tracking
sonar.security.sources.javasecurity=true
SonarQube differentiates between "vulnerabilities" (confirmed issues) and "security hotspots" (code that requires manual review). Hotspots are useful because they surface security-sensitive code patterns without asserting they're definitely vulnerable — developers review and mark them safe or vulnerable.
Community Edition is free and covers most languages. The Enterprise and Developer editions add additional language support, branch analysis, and portfolio-level views.
False positive rates: SonarQube tends to generate more noise than Semgrep but less than Checkmarx in raw taint-flow findings. The hotspot model helps manage review burden.
Snyk Code: Developer-First SAST
Snyk Code is designed for developer adoption. It scans in milliseconds using a symbolic AI approach (DeepCode's technology), integrates with IDEs via the Snyk plugin, and surfaces results in pull requests through the Snyk GitHub App.
The speed is the headline: Snyk Code scans complete in seconds for most repos. This makes it viable as a blocking PR check without frustrating developers.
# Snyk Code scan via CLI
snyk code test --severity-threshold=high --json > snyk-results.json
# Integrated into GitHub Actions
- name: Snyk Code SAST
uses: snyk/actions/node@master
env:
SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
with:
command: code test
args: --severity-threshold=medium
Snyk Code's rules are powered by machine learning trained on millions of open-source vulnerabilities. This gives it good coverage for common vulnerability patterns but can make it harder to understand why a finding was flagged compared to explicit rule-based tools.
Custom rules are limited in Snyk Code compared to Semgrep or Checkmarx. If you need highly customized rule sets (e.g., internal framework-specific patterns), Semgrep is a better fit.
What SAST Catches vs DAST
Understanding the boundary is critical for designing your security testing strategy:
SAST catches (DAST misses):
- Hardcoded credentials and API keys
- Insecure cryptographic algorithm usage
- Integer overflow in untested code paths
- Injection vulnerabilities in code paths never reached by test suites
- Security issues in dead code or rarely-executed branches
DAST catches (SAST misses):
- Authentication and session management flaws
- Business logic vulnerabilities
- CSRF issues dependent on browser behavior
- Runtime configuration errors (e.g., TLS misconfiguration, missing headers)
- Second-order injection (stored, then rendered later)
A mature security program uses both. SAST in the CI pipeline catches issues before they're deployed; DAST against staging environments validates runtime behavior.
False Positive Rates in Practice
Across teams using these tools:
| Tool | Typical False Positive Rate | Notes |
|---|---|---|
| Semgrep (community rules) | 5-15% | Improves significantly with custom tuning |
| Semgrep (custom rules) | <5% | If you invest in rule quality |
| SonarQube | 20-35% | Hotspots vs vulnerabilities distinction helps |
| Checkmarx | 30-50% | Requires significant tuning for developer trust |
| Snyk Code | 10-20% | Better for common patterns, weaker on unusual code |
These numbers vary by codebase, language, and how much tuning has been done. Raw out-of-the-box rates are always higher than tuned deployments.
Choosing the Right Tool
- Startups and growing teams: Semgrep OSS or Snyk Code. Fast, low friction, developer-friendly.
- Enterprise regulated environments: Checkmarx or Snyk Enterprise. Deep analysis, compliance reporting, dedicated support.
- Teams that already use code quality tools: SonarQube. Adding security rules to existing quality gates reduces tool sprawl.
- Polyglot organizations: Checkmarx or Semgrep (which supports 30+ languages).
Whatever you choose, success depends on tuning — reducing false positives until developers trust the results, and writing coverage for your specific technology stack rather than relying solely on generic rules.