SIEM Setup Guide: Centralized Logging for Security Monitoring
How to set up centralized security logging — log sources, detection rules, alert tuning, and a practical comparison of Splunk, Elastic, and Microsoft Sentinel.
SIEM Setup Guide: Centralized Logging for Security Monitoring
A SIEM (Security Information and Event Management) system is the nerve center of a security operations program. It aggregates logs from across your infrastructure, applies detection logic to identify threats, and provides the query interface investigators use during incident response.
Without a SIEM, security events are scattered across cloud consoles, application logs, identity provider dashboards, and endpoint agents. An attacker can chain together a privilege escalation, a lateral movement, and a data exfiltration — and no single team would see the complete picture.
This guide walks through the practical steps of building a SIEM from log collection through alert tuning, with a comparison of the three leading platforms.
Defining Your Requirements
Before choosing a SIEM, define what you need to achieve:
- Retention requirements — Most compliance frameworks require 12 months of log retention. SOC 2 and PCI DSS require audit log retention. Define your minimum.
- Log volume — Estimate your daily ingestion volume. 100 GB/day is a different cost profile than 10 TB/day.
- Query latency — Do analysts need real-time queries (15-second refresh) or is 5-minute latency acceptable?
- Compliance requirements — FedRAMP requires US-hosted services. GDPR may require EU data residency.
- Team size and expertise — Splunk requires skilled administrators. Sentinel is lower operational overhead for Microsoft-heavy shops.
Essential Log Sources
Not all logs provide equal security value. Start with these high-priority sources:
Identity and Access
- IdP authentication logs (Okta, Entra ID, Google Workspace) — Login events, MFA events, session tokens issued
- Cloud IAM audit logs (CloudTrail, GCP Audit Logs, Azure Activity Log) — API calls made with cloud credentials
- VPN/ZTNA access logs — Employee remote access sessions
Infrastructure
- Cloud VPC flow logs — Network traffic metadata
- Firewall logs — Allow/deny decisions at the perimeter
- Load balancer access logs — HTTP requests with source IPs and response codes
- DNS query logs — Domains resolved by internal clients
Application
- Application authentication logs — User logins, MFA, password resets
- Authorization failures — 403 responses, access denied events
- High-value actions — Admin console access, bulk data exports, configuration changes
Endpoint
- EDR telemetry (CrowdStrike, SentinelOne) — Process execution, file modification, network connections
- OS audit logs —
/var/log/auth.log(Linux), Windows Security Event Log
- Email gateway logs — Inbound messages, spam/phishing disposition
- Exchange/Gmail audit logs — Email forwarding rules, delegate access, suspicious downloads
Log Normalization
Logs arrive in dozens of formats — JSON, CEF, syslog, W3C access logs. Before correlation rules can work, logs must be normalized to a common schema.
Common schemas:
- ECS (Elastic Common Schema) — Used by the Elastic Stack
- OCSF (Open Cybersecurity Schema Framework) — AWS-led, supported by Splunk, IBM, others
- CEF (Common Event Format) — ArcSight standard, widely supported
At minimum, normalize these fields:
timestamp— ISO 8601 UTCsource.ip,destination.ipuser.name,user.idevent.action(what happened)event.outcome(success/failure)host.name,host.ip
Platform Comparison: Splunk vs. Elastic vs. Sentinel
Splunk Enterprise Security
Strengths:
- Extremely powerful SPL (Search Processing Language)
- Rich ecosystem of apps and integrations
- Mature compliance and executive reporting
- Strong SOC workflow features (Notable Events, Adaptive Response)
Weaknesses:
- High licensing cost (priced per GB/day ingested)
- Significant operational expertise required
- Self-managed version requires hardware investment
Best for: Large enterprises with dedicated security teams and significant SIEM budgets.
Sample SPL detection (impossible travel):
index=auth sourcetype=okta event_type=user.session.start
| stats earliest(_time) as first_login, latest(_time) as last_login,
values(client_ip) as ips, dc(client_ip) as ip_count,
values(geo_city) as cities
by user_id
| where ip_count > 1
| eval time_diff = last_login - first_login
| where time_diff < 3600
| search cities="*New York*" cities="*London*"
Elastic Security (ELK Stack)
Strengths:
- Open source (self-hosted) or Elastic Cloud (managed)
- Excellent full-text search and log analytics
- ECS provides strong normalization out of the box
- Lower cost than Splunk for equivalent data volumes
- Kibana dashboards are highly customizable
Weaknesses:
- Detection rule library less mature than Splunk or Sentinel
- Self-hosted requires significant operational investment (index management, sizing)
- Query language (KQL/EQL) less powerful than SPL for complex correlations
Best for: Engineering-oriented teams comfortable with self-managed infrastructure; multi-cloud and cloud-neutral environments.
Sample EQL detection (lateral movement):
sequence by host.name with maxspan=5m
[process where process.name == "cmd.exe" and
process.args : ("net use*", "net view*")]
[network where destination.port == 445 and
network.direction == "outgoing"]
Microsoft Sentinel
Strengths:
- Native Azure integration — 1-click connectors for Microsoft 365, Entra ID, Azure services
- SOAR built in (Azure Logic Apps automation)
- KQL is powerful and consistent with Azure Monitor
- No infrastructure to manage
- Competitive pricing via data ingestion tiers
Weaknesses:
- KQL learning curve if team is unfamiliar
- Non-Microsoft log source connectors vary in quality
- Cost can escalate with high data volumes
Best for: Microsoft-centric organizations (Azure, M365, Entra ID); teams wanting low operational overhead.
Sample KQL detection (brute force):
SigninLogs
| where ResultType != "0" // Failed logins
| summarize FailureCount = count(), DistinctIPs = dcount(IPAddress)
by UserPrincipalName, bin(TimeGenerated, 5m)
| where FailureCount > 10
| join kind=leftouter (
SigninLogs
| where ResultType == "0"
| summarize SuccessCount = count() by UserPrincipalName
) on UserPrincipalName
| where isempty(SuccessCount) // Only failures, no successes
Building Detection Rules
Start with the MITRE ATT&CK framework to structure your detection coverage. Each ATT&CK technique maps to one or more log sources and detection patterns.
High-value rules to implement first (highest ROI):
| Technique | Detection | Log Source |
|---|---|---|
| T1078 Valid Accounts | Impossible travel | IdP logs |
| T1110 Brute Force | 10+ failures in 5 min | Auth logs |
| T1190 Public-Facing Exploit | WAF alerts + app errors | WAF, app logs |
| T1136 Create Account | New admin user created | IdP, CloudTrail |
| T1098 Account Manipulation | New IAM role/policy | CloudTrail |
| T1567 Exfil to Cloud | Large upload to S3/GCS from unusual source | VPC flow logs |
Alert Tuning Workflow
The first week after a SIEM goes live, every rule will fire constantly. Alert tuning is an ongoing process:
-
Establish whitelist exceptions for known-good behavior (IT automation accounts, scheduled jobs, known scanner IPs)
-
Adjust thresholds — If "10 failed logins in 5 minutes" fires too often, increase to 20 or narrow the time window
-
Track false positive rate — For each rule, record how many alerts resulted in investigation vs. immediate dismissal over 30 days
-
Retire or rebuild rules with less than 10% true positive rate
-
Create detection improvement tickets — When a real incident is detected manually that a rule should have caught, file a ticket to write the rule
A mature SIEM is not defined by the number of rules it runs — it is defined by the signal-to-noise ratio and how quickly analysts can triage real threats.