SIEM Setup Guide: Centralized Logging for Security Monitoring

A SIEM (Security Information and Event Management) system is the nerve center of a security operations program. It aggregates logs from across your infrastructure, applies detection logic to identify threats, and provides the query interface investigators use during incident response.

Without a SIEM, security events are scattered across cloud consoles, application logs, identity provider dashboards, and endpoint agents. An attacker can chain together a privilege escalation, a lateral movement, and a data exfiltration — and no single team would see the complete picture.

This guide walks through the practical steps of building a SIEM from log collection through alert tuning, with a comparison of the three leading platforms.

Defining Your Requirements

Before choosing a SIEM, define what you need to achieve:

Retention requirements — Most compliance frameworks require 12 months of log retention. SOC 2 and PCI DSS require audit log retention. Define your minimum.
Log volume — Estimate your daily ingestion volume. 100 GB/day is a different cost profile than 10 TB/day.
Query latency — Do analysts need real-time queries (15-second refresh) or is 5-minute latency acceptable?
Compliance requirements — FedRAMP requires US-hosted services. GDPR may require EU data residency.
Team size and expertise — Splunk requires skilled administrators. Sentinel is lower operational overhead for Microsoft-heavy shops.

Essential Log Sources

Not all logs provide equal security value. Start with these high-priority sources:

Identity and Access

IdP authentication logs (Okta, Entra ID, Google Workspace) — Login events, MFA events, session tokens issued
Cloud IAM audit logs (CloudTrail, GCP Audit Logs, Azure Activity Log) — API calls made with cloud credentials
VPN/ZTNA access logs — Employee remote access sessions

Infrastructure

Cloud VPC flow logs — Network traffic metadata
Firewall logs — Allow/deny decisions at the perimeter
Load balancer access logs — HTTP requests with source IPs and response codes
DNS query logs — Domains resolved by internal clients

Application

Application authentication logs — User logins, MFA, password resets
Authorization failures — 403 responses, access denied events
High-value actions — Admin console access, bulk data exports, configuration changes

Endpoint

EDR telemetry (CrowdStrike, SentinelOne) — Process execution, file modification, network connections
OS audit logs — /var/log/auth.log (Linux), Windows Security Event Log

Email

Email gateway logs — Inbound messages, spam/phishing disposition
Exchange/Gmail audit logs — Email forwarding rules, delegate access, suspicious downloads

Log Normalization

Logs arrive in dozens of formats — JSON, CEF, syslog, W3C access logs. Before correlation rules can work, logs must be normalized to a common schema.

Common schemas:

ECS (Elastic Common Schema) — Used by the Elastic Stack
OCSF (Open Cybersecurity Schema Framework) — AWS-led, supported by Splunk, IBM, others
CEF (Common Event Format) — ArcSight standard, widely supported

At minimum, normalize these fields:

timestamp — ISO 8601 UTC
source.ip, destination.ip
user.name, user.id
event.action (what happened)
event.outcome (success/failure)
host.name, host.ip

Platform Comparison: Splunk vs. Elastic vs. Sentinel

Splunk Enterprise Security

Strengths:

Extremely powerful SPL (Search Processing Language)
Rich ecosystem of apps and integrations
Mature compliance and executive reporting
Strong SOC workflow features (Notable Events, Adaptive Response)

Weaknesses:

High licensing cost (priced per GB/day ingested)
Significant operational expertise required
Self-managed version requires hardware investment

Best for: Large enterprises with dedicated security teams and significant SIEM budgets.

Sample SPL detection (impossible travel):

index=auth sourcetype=okta event_type=user.session.start
| stats earliest(_time) as first_login, latest(_time) as last_login,
        values(client_ip) as ips, dc(client_ip) as ip_count,
        values(geo_city) as cities
  by user_id
| where ip_count > 1
| eval time_diff = last_login - first_login
| where time_diff < 3600
| search cities="*New York*" cities="*London*"

Elastic Security (ELK Stack)

Strengths:

Open source (self-hosted) or Elastic Cloud (managed)
Excellent full-text search and log analytics
ECS provides strong normalization out of the box
Lower cost than Splunk for equivalent data volumes
Kibana dashboards are highly customizable

Weaknesses:

Detection rule library less mature than Splunk or Sentinel
Self-hosted requires significant operational investment (index management, sizing)
Query language (KQL/EQL) less powerful than SPL for complex correlations

Best for: Engineering-oriented teams comfortable with self-managed infrastructure; multi-cloud and cloud-neutral environments.

Sample EQL detection (lateral movement):

sequence by host.name with maxspan=5m
  [process where process.name == "cmd.exe" and
   process.args : ("net use*", "net view*")]
  [network where destination.port == 445 and
   network.direction == "outgoing"]

Microsoft Sentinel

Strengths:

Native Azure integration — 1-click connectors for Microsoft 365, Entra ID, Azure services
SOAR built in (Azure Logic Apps automation)
KQL is powerful and consistent with Azure Monitor
No infrastructure to manage
Competitive pricing via data ingestion tiers

Weaknesses:

KQL learning curve if team is unfamiliar
Non-Microsoft log source connectors vary in quality
Cost can escalate with high data volumes

Best for: Microsoft-centric organizations (Azure, M365, Entra ID); teams wanting low operational overhead.

Sample KQL detection (brute force):

SigninLogs
| where ResultType != "0"  // Failed logins
| summarize FailureCount = count(), DistinctIPs = dcount(IPAddress)
  by UserPrincipalName, bin(TimeGenerated, 5m)
| where FailureCount > 10
| join kind=leftouter (
    SigninLogs
    | where ResultType == "0"
    | summarize SuccessCount = count() by UserPrincipalName
  ) on UserPrincipalName
| where isempty(SuccessCount)  // Only failures, no successes

Building Detection Rules

Start with the MITRE ATT&CK framework to structure your detection coverage. Each ATT&CK technique maps to one or more log sources and detection patterns.

High-value rules to implement first (highest ROI):

Technique	Detection	Log Source
T1078 Valid Accounts	Impossible travel	IdP logs
T1110 Brute Force	10+ failures in 5 min	Auth logs
T1190 Public-Facing Exploit	WAF alerts + app errors	WAF, app logs
T1136 Create Account	New admin user created	IdP, CloudTrail
T1098 Account Manipulation	New IAM role/policy	CloudTrail
T1567 Exfil to Cloud	Large upload to S3/GCS from unusual source	VPC flow logs

Alert Tuning Workflow

The first week after a SIEM goes live, every rule will fire constantly. Alert tuning is an ongoing process:

Establish whitelist exceptions for known-good behavior (IT automation accounts, scheduled jobs, known scanner IPs)
Adjust thresholds — If "10 failed logins in 5 minutes" fires too often, increase to 20 or narrow the time window
Track false positive rate — For each rule, record how many alerts resulted in investigation vs. immediate dismissal over 30 days
Retire or rebuild rules with less than 10% true positive rate
Create detection improvement tickets — When a real incident is detected manually that a rule should have caught, file a ticket to write the rule

A mature SIEM is not defined by the number of rules it runs — it is defined by the signal-to-noise ratio and how quickly analysts can triage real threats.

SIEM Setup Guide: Centralized Logging for Security Monitoring

Defining Your Requirements

Essential Log Sources

Identity and Access

Infrastructure

Application

Endpoint

Email

Log Normalization

Platform Comparison: Splunk vs. Elastic vs. Sentinel

Splunk Enterprise Security

Elastic Security (ELK Stack)

Microsoft Sentinel

Building Detection Rules

Alert Tuning Workflow

Check Your Security Score — Free