Security

SIEM Setup Guide: Centralized Logging for Security Monitoring

How to set up centralized security logging — log sources, detection rules, alert tuning, and a practical comparison of Splunk, Elastic, and Microsoft Sentinel.

March 9, 20266 min readShipSafer Team

SIEM Setup Guide: Centralized Logging for Security Monitoring

A SIEM (Security Information and Event Management) system is the nerve center of a security operations program. It aggregates logs from across your infrastructure, applies detection logic to identify threats, and provides the query interface investigators use during incident response.

Without a SIEM, security events are scattered across cloud consoles, application logs, identity provider dashboards, and endpoint agents. An attacker can chain together a privilege escalation, a lateral movement, and a data exfiltration — and no single team would see the complete picture.

This guide walks through the practical steps of building a SIEM from log collection through alert tuning, with a comparison of the three leading platforms.

Defining Your Requirements

Before choosing a SIEM, define what you need to achieve:

  • Retention requirements — Most compliance frameworks require 12 months of log retention. SOC 2 and PCI DSS require audit log retention. Define your minimum.
  • Log volume — Estimate your daily ingestion volume. 100 GB/day is a different cost profile than 10 TB/day.
  • Query latency — Do analysts need real-time queries (15-second refresh) or is 5-minute latency acceptable?
  • Compliance requirements — FedRAMP requires US-hosted services. GDPR may require EU data residency.
  • Team size and expertise — Splunk requires skilled administrators. Sentinel is lower operational overhead for Microsoft-heavy shops.

Essential Log Sources

Not all logs provide equal security value. Start with these high-priority sources:

Identity and Access

  • IdP authentication logs (Okta, Entra ID, Google Workspace) — Login events, MFA events, session tokens issued
  • Cloud IAM audit logs (CloudTrail, GCP Audit Logs, Azure Activity Log) — API calls made with cloud credentials
  • VPN/ZTNA access logs — Employee remote access sessions

Infrastructure

  • Cloud VPC flow logs — Network traffic metadata
  • Firewall logs — Allow/deny decisions at the perimeter
  • Load balancer access logs — HTTP requests with source IPs and response codes
  • DNS query logs — Domains resolved by internal clients

Application

  • Application authentication logs — User logins, MFA, password resets
  • Authorization failures — 403 responses, access denied events
  • High-value actions — Admin console access, bulk data exports, configuration changes

Endpoint

  • EDR telemetry (CrowdStrike, SentinelOne) — Process execution, file modification, network connections
  • OS audit logs/var/log/auth.log (Linux), Windows Security Event Log

Email

  • Email gateway logs — Inbound messages, spam/phishing disposition
  • Exchange/Gmail audit logs — Email forwarding rules, delegate access, suspicious downloads

Log Normalization

Logs arrive in dozens of formats — JSON, CEF, syslog, W3C access logs. Before correlation rules can work, logs must be normalized to a common schema.

Common schemas:

  • ECS (Elastic Common Schema) — Used by the Elastic Stack
  • OCSF (Open Cybersecurity Schema Framework) — AWS-led, supported by Splunk, IBM, others
  • CEF (Common Event Format) — ArcSight standard, widely supported

At minimum, normalize these fields:

  • timestamp — ISO 8601 UTC
  • source.ip, destination.ip
  • user.name, user.id
  • event.action (what happened)
  • event.outcome (success/failure)
  • host.name, host.ip

Platform Comparison: Splunk vs. Elastic vs. Sentinel

Splunk Enterprise Security

Strengths:

  • Extremely powerful SPL (Search Processing Language)
  • Rich ecosystem of apps and integrations
  • Mature compliance and executive reporting
  • Strong SOC workflow features (Notable Events, Adaptive Response)

Weaknesses:

  • High licensing cost (priced per GB/day ingested)
  • Significant operational expertise required
  • Self-managed version requires hardware investment

Best for: Large enterprises with dedicated security teams and significant SIEM budgets.

Sample SPL detection (impossible travel):

index=auth sourcetype=okta event_type=user.session.start
| stats earliest(_time) as first_login, latest(_time) as last_login,
        values(client_ip) as ips, dc(client_ip) as ip_count,
        values(geo_city) as cities
  by user_id
| where ip_count > 1
| eval time_diff = last_login - first_login
| where time_diff < 3600
| search cities="*New York*" cities="*London*"

Elastic Security (ELK Stack)

Strengths:

  • Open source (self-hosted) or Elastic Cloud (managed)
  • Excellent full-text search and log analytics
  • ECS provides strong normalization out of the box
  • Lower cost than Splunk for equivalent data volumes
  • Kibana dashboards are highly customizable

Weaknesses:

  • Detection rule library less mature than Splunk or Sentinel
  • Self-hosted requires significant operational investment (index management, sizing)
  • Query language (KQL/EQL) less powerful than SPL for complex correlations

Best for: Engineering-oriented teams comfortable with self-managed infrastructure; multi-cloud and cloud-neutral environments.

Sample EQL detection (lateral movement):

sequence by host.name with maxspan=5m
  [process where process.name == "cmd.exe" and
   process.args : ("net use*", "net view*")]
  [network where destination.port == 445 and
   network.direction == "outgoing"]

Microsoft Sentinel

Strengths:

  • Native Azure integration — 1-click connectors for Microsoft 365, Entra ID, Azure services
  • SOAR built in (Azure Logic Apps automation)
  • KQL is powerful and consistent with Azure Monitor
  • No infrastructure to manage
  • Competitive pricing via data ingestion tiers

Weaknesses:

  • KQL learning curve if team is unfamiliar
  • Non-Microsoft log source connectors vary in quality
  • Cost can escalate with high data volumes

Best for: Microsoft-centric organizations (Azure, M365, Entra ID); teams wanting low operational overhead.

Sample KQL detection (brute force):

SigninLogs
| where ResultType != "0"  // Failed logins
| summarize FailureCount = count(), DistinctIPs = dcount(IPAddress)
  by UserPrincipalName, bin(TimeGenerated, 5m)
| where FailureCount > 10
| join kind=leftouter (
    SigninLogs
    | where ResultType == "0"
    | summarize SuccessCount = count() by UserPrincipalName
  ) on UserPrincipalName
| where isempty(SuccessCount)  // Only failures, no successes

Building Detection Rules

Start with the MITRE ATT&CK framework to structure your detection coverage. Each ATT&CK technique maps to one or more log sources and detection patterns.

High-value rules to implement first (highest ROI):

TechniqueDetectionLog Source
T1078 Valid AccountsImpossible travelIdP logs
T1110 Brute Force10+ failures in 5 minAuth logs
T1190 Public-Facing ExploitWAF alerts + app errorsWAF, app logs
T1136 Create AccountNew admin user createdIdP, CloudTrail
T1098 Account ManipulationNew IAM role/policyCloudTrail
T1567 Exfil to CloudLarge upload to S3/GCS from unusual sourceVPC flow logs

Alert Tuning Workflow

The first week after a SIEM goes live, every rule will fire constantly. Alert tuning is an ongoing process:

  1. Establish whitelist exceptions for known-good behavior (IT automation accounts, scheduled jobs, known scanner IPs)

  2. Adjust thresholds — If "10 failed logins in 5 minutes" fires too often, increase to 20 or narrow the time window

  3. Track false positive rate — For each rule, record how many alerts resulted in investigation vs. immediate dismissal over 30 days

  4. Retire or rebuild rules with less than 10% true positive rate

  5. Create detection improvement tickets — When a real incident is detected manually that a rule should have caught, file a ticket to write the rule

A mature SIEM is not defined by the number of rules it runs — it is defined by the signal-to-noise ratio and how quickly analysts can triage real threats.

Check Your Security Score — Free

See exactly how your domain scores on DMARC, TLS, HTTP headers, and 25+ other automated security checks in under 60 seconds.