Chapter 15 · Part C

Monitoring & Incident Response Core

Detecting breaches, responding to them, and recovering. This is where "accounting" from Chapter 8 becomes a real defence instead of just log files gathering dust.

Logs = records of events ("user X logged in", "file Y accessed"). Monitoring = actively watching logs and alerts for suspicious activity. SIEM (Security Information and Event Management) = software that centralises and correlates logs from many sources. Incident Response (IR) = the structured process of dealing with a breach: Prepare → Identify → Contain → Eradicate → Recover → Lessons Learned. Almost every major breach was "in the logs" — the failure was not watching them.

15.1 Why Monitoring Matters

You can't prevent every attack. Even a perfectly designed system has zero-days, insider threats, and human mistakes. So the second line of defence is detecting attacks quickly, before they become catastrophes. This is called the detect-and-respond paradigm — and for mature security programs it's as important as prevention.

A useful stat: industry surveys consistently find that attackers dwell inside networks for weeks or months before discovery. That's a long window where the attacker is exploring, stealing data, and setting up persistence. Good monitoring shortens that window. In well-run organisations it drops from months to hours.

BRINGING BACK THE FRAMEWORK: Remember the third A of AAA from Chapter 8? Accounting — "what did you do?" Monitoring is how accounting becomes useful. Logs without review are just storage costs. Monitoring turns logs into detection. This is why Chapter 8 warned: "Logs are only useful if someone is looking at them." (In exam terms: monitoring operationalises the Accounting pillar.)

15.2 What Gets Logged

Modern networks produce mountains of log data. Typical sources:

Source	What it logs	What attacks it helps detect
Authentication server (AD, RADIUS, IdP)	Logins, failures, password changes, MFA challenges	Brute force, credential stuffing, account compromise
Firewall	Allowed/blocked connections, source + destination + port	Port scanning, unusual outbound connections, C2 traffic
Web server	URLs requested, response codes, user agents, IPs	SQL injection attempts, directory traversal, web scraping
Endpoint / EDR	Processes started, files modified, network connections per app	Malware execution, lateral movement, ransomware
DNS server	Queries made by each internal device	Malware C2 domains, DNS tunneling, accessing known-bad domains
Cloud audit logs	API calls — who created/deleted/modified what	Privilege escalation, backdoor creation, data exfiltration
Email gateway	Messages sent/received, senders, attachments, SPF/DKIM results	Phishing, business email compromise, data exfiltration via email

No single log tells the whole story. An attacker's footprint spans multiple systems — they might authenticate to one server, pivot through a firewall, access files on a file server, and exfiltrate data out an email gateway. Catching this requires correlating logs across sources.

15.3 SIEM — The Log Aggregator and Correlator

SIEM (Security Information and Event Management) is the software category that collects logs from every source and correlates them in near-real-time. Examples: Splunk, Microsoft Sentinel, Elastic SIEM, LogRhythm, IBM QRadar.

What a SIEM does:

Collect logs from many sources into one searchable store
Normalise them so that different log formats can be queried together
Correlate events — e.g., "5 failed logins + 1 successful + unusual IP = suspicious"
Alert humans when a rule matches (or an anomaly-detection model flags something)
Provide dashboards for ongoing awareness of what's happening
Store for long periods to support later investigations and compliance

Without SIEM, analysts have to log into every system separately and manually spot patterns. With SIEM, patterns surface automatically.

Example correlation rules

The power of SIEM is in combining weak signals into strong ones. Each individual event might be normal; the combination is suspicious.

Rule	Catches
10+ failed logins from one IP in 5 minutes, then a success	Brute-force or credential-stuffing success
Successful login from a new country within 1 hour of one from home country	"Impossible travel" — stolen credentials
User logs in, then admin tools run shortly after	Possible privilege escalation
Large outbound data transfer to an unusual destination	Data exfiltration
Internal host making DNS queries to a known C2 domain	Malware phoning home
Unusual file access pattern — one user touching hundreds of files in minutes	Ransomware in progress, or insider data theft

SOAR — the automated response layer

SOAR (Security Orchestration, Automation, and Response) is a newer category that sits on top of SIEM. When an alert fires, SOAR can automatically do things: isolate the device from the network, disable the user account, trigger a password reset, create a ticket. This matters because during a ransomware attack, minutes count — waiting for a human to click "isolate device" may be too slow.

15.4 IDS vs IPS — Watching the Network Itself

Logs come from systems. Another layer watches the network traffic itself, looking for attack patterns in the packets.

	IDS (Intrusion Detection System)	IPS (Intrusion Prevention System)
Position	Off to the side, seeing a copy of traffic	In-line — traffic must pass through it
Action	Alerts a human when something looks wrong	Actively blocks the connection
Risk if wrong	Noise (false positives become alerts)	Blocking legitimate traffic (false positives disrupt users)
Typical use	Monitoring, forensic reconstruction	Blocking known attack patterns, automated defence

Modern firewalls often include IPS capability, sometimes called "Next-Gen Firewalls" (NGFW). The firewall already inspects every packet — adding pattern matching is a natural extension.

TRAP: "An IPS will stop everything." No. IPS is good at blocking known bad patterns — signatures of specific exploits. It's much weaker against novel attacks, encrypted traffic it can't inspect, and attacks using legitimate tools (so-called "living off the land"). IPS is part of defence-in-depth, not a silver bullet.

15.5 The Incident Response Cycle

When a breach is detected (or suspected), organisations follow a structured incident response (IR) process. The most common framework is NIST's six-phase cycle.

A cycle, not a line. Every incident's lessons feed back into preparation for the next one.

1. Preparation

The work you do before any incident. This includes:

Writing an incident response plan — who does what, in what order
Identifying an IR team (may be internal or a retained external firm)
Setting up tools: SIEM, EDR, backup systems, communications channels
Running tabletop exercises — walking through hypothetical incidents
Ensuring logs are collected and retained long enough to support investigation

Organisations without an IR plan panic during breaches. Organisations with a tested plan follow it. The difference in damage can be enormous.

2. Identification

Detecting that an incident is happening or has happened. Sources of detection include:

SIEM alerts
EDR flags on endpoints
User reports ("my computer is acting weird")
Third-party notification (a partner or law enforcement calls)
Public disclosure (your data is on a leak site)

The goal is to confirm the incident is real (not a false positive), classify its severity, and start the response clock. Many "potential incidents" turn out to be benign — part of this phase is triage.

3. Containment

Stopping the incident from getting worse. Two phases are common:

Short-term containment: isolate the affected device immediately — unplug from network, disable the account, block the IP. Buy time.
Long-term containment: apply patches, rotate credentials, tighten firewall rules — bring the environment into a state where investigation can continue safely.

CONTAINMENT TRADE-OFF: Containing too aggressively can alert the attacker (they go quiet and hide; you lose visibility). Too gently and they keep causing damage. In advanced cases, teams may watch the attacker briefly to understand the full scope before containing — but this requires serious skill. For most organisations, immediate containment is the right call.

4. Eradication

Removing the threat entirely. Includes:

Deleting malware, rootkits, backdoors
Removing attacker-created accounts
Closing the entry vector (patch the exploited vulnerability, change compromised credentials)
In severe cases: wiping and rebuilding affected systems

This phase requires confidence that you've found everything. Advanced attackers plant multiple backdoors specifically so eradication misses one and they can return. This is why post-incident investigation is so important — you're looking for what else you missed.

5. Recovery

Bringing systems and business operations back to normal. Tasks:

Restore from clean backups (this is where backup quality is tested!)
Validate that systems are clean before reconnecting to production
Monitor closely for signs the attacker returns
Communicate status to stakeholders, customers, regulators

Recovery often takes far longer than containment or eradication. Ransomware victims frequently spend weeks or months rebuilding.

6. Lessons Learned

The most-skipped phase and one of the most valuable. Within a few weeks of the incident, run a post-incident review:

What happened? Timeline.
What detections worked? What didn't?
What controls should be added or strengthened?
Should the IR plan be updated?
What training or awareness would have helped?

The goal is not to blame, but to improve. Then feed those improvements back into Preparation, closing the cycle.

15.6 Australian Legal Dimension — The NDB Scheme

In Australia, the Notifiable Data Breaches (NDB) scheme under the Privacy Act 1988 requires certain organisations to notify the OAIC (Office of the Australian Information Commissioner) and affected individuals when an "eligible data breach" occurs — generally where there's a likely risk of serious harm to individuals.

This means incident response in Australia is not just a technical exercise. It includes:

Assessing within 30 days whether the breach is "notifiable"
If so, notifying the OAIC and the affected individuals
Providing specific information: what data was involved, likely consequences, what individuals should do
Facing potential penalties for non-notification or late notification

This legal obligation is the reason Optus and Medibank notified customers so quickly in 2022 — and why late notification in other breaches has drawn public anger and regulatory action. Chapter 17 covers the legal framework in more depth. (cross-reference: Ch 10 Optus/Medibank cases; Ch 17 Privacy Act + NDB deep dive.)

EXAM-QUALITY PHRASING: "Under the Notifiable Data Breaches scheme (Privacy Act 1988), organisations must notify the OAIC and affected individuals of eligible data breaches — those likely to result in serious harm. Incident response must therefore include a legal assessment pathway alongside the technical response." That's a crisp, Australian-specific sentence worth marks.

15.7 What a Mature Monitoring Setup Looks Like

An exam-grade answer for "describe an effective monitoring and response capability":

Centralised logging from authentication, firewalls, endpoints, DNS, cloud, and applications — long enough retention to support post-incident investigation (typically 12 months minimum)
SIEM with tuned correlation rules — not default rules; tailored to this organisation's threat model
EDR deployed on all endpoints — so attacks on individual laptops are detected
24/7 monitoring — either in-house SOC (Security Operations Centre) or outsourced MSSP (Managed Security Service Provider). Attackers don't only work business hours.
Written IR plan with named roles, tested at least annually through tabletop exercises
Playbooks for common scenarios — ransomware, BEC, lost device, insider incident — so responders aren't figuring it out under pressure
Legal pathway mapped — who decides if a breach is "notifiable"? Who contacts the OAIC? Who coordinates with law enforcement?
Relationships established in advance — external IR firm on retainer, law firm briefed, communications/PR team in the loop

EXAM PATTERN — "How does monitoring improve security?":
"Monitoring operationalises the Accounting pillar of AAA — turning logs from passive records into active detections. SIEM correlates events across sources to spot multi-step attacks that individual logs would miss. When combined with a tested incident response plan, it shortens the attacker's dwell time from months to hours, limiting damage even for attacks that prevention missed. This is why mature security follows a 'prevent + detect + respond' model — assuming some breaches will succeed and designing for fast detection and response."

Four distinct points, each grounded: a high-scoring answer.

15.8 Quiz Time

A school collects firewall and authentication logs but no one looks at them until after an incident. What core principle is being violated and what's the fix?

The violation is treating Accounting (AAA's third pillar) as passive storage. Logs without monitoring can't detect anything in real-time. The fix is to (a) centralise logs in a SIEM, (b) set up alerts for suspicious patterns (failed logins + success, impossible travel, large outbound transfers), and (c) have a person or service responsible for reviewing/responding to those alerts 24/7 (or at least during business hours with alerting out-of-hours). Many major breaches were "in the logs" — the gap was nobody watching. (In exam terms: monitoring is the active use of logs; without it, accounting provides only forensics, not defence.)

What's the difference between an IDS and an IPS, and when would you choose each?

IDS (Intrusion Detection System) sits off to the side, receives a copy of traffic, and alerts when it sees suspicious patterns. It can't block. Used in monitoring-heavy environments where false positives must not disrupt users.
IPS (Intrusion Prevention System) sits in-line and actively blocks traffic matching attack signatures. Used where automated blocking is acceptable — typically modern firewalls (NGFWs) include IPS functionality.
Choose IDS when you want visibility without disrupting traffic (e.g., highly-tuned production); choose IPS when you want active defence and accept occasional false-positive blocks (e.g., most perimeter firewalls). In practice, mature organisations use both.

A ransomware attack hits at 2am on a Saturday. Walk through the first four phases of IR that should happen.

1. Identify: SIEM/EDR flags ransomware behaviour (mass file modifications, known ransomware signatures). On-call analyst receives alert, confirms the incident is real. IR plan is activated and incident commander notified.
2. Contain (short-term): Isolate affected hosts from the network immediately (EDR action or network ACL). Disable involved accounts. Block the C2 IPs/domains at the firewall. The goal is to stop lateral spread within minutes.
3. Contain (long-term): Identify the attack's entry vector and close it. Rotate any compromised credentials. Validate that segmentation held.
4. Eradicate: Remove the ransomware and any persistence mechanisms. Rebuild compromised endpoints from clean images rather than "cleaning" them. Verify all backdoors are gone.
(Then recovery and lessons learned.) Meanwhile, legal/compliance is already running a parallel track to assess NDB notification requirements. (In exam terms: a practised response follows a defined playbook, not improvisation.)

Why is the "Lessons Learned" phase often skipped, and why is that a mistake?

It's skipped because by the time the incident is contained and recovered, everyone is exhausted and eager to move on. It's also politically uncomfortable — the review usually identifies things that could have gone better, which can feel like blame. But skipping it means the organisation learns nothing from the experience and remains vulnerable to similar attacks. Mature organisations treat post-incident review as non-optional, frame it as blameless (focus on systems not people), and feed findings back into Preparation for the next cycle. Every serious incident should produce concrete action items — otherwise the organisation pays the cost of the breach without reaping the learning benefit.

← Previous

14. Network Defence

16. Secure Design