Introduction
Discovering that a server has been compromised is one of the most stressful moments in operations. The decisions we make in the first 30 minutes determine whether we contain the incident quickly or allow the attacker to spread laterally through the network. This guide provides a structured approach to detecting, triaging, and responding to a compromised Linux server.
Recognizing the Signs
Not every anomaly means a breach, but the following indicators warrant immediate investigation:
- Unexpected spikes in outbound network traffic
- Unknown processes consuming CPU or memory
- Modified system binaries (e.g.,
ls,ps,netstatbehaving oddly) - New user accounts or SSH keys we did not create
- Cron jobs we did not schedule
- Log files that have been truncated or deleted
- Files with recent modification timestamps in system directories
- Connections to known-bad IP addresses
Phase 1: Initial Triage (Do Not Reboot)
The first rule of incident response is to preserve evidence. Rebooting destroys volatile data (running processes, network connections, memory contents).
1.1 Check Active Connections
# List all established connections with process names
ss -tulnp | grep ESTAB
# Look for connections to unusual ports or IPs
netstat -anp | grep ESTABLISHED
1.2 Inspect Running Processes
# Full process tree
ps auxf
# Look for processes running from /tmp, /dev/shm, or hidden directories
ls -la /tmp /dev/shm
find /tmp /dev/shm -type f -executable
# Check for processes masquerading as system services
ps aux | grep -E '\./|/tmp/|/dev/shm/'
1.3 Review Recent User Activity
# Who is currently logged in
w
# Recent logins
last -20
# Failed login attempts
lastb -20
# Check for new users
grep -v "nologin\|false" /etc/passwd
# Look for unauthorized SSH keys
find / -name "authorized_keys" -exec cat {} \;
1.4 Examine Cron Jobs
# Check all user crontabs
for user in $(cut -f1 -d: /etc/passwd); do
echo "=== $user ==="
crontab -l -u "$user" 2>/dev/null
done
# Check system cron directories
ls -la /etc/cron.d/ /etc/cron.daily/ /etc/cron.hourly/
Phase 2: Evidence Preservation
Before making any changes, capture evidence. This is critical for forensic analysis and may be required for legal proceedings.
# Create an evidence directory on an external volume
mkdir -p /mnt/evidence/$(hostname)_$(date +%Y%m%d)
EVIDENCE="/mnt/evidence/$(hostname)_$(date +%Y%m%d)"
# Dump running processes
ps auxf > "$EVIDENCE/ps_auxf.txt"
# Dump network connections
ss -tulnp > "$EVIDENCE/ss_tulnp.txt"
netstat -anp > "$EVIDENCE/netstat_anp.txt"
# Dump open files
lsof > "$EVIDENCE/lsof.txt"
# Copy auth logs
cp /var/log/auth.log "$EVIDENCE/"
cp /var/log/syslog "$EVIDENCE/"
# Dump currently loaded kernel modules
lsmod > "$EVIDENCE/lsmod.txt"
# Hash critical system binaries for later comparison
sha256sum /usr/bin/ps /usr/bin/ls /usr/bin/netstat /usr/bin/ss > "$EVIDENCE/binary_hashes.txt"
Phase 3: Containment
Once evidence is preserved, we contain the breach to prevent further damage.
3.1 Network Isolation
# Block all outbound traffic except our management IP
sudo iptables -F
sudo iptables -A INPUT -s OUR_MGMT_IP -j ACCEPT
sudo iptables -A OUTPUT -d OUR_MGMT_IP -j ACCEPT
sudo iptables -A INPUT -j DROP
sudo iptables -A OUTPUT -j DROP
3.2 Kill Malicious Processes
# Identify the PID of the suspicious process
kill -9 <PID>
# If the process respawns, check for a parent process or cron job
3.3 Revoke Compromised Credentials
# Lock user accounts
passwd -l suspicious_user
# Remove unauthorized SSH keys
# Rotate all passwords and API keys that were on this server
Phase 4: Rootkit Detection
# Install and run rkhunter
sudo apt install -y rkhunter
sudo rkhunter --update
sudo rkhunter --check --skip-keypress
# Install and run chkrootkit
sudo apt install -y chkrootkit
sudo chkrootkit
# Compare system binary hashes against a known-good baseline
debsums -c
Phase 5: Recovery
After containment and analysis, we have two options:
- Clean and patch the existing server (only if we are confident we have identified the full scope of the compromise).
- Rebuild from scratch using a clean image and restore data from a known-good backup (the recommended approach for production servers).
We strongly recommend option 2 for production systems. An attacker who gained root access may have installed persistent backdoors that are extremely difficult to detect.
Phase 6: Incident Report
Every security incident should result in a written report that includes:
- Timeline: When the compromise occurred, when it was detected, when it was contained
- Scope: Which systems and data were affected
- Root cause: How the attacker gained access (unpatched vulnerability, stolen credentials, misconfiguration)
- Evidence: Logs, process dumps, hashes
- Remediation: What was done to contain and recover
- Lessons learned: What we will change to prevent recurrence
Conclusion
Incident response is not something we want to figure out during an actual incident. We should have runbooks prepared, practice them regularly, and ensure the team knows who to contact and what steps to take. The commands in this guide form the foundation of a Linux incident response toolkit.