Intermediate By Samson Tanimawo, PhD Published Sep 17, 2026 5 min read

Linux Perf Cheat Sheet

Brendan Gregg's USE method, condensed into the commands you actually run when a Linux box is misbehaving.

First-minute scan

Brendan Gregg's classic 60-second checklist. Run these in order and you'll have a working hypothesis before you finish typing the last one.

uptime, load averages (1, 5, 15 min). If 1m >> 15m, load is climbing
dmesg | tail, kernel messages; OOMKills, hardware errors, NIC resets land here
vmstat 1 5, five 1s samples; r, b, si, so, us, sy, id, wa
mpstat -P ALL 1 3, per-CPU breakdown; uneven means a single hot core
pidstat 1 3, per-process CPU over 3 seconds
iostat -xz 1 3, per-device IO; watch %util and await
free -m, memory in MB; available is what matters, not free
sar -n DEV 1 3, network throughput per interface
sar -n TCP,ETCP 1 3, TCP retransmits and errors
top or htop, interactive view to confirm

Which core, which process, which kind of work, user, system, IO wait, steal.

top, sort by CPU with P, by mem with M
htop, color, scrolling, tree view; install it
mpstat -P ALL 1, per-CPU; %steal > 0 means hypervisor is taking your time
pidstat 1, per-process per-second CPU
uptime, load avg = runnable + uninterruptible procs; not just CPU
cat /proc/loadavg, same numbers, scriptable
perf top, live profile of where the kernel/user spends CPU
perf record -F 99 -p <pid> -g -- sleep 30 && perf report, flamegraph-ready stack samples

The free output confuses everyone once. Page cache is not "used" in any meaningful sense.

free -h, human-readable; available = what apps can actually claim
vmstat 1, si/so > 0 means swapping (bad)
cat /proc/meminfo, every counter; MemAvailable, SwapFree, Dirty, Slab
ps aux --sort=-rss | head, top RSS consumers
smem -tk, proportional set size; better than RSS for shared-memory apps
slabtop, kernel slab allocations; useful when "memory is gone but no process owns it"
dmesg | grep -i kill, recent OOMKills with the killed PID

%util at 100% doesn't mean "saturated" on SSDs, it means "had at least one request in flight at every sample." Look at await and queue depth.

iostat -xz 1, extended per-device, hide idle
iotop, per-process IO (needs root)
biotop (bcc-tools), per-process block IO with latency
biolatency (bcc-tools), block IO latency histogram
df -h, filesystem usage
df -i, inode usage; "disk full" with df -h showing free space is usually inodes
du -sh /var/log/*, find what's eating disk
lsof | head, open files; massive output, pipe through grep

Replace netstat with ss. It's faster, the flags are saner, and it's installed everywhere modern.

ss -tunap, TCP and UDP, all states, with PIDs
ss -ltn, listening TCP sockets, numeric
ss -s, summary counts by state
ss -tn state established '( dport = :443 or sport = :443 )', filter by state and port
sar -n DEV 1, per-interface throughput
sar -n TCP,ETCP 1, retransmits, resets
tcpdump -i any -nn -s 0 port 443, raw packets; pipe to -w file.pcap for Wireshark
mtr <host>, traceroute + ping over time
ethtool -S eth0 | grep -i drop, NIC-level drops

When you've narrowed to one PID, these are the deep-dive tools.

cat /proc/<pid>/status, VmRSS, VmSize, threads, state
cat /proc/<pid>/limits, ulimits the process is actually under
ls -l /proc/<pid>/fd | wc -l, open file descriptors
strace -p <pid> -f -e trace=open,read,write -o trace.out, syscalls; expensive, use briefly
ltrace -p <pid>, library calls
lsof -p <pid>, files, sockets, pipes the process holds
gdb -p <pid> with thread apply all bt, stack traces of every thread

When the basics don't answer, eBPF tools usually do.

execsnoop, every new process system-wide
opensnoop, every open() syscall
tcplife, connection lifetimes with throughput
runqlat, scheduler run-queue latency histogram
profile (bpftrace), CPU profiling without the perf overhead
perf sched record -- sleep 10 + perf sched latency, scheduling latencies per task
bpftrace -e 'tracepoint:syscalls:sys_enter_* { @[probe] = count(); }', syscall heatmap