Look for GC cycle time, %CPU in GC, and heap before/after/goal.
Sample gctrace lines (good vs bad)
# good: short STW, stable live heap
gc 73 @2.184s 4%: 0.18+1.3+0.07 ms clock, 0.8+0.9/1.6/0.0+0.3 ms cpu, 6->7->4 MB, 8 MB goal, 4 P
# bad: long pauses, large live heap, high GC share
gc 74 @2.684s 18%: 1.8+9.4+1.1 ms clock, 12+8/18/1+6 ms cpu, 220->260->180 MB, 360 MB goal, 4 P
Clock vs CPU example (read the gap)
gc 75 @3.012s 6%: 0.20+2.0+0.10 ms clock, 1.6+3.2/4.8/0.2+0.8 ms cpu, 10->12->7 MB, 14 MB goal, 8 P
If CPU >> clock, GC work is parallelized or includes heavy assists.
If clock is high but CPU modest, the pause itself is likely dominating.
Gctrace field map (table)
Segment
Meaning
Notes
gc 74
GC cycle number
increments each cycle
@2.684s
time since start
wall time
18%
GC CPU share
since last line
1.8+9.4+1.1 ms clock
STW start + concurrent mark + STW end
wall time
... ms cpu
CPU time across Ps
includes assists
220->260->180 MB
heap start -> heap end -> live heap
live heap drives cost
360 MB goal
next heap target
based on GOGC
4 P
GOMAXPROCS
active Ps
Good vs bad signals (scan)
Signal
Good case
Bad case
STW clock
sub-ms to low-ms
multi-ms or spiky
Live heap
small and stable
large or growing
GC CPU %
low single digits
double digits sustained
Reading gctrace for p95/p99
Track max STW clock and its frequency; correlate spikes with latency tails.
If p99 latency aligns with GC spikes, reduce live heap or allocation rate.
Watch GC CPU % at high load; sustained double digits often shows up in tail latency.
Use time-series parsing (log pipeline or custom script) to build p95/p99 of STW clock.
Quick gctrace parser (CSV)
bash
GODEBUG=gctrace=1 ./your-binary 2>&1 | python3 - <<'PY'import re,syspat = re.compile(r'^gc (\d+) @([0-9.]+)s ([0-9.]+)%: ([0-9.]+)\+([0-9.]+)\+([0-9.]+) ms clock, .*?, ([0-9.]+)->([0-9.]+)->([0-9.]+) MB,')print('cycle,stw_ms,mark_ms,live_mb,gc_cpu_pct')for line in sys.stdin: m = pat.match(line.strip()) if not m: continue cycle, t, gc_pct, stw1, mark, stw2, h1, h2, live = m.groups() stw = float(stw1) + float(stw2) print(f"{cycle},{stw:.2f},{mark},{live},{gc_pct}")PY
p95/p99 from gctrace (STW clock)
bash
GODEBUG=gctrace=1 ./your-binary 2>&1 | python3 - <<'PY'import re,sys,mathpat = re.compile(r'^gc (\d+) @([0-9.]+)s ([0-9.]+)%: ([0-9.]+)\+([0-9.]+)\+([0-9.]+) ms clock,')vals = []for line in sys.stdin: m = pat.match(line.strip()) if not m: continue stw1, stw2 = float(m.group(4)), float(m.group(6)) vals.append(stw1 + stw2)def pct(xs,p): if not xs: return None xs = sorted(xs) k = (len(xs)-1) * p f, c = math.floor(k), math.ceil(k) if f == c: return xs[int(k)] return xs[f] + (xs[c] - xs[f]) * (k - f)p95 = pct(vals, 0.95)p99 = pct(vals, 0.99)print(f"p95_stw_ms={p95:.2f} p99_stw_ms={p99:.2f} samples={len(vals)}")PY
Quick plots from gctrace CSV
Python (matplotlib):
bash
GODEBUG=gctrace=1 ./your-binary 2>&1 | python3 - <<'PY'import re,sysimport matplotlib.pyplot as pltpat = re.compile(r'^gc (\d+) @([0-9.]+)s ([0-9.]+)%: ([0-9.]+)\+([0-9.]+)\+([0-9.]+) ms clock, .*?, ([0-9.]+)->([0-9.]+)->([0-9.]+) MB,')t, stw = [], []for line in sys.stdin: m = pat.match(line.strip()) if not m: continue t.append(float(m.group(2))) stw.append(float(m.group(4)) + float(m.group(6)))plt.plot(t, stw)plt.xlabel('time (s)')plt.ylabel('stw (ms)')plt.title('GC STW over time')plt.show()PY
Python with p95/p99 lines:
bash
GODEBUG=gctrace=1 ./your-binary 2>&1 | python3 - <<'PY'import re,sys,mathimport matplotlib.pyplot as pltpat = re.compile(r'^gc (\d+) @([0-9.]+)s ([0-9.]+)%: ([0-9.]+)\+([0-9.]+)\+([0-9.]+) ms clock,')t, stw = [], []for line in sys.stdin: m = pat.match(line.strip()) if not m: continue t.append(float(m.group(2))) stw.append(float(m.group(4)) + float(m.group(6)))def pct(xs,p): xs = sorted(xs) k = (len(xs)-1) * p f, c = math.floor(k), math.ceil(k) if f == c: return xs[int(k)] return xs[f] + (xs[c] - xs[f]) * (k - f)p95 = pct(stw, 0.95)p99 = pct(stw, 0.99)plt.plot(t, stw, label='stw')plt.axhline(p95, color='orange', ls='--', label='p95')plt.axhline(p99, color='red', ls='--', label='p99')plt.xlabel('time (s)')plt.ylabel('stw (ms)')plt.title('GC STW over time')plt.legend()plt.show()PY
Python multi-axis (STW + live heap + GC CPU):
bash
GODEBUG=gctrace=1 ./your-binary 2>&1 | python3 - <<'PY'import re,sysimport matplotlib.pyplot as pltpat = re.compile(r'^gc (\d+) @([0-9.]+)s ([0-9.]+)%: ([0-9.]+)\+([0-9.]+)\+([0-9.]+) ms clock, .*?, ([0-9.]+)->([0-9.]+)->([0-9.]+) MB,')t, stw, live, gc = [], [], [], []for line in sys.stdin: m = pat.match(line.strip()) if not m: continue t.append(float(m.group(2))) stw.append(float(m.group(4)) + float(m.group(6))) live.append(float(m.group(9))) gc.append(float(m.group(3)))fig, ax1 = plt.subplots()ax2 = ax1.twinx()l1, = ax1.plot(t, stw, color='C0', label='stw_ms')l2, = ax2.plot(t, live, color='C1', label='live_mb')l3, = ax2.plot(t, gc, color='C2', label='gc_cpu_pct')ax1.set_xlabel('time (s)')ax1.set_ylabel('stw (ms)')ax2.set_ylabel('live MB / GC CPU %')ax1.set_title('GC STW + Live Heap + GC CPU')lines = [l1, l2, l3]ax1.legend(lines, [l.get_label() for l in lines])plt.show()PY
Gnuplot (requires CSV from parser):
bash
GODEBUG=gctrace=1 ./your-binary 2>&1 | python3 - <<'PY'import re,syspat = re.compile(r'^gc (\d+) @([0-9.]+)s ([0-9.]+)%: ([0-9.]+)\+([0-9.]+)\+([0-9.]+) ms clock, .*?, ([0-9.]+)->([0-9.]+)->([0-9.]+) MB,')print('time_s,stw_ms')for line in sys.stdin: m = pat.match(line.strip()) if not m: continue t = m.group(2) stw = float(m.group(4)) + float(m.group(6)) print(f"{t},{stw:.2f}")PY > gctrace.csvgnuplot -e "set datafile separator ','; set term png size 900,300; set output 'stw.png'; set xlabel 'time (s)'; set ylabel 'stw (ms)'; plot 'gctrace.csv' using 1:2 with lines title 'stw'"
Gnuplot with p95/p99 lines:
bash
GODEBUG=gctrace=1 ./your-binary 2>&1 | python3 - <<'PY' > gctrace.csvimport re,syspat = re.compile(r'^gc (\d+) @([0-9.]+)s ([0-9.]+)%: ([0-9.]+)\+([0-9.]+)\+([0-9.]+) ms clock, .*?, ([0-9.]+)->([0-9.]+)->([0-9.]+) MB,')print('time_s,stw_ms')for line in sys.stdin: m = pat.match(line.strip()) if not m: continue t = m.group(2) stw = float(m.group(4)) + float(m.group(6)) print(f"{t},{stw:.2f}")PYread P95 P99 <<EOF$(python3 - <<'PY'import csv,mathwith open('gctrace.csv') as f: next(f) vals=[float(r.split(',')[1]) for r in f if r.strip()]vals.sort()def pct(xs,p): k=(len(xs)-1)*p f=math.floor(k); c=math.ceil(k) if f==c: return xs[int(k)] return xs[f]+(xs[c]-xs[f])*(k-f)print(f"{pct(vals,0.95)} {pct(vals,0.99)}")PY)EOFgnuplot -e "set datafile separator ','; set term png size 900,300; set output 'stw.png'; set xlabel 'time (s)'; set ylabel 'stw (ms)'; p95=$P95; p99=$P99; plot 'gctrace.csv' using 1:2 with lines title 'stw', p95 with lines lt 1 lc rgb 'orange' title 'p95', p99 with lines lt 1 lc rgb 'red' title 'p99'"
// alloc-heavyfunc buildMany(n int) []string { out := make([]string, 0, n) for i := 0; i < n; i++ { out = append(out, fmt.Sprintf("id=%d", i } return out}// lower-allocfunc buildManyBetter(n int) []string { out := make([]string, 0, n) var b strings.Builder for i := 0; i < n; i++ { b.Reset() b.WriteString("id=") b.WriteString(strconv.Itoa(i)) out = append(out, b.String()) } return out}
Escape analysis is the fastest lever to reduce heap pressure.
Tune with data: pprof + gctrace + load tests.
Tags: golang, performance, runtime, gc
homelabird
Sharing hands-on cloud infrastructure and DevOps experience. Writing about Kubernetes, Terraform, and observability, and documenting lessons learned as a solo operator.