Behavior

How pybenchx runs, measures, and keeps noise in check.

Profiles

Profiles define repeat counts, warmup passes, and the calibration budget per variant.

profile	repeat	warmup	calibration	best for
smoke	3	0	off	quick iteration during development
thorough	30	2	~1 s/variant	pre-merge validation, CI checks

Use --profile smoke while exploring; switch to --profile thorough (or a custom JSON profile) when you want stable numbers.

Profiles live in pybench/profiles.py; they are simple dataclasses, so shipping a new profile boils down to adding a JSON file with the same fields.

calibrate_n runs the benchmark in small bursts, growing n exponentially until the target budget is met. A refinement step probes ±20% to avoid overshooting.
Context mode calls the benchmark once to detect whether BenchContext.start()/end() is used; if not, pybenchx reverts to whole-function timing.
Warmups run before sampling (unless disabled with --no-warmup) to warm caches and JITs.
Each repeat wraps the tight loop with perf_counter_ns (or perf_counter as a fallback) for precise timings.
Raw samples feed percentile calculations (p50/p75/p95/p99) via linear interpolation.

Clock – monotonic, high-resolution (perf_counter_ns on Python ≥3.7).
GC – the runner performs gc.collect() before measurements and may freeze GC during timed sections (Python ≥3.11) to reduce pauses.
Environment – colors only print on TTY; use --no-color in CI logs. The CLI also respects PYBENCH_DISABLE_GC when you want to opt out of GC tweaks entirely.
Noise hints – the table highlights outliers (p99 vs mean) so you can spot unstable cases quickly.

Exceptions inside benchmarks bubble up with context; failing cases never poison other variants.
--fail-fast stops the run on the first error.
--fail-on mean:7%,p99:12% applies thresholds during comparisons and exits non-zero when budgets are exceeded—ideal for CI.

The CLI table groups benchmarks by group and annotates baselines with a star (★).
≈ same shows up when the relative mean differs by ≤1% and the statistical test agrees.
All reporters consume the same Run object; adding a format is as easy as implementing a new reporter class.