Getting Started
Install
Section titled “Install”pip install pybenchx# oruv pip install pybenchx
First benchmark (function mode)
Section titled “First benchmark (function mode)”Create examples/hello_bench.py
:
from pybench import bench
@bench(name="hello", n=1_000, repeat=10)def hello(): return sum(range(50))
Run:
pybench run examples/
Filter and tweak at runtime:
pybench run examples/ -k hello -P repeat=5 -P n=10_000
Isolating the hot path (context mode)
Section titled “Isolating the hot path (context mode)”from pybench import Bench, BenchContext
suite = Bench("math")
@suite.bench(name="baseline", baseline=True, repeat=10)def baseline(b: BenchContext): setup = list(range(100)) b.start() sum(setup) # timed b.end()
Why: context mode excludes per-iteration setup from timing, improving signal.
Discovery and naming
Section titled “Discovery and naming”- Directories expand to
**/*bench.py
. - Name cases with
name=
. Usegroup=
to cluster related cases.
@suite.bench(name="join-basic", group="strings")
Baseline and groups
Section titled “Baseline and groups”- Set
baseline=True
on one case per group. Others show “vs base”. - Without an explicit baseline, a case whose name includes “baseline/base” may be used.
Profiles you’ll use
Section titled “Profiles you’ll use”pybench run examples/ --profile thorough # ~1s per variant, repeat=30pybench run examples/ --profile smoke # no calibration, repeat=3 (default)
Save, export, and compare
Section titled “Save, export, and compare”# Save a run and a baselinepybench run examples/ --save runApybench run examples/ --save-baseline main
# Export the latest run to Markdown, CSV, and an interactive chartpybench run examples/ --export md:bench.mdpybench run examples/ --export csv:bench.csvpybench run examples/ --export chart:bench.html
# Compare against a named baseline and enforce thresholdspybench run examples/ --compare main --fail-on mean:7%,p99:12%
# Quick comparisons against historypybench run examples/ --vs mainpybench run examples/ --vs last
Compare output example
Section titled “Compare output example”$ pybench run examples/ --compare main --fail-on mean:5%,p99:10%comparing against: mainstrings/join-baseline: Δ=+0.00% p=n/a [same]strings/join_split: Δ=-2.10% p=0.12 [same]strings/join_plus: Δ=+98.50% p=0.000 [worse]❌ thresholds violated
-
Δ: variação percentual do tempo médio (positivo = mais lento/regressão).
-
p: p-valor (Mann–Whitney U, aproximação).
same/better/worse
exige significância (α=0.05) além do limiar de 1%. -
--fail-on mean:%,p99:%
aplica limites por métrica;p99
usa delta real do P99. -
Artifacts live under
.pybenchx/
(created automatically). -
P-values (Mann–Whitney U, approx) are shown in compare output;
p99
policy uses actual P99 delta. -
Use
pybench list
,pybench stats
, andpybench clean --keep N
to inspect and prune history.
Next steps
Section titled “Next steps”- Read CLI for options (
--no-color
,--sort
,--budget
,--max-n
). - See API for parameterization (
params={...}
) and suites. - Explore Examples for common patterns.