Timing Benchmarks

pyCSRML fingerprinting speed is measured on five molecule-size-stratified benchmark sets extracted from the CLinventory. Each set contains 500 molecules; timing is the median of 5 repetitions of fingerprint_batch().

Benchmark sets

Sets are generated from the CLinventory by scripts/create_size_benchmarks.py and stored in tests/test_data/size_benchmarks/.

Set

Heavy-atom range

Molecules

bench_tiny

1 – 10

500

bench_small

11 – 20

500

bench_medium

21 – 35

500

bench_large

36 – 60

500

bench_xlarge

61 +

500

pyCSRML timing results

Measured on Snapdragon X Elite X1E78100 (ARM64, 12 cores, ~32 GB RAM), Python 3.14.2, RDKit 2025.09.3, NumPy 2.3.5.

Set

Heavy atoms

ToxPrint v2 (ms/mol)

TxP_PFAS v1 (ms/mol)

bench_tiny

1 – 10

3.76

0.73

bench_small

11 – 20

5.47

1.01

bench_medium

21 – 35

8.23

1.53

bench_large

36 – 60

12.32

2.19

bench_xlarge

61 +

23.20

4.46

The TxP_PFAS v1 fingerprinter (129 bits) is roughly 5× faster than ToxPrint v2 (729 bits) across all size bins. Both fingerprinters scale approximately linearly with heavy-atom count: ToxPrint v2 ranges from 3.76 ms/mol (tiny) to 23.2 ms/mol (xlarge), a ~6× increase over the full size range.

Baseline file: tests/test_data/size_benchmarks/pycsrml_timing_baseline.json.

Reproducing the benchmarks

# Create benchmark sets (one-time; requires CLinventory CSV)
python scripts/create_size_benchmarks.py

# Time pyCSRML (saves pycsrml_timing_baseline.json)
python scripts/benchmark_pycsrml_timing.py          # 5 reps (default)
python scripts/benchmark_pycsrml_timing.py --reps 3  # fewer reps, faster

# Run regression tests (require zips from ChemoTyper)
pytest tests/test_benchmark_regression.py -v -m slow

Timing regression tests (tests/test_benchmark_regression.py) fail if any set runs more than 30 % slower than the saved baseline. They skip gracefully until pycsrml_timing_baseline.json exists.

System information

Full details are recorded in tests/test_data/size_benchmarks/SYSTEM_INFO.md.

Property

Value

Host

ZenbookA14

OS

Windows 11

CPU

Snapdragon X Elite X1E78100 — Qualcomm Oryon

Architecture

ARM64

Physical cores

12

RAM

~32 GB

Python

3.14.2

RDKit

2025.09.3

NumPy

2.3.5

pyCSRML

0.1.0 (editable install)