Script Reference¶

This document maps scripts and notebooks to their intended role and current stability.

Maturity Legend¶

Canonical
expected reproducible workflow entry point
Stable utility
maintained and useful, but not necessarily part of the default article profile
Research
useful experiment runner or analysis tool, but more likely to evolve
Optional
depends on external stacks such as FastSolv or SolProp
Infrastructure
internal helper, not a user-facing workflow

Preferred CLI Layout¶

The preferred human-facing CLI surface is now grouped by purpose:

scripts/data/
scripts/training/
scripts/evaluation/
scripts/experiments/
scripts/external/

Legacy top-level scripts/*.py entry points remain available as compatibility wrappers because tests, script-to-script imports, and compatibility entrypoints such as reproduce.sh still rely on them.

Canonical Workflow¶

Entry point	Role	Status	Notes
`scripts/data/prepare_data.py`	Build processed splits from raw sources	Canonical	Writes all supported split families
`scripts/training/train.py`	Train one TGNN-Solv model	Canonical	Three-phase curriculum
`scripts/experiments/run_seeds.py`	Multi-seed wrapper	Canonical	Can call other train scripts too
`scripts/evaluation/evaluate_complete.py`	Quick checkpoint evaluation	Canonical	Figure-ready arrays
`scripts/experiments/run_split_comparisons.py`	Fair split-wise comparison	Canonical	TGNN, DirectGNN, RF modes
`scripts/experiments/reproduce_paper.py`	Structured article-reproduction runner	Canonical	Supports `core`, `article`, and `full` profiles
`scripts/experiments/generate_paper_figures.py`	Figure generation	Canonical	Consumes result JSONs
`reproduce.sh`	Compatibility shell driver	Canonical	Delegates to `scripts/experiments/reproduce_paper.py --profile article`

Stable Utilities¶

Entry point	Role	Status	Notes
`scripts/training/train_directgnn.py`	Train DirectGNN baseline	Stable utility	Supports descriptor augmentation
`scripts/training/train_with_pretrain.py`	Train TGNN-Solv with Stage 0 enabled by default	Stable utility	Thin wrapper over `train.py --pretrain --run-descriptor-probe`; useful for GPS and descriptor-augmented TGNN warm starts too
`scripts/training/run_resume_safe_train.sh`	Resume-safe TGNN wrapper for cloud sessions	Stable utility	Wraps `train.py --resume`
`scripts/evaluation/benchmark_tgnn_solv.py`	Rich benchmark via `Evaluator`	Stable utility	Use when you want more than quick eval
`scripts/evaluation/benchmark_adapter_model.py`	Benchmark a formal Python adapter	Stable utility	Preferred custom-model path when you want fit/predict/report in one contract
`scripts/evaluation/analyze_benchmark.py`	Text summary of benchmark JSON	Stable utility	Lightweight reporting helper
`scripts/evaluation/compare_models.py`	Compare multiple TGNN checkpoints	Stable utility	Wraps benchmark logic
`scripts/training/diagnose_training.py`	Dataset stats and overfit sanity check	Stable utility	Good pre-flight tool
`scripts/evaluation/probe_gsol_descriptor_recovery.py`	Ridge linear probe from `g_sol` to RDKit descriptors	Stable utility	Useful for encoder-capacity diagnostics
`scripts/evaluation/run_thermo_stress_suite.py`	Stress slices on canonical prediction bundles	Stable utility	Reads `predictions.csv`, writes slice metrics JSON
`scripts/experiments/run_optuna.py`	Hyperparameter tuning	Stable utility	Supports TGNN, GPS TGNN, descriptor-augmented TGNN, and DirectGNN families
`scripts/launch_lab.py`	Launch the maintained Streamlit control surface	Stable utility	Preferred GUI entry point
`scripts/gui/launch_lab.py`	Namespaced launcher for the same lab	Stable utility	Same behavior, alternate path

Research Experiment Runners¶

Entry point	Role	Status	Notes
`scripts/experiments/run_ablation.py`	Multi-seed ablation sweeps	Research	Includes `fixed_group_priors` and `direct_gnn`
`scripts/experiments/run_full_budget_experiment.py`	Full-budget TGNN-vs-DirectGNN diagnostic study	Research	Exports TGNN intermediates and oracle diagnostics
`scripts/experiments/run_medium_budget_comparison.py`	Full-split medium-budget architecture comparison	Research	4 TGNN variants, 2 DirectGNN variants, RF baseline
`scripts/evaluation/validate_physics.py`	Physics-parameter diagnostics	Research	Useful for TGNN checkpoint inspection
`scripts/evaluation/error_analysis.py`	Detailed residual analysis	Research	Consumes evaluation JSON
`scripts/experiments/learning_curves.py`	Data-efficiency study	Research	Multi-fraction, multi-seed
`scripts/experiments/temperature_extrapolation.py`	Temperature extrapolation study	Research	Uses a combined dataset CSV
`scripts/experiments/statistical_tests.py`	Paired significance testing	Research	Used by the `full` reproduction profile, but still analysis-oriented
`scripts/experiments/generate_supplementary.py`	Supplementary table generation	Research	Consumes produced result JSONs
`scripts/experiments/build_benchmark_release.py`	Freeze a checksum-based benchmark release manifest	Research	Best when preparing a paper-ready artifact snapshot

Optional External Baseline Wrappers¶

Entry point	Role	Status	Notes
`scripts/external/run_fastsolv.py`	Predict, train, or compare FastSolv	Optional	Preferred FastSolv wrapper
`scripts/external/compare_fastsolv_tgnn.py`	Lightweight TGNN-vs-FastSolv comparison	Optional	Older convenience wrapper
`scripts/external/run_solprop.py`	Zero-shot, calibrated, or native-retrained SolProp	Optional	Usually run in a separate environment

Infrastructure¶

Entry point	Role	Status	Notes
`scripts/_bootstrap.py`	Adds repo `src/` to `sys.path` for CLIs	Infrastructure	Imported by most scripts

Maintained Library Utilities¶

Some important maintained surfaces are not exposed as standalone CLIs today. They are available through the Python API and are demonstrated in notebooks.

Module / API	Role	Notes
`tgnn_solv.pretrain.Pretrainer`	Stage 0 encoder/readout pretraining core	Used by `train.py --pretrain`, `train_with_pretrain.py`, and `notebooks/02_train.ipynb`
`tgnn_solv.pretrain_pipeline`	Stage 0 checkpoint save/load helpers	Used by the maintained TGNN training CLI
`tgnn_solv.pretrain.download_zinc250k`	Pretraining SMILES acquisition with fallback	Falls back to BigSolDB SMILES if needed
`tgnn_solv.inference.load_model`	Checkpoint loading	Reconstructs config and compatible weights
`tgnn_solv.inference.predict_solubility`	Single-system inference	Returns intermediates, not only final `ln(x2)`
`tgnn_solv.inference.temperature_scan`	Multi-temperature inference	Useful for van't Hoff style inspection
`tgnn_solv.inference.interpret_prediction`	Human-readable prediction report	Good for manual case review
`tgnn_solv.uncertainty.MCDropoutPredictor`	Single-checkpoint uncertainty	Covered in `notebooks/04_evaluation.ipynb`
`tgnn_solv.uncertainty.EnsemblePredictor`	Multi-checkpoint uncertainty	Now works for both `TGNN-Solv` and `DirectGNN` families
`tgnn_solv.uncertainty.calibration_report`	Interval calibration summary	Accepts MC-dropout or ensemble outputs
`tgnn_solv.domain.ApplicabilityDomain`	Inference-time OOD / AD scoring	Covered in `notebooks/03_inference.ipynb` and `notebooks/04_evaluation.ipynb`
`tgnn_solv.benchmark_adapters`	Formal custom-model adapter contract	Lets arbitrary models participate in canonical benchmark bundles
`tgnn_solv.artifacts`	Run manifests and benchmark/model cards	Supplies machine-readable provenance sidecars
`tgnn_solv.stress.build_stress_suite`	Thermodynamic stress slices for benchmark bundles	Used after `predictions.csv` already exists

The same maintained surfaces are also exposed together through tools/experiment_lab/app.py, but the GUI is an orchestration layer rather than a separate model implementation.

High-Signal Usage Notes¶

`scripts/experiments/run_seeds.py`¶

default train script is scripts/training/train.py
can also launch scripts/training/train_directgnn.py
aggregates mae, rmse, r2, and pearson_r

`scripts/training/train_directgnn.py`¶

computes descriptor normalization stats automatically when use_descriptor_augmentation=True
saves descriptor_mean and descriptor_std into the checkpoint
supports --checkpoint-every and --resume

`scripts/training/train.py`¶

supports --checkpoint-every and --resume
optionally runs Stage 0 with --pretrain
can warm-start from --pretrain-checkpoint
can launch the existing descriptor-recovery probe with --run-descriptor-probe
saves reusable Stage 0 encoder/readout checkpoints through tgnn_solv.pretrain_pipeline
stores TGNN descriptor normalization stats in the checkpoint when use_descriptor_augmentation=True
fits gc_prior_tm_scale / gc_prior_tm_bias on the training split when use_gc_priors_crystal=True
preserves those calibrated GC settings inside the saved config

`scripts/experiments/run_ablation.py`¶

resolves canonical variant aliases
automatically enables any optional dataset feature paths required by the selected variants

`scripts/experiments/run_full_budget_experiment.py`¶

trains TGNN-Solv and DirectGNN on matched budgets
exports metrics.json, diagnostics.json, and tgnn_intermediates.csv
passes --checkpoint-every through to the training CLIs
resumes from existing per-seed checkpoints when available

`scripts/experiments/run_medium_budget_comparison.py`¶

runs the medium-budget full-scaffold comparison under results/medium_budget
derives a no-oracle training config from paper_config_combined.yaml
writes summary.json, comparison_table.md, and per-model artifacts

Notebook Reference¶

Notebook	Role	Recommended usage
`notebooks/01_prepare_data.ipynb`	Data preparation	Canonical interactive equivalent of `prepare_data.py`
`notebooks/02_train.ipynb`	TGNN training walkthrough	Interactive training plus optional Stage 0 pretraining
`notebooks/03_inference.ipynb`	Inference examples	Manual inspection, temperature scans, and single-query AD checks
`notebooks/04_evaluation.ipynb`	Evaluation workflow	Stratified metrics, MC-dropout, calibration, and AD analysis
`notebooks/05_baselines.ipynb`	Baseline experiments	Exploratory DirectGNN, descriptor, RF, and external-baseline work
`notebooks/06_ablations.ipynb`	Ablation experiments	Exploratory ablations including maintained split-late comparison
`notebooks/07_temperature.ipynb`	Temperature analysis	Research notebook for van't Hoff and multi-temperature behavior
`notebooks/08_optuna_tuning.ipynb`	Optuna tuning	Interactive tuning