Skip to content

Experiment Lab

Experiment Lab is the maintained interactive control surface for TGNN-Solv. It uses the same scripts, checkpoints, processed CSVs, and Python APIs documented elsewhere on this site, but exposes them as one visual workspace.

Use it when you want:

  • one place to launch training, evaluation, experiments, and reproduction jobs
  • interactive single-system inference with RDKit structures and atom graphs
  • post-hoc uncertainty inspection, calibration review, and OOD checks
  • application-facing solvent screening for synthesis and formulation work
  • artifact browsing, lineage tracing, and side-by-side result comparison
  • visual DAG planning and model-architecture editing
  • a repo-backed kanban board and experiment schedule
  • access to the documentation site from inside the app itself

Install

Install the GUI-enabled extras:

pip install -e ".[gui,dev]"

The core model runtime still follows the same PyTorch / PyG installation path described in Installation.

Launch

Preferred launchers:

python scripts/launch_lab.py

or:

python scripts/gui/launch_lab.py

Runtime Model

The Streamlit UI process and the model-execution process do not need to use the same interpreter.

The sidebar exposes a Python command field. That command is used for:

  • training and evaluation subprocesses
  • checkpoint inspection
  • detailed inference
  • uncertainty and calibration helpers
  • environment diagnostics

This is useful when the GUI runs in a lighter environment but model code should execute in the real tgnn-solv environment.

Main Workspaces

Data

  • processed split coverage
  • representative rows and temperature / solubility scatter
  • split-family navigation for scaffold, solute, and solvent holdout CSVs

Training

  • tuned TGNN-Solv and DirectGNN launchers
  • Stage 0 launch controls, warm-start checkpoint loading, and descriptor-probe export for TGNN
  • GPS-aware config inspection and TGNN descriptor-augmentation visibility in the config summary
  • curriculum-aware training setup
  • the same grouped training entry points documented in Training

Experiments

  • multi-seed, split-comparison, medium-budget, full-budget, and Optuna launchers
  • external baseline benchmark launcher for FastSolv and SolProp, including native SolProp retraining on TGNN-Solv targets
  • custom-model benchmark launcher for arbitrary prediction CSVs or command-generated outputs
  • formal adapter-based custom-model benchmarking through tgnn_solv.benchmark_adapters
  • canonical benchmark bundles that drop straight into the same results registry and compare views as maintained models

Pipeline Studio

  • Airflow-style DAG editor with drag, connect, and inline block editing
  • repo-backed preset saving under tools/experiment_lab/presets/pipelines/
  • shell / JSON export
  • import of saved inference, uncertainty, or calibration artifacts as DAG nodes

Model Architect

  • TGNN-Solv and DirectGNN config editing
  • explicit encoder_type switching between mpnn and gps
  • GPS positional-encoding controls and TGNN descriptor-augmentation branches
  • Stage 0 warm-start planning and launch-time export through the same maintained training CLI
  • visual branch diff for the maintained architecture comparison
  • real RDKit structure and graph previews derived from the current SMILES

Results & Plots

  • artifact registry across results/, checkpoints/, figures/, and tables/
  • Benchmark Studio for canonical benchmark bundles with leaderboard, parity/residual plots, temperature/error views, and stratified metrics across FastSolv, SolProp, TGNN-family, and custom-model outputs
  • benchmark-card and run-manifest inspection directly from the focused bundle view
  • lineage graph linking configs, jobs, checkpoints, lab histories, and planner follow-ups
  • artifact diff for checkpoints, JSON reports, and CSV tables

Inference

  • single-system prediction with decomposition and temperature scan
  • explicit DirectGNN run-and-inspect path for the matched no-physics baseline
  • persistent history for saved inference runs
  • Uncertainty lab for ensemble and MC-dropout review
  • Calibration dashboard for PICP_90, MPIW, MAE, RMSE, and parity plots
  • OOD / applicability-domain scoring through tgnn_solv.domain
  • structure drawing/editing with a sanitized RDKit preview before the model ever sees the SMILES

Current scope note:

  • Run & inspect, Uncertainty lab, and Calibration dashboard now support both TGNN-Solv and DirectGNN
  • solver decomposition, GC priors, and OOD / applicability-domain scoring remain TGNN-Solv-specific because they rely on the physics-facing model path

Applications

  • synthesis-route solvent screening
  • ranks explicit solvents for hot-to-cold isolation windows per intermediate
  • developability / oral dose-pressure proxies
  • uses water plus a small explicit-solvent panel
  • solvent-swap screening
  • estimates crash-out pressure during workup or antisolvent transfer

Important scope note:

  • this is a translation layer on top of solubility prediction
  • it is not a full retrosynthesis planner
  • it is not a mechanistic PK/PD simulator

Planner

  • repo-backed kanban board
  • experiment todo list with scheduling
  • intake from saved inference / uncertainty / calibration history
  • launch of selected follow-up tasks directly from the board

Documentation

  • local Markdown rendering from docs/
  • embedded published documentation site

Reproduce

  • structured article-reproduction launcher with core, article, and full profiles
  • step-graph view sourced from the same maintained reproduction module used by reproduce.sh and scripts/experiments/reproduce_paper.py
  • targeted launch of selected reproduction steps when you do not want the whole profile

History and Lineage

The lab persists its own analysis sessions under:

  • results/lab_runs/inference_history/
  • results/lab_runs/uncertainty_history/
  • results/lab_runs/calibration_history/

These saved JSON artifacts are reused throughout the app:

  • they appear in the results registry
  • they can be compared and downloaded from the inference workbench
  • they can be imported into Pipeline Studio
  • they can be turned into follow-up tasks in Planner
  • they appear in the lineage graph together with checkpoints and configs

When To Use the Lab vs the CLI

Use the lab when you need:

  • visual monitoring
  • ad hoc single-system analysis
  • artifact comparison
  • external baseline or custom-model benchmarking without hand-writing wrapper commands
  • planning, scheduling, and manual review loops

Use the CLI when you need:

  • unattended execution on a server
  • scripted reproducibility
  • simple copy-paste commands for training or evaluation
  • integration into external automation

Both surfaces are maintained and point to the same underlying project assets.