Configuration and CLI
SchNetPack models and tasks can be defined using hierarchical Hydra config files in YAML format and modified using command line arguments. Here, we will introduce the structure and basic syntax of the configuration. Please refer to the Hydra documentation for more extensive information.
We will explain the structure of the config at the example of training PaiNN on QM9 with the command:
$ spktrain experiment=qm9_atomwise
Before going through the config step-by-step, we show the full config as printed by the command:
⚙ Running with the following config:
├── run
│ └── work_dir: ${hydra:runtime.cwd}
│ data_dir: ${run.work_dir}/data
│ path: runs
│ id: ${uuid:1}
│
├── globals
│ └── model_path: best_model
│ cutoff: 5.0
│ lr: 0.0005
│ property: energy_U0
│
├── data
│ └── _target_: schnetpack.datasets.QM9
│ datapath: ${run.data_dir}/qm9.db
│ data_workdir: null
│ batch_size: 100
│ num_train: 110000
│ num_val: 10000
│ num_test: null
│ num_workers: 8
│ num_val_workers: null
│ num_test_workers: null
│ remove_uncharacterized: false
│ distance_unit: Ang
│ property_units:
│ energy_U0: eV
│ energy_U: eV
│ enthalpy_H: eV
│ free_energy: eV
│ homo: eV
│ lumo: eV
│ gap: eV
│ zpve: eV
│ transforms:
│ - _target_: schnetpack.transform.SubtractCenterOfMass
│ - _target_: schnetpack.transform.RemoveOffsets
│ property: ${globals.property}
│ remove_atomrefs: true
│ remove_mean: true
│ - _target_: schnetpack.transform.MatScipyNeighborList
│ cutoff: ${globals.cutoff}
│ - _target_: schnetpack.transform.CastTo32
│
├── model
│ └── representation:
│ _target_: schnetpack.representation.PaiNN
│ n_atom_basis: 128
│ n_interactions: 3
│ shared_interactions: false
│ shared_filters: false
│ radial_basis:
│ _target_: schnetpack.nn.radial.GaussianRBF
│ n_rbf: 20
│ cutoff: ${globals.cutoff}
│ cutoff_fn:
│ _target_: schnetpack.nn.cutoff.CosineCutoff
│ cutoff: ${globals.cutoff}
│ _target_: schnetpack.model.NeuralNetworkPotential
│ input_modules:
│ - _target_: schnetpack.atomistic.PairwiseDistances
│ output_modules:
│ - _target_: schnetpack.atomistic.Atomwise
│ output_key: ${globals.property}
│ n_in: ${model.representation.n_atom_basis}
│ aggregation_mode: sum
│ postprocessors:
│ - _target_: schnetpack.transform.CastTo64
│ - _target_: schnetpack.transform.AddOffsets
│ property: ${globals.property}
│ add_mean: true
│ add_atomrefs: true
│
├── task
│ └── optimizer_cls: torch.optim.AdamW
│ optimizer_args:
│ lr: ${globals.lr}
│ weight_decay: 0.0
│ scheduler_cls: schnetpack.train.ReduceLROnPlateau
│ scheduler_monitor: val_loss
│ scheduler_args:
│ mode: min
│ factor: 0.8
│ patience: 80
│ threshold: 0.0001
│ threshold_mode: rel
│ cooldown: 10
│ min_lr: 0.0
│ smoothing_factor: 0.0
│ _target_: schnetpack.AtomisticTask
│ outputs:
│ - _target_: schnetpack.task.ModelOutput
│ name: ${globals.property}
│ loss_fn:
│ _target_: torch.nn.MSELoss
│ metrics:
│ mae:
│ _target_: torchmetrics.regression.MeanAbsoluteError
│ mse:
│ _target_: torchmetrics.regression.MeanSquaredError
│ loss_weight: 1.0
│ warmup_steps: 0
│
├── trainer
│ └── _target_: pytorch_lightning.Trainer
│ devices: 1
│ min_epochs: null
│ max_epochs: 100000
│ enable_model_summary: true
│ profiler: null
│ gradient_clip_val: 0
│ accumulate_grad_batches: 1
│ val_check_interval: 1.0
│ check_val_every_n_epoch: 1
│ num_sanity_val_steps: 0
│ fast_dev_run: false
│ overfit_batches: 0
│ limit_train_batches: 1.0
│ limit_val_batches: 1.0
│ limit_test_batches: 1.0
│ track_grad_norm: -1
│ detect_anomaly: false
│ amp_backend: native
│ amp_level: null
│ precision: 32
│ accelerator: auto
│ num_nodes: 1
│ tpu_cores: null
│ deterministic: false
│ resume_from_checkpoint: null
│
├── callbacks
│ └── model_checkpoint:
│ _target_: schnetpack.train.ModelCheckpoint
│ monitor: val_loss
│ save_top_k: 1
│ save_last: true
│ mode: min
│ verbose: false
│ dirpath: checkpoints/
│ filename: '{epoch:02d}'
│ model_path: ${globals.model_path}
│ early_stopping:
│ _target_: pytorch_lightning.callbacks.EarlyStopping
│ monitor: val_loss
│ patience: 1000
│ mode: min
│ min_delta: 0.0
│ lr_monitor:
│ _target_: pytorch_lightning.callbacks.LearningRateMonitor
│ logging_interval: epoch
│
├── logger
│ └── tensorboard:
│ _target_: pytorch_lightning.loggers.tensorboard.TensorBoardLogger
│ save_dir: tensorboard/
│ name: default
│
└── seed
└── None
Train config and config groups
The config printed above is the flattened final config that the SchNetPack receives as a dictionary. However, it is not necessary to write down the whole config in a file when specifying a run. Instead Hydra uses config groups that are hierarchically ordered in directories and allow to predefine templates for parts of the config.
The default configs for SchNetPack are located in the directory src/schnetpack/configs
.
The config train.yaml
is used as a basis for all training runs and sets the
following default config groups:
defaults:
- run: default_run
- globals: default_globals
- trainer: default_trainer
- callbacks:
- checkpoint
- earlystopping
- lrmonitor
- task: default_task
- model: null
- data: custom
- logger: tensorboard
- experiment: null
Here is a description of the purpose of the different config groups:
run: defines run-specific variables, such as the run
id
, or working and data directoriesglobals: defines custom variables that can be reused across the whole config by making use of the interpolation syntax
${globals.variable}
data: defines the
data.AtomsDataModule
to be usedmodel: defines the
model.AtomisticModel
to be usedtask: defines the
task.AtomisticTask
trainer: configure the PyTorchLightning
Trainer
callbacks: a list of callbacks for the PyTorchLightning
Trainer
logger: a dictionary of training logger that is passed to the trainer
seed: the random seed
experiment: define experiment templates by overriding the train.yaml config
A special role plays the config group experiment
, which does not occur in the config
shown above. This is because experiment
is used to overwrite the defaults of
train.yaml
and can be used to build pre-defined configs, such as for the QM9 case
shown above.
The config groups data
, model
, task
, trainer
, callback
and
logger
directly define objects using the special key _target_
, which specifies
a class, while the remaining key-value pairs define the arguments passed to the
__init__
.
Defining experiments
We will now take a look at the QM9 experiment config:
# @package _global_
defaults:
- override /model: nnp
- override /data: qm9
The first line indicates that the experiment config should be placed at the base level of the hierarchy, i.e. it directly overrides train.yaml`. Then, the defaults for the model and data config groups are overridden.
The configs for the model.NeuralNetworkPotential
(nnp
) and the
dataset.AtomsDataModule
(qm9
) are predefined in the respective directories
of their config groups. E.g., the data config loads the predefined
AtomsDataModule
for QM9 that automatically downloads the dataset and sets the
units that the property should be converted to:
defaults:
- custom
_target_: schnetpack.datasets.QM9
datapath: ${run.data_dir}/qm9.db # data_dir is specified in train.yaml
batch_size: 100
num_train: 110000
num_val: 10000
remove_uncharacterized: False
# convert to typically used units
distance_unit: Ang
property_units:
energy_U0: eV
energy_U: eV
enthalpy_H: eV
free_energy: eV
homo: eV
lumo: eV
gap: eV
zpve: eV
In the next section of the experiment config, the run path and global variables are set:
run.path: runs/qm9_${globals.property}
globals:
cutoff: 5.
lr: 5e-4
property: energy_U0
These variables will be used in the following sections. There, the defaults for model and data loaded above are overridden:
data:
transforms:
- _target_: schnetpack.transform.SubtractCenterOfMass
- _target_: schnetpack.transform.RemoveOffsets
property: ${globals.property}
remove_atomrefs: True
remove_mean: True
- _target_: schnetpack.transform.MatScipyNeighborList
cutoff: ${globals.cutoff}
- _target_: schnetpack.transform.CastTo32
model:
output_modules:
- _target_: schnetpack.atomistic.Atomwise
output_key: ${globals.property}
n_in: ${model.representation.n_atom_basis}
aggregation_mode: sum
postprocessors:
- _target_: schnetpack.transform.CastTo64
- _target_: schnetpack.transform.AddOffsets
property: ${globals.property}
add_mean: True
add_atomrefs: True
All parameters not set here are kept from the default configs.
The data config is altered by setting a custom list of pre-processing transforms.
These are suitable for prediction of the energy and similar extensive targets.
The list includes removing of the single atom reference and the mean energy per atom,
computing the neighborlist and finally casting to float32
.
The output-specific part of the model is set to predict the energy as a sum of atomwise
contributions. The key in the output dictionary output_key
is set to the name of the
property to be predicted. Finally, the specified post-processor casts back to double and
adds the removed offsets to the prediction.
The missing part is to define the task that should be solved during the training:
task:
outputs:
- _target_: schnetpack.task.ModelOutput
name: ${globals.property}
loss_fn:
_target_: torch.nn.MSELoss
metrics:
mae:
_target_: torchmetrics.regression.MeanAbsoluteError
mse:
_target_: torchmetrics.regression.MeanSquaredError
loss_weight: 1.
This last section modifies the config of task.AtomisticTask
by setting a custom
list of model outputs. In this case, we use the mean squared error as a loss on
the predicted property. When the target_name
is nor explicitly set, it is assumed to
be identical with the name
of the prediction.
In conclusion, the hierarchical structure of the configuration allows us to prepare templates for common use case of the individual components and then assemble and modify them as required for a specific experiment. In the next section, we will show some examples of how to override a given experiment with the command line.
Overriding arguments with the CLI
If you are running several variation of an experiment, it is convenient to make these directly at the command line instead of creating a separate config file for each of them. When changing a single value, such as the learning rate, you can use the following notation:
$ spktrain experiment=qm9_atomwise globals.lr=1e-4
Alternatively, one can also change a whole config group. The syntax for this is slightly different:
$ spktrain experiment=qm9_atomwise model/representation=schnet
The difference here is that schnet
refers to a pre-defined subconfig, instead of a
single value. The config would be changed by this as follows:
...
├── model
│ └── representation:
│ _target_: schnetpack.representation.SchNet
│ n_atom_basis: 128
│ n_interactions: 6
│ radial_basis:
│ _target_: schnetpack.nn.radial.GaussianRBF
│ n_rbf: 20
│ cutoff: ${globals.cutoff}
│ cutoff_fn:
│ _target_: schnetpack.nn.cutoff.CosineCutoff
│ cutoff: ${globals.cutoff}
...
Working with your own config
Hydra is looking for config YAMLs in the config directory of SchNetPack, where
tha main config train.yaml
is located, as well as in the paths specified in
train.yaml
:
hydra:
searchpath:
- file://${oc.env:PWD}
- file://${oc.env:PWD}/configs
This means that Hydra is looking also in the current working directory as well as a
subdirectory ./configs
.
This makes it convenient to create working directories for your projects where all
your custom configs and runs are located.
If you want to specify a different config location, this is possible with:
$ spktrain --config-dir=/path/to/configs experiment=my_config
Note, that in any case the config path points to the base directory of the hierarchical config. The config files need to be placed in the respective directories matching the config groups. This could look something like this:
# create main config directory and subdirectory for experiment config group
$ mkdir configs
$ mkdir configs/experiment
# create experiment and edit config
$ vim my_configs/experiments/my_experiment.yaml
# explicitly stating config dir is not required since it is in current working dir
$ spktrain experiment=my_experiment
Within the configs you are not restricted to using SchNetPack modules, but you can also
set your own implementations as _target_
, as long as they conform with the required
interface, e.g. are subclasses of SchNetPack base classes. Unfortunately, this can not
be checked statically at the time, but will lead to errors when calling spktrain
.
This kind of flexibilty enables a convenient extension of SchNetPack with your own
config and code. As an example, you can check out the package
schnetpack-gschnet,
which implements new classes and integrates them in a custom config to build generative
neural networks for molecules.
Internal precision of float32 matrix multiplications
When utilizing GPUs, it can be advantageous to manage the precision of floating-point
and matrix multiplication. PyTorch Lightning offers the option to set floating-point
precision by defining the alias trainer.precision
in the Hydra config file.
In addition to the explicit floating point precision, you can adjust the internal
precision for matrix multiplication operations with float32 by defining matmul_precision
in the config file or the argument +matmul_precision
when using the CLI.
It’s important to note that this PyTorch variable applies only to NVIDIA GPUs with specific
capabilities, like A100, that allow the adjustment of internal precision for matrix multiplications.
Further details can be found in the PyTorch documentation for torch.set_float32_matmul_precision
.