Command line interface

CASCADE can also be run as a command line tool. The following sections describe the available commands and their options.

General help

cascade --help
usage: cascade [-h]
               {discover,acyclify,tune,counterfactual,design,design_brute_force,upgrade,devmgr}
               ...

CASCADE: Causality Aware Single-Cell Automatic Discovery/Deduction/Design
Engine

options:
  -h, --help            show this help message and exit

subcommands:
  Select a function

  {discover,acyclify,tune,counterfactual,design,design_brute_force,upgrade,devmgr}

Please check the help message of each subcommand for more details.

Causal discovery

cascade discover --help
usage: cascade discover [-h] -d DATA -m MODEL [-i INFO]
                        [--interv-key INTERV_KEY]
                        [--use-covariate USE_COVARIATE] [--use-size USE_SIZE]
                        [--use-weight USE_WEIGHT] [--use-layer USE_LAYER]
                        [--n-particles N_PARTICLES] [--n-layers N_LAYERS]
                        [--hidden-dim HIDDEN_DIM] [--latent-dim LATENT_DIM]
                        [--dropout DROPOUT] [--beta BETA]
                        [--scaffold-mod {Bilinear,Edgewise}]
                        [--sparse-mod {ScaleFree,L1}]
                        [--acyc-mod {LogDet,SpecNorm,TrExp}]
                        [--latent-mod {GCNLatent,EmbLatent,NilLatent}]
                        [--lik-mod {NegBin,Normal}]
                        [--kernel-mod {RBF,KroneckerDelta}]
                        [--scaffold-graph SCAFFOLD_GRAPH]
                        [--scaffold-tau SCAFFOLD_TAU]
                        [--bilinear-emb-dim BILINEAR_EMB_DIM]
                        [--spec-norm-n-iter SPEC_NORM_N_ITER]
                        [--latent-data LATENT_DATA]
                        [--gcn-latent-emb-dim GCN_LATENT_EMB_DIM]
                        [--gcn-latent-n-layers GCN_LATENT_N_LAYERS]
                        [--random-seed RANDOM_SEED] [--log-dir LOG_DIR]
                        [--lam LAM] [--alpha ALPHA] [--gamma GAMMA]
                        [--cyc-tol CYC_TOL] [--prefit] [--opt OPT] [--lr LR]
                        [--batch-size BATCH_SIZE]
                        [--weight-decay WEIGHT_DECAY]
                        [--accumulate-grad-batches ACCUMULATE_GRAD_BATCHES]
                        [--log-adj {mean,particles,both,none}]
                        [--val-check-interval VAL_CHECK_INTERVAL]
                        [--val-frac VAL_FRAC] [--max-epochs MAX_EPOCHS]
                        [--n-devices N_DEVICES] [--log-subdir LOG_SUBDIR]
                        [--random-sleep RANDOM_SLEEP] [-v]

Run causal discovery

options:
  -h, --help            show this help message and exit

Input/output options:
  -d DATA, --data DATA  Input dataset (.h5ad)
  -m MODEL, --model MODEL
                        Output discovered model (.pt)
  -i INFO, --info INFO  Output run information (.yaml)

Dataset configuration options:
  --interv-key INTERV_KEY
                        Interventional target key in adata.obs
  --use-covariate USE_COVARIATE
                        Covariate key in adata.obs
  --use-size USE_SIZE   Size key in adata.obs
  --use-weight USE_WEIGHT
                        Weight key in adata.obs
  --use-layer USE_LAYER
                        Data key in adata.layers

Model construction options:
  --n-particles N_PARTICLES
                        Number of SVGD particles
  --n-layers N_LAYERS   Number of MLP layers in the structural equations
  --hidden-dim HIDDEN_DIM
                        MLP hidden layer dimension in the structural equations
  --latent-dim LATENT_DIM
                        Dimension of the latent variable
  --dropout DROPOUT     Dropout rate
  --beta BETA           KL weight of the latent variable
  --scaffold-mod {Bilinear,Edgewise}
                        Scaffold graph module
  --sparse-mod {ScaleFree,L1}
                        Sparse prior module
  --acyc-mod {LogDet,SpecNorm,TrExp}
                        Acyclic prior module
  --latent-mod {GCNLatent,EmbLatent,NilLatent}
                        Latent module
  --lik-mod {NegBin,Normal}
                        Causal likelihood module
  --kernel-mod {RBF,KroneckerDelta}
                        SVGD kernel module
  --scaffold-graph SCAFFOLD_GRAPH
                        Scaffold graph of the scaffold graph module (.gml)
  --scaffold-tau SCAFFOLD_TAU
                        Gumbel sigmoid temperature of the scaffold graph
                        module
  --bilinear-emb-dim BILINEAR_EMB_DIM
                        Embedding dimension of the `Bilinear` scaffold graph
                        module (only effective when the `Bilinear` module is
                        used)
  --spec-norm-n-iter SPEC_NORM_N_ITER
                        Number of power iterations for the `SpecNorm` acyclic
                        prior module (only effective when the `SpecNorm`
                        module is used)
  --latent-data LATENT_DATA
                        Depending on the latent module used, it can be the
                        latent embedding of the `EmbLatent` module (.csv), or
                        the latent graph of the `GCNLatent` module (.gml)
  --gcn-latent-emb-dim GCN_LATENT_EMB_DIM
                        Embedding dimension of the `GCNLatent` module (only
                        effective when the `GCNLatent` module is used)
  --gcn-latent-n-layers GCN_LATENT_N_LAYERS
                        Number of layers for the `GCNLatent` module(only
                        effective when the `GCNLatent` module is used)
  --random-seed RANDOM_SEED
                        Random seed
  --log-dir LOG_DIR     Directory to store tensorboard logs

Model fitting options:
  --lam LAM             Sparse gradient coefficient
  --alpha ALPHA         Acyclicity gradient coefficient
  --gamma GAMMA         Kernel gradient coefficient
  --cyc-tol CYC_TOL     Tolerance for cyclic penalty
  --prefit              Whether to prefit the model on covariates only
  --opt OPT             Optimizer type
  --lr LR               Learning rate
  --batch-size BATCH_SIZE
                        Mini-batch size
  --weight-decay WEIGHT_DECAY
                        Weight decay
  --accumulate-grad-batches ACCUMULATE_GRAD_BATCHES
                        Number of batches to accumulate before optimizer step
  --log-adj {mean,particles,both,none}
                        Type of adjacency matrix to write to tensorboard logs
  --val-check-interval VAL_CHECK_INTERVAL
                        Validation check interval in training steps
  --val-frac VAL_FRAC   Fraction of data to use for validation
  --max-epochs MAX_EPOCHS
                        Maximal number of training epochs
  --n-devices N_DEVICES
                        Number of GPU devices to use
  --log-subdir LOG_SUBDIR
                        Subdirectory to store tensorboard logs

Miscellaneous options:
  --random-sleep RANDOM_SLEEP
                        Sleep a random amount of time before starting
  -v, --verbose         Enable verbose output

Graph acyclification

cascade acyclify --help
usage: cascade acyclify [-h] -m MODEL -g GRAPH [-i INFO]
                        [--random-sleep RANDOM_SLEEP] [-v]

Acyclify discovered causal graph

options:
  -h, --help            show this help message and exit

Input/output options:
  -m MODEL, --model MODEL
                        Input discovered model (.pt)
  -g GRAPH, --graph GRAPH
                        Output acyclified causal graph (.gml)
  -i INFO, --info INFO  Output run information (.yaml)

Miscellaneous options:
  --random-sleep RANDOM_SLEEP
                        Sleep a random amount of time before starting
  -v, --verbose         Enable verbose output

Model tuning

cascade tune --help
usage: cascade tune [-h] -d DATA -g GRAPH -m INPUT_MODEL -o OUTPUT_MODEL
                    [-i INFO] [--interv-key INTERV_KEY]
                    [--use-covariate USE_COVARIATE] [--use-size USE_SIZE]
                    [--use-weight USE_WEIGHT] [--use-layer USE_LAYER]
                    [--tune-ctfact] [--stratify STRATIFY] [--opt OPT]
                    [--lr LR] [--batch-size BATCH_SIZE]
                    [--weight-decay WEIGHT_DECAY]
                    [--accumulate-grad-batches ACCUMULATE_GRAD_BATCHES]
                    [--log-adj {both,none,mean,particles}]
                    [--val-check-interval VAL_CHECK_INTERVAL]
                    [--val-frac VAL_FRAC] [--max-epochs MAX_EPOCHS]
                    [--n-devices N_DEVICES] [--log-subdir LOG_SUBDIR]
                    [--random-seed RANDOM_SEED] [--random-sleep RANDOM_SLEEP]
                    [-v]

Tune acyclified model

options:
  -h, --help            show this help message and exit

Input/output options:
  -d DATA, --data DATA  Input dataset (.h5ad)
  -g GRAPH, --graph GRAPH
                        Input acyclified causal graph (.gml)
  -m INPUT_MODEL, --input-model INPUT_MODEL
                        Input discovered model (*.pt)
  -o OUTPUT_MODEL, --output-model OUTPUT_MODEL
                        Output tuned model (.pt)
  -i INFO, --info INFO  Output run information (.yaml)

Dataset configuration options:
  --interv-key INTERV_KEY
                        Interventional target key in adata.obs
  --use-covariate USE_COVARIATE
                        Covariate key in adata.obs
  --use-size USE_SIZE   Size key in adata.obs
  --use-weight USE_WEIGHT
                        Weight key in adata.obs
  --use-layer USE_LAYER
                        Data key in adata.layers

Model fitting options:
  --tune-ctfact         Tune the model in counterfactual mode
  --stratify STRATIFY   Stratify counterfactual pairs based on the given key
                        in adata.obs
  --opt OPT             Optimizer type
  --lr LR               Learning rate
  --batch-size BATCH_SIZE
                        Mini-batch size
  --weight-decay WEIGHT_DECAY
                        Weight decay
  --accumulate-grad-batches ACCUMULATE_GRAD_BATCHES
                        Number of batches to accumulate before optimizer step
  --log-adj {both,none,mean,particles}
                        Type of adjacency matrix to write to tensorboard logs
  --val-check-interval VAL_CHECK_INTERVAL
                        Validation check interval in training steps
  --val-frac VAL_FRAC   Fraction of data to use for validation
  --max-epochs MAX_EPOCHS
                        Maximal number of training epochs
  --n-devices N_DEVICES
                        Number of GPU devices to use
  --log-subdir LOG_SUBDIR
                        Subdirectory to store tensorboard logs
  --random-seed RANDOM_SEED
                        Random seed

Miscellaneous options:
  --random-sleep RANDOM_SLEEP
                        Sleep a random amount of time before starting
  -v, --verbose         Enable verbose output

Counterfactual deduction

cascade counterfactual --help
usage: cascade counterfactual [-h] -d DATA -m MODEL [-u DESIGN_MODULE] -p PRED
                              [-i INFO] [--interv-key INTERV_KEY]
                              [--use-covariate USE_COVARIATE]
                              [--use-size USE_SIZE] [--use-weight USE_WEIGHT]
                              [--use-layer USE_LAYER]
                              [--fixed-genes FIXED_GENES] [--sample]
                              [--ablate-latent] [--ablate-interv]
                              [--ablate-graph] [--batch-size BATCH_SIZE]
                              [--n-devices N_DEVICES]
                              [--random-sleep RANDOM_SLEEP] [-v]

Run counterfactual prediction

options:
  -h, --help            show this help message and exit

Input/output options:
  -d DATA, --data DATA  Input dataset (.h5ad)
  -m MODEL, --model MODEL
                        Input tuned model (*.pt)
  -u DESIGN_MODULE, --design-module DESIGN_MODULE
                        Input intervention design module (*.pt)
  -p PRED, --pred PRED  Output counterfactual prediction (.h5ad)
  -i INFO, --info INFO  Output run information (.yaml)

Dataset configuration options:
  --interv-key INTERV_KEY
                        Interventional target key in adata.obs
  --use-covariate USE_COVARIATE
                        Covariate key in adata.obs
  --use-size USE_SIZE   Size key in adata.obs
  --use-weight USE_WEIGHT
                        Weight key in adata.obs
  --use-layer USE_LAYER
                        Data key in adata.layers

Model prediction options:
  --fixed-genes FIXED_GENES
                        Comma-separated genes to fix in counterfactual
                        prediction
  --sample              Use random samples rather than mean for counterfactual
                        prediction
  --ablate-latent       Ablate latent contribution during counterfactual
                        prediction
  --ablate-interv       Ablate direct intervention during counterfactual
                        prediction
  --ablate-graph        Ablate graph contribution during counterfactual
                        prediction
  --batch-size BATCH_SIZE
                        Mini-batch size
  --n-devices N_DEVICES
                        Number of GPU devices to use

Miscellaneous options:
  --random-sleep RANDOM_SLEEP
                        Sleep a random amount of time before starting
  -v, --verbose         Enable verbose output

Intervention design

cascade design --help
usage: cascade design [-h] -d DATA -m MODEL -t TARGET [--pool POOL]
                      [--init INIT] -o OUTPUT_DESIGN [-u OUTPUT_MODULE]
                      [-i INFO] [--interv-key INTERV_KEY]
                      [--use-covariate USE_COVARIATE] [--use-size USE_SIZE]
                      [--use-weight USE_WEIGHT] [--use-layer USE_LAYER]
                      [--design-size DESIGN_SIZE] [--design-scale-bias]
                      [--target-weight TARGET_WEIGHT] [--stratify STRATIFY]
                      [--opt OPT] [--lr LR] [--batch-size BATCH_SIZE]
                      [--weight-decay WEIGHT_DECAY]
                      [--accumulate-grad-batches ACCUMULATE_GRAD_BATCHES]
                      [--log-adj {mean,both,particles,none}]
                      [--val-check-interval VAL_CHECK_INTERVAL]
                      [--val-frac VAL_FRAC] [--max-epochs MAX_EPOCHS]
                      [--n-devices N_DEVICES] [--log-subdir LOG_SUBDIR]
                      [--random-seed RANDOM_SEED]
                      [--random-sleep RANDOM_SLEEP] [-v]

Targeted intervention design

options:
  -h, --help            show this help message and exit

Input/output options:
  -d DATA, --data DATA  Input source dataset (.h5ad)
  -m MODEL, --model MODEL
                        Input tuned model (*.pt)
  -t TARGET, --target TARGET
                        Input design target (*.h5ad)
  --pool POOL           Input candidate variable pool to intervene (*.txt)
  --init INIT           Input initial design (*.txt)
  -o OUTPUT_DESIGN, --output-design OUTPUT_DESIGN
                        Output interventional design (.csv)
  -u OUTPUT_MODULE, --output-module OUTPUT_MODULE
                        Output intervention design module (*.pt)
  -i INFO, --info INFO  Output run information (.yaml)

Dataset configuration options:
  --interv-key INTERV_KEY
                        Interventional target key in adata.obs
  --use-covariate USE_COVARIATE
                        Covariate key in adata.obs
  --use-size USE_SIZE   Size key in adata.obs
  --use-weight USE_WEIGHT
                        Weight key in adata.obs
  --use-layer USE_LAYER
                        Data key in adata.layers

Model fitting options:
  --design-size DESIGN_SIZE
                        Expected number of perturbation targets to design
  --design-scale-bias   Whether to design interventional scale and bias
  --target-weight TARGET_WEIGHT
                        Key in target.var containing the weight of each gene
  --stratify STRATIFY   Stratify design pairs based on the given key in
                        adata.obs
  --opt OPT             Optimizer type
  --lr LR               Learning rate
  --batch-size BATCH_SIZE
                        Mini-batch size
  --weight-decay WEIGHT_DECAY
                        Weight decay
  --accumulate-grad-batches ACCUMULATE_GRAD_BATCHES
                        Number of batches to accumulate before optimizer step
  --log-adj {mean,both,particles,none}
                        Type of adjacency matrix to write to tensorboard logs
  --val-check-interval VAL_CHECK_INTERVAL
                        Validation check interval in training steps
  --val-frac VAL_FRAC   Fraction of data to use for validation
  --max-epochs MAX_EPOCHS
                        Maximal number of training epochs
  --n-devices N_DEVICES
                        Number of GPU devices to use
  --log-subdir LOG_SUBDIR
                        Subdirectory to store tensorboard logs
  --random-seed RANDOM_SEED
                        Random seed

Miscellaneous options:
  --random-sleep RANDOM_SLEEP
                        Sleep a random amount of time before starting
  -v, --verbose         Enable verbose output
cascade design_brute_force --help
usage: cascade design_brute_force [-h] -d DATA -m MODEL -t TARGET
                                  [--pool POOL] -o OUTPUT_DESIGN [-p PRED]
                                  [-i INFO] [--interv-key INTERV_KEY]
                                  [--use-covariate USE_COVARIATE]
                                  [--use-size USE_SIZE]
                                  [--use-weight USE_WEIGHT]
                                  [--use-layer USE_LAYER]
                                  [--design-size DESIGN_SIZE] [-k K]
                                  [--n-neighbors N_NEIGHBORS]
                                  [--batch-size BATCH_SIZE]
                                  [--n-devices N_DEVICES]
                                  [--random-sleep RANDOM_SLEEP] [-v]

Targeted intervention design with brute-force search

options:
  -h, --help            show this help message and exit

Input/output options:
  -d DATA, --data DATA  Input source dataset (.h5ad)
  -m MODEL, --model MODEL
                        Input tuned model (*.pt)
  -t TARGET, --target TARGET
                        Input design target (*.h5ad)
  --pool POOL           Input candidate variable pool to intervene (*.txt)
  -o OUTPUT_DESIGN, --output-design OUTPUT_DESIGN
                        Output interventional design (.csv)
  -p PRED, --pred PRED  Output counterfactual prediction (.h5ad)
  -i INFO, --info INFO  Output run information (.yaml)

Dataset configuration options:
  --interv-key INTERV_KEY
                        Interventional target key in adata.obs
  --use-covariate USE_COVARIATE
                        Covariate key in adata.obs
  --use-size USE_SIZE   Size key in adata.obs
  --use-weight USE_WEIGHT
                        Weight key in adata.obs
  --use-layer USE_LAYER
                        Data key in adata.layers

Model prediction options:
  --design-size DESIGN_SIZE
                        Expected number of perturbation targets to design
  -k K                  Number of cells to predict for each possible
                        intervention
  --n-neighbors N_NEIGHBORS
                        Number of counterfactual neighbors to consider for
                        each target cell
  --batch-size BATCH_SIZE
                        Mini-batch size
  --n-devices N_DEVICES
                        Number of GPU devices to use

Miscellaneous options:
  --random-sleep RANDOM_SLEEP
                        Sleep a random amount of time before starting
  -v, --verbose         Enable verbose output

Model file upgrade

cascade upgrade --help
usage: cascade upgrade [-h] -m MODEL

Upgrade saved CASCADE model

options:
  -h, --help            show this help message and exit
  -m MODEL, --model MODEL
                        Model to be upgraded (*.pt)

Device manager

cascade devmgr --help
usage: cascade devmgr [-h] {init,acquire,release} ...

Device manager

options:
  -h, --help            show this help message and exit

subcommands:
  Select a function

  {init,acquire,release}