Skip to Content

CLI

The CurryBO Core CLI is the software that contains all the CurryBO logic. It is used by currybo-benchmarks and CurryBO web, but also has a CLI for direct interaction.

A command like

currybo \ --measurements denmark_measurements.csv \ --options denmark_options.csv \ --substrates name=Thiol,type=smiles name=Imine,type=smiles \ --conditions name=Catalyst,type=smiles \ --targets name=Delta_Delta_G,type=scalar \ --objectives name=Delta_Delta_G,abs_threshold=1,maximize=True \ --batch-size 2

might result in this:

{ "estimated_current_optimum": { "point": { "Catalyst": "O=P1(O)OC2=C(C3=C(F)C=C(OC)C=C3F)C=C4C(C=CC=C4)=[C@]2[C@]5=C(O1)C(C6=C(F)C=C(OC)C=C6F)=CC7=C5C=CC=C7" }, "value": { "Delta_Delta_G": 1.2688742979134011 } }, "next_points": [ { "point": { "Thiol": "SC1CCCCC1", "Imine": "O=C(C1=CC=CC=C1)/N=C/C2=CC=C(Cl)C=C2Cl", "Catalyst": "O=P1(O)OC2=C(C3=C(F)C=C(OC)C=C3F)C=C4C(C=CC=C4)=[C@]2[C@]5=C(O1)C(C6=C(F)C=C(OC)C=C6F)=CC7=C5C=CC=C7" }, "value": { "Delta_Delta_G": { "mean": 1.2686868041322046, "stdev": 0.021312972361864586 } } }, { "point": { "Thiol": "SC1CCCCC1", "Imine": "O=C(C1=CC=CC=C1)/N=C/C2=CC=C(C(F)(F)F)C=C2", "Catalyst": "O=P1(O)OC2=C(CC3=CC(C(F)(F)F)=CC(C(F)(F)F)=C3)C=C4C(CCCC4)=C2C5=C(O1)C(CC6=CC(C(F)(F)F)=CC(C(F)(F)F)=C6)=CC7=C5CCCC7" }, "value": { "Delta_Delta_G": { "mean": 1.267070886641988, "stdev": 0.021635567155093717 } } } ] }

Output

In the output above, CurryBO returns

  • estimated_current_optimum: The condition that CurryBO currently thinks has the best general target(s), as point (condition) and value (target)
  • next_points: A list of --batch-size points (substrates + conditions) to measure next in order to find the optimum as quickly as possible, together with the values (as mean and standard deviation) of all targets it currently expects for these points.

Synopsis

usage: currybo [-h] [--measurements MEASUREMENTS] [--options OPTIONS] [--conditions CONDITIONS [CONDITIONS ...]] [--substrates SUBSTRATES [SUBSTRATES ...]] [--targets TARGETS [TARGETS ...]] [--objectives OBJECTIVES [OBJECTIVES ...]] [--final-objective FINAL_OBJECTIVE] [--seed SEED] [--surrogate {SimpleGP,AdditiveStructureGP}] [--kernel {TanimotoKernel}] [--likelihood {GaussianLikelihood}] [--x-utility {Random,SimpleRegret,UncertaintyUtility,QuantileUtility,QuantitativeImprovement,QualitativeImprovement}] [--x-utility-kwargs X_UTILITY_KWARGS] [--w-utility {Random,SimpleRegret,UncertaintyUtility,QuantileUtility}] [--w-utility-kwargs W_UTILITY_KWARGS] [--utility {Random,SimpleRegret,UncertaintyUtility,QuantileUtility,QuantitativeImprovement,QualitativeImprovement}] [--utility-kwargs UTILITY_KWARGS] [--acquisition {SequentialAcquisition,SequentialLookaheadAcquisition,JointLookaheadAcquisition}] [--aggregation {Mean,Sigmoid,MSE,Min}] [--batch-size BATCH_SIZE] [--batch-strategy {QSequentialAcquisition,QProbabilityOfOptimality}] [--qpo-num-samples QPO_NUM_SAMPLES] [--silent] Find general parameters in synthesis using Bayesian Optimization options: -h, --help show this help message and exit --measurements MEASUREMENTS Measurements .csv file --options OPTIONS Options for substrate and condition columns --conditions CONDITIONS [CONDITIONS ...] Condition columns of data set, as keyval. Specify [name, type (smiles, scalar, array)], e.g. `name=Catalyst,type=smiles` --substrates SUBSTRATES [SUBSTRATES ...] Substrate columns that should be evaulated for generality. Specify [name, type (smiles, scalar, array)], e.g. `name=Ketone,type=smiles` --targets TARGETS [TARGETS ...] Target columns of data set. Specify [name, type (scalar)], e.g. `name=Yield,type=scalar` --objectives OBJECTIVES [OBJECTIVES ...] Objectives for optimization. Specify [name, threshold, lower_bound, upper_bound, maximize] --final-objective FINAL_OBJECTIVE Objective index to optimize when all objectives reached their threshold --seed SEED Seed for RNG --surrogate {SimpleGP,AdditiveStructureGP} Surrogate Model Type, defaults to `SimpleGP` --kernel {TanimotoKernel} Covariance Kernel for the Surrogate Model --likelihood {GaussianLikelihood} Likelihood --x-utility {Random,SimpleRegret,UncertaintyUtility,QuantileUtility,QuantitativeImprovement,QualitativeImprovement} Utility function Type for x. Defaults to QuantileUtility --x-utility-kwargs X_UTILITY_KWARGS Arguments to pass to the x utility, as a keyval string --w-utility {Random,SimpleRegret,UncertaintyUtility,QuantileUtility} Utility function Type for w. Defaults to UncertaintyUtility --w-utility-kwargs W_UTILITY_KWARGS Arguments to pass to the w utility, as a keyval string --utility {Random,SimpleRegret,UncertaintyUtility,QuantileUtility,QuantitativeImprovement,QualitativeImprovement} Utility function for Joint Acquisitions --utility-kwargs UTILITY_KWARGS Arguments to pass to the utility, as a keyval string --acquisition {SequentialAcquisition,SequentialLookaheadAcquisition,JointLookaheadAcquisition} Acquisition Strategy, defaults to `SequentialAcquisition` --aggregation {Mean,Sigmoid,MSE,Min} Aggregation Function, defaults to `Mean` --batch-size BATCH_SIZE Batch Size, defaults to 1 --batch-strategy {QSequentialAcquisition,QProbabilityOfOptimality} Batch Strategy, defaults to QSequentialAcquisition --qpo-num-samples QPO_NUM_SAMPLES Nuber of samples for qPO --silent Do not generate any output. Useful for automated runs.

Arguments

--help

Print the help message and exit.

--measurements

required

e.g. --measurements measurements-file.csv

Specify which measurements file to use. This file defines what values (at least 1) were already measured. Provide this file as a .csv (comma-separated) with parameter names as column headers. Each measurement is one line.

Lists of values that correspond to one parameter (array type) are space-separated. All (SUBSTRATE, CONDITION, TARGET) need to be included here.

Example CSV:

measurements.csv
substrate,base,fluoride,yield OCCCCC1=CC=CC=C1,N12CCCN=C1CCCCC2,ClC1=CC=C(S(=O)(F)=O)C=C1,0.42 OCCCCC1=CC=CC=C1,N12CCCN=C1CCCCC2,O=S(C1=CC=CC=N1)(F)=O,0.48 ...

--options

required

e.g. --options options-file.csv

Specify which options file to use. This file defines what options (at least 1 per column) CurryBO should consider. Provide this file as a .csv (comma-separated) with parameter names as column headers. Each option is one line in a column. Options in different columns and the same row have no correlation.

Lists of values that correspond to one parameter (array type) are space-separated. Only (SUBSTRATE, CONDITION) should be included here.

Note that duplicates in a column are automatically removed by CurryBO.

options.csv
substrate,base,fluoride OCCCCC1=CC=CC=C1,N12CCCN=C1CCCCC2,ClC1=CC=C(S(=O)(F)=O)C=C1 OCCCCC1=CC=CC=C1,N12CCCN=C1CCCCC2,O=S(C1=CC=CC=N1)(F)=O OCCCCC1=CC=CC=C1,N12CCCN=C1CCCCC2 OCCCCC1=CC=CC=C1 OCCCCC1=CC=CC=C1 OC(C)CCC1=CC=CC=C1

--conditions

required

e.g. --conditions name=fluoride,type=smiles name=temperature,type=scalar

Specify what columns of your measurements/options should be treaded as conditions. For each condition, specify a name (equals a column name in your input files) and a type (one of smiles, scalar or array), separated by a comma. Do not use spaces around the = or ,. Column names with spaces can be handled with e.g. --conditions "name=my condition,type=smiles".

  • smiles: A molecule, defined by its SMILES string
  • array: A list of values that correspond to the same parameter, e.g. a list of descriptors for a molecule. Space-separated, e.g. 2.3 4.5 6.7
  • scalar: A number, e.g. a temperature

--substrates

required

e.g. --substrates name=substrate,type=smiles name=temperature,type=scalar

Same as --conditions, except for defining substrates.

--targets

required

e.g. --targets name=yield,type=scalar

Same as --conditions, except for defining targets. Targets must always be of type scalar.

--objectives

required

e.g. --objectives name=yield,abs_threshold=0.9,maximize=True name=stereoselectivity,rel_threshold=0.6

Specify what CurryBO should optimize for. More information on Multi-Objective BO can be found here. Use the same key-value notation as described above.

If only one objective is given, CurryBO will optimize this objective.

If multiple objective are given, CurryBO will apply the following order of rules:

  • Optimize the first objective until its threshold is reached
  • Optimize the second objective until its threshold is reached
  • …
  • Optimize final-objective (below) to its optimum

If an objective cannot reach its threshold, CurryBO will optimize it as far as possible and then stop.

Possible keys:

  • name (required): Name of the column, usually a TARGET.
  • abs_threshold (required*): Defines what value this target should at least have.
  • rel_threshold (required*): Defines what value between 0 and 1 this target should at least have. Here, 0 is the lowest measured value and 1 the highest. If bounds are set, 0 is lower_bound and 1 is upper_bound.
  • lower_bound: Lower bound of the scalarizer. If not set, this value is the lowest measurement.
  • upper_bound: Upper bound of the scalarizer. If not set, this value is the highest measurement.
  • maximize: Whether this objective should be maximized (default) or minimized (maximize=False).

--final-objective

e.g. --final-objective 1

When all objectives have been satisfied, further optimize the objective at this index. Starts with 0, defaults to 0.

--seed

e.g. --seed 1234

Sets the seed for all random number generations.

--surrogate

Choose from {SimpleGP,AdditiveStructureGP}

e.g. --surrogate SimpleGP

Sets the surrogate model type. More info here.

--kernel

Choose from {TanimotoKernel}

e.g. --kernel TanimotoKernel

Sets the kernel for the surrogate model.

--likelihood

Choose from {GaussianLikelihood}

e.g. --likelihood GaussianLikelihood

Sets the likelihood for the surrogate model.

--x-utility

Choose from {Random,SimpleRegret,UncertaintyUtility,QuantileUtility,QuantitativeImprovement,QualitativeImprovement}

e.g. --x-utility QualitativeImprovement

Sets the Condition utility function for SequentialAcquisition or SequentialLookaheadAcquisition. Defaults to QuantileUtility. More info here.

--x-utility-kwargs

e.g. --x-utility-kwargs beta=5

Some utility functions (e.g. QuantileUtility) can be configured with arguments. Pass these as key-value strings here.

--w-utility

Choose from {Random,SimpleRegret,UncertaintyUtility,QuantileUtility}

e.g. --w-utility QualitativeImprovement

Sets the Substrate utility. Otherwise same as --x-utility

--w-utility-kwargs

See --x-utility-kwargs

--utility

Choose from {Random,SimpleRegret,UncertaintyUtility,QuantileUtility,QuantitativeImprovement,QualitativeImprovement}

e.g. --utility QualitativeImprovement

Sets the Condition and Substrate utility function for JointLookaheadAcquisition. Defaults to QuantileUtility. More info here.

--utility-kwargs

See --x-utility-kwargs

--acquisition

Choose from {SequentialAcquisition,SequentialLookaheadAcquisition,JointLookaheadAcquisition}

e.g. --acquisition SequentialLookaheadAcquisition

Sets the acquisition strategy. Defaults to SequentialAcquisition. More info here.

Keep in mind that SequentialAcquisition and SequentialLookaheadAcquisition use --x-utility and --w-utility while JointLookaheadAcquisition uses --utility.

--aggregation

Choose from {Mean,Sigmoid,MSE,Min}

e.g. --aggregation Min

Sets the aggregation function. Defaults to Mean. More info here

--batch-size

e.g. --batch-size 5

Sets the number of conditions/substrates CurryBO proposes for the next round of measurements. Defaults to 1. More info here.

--batch-strategy

Choose from {QSequentialAcquisition,QProbabilityOfOptimality}

e.g. --batch-strategy QProbabilityOfOptimality

Sets the batching strategy. Defaults to QSequentialAcquisition. More info here

--qpo-num-samples

e.g. --qpo-num-samples 20

Sets the number of samples QProbabilityOfOptimality should use. Ignored if --batch-strategy is not QProbabilityOfOptimality. Defaults to 10. More info here.

--silent

Do not generate any console output. Useful for automated runs.

Last updated on