Auto-tuning

Auto-tuning is featured in the end-to-end example Jupyter notebooks:

Heterogeneous Node Classification with HashGNN and Autotuning

Node Classification Pipelines, Node Regression Pipelines, and Link Prediction Pipelines are trained using supervised machine learning methods which have multiple configurable parameters that affect training outcomes. To obtain models with high quality, setting good values for the hyper-parameters can have a large impact. Auto-tuning is generally preferable over manual search for such values, as that is a time-consuming and hard thing to do.

It is possible to combine manual and automatic tuning when adding model candidates to Node Classification, Node Regression, or Link Prediction. For the manual part, configurations with fixed values for all hyper-parameters are added to the pipeline. To fully leverage automatic search, hyper-parameters can be specified to lie in ranges instead of having fixed values. For some parameters, ranges are interpreted in log-scale. This applies to parameters that are conventionally tuned on a log scale.

If any model candidate hyper-parameter is specified as a range, auto-tuning is applied when training the pipeline. The configurations with only fixed values are evaluated first, and subsequently the remaining configurations with ranges are repeatedly selected and evaluated. For configurations that have at least one range, fixed values from the ranges are selected before the evaluation. Each such evaluation is called a trial. In the case at least one range is present, the number of trials is the value of the maxTrials configuration parameter of gds.alpha.pipeline.nodeClassification.configureAutoTuning, gds.alpha.pipeline.noderegression.configureAutoTuning, and gds.alpha.pipeline.linkPrediction.configureAutoTuning respectively. If no range is present in any model configuration, all of the configurations are tried, regardless of maxTrials. Once the all the trials have been completed, the best model candidate configuration is selected as the winner.

For details on specific hyper-parameters, please see the supported training methods.