Code

Submodules

variable_dropout.variable_dropout

class variable_dropout.variable_dropout.DropoutType

Method of variable dropout loss representation. One of the following:

RAW - raw value of variable dropout loss,

RATIO - ratio of loss of variable dropout to loss for unperturbed model,

DIFFERENCE - difference between variable dropout loss and unperturbed model loss

variable_dropout.variable_dropout.variable_dropout(estimator: Any, X: pandas.core.frame.DataFrame, y: Iterable[Any], loss_function: Callable[[Iterable[Any], Iterable[Any]], float] = <function mean_squared_error>, dropout_type: variable_dropout.variable_dropout.DropoutType = <DropoutType.RAW: (<function DropoutType.<lambda>>,)>, n_sample: int = 1000, n_iters: int = 100, random_state: Union[int, mtrand.RandomState, NoneType] = None, label=None) → pandas.core.frame.DataFrame

Determines importance of variables in the model. Model trained on all variables is used to predict result variable for data with one variable randomly shuffled. The worse the result with a particular variable shuffled is, the more important the variable is.

Parameters:
  • estimator – any fitted classification or regression model with predict method.
  • X – samples.
  • y – result variable for samples.
  • loss_function – a function taking vectors of real and predicted results. The better the prediction, the smaller the returned value.
  • dropout_type – method of loss representation. One of values specified in DropoutType enumeration.
  • n_sample – number of samples to predict for. Given number of samples. is randomly chosen from X with replacement.
  • n_iters – number of iterations. Final result is mean of the results of iterations.
  • random_state – ensures deterministic results if run twice with the same value.
Returns:

series of variable dropout loss sorted descending.

variable_dropout.plot_variable_dropout

variable_dropout.plot_variable_dropout.plot_variable_dropout(*args, max_vars: Union[int, NoneType] = 10, include_baseline_and_full: bool = True) → None

Plots the results of variable_dropout.

Parameters:
  • args – any number of variable_dropout results.
  • max_vars – maximum number of variables to plot per classifier, or None to plot all of them.
  • include_baseline_and_full – whether to include _baseline_ and _full_model_ in the plot.