CategoricalBinarizer

class paralytics.preprocessing.CategoricalBinarizer(keywords_true=None, keywords_false=None)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Finds categorical columns with binary-like response and converts them.

Searches throughout the categorical columns in the DataFrame and finds those which contain categories corresponding to the passed boolean values only.

Parameters
keywords_{true, false}: list, optional (default=None)

List of categories’ names corresponding to {True, False} logical values.

Attributes
columns_binarylike_: list

List of column names that should be mapped to boolean.

Methods Summary

fit(self, X[, y])

Fits selection of binary-like columns.

transform(self, X)

Applies boolean convertion to binary-like category columns.

Methods Documentation

fit(self, X, y=None)[source]

Fits selection of binary-like columns.

Parameters
X: pd.DataFrame, shape = (n_samples, n_features)

Data with n_samples as its number of samples and n_features as its number of features.

y: ignore
Returns
self: object

Returns the instance itself.

transform(self, X)[source]

Applies boolean convertion to binary-like category columns.

X columns that match the condition of containing only binary-like string values are mapped to boolean values corresponding to the passed strings expected to be interpreted as binary response.

Parameters
X: pd.DataFrame, shape = (n_samples, n_features)

Data with n_samples as its number of samples and n_features as its number of features.

Returns
X_new: pd.DataFrame, shape = (n_samples, n_features)

X data with substituted binary-like category columns with its corresponding binary values.