ColumnProjector

class paralytics.preprocessing.ColumnProjector(manual_projection=None, num_to_float=True)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Projects variable types onto basic dtypes.

If not specified projects numeric features onto float, boolean onto bool and categorical onto ‘category’ dtypes.

Parameters
manual_projection: dictionary, optional (default=None)

Dictionary where keys are dtype names onto which specified columns will be projected and values are lists containing names of variables to be projected onto given dtype. Example usage:

>>> manual_projection = {
>>>    float: ['foo', 'bar'],
>>>    'category': ['baz'],
>>>    int: ['qux'],
>>>    bool: ['quux']
>>> }
num_to_float: boolean, optional (default=True)

Specifies whether numerical variables should be projected onto float (if True) or onto int (if False).

Attributes
automatic_projection_: dict

Dictionary where key is the dtype name onto which specified columns will be projected chosen automatically (when manual_projection is specified then this manual assignment is decisive).

Methods Summary

fit(self, X[, y])

Fits corresponding dtypes to X.

transform(self, X)

Apply variable projection on X.

Methods Documentation

fit(self, X, y=None)[source]

Fits corresponding dtypes to X.

Parameters
X: pd.DataFrame, shape = (n_samples, n_features)

Training data of independent variable values.

y: ignore
Returns
self: object

Returns the instance itself.

transform(self, X)[source]

Apply variable projection on X.

Parameters
X: pd.DataFrame, shape = (n_samples, n_features)

New data with n_samples as its number of samples.

Returns
X_new: pd.DataFrame, shape = (n_samples, n_features)

X data with projected values onto specified dtype.