ColumnProjector¶

class paralytics.preprocessing.ColumnProjector(manual_projection=None, num_to_float=True)[source]¶

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin

Projects variable types onto basic dtypes.

If not specified projects numeric features onto float, boolean onto bool and categorical onto ‘category’ dtypes.

Parameters

manual_projection: dictionary, optional (default=None)

Dictionary where keys are dtype names onto which specified columns will be projected and values are lists containing names of variables to be projected onto given dtype. Example usage:

>>> manual_projection = {
>>>    float: ['foo', 'bar'],
>>>    'category': ['baz'],
>>>    int: ['qux'],
>>>    bool: ['quux']
>>> }

num_to_float: boolean, optional (default=True)

Specifies whether numerical variables should be projected onto float (if True) or onto int (if False).

Attributes

automatic_projection_: dict: Dictionary where key is the dtype name onto which specified columns will be projected chosen automatically (when manual_projection is specified then this manual assignment is decisive).

Methods Summary

`fit`(self, X[, y])	Fits corresponding dtypes to X.
`transform`(self, X)	Apply variable projection on X.

Methods Documentation

fit(self, X, y=None)[source]¶

Fits corresponding dtypes to X.

Parameters

X: pd.DataFrame, shape = (n_samples, n_features): Training data of independent variable values.
y: ignore

Returns

self: object: Returns the instance itself.

transform(self, X)[source]¶

Apply variable projection on X.

Parameters

X: pd.DataFrame, shape = (n_samples, n_features): New data with n_samples as its number of samples.

Returns

X_new: pd.DataFrame, shape = (n_samples, n_features): X data with projected values onto specified dtype.