本文整理汇总了Python中preprocessor.Preprocessor.fill_nans方法的典型用法代码示例。如果您正苦于以下问题:Python Preprocessor.fill_nans方法的具体用法?Python Preprocessor.fill_nans怎么用?Python Preprocessor.fill_nans使用的例子?那么, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类preprocessor.Preprocessor
的用法示例。
在下文中一共展示了Preprocessor.fill_nans方法的1个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。
示例1: get_important_vars
# 需要导入模块: from preprocessor import Preprocessor [as 别名]
# 或者: from preprocessor.Preprocessor import fill_nans [as 别名]
def get_important_vars(cfg, dat):
'''
This method does Feature Selection.
'''
# Balances the dataset
idxs_pos = dat[cfg['target']] == 1
pos = dat[idxs_pos]
neg = dat[dat[cfg['target']] == 0][1:sum(idxs_pos)]
# Concatenates pos and neg, it's already shuffled
sub_dat = pos.append(neg, ignore_index = True)
# Imputes the data and fills in the missing values
sub_dat = Preprocessor.fill_nans(sub_dat)
# Changes categorical vars to a numerical form
X = pd.get_dummies(sub_dat)
#### Correlation-based Feature Selection ####
# Computes correlation between cfg['target'] and the predictors
target_corr = X.corr()[cfg['target']].copy()
target_corr.sort(ascending = False)
# Sorts and picks the first x features
# TODO: get optimal x value automatically
tmp = abs(target_corr).copy()
tmp.sort(ascending = False)
important_vars = [tmp.index[0]]
important_vars.extend(list(tmp.index[2:52])) # removes other target
#### Variance-based Feature Selection ####
#sel = VarianceThreshold(threshold = 0.005)
#X_new = sel.fit_transform(X)
#### Univariate Feature Selection ####
#y = X.TARGET_B
#X = X.drop("TARGET_B", axis = 1)
#X_new = SelectKBest(chi2, k = 10).fit_transform(X.values, y.values)
#### Tree-based Feature Selection ####
#clf = ExtraTreesClassifier()
#X_new = clf.fit(X.values, y.values).transform(X.values)
#aux = dict(zip(X.columns, clf.feature_importances_))
#important_vars = [i[0] for i in sorted(
# aux.items(), key = operator.itemgetter(0))]
return important_vars