本文整理匯總了Python中preprocessor.Preprocessor.fill_nans方法的典型用法代碼示例。如果您正苦於以下問題:Python Preprocessor.fill_nans方法的具體用法?Python Preprocessor.fill_nans怎麽用?Python Preprocessor.fill_nans使用的例子?那麽, 這裏精選的方法代碼示例或許可以為您提供幫助。您也可以進一步了解該方法所在類preprocessor.Preprocessor
的用法示例。
在下文中一共展示了Preprocessor.fill_nans方法的1個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的Python代碼示例。
示例1: get_important_vars
# 需要導入模塊: from preprocessor import Preprocessor [as 別名]
# 或者: from preprocessor.Preprocessor import fill_nans [as 別名]
def get_important_vars(cfg, dat):
'''
This method does Feature Selection.
'''
# Balances the dataset
idxs_pos = dat[cfg['target']] == 1
pos = dat[idxs_pos]
neg = dat[dat[cfg['target']] == 0][1:sum(idxs_pos)]
# Concatenates pos and neg, it's already shuffled
sub_dat = pos.append(neg, ignore_index = True)
# Imputes the data and fills in the missing values
sub_dat = Preprocessor.fill_nans(sub_dat)
# Changes categorical vars to a numerical form
X = pd.get_dummies(sub_dat)
#### Correlation-based Feature Selection ####
# Computes correlation between cfg['target'] and the predictors
target_corr = X.corr()[cfg['target']].copy()
target_corr.sort(ascending = False)
# Sorts and picks the first x features
# TODO: get optimal x value automatically
tmp = abs(target_corr).copy()
tmp.sort(ascending = False)
important_vars = [tmp.index[0]]
important_vars.extend(list(tmp.index[2:52])) # removes other target
#### Variance-based Feature Selection ####
#sel = VarianceThreshold(threshold = 0.005)
#X_new = sel.fit_transform(X)
#### Univariate Feature Selection ####
#y = X.TARGET_B
#X = X.drop("TARGET_B", axis = 1)
#X_new = SelectKBest(chi2, k = 10).fit_transform(X.values, y.values)
#### Tree-based Feature Selection ####
#clf = ExtraTreesClassifier()
#X_new = clf.fit(X.values, y.values).transform(X.values)
#aux = dict(zip(X.columns, clf.feature_importances_))
#important_vars = [i[0] for i in sorted(
# aux.items(), key = operator.itemgetter(0))]
return important_vars