当前位置: 首页>>代码示例>>Python>>正文


Python Preprocessor.fill_nans方法代码示例

本文整理汇总了Python中preprocessor.Preprocessor.fill_nans方法的典型用法代码示例。如果您正苦于以下问题:Python Preprocessor.fill_nans方法的具体用法?Python Preprocessor.fill_nans怎么用?Python Preprocessor.fill_nans使用的例子?那么, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在preprocessor.Preprocessor的用法示例。


在下文中一共展示了Preprocessor.fill_nans方法的1个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。

示例1: get_important_vars

# 需要导入模块: from preprocessor import Preprocessor [as 别名]
# 或者: from preprocessor.Preprocessor import fill_nans [as 别名]
    def get_important_vars(cfg, dat):
        '''
        This method does Feature Selection.
        '''

        # Balances the dataset
        idxs_pos = dat[cfg['target']] == 1
        pos = dat[idxs_pos]
        neg = dat[dat[cfg['target']] == 0][1:sum(idxs_pos)]

        # Concatenates pos and neg, it's already shuffled
        sub_dat = pos.append(neg, ignore_index = True)

        # Imputes the data and fills in the missing values
        sub_dat = Preprocessor.fill_nans(sub_dat)

        # Changes categorical vars to a numerical form
        X = pd.get_dummies(sub_dat)

        #### Correlation-based Feature Selection ####

        # Computes correlation between cfg['target'] and the predictors
        target_corr = X.corr()[cfg['target']].copy()
        target_corr.sort(ascending = False)

        # Sorts and picks the first x features
        # TODO: get optimal x value automatically
        tmp = abs(target_corr).copy()
        tmp.sort(ascending = False)
        important_vars = [tmp.index[0]]
        important_vars.extend(list(tmp.index[2:52])) # removes other target

        #### Variance-based Feature Selection ####

        #sel = VarianceThreshold(threshold = 0.005)
        #X_new = sel.fit_transform(X)

        #### Univariate Feature Selection ####

        #y = X.TARGET_B
        #X = X.drop("TARGET_B", axis = 1)

        #X_new = SelectKBest(chi2, k = 10).fit_transform(X.values, y.values)

        #### Tree-based Feature Selection ####

        #clf = ExtraTreesClassifier()
        #X_new = clf.fit(X.values, y.values).transform(X.values)

        #aux = dict(zip(X.columns, clf.feature_importances_))
        #important_vars = [i[0] for i in sorted(
        #    aux.items(), key = operator.itemgetter(0))]

        return important_vars
开发者ID:kantSunthad,项目名称:kdd98cup,代码行数:56,代码来源:analyser.py


注:本文中的preprocessor.Preprocessor.fill_nans方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。