當前位置: 首頁>>代碼示例>>Python>>正文


Python Preprocessor.fill_nans方法代碼示例

本文整理匯總了Python中preprocessor.Preprocessor.fill_nans方法的典型用法代碼示例。如果您正苦於以下問題:Python Preprocessor.fill_nans方法的具體用法?Python Preprocessor.fill_nans怎麽用?Python Preprocessor.fill_nans使用的例子?那麽, 這裏精選的方法代碼示例或許可以為您提供幫助。您也可以進一步了解該方法所在preprocessor.Preprocessor的用法示例。


在下文中一共展示了Preprocessor.fill_nans方法的1個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的Python代碼示例。

示例1: get_important_vars

# 需要導入模塊: from preprocessor import Preprocessor [as 別名]
# 或者: from preprocessor.Preprocessor import fill_nans [as 別名]
    def get_important_vars(cfg, dat):
        '''
        This method does Feature Selection.
        '''

        # Balances the dataset
        idxs_pos = dat[cfg['target']] == 1
        pos = dat[idxs_pos]
        neg = dat[dat[cfg['target']] == 0][1:sum(idxs_pos)]

        # Concatenates pos and neg, it's already shuffled
        sub_dat = pos.append(neg, ignore_index = True)

        # Imputes the data and fills in the missing values
        sub_dat = Preprocessor.fill_nans(sub_dat)

        # Changes categorical vars to a numerical form
        X = pd.get_dummies(sub_dat)

        #### Correlation-based Feature Selection ####

        # Computes correlation between cfg['target'] and the predictors
        target_corr = X.corr()[cfg['target']].copy()
        target_corr.sort(ascending = False)

        # Sorts and picks the first x features
        # TODO: get optimal x value automatically
        tmp = abs(target_corr).copy()
        tmp.sort(ascending = False)
        important_vars = [tmp.index[0]]
        important_vars.extend(list(tmp.index[2:52])) # removes other target

        #### Variance-based Feature Selection ####

        #sel = VarianceThreshold(threshold = 0.005)
        #X_new = sel.fit_transform(X)

        #### Univariate Feature Selection ####

        #y = X.TARGET_B
        #X = X.drop("TARGET_B", axis = 1)

        #X_new = SelectKBest(chi2, k = 10).fit_transform(X.values, y.values)

        #### Tree-based Feature Selection ####

        #clf = ExtraTreesClassifier()
        #X_new = clf.fit(X.values, y.values).transform(X.values)

        #aux = dict(zip(X.columns, clf.feature_importances_))
        #important_vars = [i[0] for i in sorted(
        #    aux.items(), key = operator.itemgetter(0))]

        return important_vars
開發者ID:kantSunthad,項目名稱:kdd98cup,代碼行數:56,代碼來源:analyser.py


注:本文中的preprocessor.Preprocessor.fill_nans方法示例由純淨天空整理自Github/MSDocs等開源代碼及文檔管理平台,相關代碼片段篩選自各路編程大神貢獻的開源項目,源碼版權歸原作者所有,傳播和使用請參考對應項目的License;未經允許,請勿轉載。