本文整理汇总了Python中Dataset.Dataset.getTrainAndTestSets方法的典型用法代码示例。如果您正苦于以下问题:Python Dataset.getTrainAndTestSets方法的具体用法?Python Dataset.getTrainAndTestSets怎么用?Python Dataset.getTrainAndTestSets使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类Dataset.Dataset
的用法示例。
在下文中一共展示了Dataset.getTrainAndTestSets方法的1个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。
示例1: Rivera
# 需要导入模块: from Dataset import Dataset [as 别名]
# 或者: from Dataset.Dataset import getTrainAndTestSets [as 别名]
#!/usr/bin/python
# CIS 521 Homework 7: Learning Machine Learning
# Cory Rivera (rcor) and Sam Panzer (panzers)
from numpy import *
from Dataset import Dataset
d = Dataset("comp.sys.ibm.pc.hardware.txt",
"rec.sport.baseball.txt", cutoff=10)
#d = Dataset("comp.sys.mac.hardware.txt", "comp.sys.ibm.pc.hardware.txt", cutoff=2000)
(Xtrain, Ytrain, Xtest, Ytest) = d.getTrainAndTestSets(0.8, seed=1)
wordlist = d.getWordList()
def trainNaiveBayes(X, Y):
# First, count frequencies given the category
# Each row is a post, and each column is a word
# To count the number of words from every post, sum up the values from each
# column for a given category
# Flattens Y so that it is easier to iterate over
yFlat = Y.flatten()
yPos = yFlat == 1
yNeg = yFlat == -1
# X.shape[1] returns number of columns for a given matrix
numColumns = X.shape[1]
# Indexing with a boolean array like yOne only checks indices that are True