当前位置: 首页>>代码示例>>Python>>正文


Python Parser.getWordList方法代码示例

本文整理汇总了Python中Parser.Parser.getWordList方法的典型用法代码示例。如果您正苦于以下问题:Python Parser.getWordList方法的具体用法?Python Parser.getWordList怎么用?Python Parser.getWordList使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在Parser.Parser的用法示例。


在下文中一共展示了Parser.getWordList方法的2个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。

示例1: getInfo

# 需要导入模块: from Parser import Parser [as 别名]
# 或者: from Parser.Parser import getWordList [as 别名]
    def getInfo(self, uName):
        r = praw.Reddit('Reddit User Data Scraper')
        try:
            user = r.get_redditor(uName)
        except:
            self.isValid = False
            return
         
        comments = user.get_comments(limit = None)
        totStr = ""
        
        for c in comments:
            data = vars(c)['body']
            data = data.encode('ascii','ignore')
            data = data.strip()
            totStr = totStr + data + " "

            #get and save the subreddit
            srName = vars(vars(c)['subreddit'])['display_name']
            srName = srName.lower()
            if srName not in self.userSubreddits:
                self.userSubreddits[srName] = 0
            else:
                self.userSubreddits[srName] = self.userSubreddits[srName] + 1

        submissions = user.get_submitted(limit = None)
    
        for s in submissions:
            data = vars(s)['title']
            data = data.encode('ascii', 'ignore')
            data = data.strip()
            totStr = totStr + data + " " 

            #get and save the subreddit
            srName = vars(vars(s)['subreddit'])['display_name']
            srName = srName.lower()
            if srName not in self.userSubreddits:
                self.userSubreddits[srName] = 0
            else:
                self.userSubreddits[srName] = self.userSubreddits[srName] + 1
        
        srListTemp = sorted(self.userSubreddits.iteritems(), key=operator.itemgetter(1))
        for sr in srListTemp:
            self.srList.insert(0, sr[0])

        parser = Parser(totStr)        
        self.allWords = parser.getWordList()
        tempWords = []
        for word in self.allWords:
            if len(word) == 0:
                continue
            if word[0] in self.punc:
                word = word[1:]
            if word[-1:] in self.punc:
                word = word[:-1]
            tempWords.append(word)
        self.allWords = tempWords
开发者ID:ben444422,项目名称:Recommendit,代码行数:59,代码来源:Recommender.py

示例2: str

# 需要导入模块: from Parser import Parser [as 别名]
# 或者: from Parser.Parser import getWordList [as 别名]
        if submissions is None:
            continue
    
        sys.stderr.write("Processing subreddit # " + str(i) + ": " + subreddit[:-1] + "\n")
        text = "" 
        for submission in submissions:
            data = vars(submission)['title']
            data = data.encode('ascii', 'ignore')
            data = data.strip()
            text = text + data + " "
            
        if text == "":
            continue;

        parser = Parser(text)    
        wordList = parser.getWordList()
        
        text = " ".join(wordList)
        srName = subreddit[:-1]
        srName = srName.lower()
        
        srDatum = { "name" : srName,
                    "text" : text }
        pprint(wordList[:10])
            
        if srData.find_one({"name": srName}) == None:
            srData.insert(srDatum)
        else:
            srData.update({ "name": srName }, srDatum)
    except:
        sys.stderr.write("Error in processing subreddit # " + str(i) + ": " + subreddit[:-1] + "\n")        
开发者ID:ben444422,项目名称:Recommendit,代码行数:33,代码来源:DataGetter.py


注:本文中的Parser.Parser.getWordList方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。