當前位置: 首頁>>代碼示例>>Python>>正文


Python Dictionary.bestEnglishWordForSpanishWordToken方法代碼示例

本文整理匯總了Python中Dictionary.Dictionary.bestEnglishWordForSpanishWordToken方法的典型用法代碼示例。如果您正苦於以下問題:Python Dictionary.bestEnglishWordForSpanishWordToken方法的具體用法?Python Dictionary.bestEnglishWordForSpanishWordToken怎麽用?Python Dictionary.bestEnglishWordForSpanishWordToken使用的例子?那麽, 這裏精選的方法代碼示例或許可以為您提供幫助。您也可以進一步了解該方法所在Dictionary.Dictionary的用法示例。


在下文中一共展示了Dictionary.bestEnglishWordForSpanishWordToken方法的1個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的Python代碼示例。

示例1: __init__

# 需要導入模塊: from Dictionary import Dictionary [as 別名]
# 或者: from Dictionary.Dictionary import bestEnglishWordForSpanishWordToken [as 別名]
class ModifiedTranslator:
    def __init__(self):
        self.dictionary = Dictionary()

    # Pre processing strategies
    def createWordLookup(self, foreignSentence):
        corpus = Corpus()
        tokenDictList = []

        """Captures only words, no spaces/punctuation"""
        spanishTokens = re.compile('(\W+)', re.UNICODE).split(unicode(foreignSentence, 'utf-8'))
        spanishTokens.pop()
        
        for idx, token in enumerate(spanishTokens):
            tokenDict = dict()
            tokenDict['originalToken'] = token
            tokenDict['spanish_POS'] = corpus.spanishTags().get(token, None)
            if (len(token) > 0):
                if token[0].isupper():
                    tokenDict['upper'] = True
                else:
                    tokenDict['upper'] = False
            else:
                tokenDict['upper'] = False
            tokenDictList.append(tokenDict)
            
        self.tokenDictList = tokenDictList
            
    def translateSentence(self, foreignSentence):
        # Pre processing
        self.createWordLookup(foreignSentence)
        
        # Translation
        translatedSentence = ""
        for spanishToken in self.tokenDictList:
            originalToken = spanishToken['originalToken']
            
            translatedWord = self.dictionary.bestEnglishWordForSpanishWordToken(spanishToken)
            if translatedWord:
                spanishToken['translatedToken'] = translatedWord
            else:
                spanishToken['translatedToken'] = originalToken
                          
        # Post processing
        # Strategies
        # 1 - Preserve capitalization
        self.capitalizeWords()
        
        # 2 - Flip object pronouns (gave him -> le dio vs dio le)
        self.flipObjectPronouns()
        
        # 3 - Flip adjectives and nouns (whole book -> libro entero vs entero libro)
        self.flipAdjectivesAndNouns()

        #4 - For infinitive, future, and conditional verbs, insert 'to', 'will', and 'would respectively'
        self.correctVerbForm()
        
        # Eventually turn into a sentence
        for token in self.tokenDictList:
            translatedToken = token['translatedToken']
            translatedSentence = translatedSentence + translatedToken
        
        return translatedSentence
    
    # Post processing strategies
    def capitalizeWords(self):
        for token in self.tokenDictList:
            if token['upper']:    
                translatedToken = token['translatedToken']
                shouldCapitalize = True
                if (len(translatedToken) > 1):
                    if translatedToken[1].isupper():
                        shouldCapitalize = False
                        
                if shouldCapitalize:
                    token['translatedToken'] = translatedToken.capitalize()
                    
    def flipObjectPronouns(self):
        objectPronoun = None
        verb = None
        
        # find an object pronoun
        for idx, token in enumerate(self.tokenDictList):
            spanishPOS = token['spanish_POS']
            if spanishPOS:
                if (spanishPOS[0] == 'p') and (spanishPOS[1] == 'p'):
                    possibleVerbTokenIndex = idx + 2
                    if (len(self.tokenDictList) > possibleVerbTokenIndex):
                        possibleVerbToken = self.tokenDictList[possibleVerbTokenIndex]
                        possibleVerbTag = possibleVerbToken['spanish_POS']
                        if (possibleVerbTag and (possibleVerbTag[0] == 'v')):
                            objectPronoun = token
                            verb = possibleVerbToken
            
        if (objectPronoun and verb):
            idx1, idx2 = self.tokenDictList.index(objectPronoun), self.tokenDictList.index(verb)
            self.tokenDictList[idx2], self.tokenDictList[idx1] = self.tokenDictList[idx1], self.tokenDictList[idx2]
    
    def flipAdjectivesAndNouns(self):
        noun = None
#.........這裏部分代碼省略.........
開發者ID:danielsht86,項目名稱:cs124-pa6,代碼行數:103,代碼來源:ModifiedTranslator.py


注:本文中的Dictionary.Dictionary.bestEnglishWordForSpanishWordToken方法示例由純淨天空整理自Github/MSDocs等開源代碼及文檔管理平台,相關代碼片段篩選自各路編程大神貢獻的開源項目,源碼版權歸原作者所有,傳播和使用請參考對應項目的License;未經允許,請勿轉載。