当前位置: 首页>>代码示例>>Python>>正文


Python FileUtils.recursivefilegetter方法代码示例

本文整理汇总了Python中FileUtils.recursivefilegetter方法的典型用法代码示例。如果您正苦于以下问题:Python FileUtils.recursivefilegetter方法的具体用法?Python FileUtils.recursivefilegetter怎么用?Python FileUtils.recursivefilegetter使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在FileUtils的用法示例。


在下文中一共展示了FileUtils.recursivefilegetter方法的1个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。

示例1: main

# 需要导入模块: import FileUtils [as 别名]
# 或者: from FileUtils import recursivefilegetter [as 别名]
def main():
    import FileCabinet
    import FileUtils
    import Volume2
    import Context
    import sys
    import os

    # DEFINE CONSTANTS.
    delim = '\t'
    debug = False
    felecterrors = ['fee', 'fea', 'fay', 'fays', 'fame', 'fell', 'funk', 'fold', 'haft', 'fat', 'fix', 'chafe', 'loft']
    selecttruths = ['see', 'sea', 'say', 'says', 'same', 'sell', 'sunk', 'sold', 'hast', 'sat', 'six', 'chase', 'lost']

    # Locate ourselves in the directory structure.

    cwd = os.getcwd()
    cwdparent = os.path.abspath(os.path.join(cwd, os.pardir))

    # We need to find a directory called 'rulesets,' which we expect to be located
    # either within the working directory or adjacent to it.

    if os.path.isdir(os.path.join(cwd, "rulesets")):
        rulepath = os.path.join(cwd, "rulesets")
    elif os.path.isdir(os.path.join(cwdparent, "rulesets")):
        rulepath = os.path.join(cwdparent, "rulesets")
    else:
        user = input("Please specify a path to the ruleset directory: ")
        if os.path.isdir(user):
            rulepath = user
        else:
            print("Invalid path.")
            sys.exit()

    # Use rulepath to load relevant rules inside modules.

    Volume2.importrules(rulepath)
    Context.importrules(rulepath)

    # Now we enter dialogue with the user. This is all a little 1982,
    # but what can I say? Wetware upgrades are expensive.

    def prompt(promptstring, options):
        user = input(promptstring)
        if user not in options:
            user = prompt(promptstring, options)
        return user

    # Ask the user to tell us how to find files to process.
    print("****************** CorrectOCR 0.1 ******************")
    print()
    print("Do you want the full spiel (explanations, caveats, etc.)")
    user = prompt("y/n : ", ["y", "n"])
    
    if user.lower() == "y":
        spielpath = os.path.join(cwd, "spiel.txt")
        with open(spielpath, encoding = 'utf-8') as file:
            filelines = file.readlines()
        for line in filelines:
            print(line, end='')

    print("\nThis script will correct .txt files, or extract text")
    print("from zipped archives containing one txt file for each page.")
    print("In either case it writes the cleaned files back to their")
    print("original locations with the new suffix '.clean.txt'.")
    print("\nDo you want to unpack .zip files or .txt files?")
    user = prompt("zip or txt: ", ["zip", "txt"])
    suffix = "." + user
    suffixlen = len(suffix)
    
    print("\nThere are two ways to identify the location of the")
    print("files to be corrected.")
    print("\n1. Provide the path to a folder that contains them. I'll")
    print("recursively search subdirectories of that folder as well. Or,")
    print("\n2. Provide a file holding a list of pairtree file identifiers,")
    print("e.g. HathiTrust Volume IDs. I can use those identifiers to infer")
    print("the paths to the files themselves.\n")

    user = prompt("Which option do you prefer (1 or 2)? ", ["1", "2"])
    
    if user == "1":
        rootpath = input("Path to the folder that contains source files: ")
        filelist = FileUtils.recursivefilegetter(rootpath, suffix)
 
    else:
        print("I expect the pairtree identifiers to be listed one per line,")
        print("and to be the only item on a line.")  
        filepath = input("Path to the file that contains pairtree identifiers: ")
        filelist = list()
        with open(filepath, encoding = 'utf-8') as file:
            filelines = file.readlines()

        print("Now I need a path to the folder that contains the pairtree structure.")
        print("If you have multiple folders for different libraries, this should be")
        print("the folder above them all. It should end with a slash.")
        rootpath = input("Path to the folder that contains pairtree: ")
        for line in filelines:
            line = line.rstrip()
            filepath, postfix = FileCabinet.pairtreepath(line, rootpath)
            filename = filepath + postfix + '/' + postfix + suffix
#.........这里部分代码省略.........
开发者ID:Anterotesis,项目名称:DataMunging,代码行数:103,代码来源:OCRnormalizer.py


注:本文中的FileUtils.recursivefilegetter方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。