当前位置: 首页>>代码示例>>Python>>正文


Python Employee.profile方法代码示例

本文整理汇总了Python中models.Employee.profile方法的典型用法代码示例。如果您正苦于以下问题:Python Employee.profile方法的具体用法?Python Employee.profile怎么用?Python Employee.profile使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在models.Employee的用法示例。


在下文中一共展示了Employee.profile方法的4个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。

示例1: handler

# 需要导入模块: from models import Employee [as 别名]
# 或者: from models.Employee import profile [as 别名]
def handler(tag):
    employee = Employee()
    ass = tag.find_all('a',class_="orangea")
    if ass and len(ass) != 0:
        employee.name = ass[0].get_text()
        employee.name = ''.join(employee.name.split())
        employee.profile = ass[0]['href']
    
    ass = tag.find_all('a',class_="black01")
    if ass and len(ass) != 0:
        lines = ass[0].stripped_strings
        parser = ProfileParser(lines=lines,employee=employee)
        employee = parser.parse()
    return employee
开发者ID:Jumbo-WJB,项目名称:EduParser,代码行数:16,代码来源:MyHandler.py

示例2: handler

# 需要导入模块: from models import Employee [as 别名]
# 或者: from models.Employee import profile [as 别名]
def handler(tag):
    employee = Employee()

    lines = tag.stripped_strings

    ass = tag.find_all(name="a", attrs={"class": "dt_text_tit"})
    if not ass or len(ass) == 0:
        # first line is the name
        for count, line in enumerate(lines):
            employee.name = line
            break
    else:
        employee.name = ass[0].string
        employee.profile = ass[0]["href"]
        employee.url = employee.profile

    parser = ProfileParser(lines=lines, employee=employee)
    employee = parser.parse()
    return employee
开发者ID:yixiaoyang,项目名称:EduParser,代码行数:21,代码来源:MyHandler.py

示例3: profile_handler

# 需要导入模块: from models import Employee [as 别名]
# 或者: from models.Employee import profile [as 别名]
def profile_handler(doc, name, url, path):
    filename = os.path.join(path, name + ".html")
    employee = Employee(name=name, url=url)

    # 只保存名称和个人主页,个人简历文件另存当前目录
    soup = BeautifulSoup(doc, Config.SOUP_PARSER)
    divs = soup.find_all(name="table", attrs={"width":"96%","cellspacing":"0"}, limit=1)    
    if not divs or len(divs) == 0:
        print "not found main div"
        div = soup
    else:
        div = divs[0]
    
    if not os.path.exists(filename):
        with open(filename, 'wb') as fp:
            content = div.prettify()
            fp.write(content)
            fp.close()

    divs = soup.find_all(name="table", attrs={"width":"96%","cellspacing":"1"}, limit=1)
    if not divs or len(divs) == 0:
        print "not found main div"
        div = soup
    else:
        div = divs[0]

    ass = div.find_all('a',text="点击此处访问")
    if ass and len(ass) != 0:
        employee.profile = ass[0]['href']
        print 'Got profile:' + employee.profile

    # 使用纯文本方式处理
    lines = div.stripped_strings
    # text=div.get_text(strip=True)
    parser = ProfileParser(lines=lines,employee=employee,set_attr_hook=set_attr_hook,max_line=256)
    return parser.parse()
开发者ID:Jumbo-WJB,项目名称:EduParser,代码行数:38,代码来源:MyHandler.py

示例4: profile_handler

# 需要导入模块: from models import Employee [as 别名]
# 或者: from models.Employee import profile [as 别名]
def profile_handler(doc,name,url,path):
    symbols = {
        u'个人主页:'   :'profile',
        u'研究方向:'   :'research',
        u'电话:':'tel',
        u'电话':'tel'
    }
    filename = path+name+".html"
    
    employee = Employee(name=name,url=url)
    # 太乱了,只保存名称和个人主页,个人简历文件另存当前目录
    soup = BeautifulSoup(doc, Config.SOUP_PARSER)
    divs = soup.find_all(id="sub_main",limit=1)
    if not divs or len(divs) == 0:
        # xml
        members = soup.find_all(name="member",limit=1)
        if not members or len(members) == 0:
            print("id:main or sub_main not found")
            #print doc
            return employee
        member = members[0]
        # title
        names = member.find_all('name')
        if not names and len(names) != 0:
            name = name[0].string
            if name:
                idx = name.find(' ')
                if idx != -1:
                    employee.title = name[idx:]
        if member.field:
            employee.research = member.field.string or ''
        if member.homepage:
            employee.profile = member.homepage.string or ''
        if member.contact:
            if member.contact.string:
                for i,c in enumerate(member.contact.string):
                    if c.isdigit():
                        employee.tel += c
        
        with open(filename,'wb') as fp:
            content = member.prettify()
            fp.write(content)
            fp.close()
        return employee
    
    div = divs[0]
    with open(filename,'wb') as fp:
        content = div.prettify()
        fp.write(content)
        fp.close()
        
    h4s = div.find_all('h4')
    if not h4s and len(h4s) != 0:
        name = h4s[0].string
        idx = name.find(' ')
        if idx != -1:
            employee.tite = name[idx:]
            employee.tite = ''.join(employee.tite.split())
            
    lis = div.find_all("li",limit=8)
    if not lis or len(lis) == 0:
        return employee
    res = lis[0]
    # 解析详细内容
    for count,tag in  enumerate(lis[0].children):
        text = tag.string
        if not text:
            continue
        if len(text) == 0:
            continue
        text = ''.join(text.split())
        if '@' in text:
            employee.email = text
            continue
                
        for symbol,name in symbols.items():
            idx = text.find(symbol)
            if idx != -1:
                idx += len(symbol)
                value = text[idx:]
                if hasattr(employee, name):
                    setattr(employee, name, value)
                    print (name + ":" + value)
                else:
                    print ("no attr %s in employee" % name)
                break
    return employee
开发者ID:yixiaoyang,项目名称:pyScripts,代码行数:89,代码来源:MyHandler.py


注:本文中的models.Employee.profile方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。