Python urlparse函數詳解

Python `Urlparse`函數聲明

urlparse.urlparse(urlstring[, scheme[, allow_fragments]])

將URL解析為六個組件，返回一個6元組。這對應於URL的一般結構：scheme：// netloc / path; parameters？query＃fragment。每個元組項都是一個字符串，可能是空的。組件不會在較小的部分分解（例如，網絡位置是單個字符串），並且％轉義不會展開。如上所示的分隔符不是結果的一部分，除了路徑組件中的前導斜杠，如果存在則保留。

用法示例一：

>>>
>>> from urlparse import urlparse
>>> o = urlparse('http://www.cwi.nl:80/%7Eguido/Python.html')
>>> o   
ParseResult(scheme='http', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html',
            params='', query='', fragment='')
>>> o.scheme
'http'
>>> o.port
80
>>> o.geturl()
'http://www.cwi.nl:80/%7Eguido/Python.html'

遵循RFC 1808中的語法規範，urlparse僅在通過’//’正確引入netloc時識別netloc。否則，該輸入被推定為相對URL，從而以路徑組件開頭。

用法示例二

>>>
>>> from urlparse import urlparse
>>> urlparse('//www.cwi.nl:80/%7Eguido/Python.html')
ParseResult(scheme='', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html',
           params='', query='', fragment='')
>>> urlparse('www.cwi.nl/%7Eguido/Python.html')
ParseResult(scheme='', netloc='', path='www.cwi.nl/%7Eguido/Python.html',
           params='', query='', fragment='')
>>> urlparse('help/Python.html')
ParseResult(scheme='', netloc='', path='help/Python.html', params='',
           query='', fragment='')

如果指定了scheme參數，則給出默認尋址方案，僅在URL未指定的方案時使用。此參數的默認值為空字符串。

如果allow_fragments參數為false，則片段標識符不會被識別並作為前一個組件的一部分進行解析，即使URL的尋址方案通常不支持它們。此參數的默認值為True。

返回值實際上是元組的子類的一個實例。此類具有以下附加的隻讀方便屬性：

Attributes	Index	Value	Value if not present
scheme	0	URL scheme specifier	scheme parameter
netloc	1	Network location part	empty string
path	2	Hierarchical path	empty string
params	3	Parameters for last path element	empty string
query	4	Query component	empty string
fragment	5	Fragment identifier	empty string
username		User name	None
password		Password	None
hostname		Host name (lower case)	None
port		Port number as integer, if present	None

有關結果對象的更多信息，請參閱urlparse()和urlsplit()的結果部分。

補充說明

2.5版本更改：添加屬性返回值。
在版本2.7中更改：添加了IPv6 URL解析功能。

Python Urlparse函數聲明

用法示例一：

用法示例二

補充說明

Python `Urlparse`函數聲明