mirror of
https://github.com/m-lamonaca/dev-notes.git
synced 2025-04-07 03:16:41 +00:00
860 B
860 B
Urllib Module Cheatsheet
Module Structure
urllib
is a package that collects several modules for working with URLs:
urllib.request
for opening and reading URLsurllib.error
containing the exceptions raised by urllib.requesturllib.parse
for parsing URLsurllib.robotparser
for parsing robots.txt files
urllib.request
Opening an URL
import urllib.request
# HTTP request header are not returned
response = urllib.request.urlopen(url)
data = response.read().decode()
Readign Headers
response = urllib.request.urlopen(url)
headers = dict(response.getheaders()) # store headers as a dict
urllib.parse
URL Encoding
Encode a query in a URL
url = "http://www.addres.x/_?"
# encode an url with passed key-value pairs
encoded = url + urllib.parse.encode( {"key": value} )