用 Python Requests 模拟 cURL 请求的完整指南：从基础到高级

导言

在现代 Web 开发和数据交互中，我们经常需要与各种 API 进行通信，或者模拟浏览器行为来获取数据。cURL（Client URL）是一个强大的命令行工具，它能够发送各种类型的 HTTP 请求，并广泛用于测试 API、下载文件、自动化任务等场景。它的简洁和功能强大使其成为许多开发者的首选。

然而，当任务变得复杂，需要集成到更大型的应用程序中，或者需要进行数据处理、逻辑判断、循环操作时，命令行工具的局限性就显现出来了。这时，Python 的 requests 库便成为了一个理想的替代品。requests 库以其“HTTP For Humans”的设计理念，提供了简洁直观的 API，使得在 Python 中发送 HTTP 请求变得轻而易举。

本指南将深入探讨如何使用 Python 的 requests 库来模拟 cURL 的各种请求场景。我们将从 cURL 的基本概念开始，逐步介绍 requests 库的核心功能，并详细对比 cURL 的常用选项与 requests 库中对应的参数和方法。无论您是 API 开发者、数据分析师、自动化工程师还是网络爬虫爱好者，本文都将为您提供一份全面而实用的参考。

第一部分：cURL 基础回顾

在深入 requests 库之前，让我们快速回顾一下 cURL 的基础知识和常用选项。这将有助于我们理解如何将 cURL 的功能映射到 Python 代码中。

cURL 的基本语法是 curl [options] [URL]。通过不同的选项，我们可以控制请求的类型、头部信息、请求体、认证方式等。

常用 cURL 选项一览：

-X <METHOD> 或 --request <METHOD>： 指定 HTTP 请求方法，如 GET, POST, PUT, DELETE 等。
- 示例：curl -X POST http://example.com/api
-H <HEADER> 或 --header <HEADER>： 添加自定义请求头。可以多次使用此选项添加多个头。
- 示例：curl -H "Content-Type: application/json" -H "Authorization: Bearer mytoken" http://example.com/api
-d <DATA> 或 --data <DATA>： 发送 POST 请求的数据。通常用于 application/x-www-form-urlencoded 或原始数据。
- 示例（表单数据）：curl -d "param1=value1&param2=value2" http://example.com/submit
- 示例（原始数据）：curl -d '{"key": "value"}' http://example.com/jsonapi
--data-raw <DATA>： 类似于 -d，但会阻止 cURL 对数据进行 URL 编码。
--data-urlencode <DATA>： 类似于 -d，但会强制对数据进行 URL 编码。
-F <NAME=CONTENT> 或 --form <NAME=CONTENT>： 用于 multipart/form-data 类型请求，常用于文件上传。
- 示例：curl -F "file=@/path/to/local/file.jpg" -F "description=My image" http://example.com/upload
-u <USER:PASSWORD> 或 --user <USER:PASSWORD>： 用于 HTTP 基本认证。
- 示例：curl -u "username:password" http://example.com/protected
-b <FILE|DATA> 或 --cookie <FILE|DATA>： 发送 Cookie。可以是文件路径或直接的 Cookie 字符串。
- 示例：curl -b "session_id=abc; csrf_token=xyz" http://example.com/dashboard
-c <FILE> 或 --cookie-jar <FILE>： 将响应中的 Cookie 保存到文件。
- 示例：curl -c cookies.txt http://example.com/login
-L 或 --location： 跟踪 HTTP 重定向。
- 示例：curl -L http://shorturl.at/abc
-k 或 --insecure： 禁用 SSL/TLS 证书验证。通常用于测试环境或自签名证书。
- 示例：curl -k https://self-signed.example.com
-o <FILE> 或 --output <FILE>： 将响应体保存到文件。
-O 或 --remote-name： 根据远程文件的名称保存响应体到文件。
-s 或 --silent： 静默模式，不显示进度或错误信息。
-v 或 --verbose： 详细模式，显示请求和响应的完整信息，包括头、SSL 握手等。
--proxy <PROXY_URL>： 通过代理服务器发送请求。
- 示例：curl --proxy http://myproxy.com:8080 http://example.com
--connect-timeout <SECONDS>： 设置连接超时时间。
--max-time <SECONDS>： 设置整个请求的最大允许时间。

了解这些 cURL 选项是成功将其转换为 requests 代码的关键。

第二部分：Python Requests 库简介

requests 库是 Python 中一个广受欢迎的第三方库，用于发送 HTTP 请求。它的设计目标是让 HTTP 请求变得简单和直观。

安装 Requests：

如果您还没有安装 requests，可以使用 pip 进行安装：

bash pip install requests

Requests 基本用法：

requests 库提供了多种便捷的方法来发送不同类型的 HTTP 请求：

requests.get(url, ...)：发送 GET 请求
requests.post(url, ...)：发送 POST 请求
requests.put(url, ...)：发送 PUT 请求
requests.delete(url, ...)：发送 DELETE 请求
requests.head(url, ...)：发送 HEAD 请求
requests.options(url, ...)：发送 OPTIONS 请求
requests.request(method, url, ...)：通用方法，可以指定任何 HTTP 方法

所有这些方法都会返回一个 Response 对象，其中包含了服务器的响应信息，如状态码、响应头、响应体等。

“`python
import requests

GET 请求

response = requests.get(‘https://api.github.com/users/octocat’)
print(f”状态码: {response.status_code}”)
print(f”响应体: {response.json()}”) # 尝试解析 JSON

POST 请求

data = {‘key’: ‘value’}
response = requests.post(‘https://httpbin.org/post’, data=data)
print(f”POST 响应: {response.json()}”)
“`

第三部分：cURL 选项与 Requests 参数的完整映射

现在，我们将详细对比 cURL 的每个常用选项，并展示如何在 Python requests 库中实现相同的功能。

1. HTTP 请求方法 (`-X`)

cURL: curl -X POST http://example.com/api
Requests: requests 库为每种常见的 HTTP 方法提供了专用的函数。对于不常用的方法，可以使用 requests.request()。

“`python
import requests

url = “http://example.com/api”

GET 请求

response = requests.get(url)
print(f”GET Status: {response.status_code}”)

POST 请求

response = requests.post(url)
print(f”POST Status: {response.status_code}”)

PUT 请求

response = requests.put(url)
print(f”PUT Status: {response.status_code}”)

DELETE 请求

response = requests.delete(url)
print(f”DELETE Status: {response.status_code}”)

通用请求方法 (例如：PATCH)

response = requests.request(‘PATCH’, url)
print(f”PATCH Status: {response.status_code}”)
“`

2. 请求头 (`-H`)

cURL: curl -H "Content-Type: application/json" -H "Accept: application/xml" http://example.com/api
Requests: 使用 headers 参数，它是一个字典，键和值都应为字符串。

“`python
import requests

url = “https://api.github.com/users/octocat”
headers = {
“User-Agent”: “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36”,
“Accept-Language”: “en-US,en;q=0.9”,
“Referer”: “https://www.google.com/”
}

response = requests.get(url, headers=headers)
print(f”Status Code: {response.status_code}”)
print(f”Response Headers: {response.headers}”)
“`

常见请求头说明：

User-Agent: 模拟浏览器类型，避免被网站识别为爬虫。
Content-Type: 指示请求体的媒体类型，例如 application/json、application/x-www-form-urlencoded、multipart/form-data。
Accept: 客户端期望接收的响应媒体类型。
Authorization: 认证信息，如 Bearer Token 或 Basic Auth。
Cookie: 发送会话 Cookie。
Referer: 指示请求的来源页面。

3. 查询参数 (URL 中的 `?key=value`)

cURL: curl "http://example.com/search?q=python&category=programming"
Requests: 使用 params 参数，它是一个字典或元组列表。requests 会自动处理 URL 编码。

“`python
import requests

base_url = “https://www.google.com/search”
query_params = {
“q”: “Python Requests Tutorial”,
“oq”: “python requests”,
“sourceid”: “chrome”
}

response = requests.get(base_url, params=query_params)
print(f”请求 URL: {response.url}”) # 打印出完整的带查询参数的URL
print(f”状态码: {response.status_code}”)

print(response.text) # 打印搜索结果页面的HTML内容

“`

4. 请求体数据 (`-d`, `--data`, `--data-raw`, `--data-binary`)

这是最复杂的部分，因为请求体可以有多种格式：表单数据、JSON 数据、原始数据等。

a) 表单数据 (application/x-www-form-urlencoded)

cURL: curl -X POST -d "name=Alice&age=30" http://example.com/submit
Requests: 使用 data 参数，传入一个字典。requests 会自动对其进行 URL 编码并设置 Content-Type 为 application/x-www-form-urlencoded。

“`python
import requests

url = “https://httpbin.org/post”
form_data = {
“name”: “Alice”,
“age”: “30”,
“city”: “New York”
}

response = requests.post(url, data=form_data)
print(f”状态码: {response.status_code}”)
print(f”响应JSON: {response.json()}”) # httpbin.org 会返回接收到的数据
“`

b) JSON 数据 (application/json)

cURL: curl -X POST -H "Content-Type: application/json" -d '{"item":"book", "price": 29.99}' http://example.com/add
Requests: 使用 json 参数，传入一个 Python 字典或列表。requests 会自动将其序列化为 JSON 字符串，并设置 Content-Type 为 application/json。

“`python
import requests

url = “https://httpbin.org/post”
json_data = {
“product_id”: “P12345”,
“name”: “Laptop”,
“details”: {
“color”: “silver”,
“storage”: “512GB SSD”
},
“tags”: [“electronics”, “computer”]
}

response = requests.post(url, json=json_data)
print(f”状态码: {response.status_code}”)
print(f”响应JSON: {response.json()}”)
`` **重要提示:** * 当使用data参数发送字典时，requests默认使用application/x-www-form-urlencoded。 * 当使用json参数发送字典/列表时，requests默认使用application/json。 * 如果你的 JSON 数据已经是一个字符串（例如从文件中读取），你需要将其赋值给data参数，并手动设置Content-Type: application/json` 头。

“`python
import requests
import json

url = “https://httpbin.org/post”

假设从文件或某个源获取到了一个JSON字符串

raw_json_string = ‘{“status”: “active”, “message”: “hello world”}’

headers = {“Content-Type”: “application/json”}
response = requests.post(url, data=raw_json_string, headers=headers)
print(f”状态码: {response.status_code}”)
print(f”响应JSON: {response.json()}”)
“`

c) 原始数据 / 二进制数据 (--data-raw, --data-binary)

cURL: curl --data-raw "This is a plain text body" http://example.com/raw
Requests: 使用 data 参数，传入一个字符串、字节串或文件对象。你需要手动设置 Content-Type 头。

“`python
import requests

url = “https://httpbin.org/post”

发送纯文本

raw_text_data = “Hello, this is a plain text message.”
headers = {“Content-Type”: “text/plain”}
response = requests.post(url, data=raw_text_data, headers=headers)
print(f”文本请求状态: {response.status_code}”)
print(f”文本请求响应: {response.json()}”)

发送二进制数据 (例如图片或文件内容)

with open(‘example.png’, ‘rb’) as f: # 假设存在一个 example.png 文件
binary_data = f.read()
headers = {“Content-Type”: “image/png”}
response = requests.post(url, data=binary_data, headers=headers)
print(f”二进制请求状态: {response.status_code}”)
print(f”二进制请求响应: {response.json()}”)
“`

5. 文件上传 (`-F`, `--form`)

cURL: curl -F "image=@/path/to/local/image.jpg" -F "caption=My photo" http://example.com/upload
Requests: 使用 files 参数，它是一个字典。字典的值可以是文件对象的元组 (filename, file_object, content_type, custom_headers)。

“`python
import requests

url = “https://httpbin.org/post”

假设当前目录下有一个名为 ‘test_image.jpg’ 的图片文件

创建一个假的图片文件用于测试

with open(“test_image.jpg”, “wb”) as f:
f.write(b”fake image data”)

files = {
‘image’: (‘my_photo.jpg’, open(‘test_image.jpg’, ‘rb’), ‘image/jpeg’, {‘Expires’: ‘0’}),
‘caption’: (None, ‘This is my uploaded photo’) # 普通表单字段不需要文件名和MIME类型
}

response = requests.post(url, files=files)
print(f”状态码: {response.status_code}”)
print(f”响应JSON: {response.json()}”)
“`

files 参数的字典值可以有多种形式：
* {'field_name': open('file.txt', 'rb')}：最简单，requests 会猜测文件名和 MIME 类型。
* {'field_name': ('filename.txt', open('file.txt', 'rb'))}：显式指定文件名。
* {'field_name': ('filename.txt', open('file.txt', 'rb'), 'text/plain')}：显式指定文件名和 MIME 类型。
* {'field_name': ('filename.txt', open('file.txt', 'rb'), 'text/plain', {'X-Custom': 'header'})}：还可以指定自定义头部。

6. Cookie 处理 (`-b`, `-c`)

cURL:
- 发送：curl -b "session_id=xyz; username=test" http://example.com/data
- 保存：curl -c cookies.txt http://example.com/login
Requests:
- 发送：使用 cookies 参数传入字典。
- 接收/管理：requests 响应对象 response.cookies 属性包含一个 RequestsCookieJar 对象，可以像字典一样操作。更高级的 Cookie 管理应使用 requests.Session()。

“`python
import requests

发送 Cookie

url_send = “https://httpbin.org/cookies”
my_cookies = {
“session_id”: “abc123xyz”,
“user_pref”: “dark_mode”
}
response = requests.get(url_send, cookies=my_cookies)
print(f”发送 Cookie 响应: {response.json()}”)

接收 Cookie

url_get = “https://httpbin.org/cookies/set?key1=value1&key2=value2″
response = requests.get(url_get)
print(f”接收到的 Cookie: {response.cookies.get_dict()}”)

使用 Session 对象进行持久化 Cookie 管理 (强烈推荐)

session = requests.Session()
login_url = “https://httpbin.org/cookies/set?auth=true&user=admin”
dashboard_url = “https://httpbin.org/cookies”

登录请求，会话会保存 Cookie

session.get(login_url)
print(f”Session 中的 Cookie (登录后): {session.cookies.get_dict()}”)

访问需要认证的页面，会话会自动发送之前保存的 Cookie

response = session.get(dashboard_url)
print(f”访问仪表盘响应: {response.json()}”)
“`

7. 认证 (`-u`, `--user`)

cURL: curl -u "username:password" http://example.com/protected
Requests: 使用 auth 参数。对于基本认证，传入一个元组 (username, password)。Requests 也支持 Digest Auth 和其他认证机制。

“`python
import requests
from requests.auth import HTTPBasicAuth, HTTPDigestAuth

HTTP Basic Authentication

url_basic = “https://httpbin.org/basic-auth/user/passwd”
response = requests.get(url_basic, auth=(‘user’, ‘passwd’))
print(f”Basic Auth 状态: {response.status_code}”)
print(f”Basic Auth 响应: {response.json()}”)

HTTP Digest Authentication (如果服务器支持)

url_digest = “http://example.com/digest-auth”

response = requests.get(url_digest, auth=HTTPDigestAuth(‘user’, ‘passwd’))

print(f”Digest Auth 状态: {response.status_code}”)

“`

8. 重定向 (`-L`, `--location`)

cURL: curl -L http://shorturl.at/abc (默认跟随重定向)
Requests: requests 默认也会跟随重定向。通过 allow_redirects=False 可以禁用它。

“`python
import requests

Requests 默认跟随重定向

url_redirect = “https://httpbin.org/redirect/3″ # 会重定向3次
response = requests.get(url_redirect)
print(f”默认重定向状态码: {response.status_code}”)
print(f”最终 URL: {response.url}”)
print(f”重定向历史: {response.history}”) # 历史响应对象列表

禁用重定向

response_no_redirect = requests.get(url_redirect, allow_redirects=False)
print(f”禁用重定向状态码: {response_no_redirect.status_code}”) # 通常是3xx
print(f”重定向 URL: {response_no_redirect.headers.get(‘Location’)}”)
“`

9. SSL/TLS 证书验证 (`-k`, `--insecure`)

cURL: curl -k https://self-signed.example.com
Requests: 使用 verify 参数。默认 verify=True（验证证书）。设置为 False 则禁用验证。

“`python
import requests

警告：禁用 SSL 验证会带来安全风险，不推荐在生产环境使用。

仅在测试环境或明确知道风险的情况下使用。

假设有一个自签名证书的网站

url_insecure = “https://self-signed.example.com”

response = requests.get(url_insecure, verify=False)

print(f”Insecure Request Status: {response.status_code}”)

指定自定义 CA 证书 (如果需要验证非系统信任的证书)

cacert_path = “/path/to/my_custom_cert.pem”

response = requests.get(“https://your-server.com”, verify=cacert_path)

“`

当 verify=False 时，Requests 会发出 InsecureRequestWarning。可以通过 requests.packages.urllib3.disable_warnings() 禁用这些警告。

“`python
import requests
import urllib3

urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

response = requests.get(“https://self-signed.example.com”, verify=False)

print(“Insecure request completed without warnings.”)

“`

10. 超时 (`--connect-timeout`, `--max-time`)

cURL: --connect-timeout 5 (连接超时) --max-time 10 (总超时)
Requests: 使用 timeout 参数，可以是一个浮点数（总超时），也可以是一个元组 (connect_timeout, read_timeout)。

“`python
import requests
from requests.exceptions import Timeout, ConnectionError

url_slow = “https://httpbin.org/delay/5” # 模拟5秒延迟

try:
# 设置连接超时2秒，读取超时10秒
response = requests.get(url_slow, timeout=(2, 10))
print(f”请求成功，状态码: {response.status_code}”)
except Timeout:
print(“请求超时！”)
except ConnectionError:
print(“连接错误！”)
except Exception as e:
print(f”发生未知错误: {e}”)

设置总超时3秒 (会连接成功但读取超时)

try:
response = requests.get(url_slow, timeout=3)
print(f”请求成功，状态码: {response.status_code}”)
except Timeout:
print(“请求超时！”)
“`

11. 代理 (`-x`, `--proxy`)

cURL: curl --proxy http://myproxy.com:8080 http://example.com
Requests: 使用 proxies 参数，它是一个字典，键是协议（’http’ 或 ‘https’），值是代理 URL。

“`python
import requests

url = “http://httpbin.org/ip” # 会显示你的出口IP

配置代理 (替换为你的代理地址)

proxies = {
“http”: “http://your_http_proxy.com:8080”,
“https”: “http://your_https_proxy.com:8080”,
# 如果代理需要认证
# “http”: “http://user:password@your_http_proxy.com:8080”,
}

try:
# response = requests.get(url, proxies=proxies)
# print(f”通过代理请求的IP: {response.json().get(‘origin’)}”)
print(“请替换为真实的代理地址以测试”)
except Exception as e:
print(f”代理请求失败: {e}”)
“`

12. 文件下载 (`-o`, `-O`)

cURL: curl -o local_file.html http://example.com/page.html
Requests: 使用 stream=True 参数配合 iter_content() 或 iter_lines() 来逐步读取响应体，并写入文件。这对于大文件下载非常重要，可以避免将整个文件加载到内存中。

“`python
import requests

url_download = “https://www.python.org/static/img/python-logo.png” # 一个小图片文件
file_name = “python-logo.png”

try:
response = requests.get(url_download, stream=True)
response.raise_for_status() # 检查请求是否成功

with open(file_name, 'wb') as f:
    for chunk in response.iter_content(chunk_size=8192): # 每次读取8KB
        if chunk: # 过滤掉保持连接的空块
            f.write(chunk)
print(f"文件 '{file_name}' 下载成功。")

except requests.exceptions.RequestException as e:
print(f”下载失败: {e}”)
“`

第四部分：高级应用与最佳实践

1. 会话管理 (Sessions)

requests.Session() 对象允许您在多次请求之间保持某些参数，例如 Cookie、请求头和连接池。这对于模拟用户登录后的操作、提高性能和简化代码非常有用。

“`python
import requests

创建一个 Session 对象

session = requests.Session()

模拟登录 (假设登录成功后服务器设置了Cookie)

login_url = “https://httpbin.org/cookies/set?user=john_doe&auth_token=xyz123″
session.get(login_url)
print(f”登录后Session中的Cookie: {session.cookies.get_dict()}”)

使用同一个 Session 对象访问其他页面，Cookie 会自动发送

dashboard_url = “https://httpbin.org/cookies”
response = session.get(dashboard_url)
print(f”访问仪表盘响应 (包含Cookie): {response.json()}”)

Session 也可以设置默认请求头和代理

session.headers.update({“X-Custom-Header”: “MyValue”})

session.proxies = {“http”: “http://localhost:8080”}

response = session.get(“https://httpbin.org/headers”)
print(f”Session请求头中的自定义头: {response.json().get(‘headers’, {}).get(‘X-Custom-Header’)}”)

session.close() # 关闭会话，释放资源
“`

2. 错误处理与异常

requests 在遇到网络问题、超时或非 2xx 状态码时会抛出异常。良好的错误处理是健壮代码的关键。

“`python
import requests
from requests.exceptions import HTTPError, ConnectionError, Timeout, RequestException

url_fail = “https://httpbin.org/status/500” # 模拟服务器内部错误
url_timeout = “https://httpbin.org/delay/6” # 模拟超时

try:
response = requests.get(url_fail)
response.raise_for_status() # 如果状态码不是 2xx，会抛出 HTTPError
print(f”请求成功: {response.status_code}”)
except HTTPError as e:
print(f”HTTP 错误发生: {e}”)
print(f”响应状态码: {e.response.status_code}”)
except ConnectionError as e:
print(f”连接错误发生 (DNS解析失败，拒绝连接等): {e}”)
except Timeout as e:
print(f”请求超时: {e}”)
except RequestException as e: # 捕获所有 requests 相关的异常
print(f”发生Requests通用错误: {e}”)
except Exception as e: # 捕获其他所有异常
print(f”发生未知错误: {e}”)
“`

3. 调试与日志

打印响应内容： response.status_code, response.headers, response.text, response.json(), response.url 是最常用的调试手段。
cURL 导出工具： 有些网站（例如 curlconverter.com）可以直接将 cURL 命令转换为 Python requests 代码，这对于复制和粘贴复杂的 cURL 请求非常有用。
Requests 内部日志： 可以开启 Requests 库的内部日志来查看更详细的请求/响应过程。

“`python
import requests
import logging

开启 Requests 库的调试日志

logging.basicConfig(level=logging.DEBUG)

logging.getLogger(“requests”).setLevel(logging.DEBUG)

logging.getLogger(“urllib3”).setLevel(logging.DEBUG)

response = requests.get(“https://httpbin.org/get”)

此时终端会输出 Requests 和 urllib3 的详细日志

“`

4. 性能优化

使用 requests.Session()： 它会重用底层的 TCP 连接，减少连接建立的时间，尤其是在进行大量请求时效果显著。
设置 timeout： 避免长时间等待无响应的服务器，提高程序的响应速度和健壮性。
流式下载： 对于大文件，使用 stream=True 避免一次性加载到内存中，减少内存消耗。

5. User-Agent

模拟浏览器行为时，设置合适的 User-Agent 头至关重要，它可以帮助你绕过一些简单的反爬虫机制。

“`python
import requests

headers = {
“User-Agent”: “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36”
}
response = requests.get(“https://httpbin.org/user-agent”, headers=headers)
print(f”返回的User-Agent: {response.json().get(‘user-agent’)}”)
“`

第五部分：实践案例：模拟一个带有登录态的 API 请求

假设我们需要访问一个需要登录认证的 API，步骤如下：
1. POST 请求登录接口，获取 Session ID 或 Token。
2. 使用获取到的 Session ID 或 Token 访问其他需要认证的接口。

cURL 模拟：

“`bash

1. 登录

curl -X POST -d “username=myuser&password=mypass” \
-c cookies.txt \
http://example.com/login

2. 访问受保护的资源 (使用保存的Cookie)

curl -b cookies.txt \
-H “Accept: application/json” \
http://example.com/api/dashboard
“`

Python Requests 模拟：

“`python
import requests
import json

假设的URL (替换为你的实际URL)

LOGIN_URL = “https://httpbin.org/post” # httpbin.org/post 只是一个回显服务，不会真的登录
DASHBOARD_URL = “https://httpbin.org/headers” # httpbin.org/headers 会回显请求头

1. 创建 Session 对象

session = requests.Session()

2. 准备登录数据

login_payload = {
“username”: “my_test_user”,
“password”: “my_secret_password”
}

3. 发送登录请求

print(“尝试登录…”)
try:
login_response = session.post(LOGIN_URL, data=login_payload, timeout=5)
login_response.raise_for_status() # 检查登录请求是否成功 (2xx 状态码)
print(f”登录请求状态码: {login_response.status_code}”)
print(f”登录响应内容: {login_response.json()}”)

# 检查 session 中是否已经有 Cookie (如果登录成功服务器会设置)
if session.cookies:
    print(f"登录成功！Session Cookie: {session.cookies.get_dict()}")
else:
    print("登录成功，但未检测到Cookie。可能是API返回的是Token。")
    # 如果是Token认证，需要从 login_response.json() 中提取Token，并添加到后续请求的headers中
    # token = login_response.json().get('access_token')
    # session.headers.update({"Authorization": f"Bearer {token}"})

# 4. 访问受保护的资源
print("\n尝试访问仪表盘...")
# 可以添加自定义头，例如 Accept
session.headers.update({"Accept": "application/json"})
dashboard_response = session.get(DASHBOARD_URL, timeout=5)
dashboard_response.raise_for_status() # 检查仪表盘请求是否成功

print(f"仪表盘请求状态码: {dashboard_response.status_code}")
print(f"仪表盘响应内容: {dashboard_response.json()}")

# 验证是否发送了Session Cookie 或 Token
if 'Cookie' in dashboard_response.json().get('headers', {}):
    print("已成功发送Session Cookie！")
elif 'Authorization' in dashboard_response.json().get('headers', {}):
    print("已成功发送Authorization Token！")
else:
    print("未检测到预期的Cookie或Authorization头，请检查API响应和认证机制。")

except requests.exceptions.HTTPError as errh:
print(f”HTTP Error: {errh}”)
print(f”Response Body: {errh.response.text}”)
except requests.exceptions.ConnectionError as errc:
print(f”Error Connecting: {errc}”)
except requests.exceptions.Timeout as errt:
print(f”Timeout Error: {errt}”)
except requests.exceptions.RequestException as err:
print(f”Something went wrong: {err}”)
finally:
session.close() # 记得关闭 Session

“`

总结

本指南详细介绍了如何使用 Python 的 requests 库来模拟 cURL 的各种功能。我们从 cURL 的基础选项开始，逐步映射到 requests 库的对应参数和方法，涵盖了请求方法、头部、查询参数、请求体（表单、JSON、原始数据）、文件上传、Cookie 管理、认证、重定向、SSL 验证、超时以及代理等核心功能。

requests 库凭借其“HTTP For Humans”的设计理念，使得复杂的 HTTP 请求在 Python 中变得异常简单和直观。它不仅提供了与 cURL 命令行工具相媲美甚至更强大的功能，更重要的是，它将这些功能融入了 Python 强大的编程环境中，让您可以轻松地进行：

自动化任务： 编写脚本批量发送请求、处理数据。
API 交互： 构建复杂的 API 客户端，进行测试和集成。
网络爬虫： 模拟浏览器行为，抓取网页内容。
数据处理： 将请求结果与 Python 的数据处理能力无缝结合。

通过掌握本指南中的知识和实践，您将能够灵活运用 requests 库，高效地完成各种与 Web 交互相关的任务，极大地提升您的开发效率。记住，在实际应用中，始终关注错误处理、性能优化和安全最佳实践，以构建健壮可靠的应用程序。

用 Python Requests 模拟 cURL 请求的完整指南：从基础到高级

导言

第一部分：cURL 基础回顾

第二部分：Python Requests 库简介

GET 请求

POST 请求

第三部分：cURL 选项与 Requests 参数的完整映射

1. HTTP 请求方法 (-X)

GET 请求

POST 请求

PUT 请求

DELETE 请求

通用请求方法 (例如：PATCH)

2. 请求头 (-H)

3. 查询参数 (URL 中的 ?key=value)

print(response.text) # 打印搜索结果页面的HTML内容

4. 请求体数据 (-d, --data, --data-raw, --data-binary)

假设从文件或某个源获取到了一个JSON字符串

发送纯文本

发送二进制数据 (例如图片或文件内容)

5. 文件上传 (-F, --form)

假设当前目录下有一个名为 ‘test_image.jpg’ 的图片文件

创建一个假的图片文件用于测试

6. Cookie 处理 (-b, -c)

发送 Cookie

接收 Cookie

使用 Session 对象进行持久化 Cookie 管理 (强烈推荐)

登录请求，会话会保存 Cookie

访问需要认证的页面，会话会自动发送之前保存的 Cookie

7. 认证 (-u, --user)

HTTP Basic Authentication

HTTP Digest Authentication (如果服务器支持)

url_digest = “http://example.com/digest-auth”

response = requests.get(url_digest, auth=HTTPDigestAuth(‘user’, ‘passwd’))

print(f”Digest Auth 状态: {response.status_code}”)

8. 重定向 (-L, --location)

Requests 默认跟随重定向

禁用重定向

9. SSL/TLS 证书验证 (-k, --insecure)

警告：禁用 SSL 验证会带来安全风险，不推荐在生产环境使用。

仅在测试环境或明确知道风险的情况下使用。

假设有一个自签名证书的网站

url_insecure = “https://self-signed.example.com”

response = requests.get(url_insecure, verify=False)

print(f”Insecure Request Status: {response.status_code}”)

指定自定义 CA 证书 (如果需要验证非系统信任的证书)

cacert_path = “/path/to/my_custom_cert.pem”

response = requests.get(“https://your-server.com”, verify=cacert_path)

response = requests.get(“https://self-signed.example.com”, verify=False)

print(“Insecure request completed without warnings.”)

10. 超时 (--connect-timeout, --max-time)

设置总超时3秒 (会连接成功但读取超时)

11. 代理 (-x, --proxy)

配置代理 (替换为你的代理地址)

12. 文件下载 (-o, -O)

第四部分：高级应用与最佳实践

1. 会话管理 (Sessions)

创建一个 Session 对象

模拟登录 (假设登录成功后服务器设置了Cookie)

使用同一个 Session 对象访问其他页面，Cookie 会自动发送

Session 也可以设置默认请求头和代理

session.proxies = {“http”: “http://localhost:8080”}

2. 错误处理与异常

3. 调试与日志

开启 Requests 库的调试日志

logging.getLogger(“requests”).setLevel(logging.DEBUG)

logging.getLogger(“urllib3”).setLevel(logging.DEBUG)

此时终端会输出 Requests 和 urllib3 的详细日志

4. 性能优化

5. User-Agent

第五部分：实践案例：模拟一个带有登录态的 API 请求

1. 登录

2. 访问受保护的资源 (使用保存的Cookie)

假设的URL (替换为你的实际URL)

1. 创建 Session 对象

2. 准备登录数据

3. 发送登录请求

总结

发表评论 取消回复

1. HTTP 请求方法 (`-X`)

2. 请求头 (`-H`)

3. 查询参数 (URL 中的 `?key=value`)

4. 请求体数据 (`-d`, `--data`, `--data-raw`, `--data-binary`)

5. 文件上传 (`-F`, `--form`)

6. Cookie 处理 (`-b`, `-c`)

7. 认证 (`-u`, `--user`)

8. 重定向 (`-L`, `--location`)

9. SSL/TLS 证书验证 (`-k`, `--insecure`)

10. 超时 (`--connect-timeout`, `--max-time`)

11. 代理 (`-x`, `--proxy`)

12. 文件下载 (`-o`, `-O`)

发表评论取消回复