curl 转 Python：快速上手指南与实用技巧

引言：从命令行到自动化

在日常的开发和系统管理中，curl 是一个强大且常用的命令行工具，用于发送和接收 HTTP 请求。无论是测试 API 端点、下载文件，还是与 Web 服务进行简单的交互，curl 都因其简洁和灵活性而备受青睐。你可能已经习惯于在终端中敲下这样的命令：

bash curl "https://api.example.com/data" -H "Authorization: Bearer mytoken" -X GET -d '{"param": "value"}'

然而，当需要进行更复杂的任务时，例如：

发送一系列相关的请求（如模拟用户登录后的操作）
根据前一个请求的结果动态构建下一个请求
处理和分析大量的响应数据
集成到现有的脚本或应用程序中
实现复杂的错误处理和重试逻辑
自动化重复性的网络交互任务

仅仅依靠 curl 命令行就会变得笨拙且难以维护。这时，将 curl 命令的功能迁移到更具表达力和编程能力的语言中就显得尤为重要。Python，凭借其丰富的生态系统、简洁的语法以及强大的网络编程库，成为了一个理想的选择。特别是 requests 库，它让 HTTP 请求变得异常简单且人性化，被誉为“为人友好的 HTTP for Humans™”。

本文将详细介绍如何把你在 curl 中熟悉的各种操作，“翻译”成 Python 代码，并提供一系列实用技巧，帮助你快速上手，高效地完成网络交互任务。

为什么选择 Python 进行网络请求？

将 curl 命令转化为 Python 代码，不仅仅是换个工具那么简单，它带来了显著的优势：

自动化与脚本化： Python 脚本可以轻松地执行一系列复杂的网络操作，无需手动输入命令，非常适合自动化测试、数据抓取和任务调度。
强大的逻辑控制： 利用 Python 的条件判断、循环、函数等，可以根据请求结果或外部条件灵活地调整请求行为。
数据处理与集成： Python 拥有强大的数据处理能力（如 pandas, JSON解析内置库），可以方便地解析、转换和利用从网络获取的数据，并与其他系统或数据源集成。
易于维护与扩展： Python 代码结构清晰，更易于组织、理解和修改，方便团队协作和项目长期维护。
错误处理与鲁棒性： Python 提供了成熟的异常处理机制（try...except），可以优雅地处理网络错误、超时等问题，使脚本更加健壮。
丰富的第三方库： 除了 requests，Python 还有许多其他库（如 BeautifulSoup, Scrapy, asyncio）可以进一步增强你的网络交互能力。

准备工作：安装 `requests` 库

虽然 Python 的标准库 urllib.request 也能发送 HTTP 请求，但 requests 库提供了更简洁、更直观的 API，是处理大多数网络请求任务的首选。如果你的 Python 环境还没有安装 requests，可以通过 pip 命令轻松安装：

bash pip install requests

安装完成后，在你的 Python 脚本中，只需要简单地导入即可开始使用：

python import requests

核心概念：`curl` 选项与 `requests` 参数的映射

curl 命令的强大之处在于其丰富的命令行选项（flags），这些选项控制着请求的方法、头部、数据、认证、重定向等行为。将 curl 转化为 Python，本质上就是理解每个 curl 选项的功能，并在 requests 库中找到对应的函数参数。

下面我们将详细介绍一些最常见和重要的 curl 选项及其在 requests 中的对应方式。

1. 指定请求方法 (`-X`, `--request`)

curl 中使用 -X 或 --request 来指定 HTTP 方法，如 GET, POST, PUT, DELETE 等。

“`bash

curl GET 请求 (默认方法，通常可省略 -X GET)

curl “https://api.example.com/resource”

curl POST 请求

curl -X POST “https://api.example.com/resource”

curl PUT 请求

curl -X PUT “https://api.example.com/resource/123”
“`

在 requests 中，每种主要 HTTP 方法都有对应的函数：requests.get(), requests.post(), requests.put(), requests.delete(), requests.head(), requests.options() 等。

“`python
import requests

GET 请求

response = requests.get(“https://api.example.com/resource”)

POST 请求

response = requests.post(“https://api.example.com/resource”)

PUT 请求

response = requests.put(“https://api.example.com/resource/123”)

DELETE 请求

response = requests.delete(“https://api.example.com/resource/456”)
“`

这些函数都接受 url 作为第一个参数。

2. 发送请求头 (`-H`, `--header`)

curl 使用 -H 或 --header 来添加自定义的 HTTP 请求头。可以多次使用 -H 来添加多个头部。

bash curl "https://api.example.com/data" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer mytoken" \ -H "User-Agent: MyPythonScript/1.0"

在 requests 中，请求头通过 headers 参数传递，它是一个字典（Dictionary），键是头部名称，值是头部的值。

“`python
import requests

url = “https://api.example.com/data”
headers = {
“Content-Type”: “application/json”,
“Authorization”: “Bearer mytoken”,
“User-Agent”: “MyPythonScript/1.0”
}

response = requests.get(url, headers=headers)

或者对于 POST/PUT 等方法：

response = requests.post(url, headers=headers, …)

“`

3. 发送请求体/数据 (`-d`, `--data`, `--data-raw`, `--data-urlencode`, `--json`, `-F`, `--form`)

这是 curl 中比较复杂的参数组，用于在 POST、PUT 等请求中发送数据。

-d, --data: 发送 application/x-www-form-urlencoded 数据，通常用于表单提交。
--data-raw: 发送原始数据，不进行特殊处理（如 @ 前缀的文件上传）。
--data-urlencode: 对数据进行 URL 编码后再发送。
--json: 这是较新版本的 curl 支持的选项，用于发送 application/json 数据。
-F, --form: 用于发送 multipart/form-data 数据，常用于文件上传。

在 requests 中，发送请求体数据有几种方式，取决于数据的类型：

表单数据 (application/x-www-form-urlencoded): 使用 data 参数，传递一个字典或一个列表元组。

“`bash

curl 发送表单数据

curl -X POST “https://api.example.com/submit” \
-d “username=testuser&password=mypassword”

或

curl -X POST “https://api.example.com/submit” \
–data-urlencode “username=testuser” \
–data-urlencode “password=mypassword with spaces”
“`

“`python
import requests

url = “https://api.example.com/submit”

使用字典

form_data = {
“username”: “testuser”,
“password”: “mypassword”
}
response = requests.post(url, data=form_data) # requests 会自动处理 URL 编码和 Content-Type

使用列表元组（当有多个同名字段时有用）

form_data_list = [(“username”, “testuser”), (“password”, “mypassword with spaces”)]

response = requests.post(url, data=form_data_list)

`` 当data参数是一个字典时，requests会自动设置Content-Type: application/x-www-form-urlencoded` 并对数据进行 URL 编码。
JSON 数据 (application/json): 使用 json 参数，传递一个 Python 字典、列表或其他 JSON 可序列化的对象。

“`bash

curl 发送 JSON 数据 (新版本 curl 支持)

curl -X POST “https://api.example.com/jsonendpoint” \
–json ‘{“name”: “Alice”, “age”: 30}’

curl 发送 JSON 数据 (旧版本 curl，需要手动设置 Content-Type 和使用 -d 或 –data-raw)

curl -X POST “https://api.example.com/jsonendpoint” \
-H “Content-Type: application/json” \
-d ‘{“name”: “Alice”, “age”: 30}’
“`

“`python
import requests
import json # 虽然 requests 内部处理，但了解 json 库有益

url = “https://api.example.com/jsonendpoint”
json_payload = {
“name”: “Alice”,
“age”: 30
}

requests 推荐的方式：使用 json 参数

response = requests.post(url, json=json_payload) # requests 会自动序列化为 JSON 并设置 Content-Type: application/json

另一种方式：手动序列化并使用 data 参数 (不推荐，除非有特殊需求)

headers = {“Content-Type”: “application/json”}

response = requests.post(url, data=json.dumps(json_payload), headers=headers)

`` 使用json参数是发送 JSON 数据的最简便方式，requests会自动处理 JSON 序列化和Content-Type` 头部。
原始数据 (--data-raw): 如果需要发送非表单、非 JSON 的原始数据（如 XML, 纯文本等），可以使用 data 参数，传递一个字符串或字节串。此时通常需要手动设置 Content-Type 头部。

“`bash

curl 发送 XML 数据

curl -X POST “https://api.example.com/xmldata” \
-H “Content-Type: application/xml” \
–data-raw ‘value‘
“`

“`python
import requests

url = “https://api.example.com/xmldata”
xml_payload = “value”
headers = {“Content-Type”: “application/xml”}

response = requests.post(url, data=xml_payload, headers=headers)
“`
文件上传 (-F, --form): curl 使用 -F 或 --form 来模拟 HTML 表单中的文件上传 (multipart/form-data)。

“`bash

curl 上传文件

curl -X POST “https://api.example.com/upload” \
-F “file=@/path/to/your/file.txt” \
-F “description=A document file”
“`

@ 符号告诉 curl 读取本地文件内容。

在 requests 中，文件上传通过 files 参数传递，它是一个字典，键是表单字段名，值可以是：
* 文件对象的元组 (filename, file_object)
* 文件内容的元组 (filename, file_content)
* 文件内容的元组加 Content-Type (filename, file_content, content_type)
* 文件内容的元组加 Content-Type 和 Headers (filename, file_content, content_type, headers)
* 直接的文件对象

最常见的是使用元组 (filename, file_object) 或 (filename, file_content)。

“`python
import requests

url = “https://api.example.com/upload”
file_path = “/path/to/your/file.txt”

方式 1: 打开文件对象 (推荐，适合大文件)

try:
with open(file_path, ‘rb’) as f:
files = {
“file”: (f.name, f), # 文件名通常取自对象，也可以自定义 (f.name, f, ‘text/plain’)
“description”: (None, “A document file”) # 普通字段也要放在 files 字典里，值为 (None, value)
}
response = requests.post(url, files=files)
except FileNotFoundError:
print(f”Error: File not found at {file_path}”)
except Exception as e:
print(f”An error occurred: {e}”)

方式 2: 直接提供文件内容 (适合小文件，不推荐对大文件使用，可能占用过多内存)

try:

with open(file_path, ‘rb’) as f:

file_content = f.read()

files = {

“file”: (“my_uploaded_file.txt”, file_content, ‘text/plain’), # 自定义文件名、内容和 Content-Type

“description”: (None, “A document file”)

}

response = requests.post(url, files=files)

except FileNotFoundError:

print(f”Error: File not found at {file_path}”)

except Exception as e:

print(f”An error occurred: {e}”)

`` 使用files参数时，requests会自动设置Content-Type: multipart/form-data并正确构建请求体。注意，与curl -F类似，其他非文件字段也应包含在files字典中，但值为(None, field_value)` 的元组形式。

4. 发送 URL 参数 (`-G`, `--get`)

curl 发送 GET 请求时，参数通常直接放在 URL 中。但有时为了方便，或者结合 -d 参数使用时，可以用 -G 或 --get 将 -d 指定的数据附加到 URL 作为查询字符串。

“`bash

curl GET 请求带参数

curl “https://api.example.com/items?category=electronics&status=instock”

使用 -G 将 -d 数据转为 GET 参数

curl -G “https://api.example.com/items” -d “category=electronics” -d “status=instock”
“`

在 requests 中，GET 请求的 URL 参数通过 params 参数传递，它是一个字典。

“`python
import requests

url = “https://api.example.com/items”
params = {
“category”: “electronics”,
“status”: “instock”
}

response = requests.get(url, params=params)

requests 会自动构建最终的 URL：https://api.example.com/items?category=electronics&status=instock

`` 使用params参数是推荐的方式，requests` 会自动处理 URL 编码，比手动拼接字符串更安全可靠。

5. 处理认证 (`-u`, `--user`, `--basic`, `--digest`, `--negotiate`, `--ntlm`)

curl 使用 -u 或 --user 指定用户名和密码进行各种 HTTP 认证。默认是 Basic 认证。

“`bash

curl Basic 认证

curl -u “username:password” “https://api.example.com/protected”
“`

在 requests 中，Basic 认证非常简单，使用 auth 参数，传递一个元组 (username, password)。

“`python
import requests

url = “https://api.example.com/protected”
auth = (“username”, “password”)

response = requests.get(url, auth=auth)

requests 会自动 base64 编码用户名密码并添加到 Authorization: Basic … 头部

``requests也支持其他类型的认证（如 Digest 认证），可以通过requests.auth` 模块找到对应的类。对于更复杂的认证流程（如 OAuth2），通常需要结合其他库或手动构建请求流程。

6. 处理 Cookies (`-b`, `--cookie`, `-c`, `--cookie-jar`)

curl 使用 -b 或 --cookie 发送 Cookie，使用 -c 或 --cookie-jar 将响应中的 Cookie 保存到文件。

“`bash

curl 发送 Cookie

curl -b “sessionid=abc123; othercookie=def456” “https://api.example.com/profile”

curl 将响应 Cookie 保存到文件，并下次请求使用该文件中的 Cookie

curl -c cookies.txt “https://api.example.com/login” -d “…”
curl -b cookies.txt “https://api.example.com/profile”
“`

在 requests 中，发送 Cookie 可以通过 cookies 参数传递一个字典。接收到的 Cookie 会自动存储在 response.cookies 对象中（一个 RequestsCookieJar 实例），你可以轻松访问或在后续请求中重用。

“`python
import requests

url = “https://api.example.com/profile”

发送 Cookie

cookies_to_send = {
“sessionid”: “abc123”,
“othercookie”: “def456”
}
response = requests.get(url, cookies=cookies_to_send)

获取响应中的 Cookie

print(response.cookies.get(“sessionid”))
print(response.cookies) # 这是一个 RequestsCookieJar 对象

将响应中的 Cookie 用于后续请求（推荐使用 Session 对象，见下文）

也可以手动传递 response.cookies 对象

next_response = requests.get(“https://api.example.com/another_page”, cookies=response.cookies)

“`

对于需要跨多个请求维持 Cookie 状态的场景（如登录后的 session），强烈建议使用 requests.Session() 对象，它会自动管理 Cookie 和连接池，极大地简化了代码。

7. 自动重定向 (`-L`, `--location`)

curl 默认不跟踪 HTTP 重定向（3xx 状态码）。使用 -L 或 --location 选项会指示 curl 跟踪重定向直到找到最终资源。

“`bash

curl 跟踪重定向

curl -L “http://shorturl.example.com/redirect_me”
“`

requests 默认是自动跟踪重定向的，所以通常无需做额外设置。

“`python
import requests

url = “http://shorturl.example.com/redirect_me”
response = requests.get(url) # 默认会跟踪重定向

如果需要禁用重定向跟踪，可以设置 allow_redirects=False

response = requests.get(url, allow_redirects=False)

“`

8. 跳过 SSL 证书验证 (`-k`, `--insecure`)

curl 使用 -k 或 --insecure 来跳过对 SSL/TLS 证书的验证。这在测试自签名证书的内部服务时可能有用，但在生产环境中极不推荐，因为它会使你的连接容易受到中间人攻击。

“`bash

curl 跳过 SSL 验证 (警告：不安全！)

curl -k “https://untrusted-cert.example.com/”
“`

在 requests 中，通过 verify 参数控制 SSL 证书验证。默认是 True（验证证书），设置为 False 则跳过验证。

“`python
import requests

url = “https://untrusted-cert.example.com/”

跳过 SSL 验证 (警告：不安全！)

response = requests.get(url, verify=False)

requests 会在禁用验证时发出警告，你可以结合 Python 的 logging 模块来忽略这些警告

import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

然后再次执行：

response = requests.get(url, verify=False)

`` **重要提示：** 除非你明确知道自己在做什么，并且是在受控的安全环境中，否则**切勿**在生产代码中设置verify=False`。如果遇到证书问题，正确的做法是配置好信任的证书颁发机构，或者使用正确的证书。

9. 设置超时 (`--connect-timeout`, `--max-time`)

curl 使用 --connect-timeout 设置连接建立的超时时间，使用 --max-time 设置整个请求（连接+发送+接收）的最大允许时间。

“`bash

curl 设置连接超时和总超时

curl –connect-timeout 5 –max-time 10 “https://slow-api.example.com/”
“`

在 requests 中，使用 timeout 参数设置超时时间。它可以是一个单一的值（连接超时和读取超时都使用这个值），也可以是一个元组 (connect_timeout, read_timeout)。

“`python
import requests

url = “https://slow-api.example.com/”

设置整个请求超时为 10 秒 (连接和读取共用)

try:
response = requests.get(url, timeout=10)
except requests.exceptions.Timeout:
print(“Request timed out”)
except requests.exceptions.RequestException as e:
print(f”An error occurred: {e}”)

设置连接超时为 5 秒，读取超时为 10 秒

try:
response = requests.get(url, timeout=(5, 10))
except requests.exceptions.ConnectTimeout:
print(“Connection timed out”)
except requests.exceptions.ReadTimeout:
print(“Read timed out”)
except requests.exceptions.RequestException as e:
print(f”An error occurred: {e}”)
“`
设置合理的超时时间是编写健壮网络请求代码的重要部分，可以防止程序长时间无响应。

10. 使用代理 (`-x`, `--proxy`)

curl 使用 -x 或 --proxy 指定 HTTP 代理。

“`bash

curl 使用代理

curl -x “http://myproxy.com:8080” “https://api.example.com/”

带认证的代理

curl -x “http://user:[email protected]:8080” “https://api.example.com/”
“`

在 requests 中，使用 proxies 参数，传递一个字典，键是协议（如 http 或 https），值是代理地址。

“`python
import requests

url = “https://api.example.com/”
proxies = {
“http”: “http://myproxy.com:8080”,
“https”: “http://myproxy.com:8080”, # https 请求也通过 http 代理 (CONNECT 方法)
# “https”: “https://myproxy.com:8081”, # 如果代理支持 https 连接
}

带认证的代理

proxies = {

“http”: “http://user:[email protected]:8080/”,

“https”: “http://user:[email protected]:8080/”

}

response = requests.get(url, proxies=proxies)
“`

11. 保存响应到文件 (`-o`, `-O`)

curl 使用 -o 将响应体保存到指定文件，使用 -O 根据 URL 推断文件名保存。

“`bash

curl 保存到指定文件

curl -o output.html “https://example.com/”

curl 根据 URL 保存 (例如保存为 logo.png)

curl -O “https://example.com/images/logo.png”
“`

在 requests 中，你需要手动读取响应体并写入文件。

“`python
import requests

url = “https://example.com/images/logo.png”
file_path = “logo.png” # 或者你想保存的任何文件名

try:
response = requests.get(url, stream=True) # 使用 stream=True 避免大文件一次性载入内存
response.raise_for_status() # 检查请求是否成功

with open(file_path, 'wb') as f: # 注意 'wb' 模式用于写入二进制数据
    for chunk in response.iter_content(chunk_size=8192): # 分块写入，提高效率
        if chunk: # filter out keep-alive new chunks
            f.write(chunk)

print(f"Successfully downloaded {url} to {file_path}")

except requests.exceptions.RequestException as e:
print(f”Download failed: {e}”)
`` 使用stream=True和iter_content()` 是处理大文件下载时的最佳实践，可以有效控制内存使用。

12. 查看请求和响应详情 (`-v`, `--verbose`)

curl 的 -v 或 --verbose 选项可以打印出请求和响应的详细信息，包括头部、SSL握手信息等。

在 requests 中，没有一个直接等价于 -v 的选项来打印所有细节。但你可以通过以下方式获取类似信息：

请求详情： 访问 response.request 对象，可以查看请求的头部、方法、URL 等。
响应详情： 访问 response 对象的各种属性，如 response.status_code, response.headers, response.text (或 response.content 对于二进制)。
底层库日志： requests 底层使用了 urllib3，你可以通过配置 Python 的 logging 模块来查看 urllib3 发出的详细连接信息。

“`python
import requests
import logging
import http.client

方式 1: 检查 response 和 response.request 对象

url = “https://example.com/”
response = requests.get(url)

print(“— Request Details —“)
print(f”Method: {response.request.method}”)
print(f”URL: {response.request.url}”)
print(f”Headers: {response.request.headers}”)

print(f”Body: {response.request.body}”) # 请求体可能较大或为空

print(“\n— Response Details —“)
print(f”Status Code: {response.status_code}”)
print(f”Headers: {response.headers}”)

print(f”Body (first 200 chars): {response.text[:200]}…”) # 打印部分响应体

方式 2: 开启 urllib3 的 debug 日志 (更接近 curl -v 的详细程度)

Requires setting up Python’s logging module

http.client.HTTPConnection.debuglevel = 1 # Or 2 for more detail

logging.basicConfig() # Set up logging

logging.getLogger().setLevel(logging.DEBUG)

requests_log = logging.getLogger(“requests.packages.urllib3”)

requests_log.setLevel(logging.DEBUG)

requests_log.propagate = True

response = requests.get(url) # Now run the request, output will go to stderr

Remember to turn off debuglevel if needed

http.client.HTTPConnection.debuglevel = 0

`` 通常情况下，检查response` 对象的属性已经足够获取所需信息。开启底层日志用于更深入的调试。

实用技巧与高级用法

掌握了基本的 curl 选项到 requests 参数的转换后，我们来看一些更进一步的实用技巧和 requests 的高级用法，它们能让你的 Python 网络代码更加高效和健壮。

1. 使用 `requests.Session()`

对于需要与同一主机进行多次交互，并且希望保持 Cookie、连接池等状态的场景，使用 requests.Session() 对象是最佳实践。

“`python
import requests

创建一个 Session 对象

with requests.Session() as session:
# 登录请求 (假设登录成功后设置了认证 Cookie)
login_url = “https://api.example.com/login”
login_data = {“username”: “test”, “password”: “pwd”}
session.post(login_url, data=login_data)

# 后续请求会自动带上 session 中保存的 Cookie
profile_url = "https://api.example.com/profile"
response1 = session.get(profile_url)
print(f"Profile status: {response1.status_code}")

# 另一个请求，同样会自动带上 Cookie
orders_url = "https://api.example.com/orders"
response2 = session.get(orders_url)
print(f"Orders status: {response2.status_code}")

# Session 对象还可以设置默认的头部、参数、认证等，这些设置会应用到通过该 session 发送的所有请求
session.headers.update({"X-My-Custom-Header": "SessionValue"})
session.auth = ("session_user", "session_pwd")
session.params.update({"common_param": "value"})

# 这个请求就会自动带上上面的头部、认证和 URL 参数
response3 = session.get("https://api.example.com/some_endpoint")
print(f"Endpoint status: {response3.status_code}")

Session 对象在其生命周期结束后会自动关闭连接，使用 with 语句可以确保这一点

“`
使用 Session 对象的好处：
* Cookie 持久化： 自动在请求之间维持 Cookie 状态，无需手动处理。
* 连接池： 重用底层的 TCP 连接，减少建立连接的开销，提高效率。
* 默认设置： 可以为一系列请求设置共同的头部、认证、参数等，简化代码。

2. 检查响应状态码与错误处理

requests 的响应对象 response 提供了 status_code 属性来获取 HTTP 状态码。建议总是检查状态码，以确保请求成功。

“`python
import requests

url = “https://api.example.com/sometimes_fails”

try:
response = requests.get(url)

# 检查状态码
if response.status_code == 200:
    print("Request successful!")
    print(response.text)
elif response.status_code == 404:
    print("Error: Resource not found.")
elif response.status_code == 401:
    print("Error: Authentication required.")
else:
    print(f"Error: Received status code {response.status_code}")
    print(response.text) # 打印响应体，可能包含错误信息

# requests 还提供了一个便捷方法 raise_for_status()
# 如果状态码不是 2xx，它会抛出一个 HTTPError 异常
# 结合 try...except 可以更简洁地处理非成功状态
response.raise_for_status()
print("Request successful (using raise_for_status)!")
print(response.text)

except requests.exceptions.RequestException as e:
# requests.exceptions.RequestException 是所有 requests 相关异常的基类
# 包括 ConnectTimeout, ReadTimeout, HTTPError 等
print(f”An error occurred during the request: {e}”)

`` 推荐使用response.raise_for_status()结合try…except requests.exceptions.RequestException` 来处理大多数网络请求中可能遇到的错误（连接问题、超时、非 2xx 状态码等）。

3. 解析响应数据 (JSON, 文本, 二进制)

requests 响应对象提供了多种方式访问响应体：

response.text: 获取响应体作为 Unicode 字符串。requests 会根据响应头部的 Content-Type 和 charset 自动猜测编码。
response.content: 获取响应体作为原始的字节串。适用于处理二进制数据（如图片、文件）或当 response.text 出现编码问题时。
response.json(): 如果响应体的 Content-Type 是 application/json（或类似的 JSON 类型），可以直接调用此方法将其解析为 Python 字典或列表。如果内容不是有效的 JSON，会抛出 json.JSONDecodeError。

“`python
import requests

假设这是一个返回 JSON 的 API

json_url = “https://api.example.com/items/123”
response = requests.get(json_url)

try:
response.raise_for_status() # 确保请求成功

# 使用 .json() 解析 JSON 响应
data = response.json()
print("Received JSON data:")
print(type(data)) # 通常是 dict 或 list
print(data)
print(f"Item name: {data.get('name', 'N/A')}")

except requests.exceptions.HTTPError as e:
print(f”HTTP error occurred: {e}”)
except requests.exceptions.RequestException as e:
print(f”Request error occurred: {e}”)
except json.JSONDecodeError: # 导入 json 模块来捕获这个异常
print(“Error: Response is not valid JSON.”)
print(“Response content:”, response.text) # 打印非 JSON 内容以便调试

假设这是一个返回纯文本的页面

text_url = “https://example.com/”
response = requests.get(text_url)
print(“\nReceived Text content (first 200 chars):”)
print(response.text[:200] + “…”)

假设这是一个返回图片

image_url = “https://www.python.org/static/community_logos/python-logo-master-v3-TM.png”
response = requests.get(image_url, stream=True) # 同样建议对二进制文件使用 stream=True
response.raise_for_status()
print(“\nReceived Binary content (first 10 bytes):”)
print(response.content[:10]) # 打印前10个字节

保存图片到文件

with open(“python_logo.png”, “wb”) as f:

for chunk in response.iter_content(chunk_size=8192):

f.write(chunk)

print(“Image saved as python_logo.png”)

“`

4. 管理秘密信息 (Secrets)

在 curl 命令中直接包含 API 密钥、用户名密码等敏感信息是危险的，这些信息可能会暴露在命令行历史记录或脚本文件中。在 Python 中，应该采取更安全的方式管理这些秘密信息，例如：

环境变量： 从环境变量中读取敏感信息。
配置文件： 使用专门的配置文件（确保其不被意外提交到版本控制）或配置管理库。
密钥管理服务： 在生产环境中使用云服务提供的密钥管理解决方案。

“`python
import requests
import os

从环境变量读取 API 密钥

api_key = os.getenv(“MY_API_KEY”)
if not api_key:
print(“Error: MY_API_KEY environment variable not set.”)
# exit() or raise an error

url = “https://api.example.com/secured_data”
headers = {
“Authorization”: f”Bearer {api_key}”
}

try:
response = requests.get(url, headers=headers)
response.raise_for_status()
print(“Data:”, response.json())
except requests.exceptions.RequestException as e:
print(f”Request failed: {e}”)
“`

从 `curl` 命令到 Python 代码的自动化工具

虽然手动转换 curl 命令有助于深入理解 HTTP 请求和 requests 库，但对于复杂的 curl 命令，手动转换可能会耗时且容易出错。有一些在线工具可以帮助你自动化这个过程：

curlconverter.com: 这是一个非常流行的在线工具，你可以粘贴 curl 命令，它会生成 Python requests、Node.js、PHP 等多种语言的代码。

这些工具是很好的起点，但生成的代码可能不总是最优化的，或者未能充分利用 requests 的高级特性（如 Session）。因此，建议将工具生成的代码作为参考，结合本文介绍的技巧进行调整和优化。

总结：释放 Python 的网络力量

从 curl 的命令行世界迈向 Python 的编程世界，意味着你将从简单的请求发送者转变为能够构建复杂、智能、可自动化的网络交互程序的开发者。requests 库极大地降低了 Python 进行 HTTP 通信的门槛，其简洁的 API 设计使得将各种 curl 命令的功能移植过来变得直观而高效。

本文详细介绍了 curl 中的常见选项如何对应到 requests 的各种参数：从指定方法、设置头部和数据体，到处理认证、Cookie、重定向、超时和代理。同时，我们也探讨了使用 requests.Session()、进行错误处理、解析不同格式的响应数据以及安全管理敏感信息等实用技巧。

掌握这些知识后，你不仅能轻松地将现有的 curl 命令转化为 Python 代码，更能利用 Python 强大的编程能力，处理更复杂的业务逻辑，构建健壮的网络客户端程序，无论是用于数据采集、API 集成、自动化测试还是其他网络相关的应用，都将事半功倍。

记住，实践是最好的老师。尝试将你常用的 curl 命令逐一转换，并在实际项目中应用这些技巧，你会越来越熟练地运用 Python 和 requests 库，释放其在网络编程领域的强大力量。

curl 转 Python：快速上手指南与实用技巧

引言：从命令行到自动化

为什么选择 Python 进行网络请求？

准备工作：安装 requests 库

核心概念：curl 选项与 requests 参数的映射

1. 指定请求方法 (-X, --request)

curl GET 请求 (默认方法，通常可省略 -X GET)

curl POST 请求

curl PUT 请求

GET 请求

POST 请求

PUT 请求

DELETE 请求

2. 发送请求头 (-H, --header)

或者对于 POST/PUT 等方法：

response = requests.post(url, headers=headers, …)

3. 发送请求体/数据 (-d, --data, --data-raw, --data-urlencode, --json, -F, --form)

curl 发送表单数据

或

使用字典

使用列表元组（当有多个同名字段时有用）

form_data_list = [(“username”, “testuser”), (“password”, “mypassword with spaces”)]

response = requests.post(url, data=form_data_list)

curl 发送 JSON 数据 (新版本 curl 支持)

curl 发送 JSON 数据 (旧版本 curl，需要手动设置 Content-Type 和使用 -d 或 –data-raw)

requests 推荐的方式：使用 json 参数

另一种方式：手动序列化并使用 data 参数 (不推荐，除非有特殊需求)

headers = {“Content-Type”: “application/json”}

response = requests.post(url, data=json.dumps(json_payload), headers=headers)

curl 发送 XML 数据

curl 上传文件

方式 1: 打开文件对象 (推荐，适合大文件)

方式 2: 直接提供文件内容 (适合小文件，不推荐对大文件使用，可能占用过多内存)

try:

with open(file_path, ‘rb’) as f:

file_content = f.read()

files = {

“file”: (“my_uploaded_file.txt”, file_content, ‘text/plain’), # 自定义文件名、内容和 Content-Type

“description”: (None, “A document file”)

}

response = requests.post(url, files=files)

except FileNotFoundError:

print(f”Error: File not found at {file_path}”)

except Exception as e:

print(f”An error occurred: {e}”)

4. 发送 URL 参数 (-G, --get)

curl GET 请求带参数

使用 -G 将 -d 数据转为 GET 参数

requests 会自动构建最终的 URL：https://api.example.com/items?category=electronics&status=instock

5. 处理认证 (-u, --user, --basic, --digest, --negotiate, --ntlm)

curl Basic 认证

requests 会自动 base64 编码用户名密码并添加到 Authorization: Basic … 头部

6. 处理 Cookies (-b, --cookie, -c, --cookie-jar)

curl 发送 Cookie

curl 将响应 Cookie 保存到文件，并下次请求使用该文件中的 Cookie

发送 Cookie

获取响应中的 Cookie

将响应中的 Cookie 用于后续请求（推荐使用 Session 对象，见下文）

也可以手动传递 response.cookies 对象

next_response = requests.get(“https://api.example.com/another_page”, cookies=response.cookies)

7. 自动重定向 (-L, --location)

curl 跟踪重定向

如果需要禁用重定向跟踪，可以设置 allow_redirects=False

response = requests.get(url, allow_redirects=False)

8. 跳过 SSL 证书验证 (-k, --insecure)

curl 跳过 SSL 验证 (警告：不安全！)

跳过 SSL 验证 (警告：不安全！)

requests 会在禁用验证时发出警告，你可以结合 Python 的 logging 模块来忽略这些警告

然后再次执行：

response = requests.get(url, verify=False)

9. 设置超时 (--connect-timeout, --max-time)

curl 设置连接超时和总超时

设置整个请求超时为 10 秒 (连接和读取共用)

设置连接超时为 5 秒，读取超时为 10 秒

10. 使用代理 (-x, --proxy)

curl 使用代理

带认证的代理

带认证的代理

proxies = {

“http”: “http://user:[email protected]:8080/”,

准备工作：安装 `requests` 库

核心概念：`curl` 选项与 `requests` 参数的映射

1. 指定请求方法 (`-X`, `--request`)

2. 发送请求头 (`-H`, `--header`)

3. 发送请求体/数据 (`-d`, `--data`, `--data-raw`, `--data-urlencode`, `--json`, `-F`, `--form`)

4. 发送 URL 参数 (`-G`, `--get`)

5. 处理认证 (`-u`, `--user`, `--basic`, `--digest`, `--negotiate`, `--ntlm`)

6. 处理 Cookies (`-b`, `--cookie`, `-c`, `--cookie-jar`)

7. 自动重定向 (`-L`, `--location`)

8. 跳过 SSL 证书验证 (`-k`, `--insecure`)

9. 设置超时 (`--connect-timeout`, `--max-time`)

10. 使用代理 (`-x`, `--proxy`)

11. 保存响应到文件 (`-o`, `-O`)

12. 查看请求和响应详情 (`-v`, `--verbose`)

1. 使用 `requests.Session()`

从 `curl` 命令到 Python 代码的自动化工具

发表评论取消回复