使用 Python 执行 Curl 命令并逐行输出：详细教程

在网络编程和自动化任务中，我们经常需要使用 curl 命令来发送 HTTP 请求，获取网页内容、API 数据等。虽然 curl 命令本身功能强大，但在某些场景下，我们可能需要在 Python 脚本中执行 curl 命令，并对其输出进行逐行处理，例如实时监控请求状态、解析返回的 JSON 数据等。本文将详细介绍如何使用 Python 执行 curl 命令并逐行输出，并提供多种实现方式和最佳实践。

一、为什么要在 Python 中执行 Curl 命令？

集成性： 将网络请求逻辑嵌入到 Python 脚本中，方便与现有代码集成，实现自动化流程。
灵活性： Python 提供了丰富的库和工具，可以对 curl 命令的输出进行灵活处理，例如数据解析、错误处理、日志记录等。
可维护性： 将复杂的 curl 命令封装在 Python 函数中，提高代码的可读性和可维护性。
实时性： 逐行输出 curl 命令的结果，可以实时监控请求状态，及时发现问题。

二、Python 执行 Curl 命令的常用方法

Python 提供了多种方法来执行外部命令，包括 os.system、os.popen、subprocess.run、subprocess.Popen 等。对于需要逐行输出 curl 命令结果的场景，subprocess.Popen 是最合适的选择。

1. 使用 os.system (不推荐)

os.system 函数可以直接执行系统命令，但它无法捕获命令的输出，因此不适合逐行输出的场景。

“`python
import os

command = “curl https://www.example.com”
os.system(command)
“`

2. 使用 os.popen (不推荐)

os.popen 函数可以执行系统命令，并返回一个文件对象，可以读取命令的输出。但 os.popen 在错误处理和资源管理方面不如 subprocess 模块。

“`python
import os

command = “curl https://www.example.com”
process = os.popen(command)
for line in process:
print(line.strip())
process.close()
“`

3. 使用 subprocess.run (不推荐用于逐行输出)

subprocess.run 函数是 Python 3.5 引入的新函数，可以方便地执行系统命令并获取其返回值。但是，它会将命令的输出一次性读取到内存中，不适合处理大量输出或需要实时输出的场景。

“`python
import subprocess

command = “curl https://www.example.com”
result = subprocess.run(command, shell=True, capture_output=True, text=True)
for line in result.stdout.splitlines():
print(line.strip())
“`

4. 使用 subprocess.Popen (推荐)

subprocess.Popen 函数可以创建一个子进程来执行系统命令，并允许我们通过管道（pipe）来读取命令的输出。这种方式可以实现逐行输出，并提供更灵活的控制。

“`python
import subprocess

command = “curl https://www.example.com”
process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)

逐行读取标准输出

for line in process.stdout:
print(line.strip())

逐行读取标准错误输出 (可选)

for line in process.stderr:

print(f”Error: {line.strip()}”)

process.wait() # 等待子进程结束
“`

三、subprocess.Popen 的详细解释

subprocess.Popen(command, ...)： 创建一个新的进程来执行 command 命令。
- command： 要执行的命令，可以是一个字符串或一个列表。如果是一个字符串，需要设置 shell=True。
- shell=True： 允许使用 shell 执行命令。这在命令包含 shell 特殊字符（如管道、重定向等）时非常有用。但需要注意安全风险，避免执行恶意代码。
- stdout=subprocess.PIPE： 将子进程的标准输出连接到一个管道，我们可以通过 process.stdout 读取管道中的数据。
- stderr=subprocess.PIPE： 将子进程的标准错误输出连接到一个管道，我们可以通过 process.stderr 读取管道中的数据。
- text=True： 以文本模式打开管道，自动将字节流解码为字符串。
process.stdout： 一个文件对象，可以读取子进程的标准输出。
process.stderr： 一个文件对象，可以读取子进程的标准错误输出。
process.wait()： 等待子进程结束，并返回子进程的退出码。

四、示例：执行 Curl 命令并逐行输出 JSON 数据

假设我们需要从一个 API 接口获取 JSON 数据，并逐行输出其中的某些字段。

“`python
import subprocess
import json

command = “curl https://jsonplaceholder.typicode.com/todos/1” # 替换为你的 API 地址
process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)

try:
# 读取标准输出
output = process.stdout.read()
# 解析 JSON 数据
data = json.loads(output)

# 逐行输出指定的字段
print(f"userId: {data['userId']}")
print(f"id: {data['id']}")
print(f"title: {data['title']}")
print(f"completed: {data['completed']}")

except json.JSONDecodeError as e:
print(f”Error decoding JSON: {e}”)
except Exception as e:
print(f”An error occurred: {e}”)
finally:
# 确保进程结束
process.wait()
“`

五、示例：执行 Curl 命令并实时监控下载进度

某些 API 接口会返回下载进度信息，我们可以使用 curl 命令的 -# 选项来显示下载进度条，并使用 Python 脚本实时监控。

“`python
import subprocess

command = “curl -# -o output.file https://example.com/large_file.zip” # 替换为你的下载地址
process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)

逐行读取标准错误输出 (curl 的进度信息会输出到 stderr)

for line in process.stderr:
print(line.strip())

process.wait()
“`

六、错误处理和异常处理

在执行 curl 命令时，可能会遇到各种错误，例如网络连接错误、API 接口错误、JSON 解析错误等。我们需要进行适当的错误处理和异常处理，以保证程序的健壮性。

检查退出码： process.wait() 方法会返回子进程的退出码。如果退出码为 0，表示命令执行成功；否则，表示命令执行失败。
读取标准错误输出： 子进程的标准错误输出（process.stderr）通常包含错误信息。我们可以读取并打印这些信息，以便调试。
使用 try...except 块： 使用 try...except 块来捕获可能发生的异常，例如 subprocess.CalledProcessError、json.JSONDecodeError 等。

七、安全注意事项

避免使用 shell=True： 尽量避免使用 shell=True，除非必须使用 shell 特殊字符。如果必须使用，请确保命令字符串是可信的，避免执行恶意代码。
验证输入： 如果命令字符串包含用户输入，请对其进行验证和转义，防止命令注入攻击。
限制权限： 以最小权限运行 Python 脚本，避免潜在的安全风险。

八、更高级的用法

使用 shlex.split() 解析命令字符串： 可以使用 shlex.split() 函数将命令字符串分割成一个列表，避免 shell=True 的安全风险。

“`python
import subprocess
import shlex

command = “curl ‘https://www.example.com?param=value with space'”
command_list = shlex.split(command)
process = subprocess.Popen(command_list, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
“`
使用第三方库 curl_cffi： curl_cffi 是一个基于 cffi 的 curl 绑定，提供更高效的性能和更 Pythonic 的 API。

“`python
from curl_cffi import Curl, CurlInfo, CurlOpt

curl = Curl()
curl.setopt(CurlOpt.URL, b”https://www.example.com”)
curl.perform()

response_code = curl.getinfo(CurlInfo.RESPONSE_CODE)
content = curl.body.decode(“utf-8”)
print(content)
“`
使用 requests 库： 虽然本文重点介绍使用 curl 命令，但在很多情况下，requests 库是更方便、更安全的替代方案。requests 库提供了更高级的 API，可以处理各种 HTTP 请求，并自动处理编码、Cookie 等问题。

“`python
import requests

response = requests.get(“https://www.example.com”)
response.raise_for_status() # 检查请求是否成功
print(response.text)
“`

九、总结

本文详细介绍了如何使用 Python 执行 curl 命令并逐行输出，并提供了多种实现方式和最佳实践。subprocess.Popen 是最适合逐行输出的方案，它允许我们通过管道读取命令的输出，并进行灵活处理。在实际应用中，需要根据具体需求选择合适的方案，并注意错误处理、异常处理和安全问题。虽然 curl 功能强大，但也要考虑使用更方便、更安全的 requests 库。希望本文能够帮助你更好地在 Python 脚本中使用 curl 命令，实现网络编程和自动化任务。

使用 Python 执行 Curl 命令并逐行输出：详细教程 – wiki基地

逐行读取标准输出

逐行读取标准错误输出 (可选)

for line in process.stderr:

print(f”Error: {line.strip()}”)

逐行读取标准错误输出 (curl 的进度信息会输出到 stderr)

发表评论取消回复

逐行读取标准输出

逐行读取标准错误输出 (可选)

for line in process.stderr:

print(f”Error: {line.strip()}”)

逐行读取标准错误输出 (curl 的进度信息会输出到 stderr)

发表评论 取消回复

发表评论取消回复