Python JSON 序列化与反序列化 (loads/dumps) 深度详解

引言

在现代软件开发中，数据交换扮演着至关重要的角色。无论是Web应用程序前后端通信、微服务之间的数据传递，还是配置文件存储、API交互，都需要一种标准、轻量级且易于解析的数据格式。JSON (JavaScript Object Notation) 正是满足这些需求的理想选择。它以其简洁的语法、良好的可读性以及跨语言的兼容性，成为了事实上的数据交换标准。

Python 作为一门功能强大且广泛使用的编程语言，内置了强大的 json 模块，专门用于处理 JSON 数据。该模块提供了将 Python 对象转换为 JSON 格式字符串（序列化）以及将 JSON 格式字符串转换回 Python 对象（反序列化）的功能。其中，json.dumps() 和 json.loads() 是进行内存中字符串操作的核心函数，而 json.dump() 和 json.load() 则用于与文件流进行交互。

本文将深入探讨 Python 中 json 模块的核心功能，特别是 json.dumps() 和 json.loads() 函数，详细解析它们的用法、参数、底层机制、常见应用场景、错误处理以及相关的最佳实践，旨在帮助读者全面掌握 Python 中的 JSON 处理能力。

JSON 基础回顾

在深入 Python 的实现之前，我们先快速回顾一下 JSON 的基本结构和数据类型：

对象 (Object)：由花括号 {} 包裹，包含一系列无序的键值对（Key-Value Pair）。键必须是字符串（用双引号 " 包裹），值可以是任意 JSON 支持的数据类型。键值对之间用逗号 , 分隔。
json { "name": "Alice", "age": 30, "isStudent": false, "courses": ["Math", "Physics"] }
数组 (Array)：由方括号 [] 包裹，包含一系列有序的值。值可以是任意 JSON 支持的数据类型，值之间用逗号 , 分隔。
json [ "apple", "banana", "cherry" ]
字符串 (String)：由双引号 " 包裹的 Unicode 字符序列。特殊字符需要转义（例如 \", \\, \n 等）。
json "Hello, World!"
数字 (Number)：整数或浮点数，不支持 NaN (Not a Number) 或 Infinity (无穷大) – 尽管某些解析器可能支持扩展。
json 42 3.14159 -10 1.2e-5
布尔值 (Boolean)：true 或 false (小写)。
json true false
null: 表示空值。
json null

JSON 的简洁性和严格性使其易于机器解析和生成，同时也相对容易被人阅读和编写。

Python `json` 模块与数据类型映射

Python 的 json 模块负责在 Python 数据结构和 JSON 格式之间进行转换。这种转换遵循一定的映射规则：

Python	JSON
`dict`	object
`list`, `tuple`	array
`str`	string
`int`, `float`	number
`True`	`true`
`False`	`false`
`None`	`null`

需要注意的是：
* Python 的 tuple 在序列化时会被转换为 JSON 的 array。
* 反序列化时，JSON 的 array 总是被转换为 Python 的 list。
* JSON 的 object 总是被转换为 Python 的 dict。
* JSON 标准不区分整数和浮点数，它们都被视为 number。Python 在反序列化时会根据具体数值决定是 int 还是 float。

序列化：将 Python 对象转换为 JSON 字符串 (`json.dumps()`)

序列化（Serialization 或 Encoding）是将内存中的 Python 数据结构转换为 JSON 格式的字符串的过程。json.dumps() 函数是实现这一目标的核心工具。

基本用法:

“`python
import json

一个包含各种 Python 数据类型的字典

python_data = {
“name”: “Bob”,
“age”: 25,
“salary”: 50000.50,
“is_manager”: True,
“departments”: [“HR”, “Finance”],
“projects”: ({“id”: 1, “name”: “Project Alpha”}, {“id”: 2, “name”: “Project Beta”}), # tuple 也会转为 array
“address”: None
}

使用 dumps 进行序列化

json_string = json.dumps(python_data)

print(f”Python Data Type: {type(python_data)}”)
print(f”JSON String Type: {type(json_string)}”)
print(“— JSON String —“)
print(json_string)

输出:

Python Data Type:

JSON String Type:

— JSON String —

{“name”: “Bob”, “age”: 25, “salary”: 50000.5, “is_manager”: true, “departments”: [“HR”, “Finance”], “projects”: [{“id”: 1, “name”: “Project Alpha”}, {“id”: 2, “name”: “Project Beta”}], “address”: null}

“`

可以看到，Python 字典被转换成了一个紧凑的 JSON 字符串，其中 Python 的 True 变成了 true，None 变成了 null，list 和 tuple 都变成了 JSON 数组 []。

json.dumps() 的常用参数详解:

json.dumps() 函数提供了多个可选参数，用于控制序列化的输出格式和行为：

indent: (整数) 用于实现 “Pretty Printing”（美化打印）。如果指定了非负整数 n，输出的 JSON 字符串将带有缩进，每层缩进 n 个空格，使结构更清晰易读。
python pretty_json_string = json.dumps(python_data, indent=4) print("\n--- Pretty JSON String (indent=4) ---") print(pretty_json_string) # 输出会带有换行和4个空格的缩进 # --- Pretty JSON String (indent=4) --- # { # "name": "Bob", # "age": 25, # "salary": 50000.5, # "is_manager": true, # "departments": [ # "HR", # "Finance" # ], # "projects": [ # { # "id": 1, # "name": "Project Alpha" # }, # { # "id": 2, # "name": "Project Beta" # } # ], # "address": null # }
separators: (元组 (item_separator, key_separator)) 用于自定义分隔符。默认是 (', ', ': ')。如果你想生成最紧凑的 JSON（没有多余空格），可以设置为 (',', ':')。
python compact_json_string = json.dumps(python_data, separators=(',', ':')) print("\n--- Compact JSON String (separators=(',', ':')) ---") print(compact_json_string) # 输出: # --- Compact JSON String (separators=(',', ':')) --- # {"name":"Bob","age":25,"salary":50000.5,"is_manager":true,"departments":["HR","Finance"],"projects":[{"id":1,"name":"Project Alpha"},{"id":2,"name":"Project Beta"}],"address":null}
注意：如果同时设置了 indent，item_separator 会被强制设为 ,，key_separator 保持不变。
sort_keys: (布尔值) 如果设置为 True，输出的 JSON 对象（对应 Python 字典）中的键将按字母顺序排序。这对于版本控制、比较 JSON 数据或确保输出一致性非常有用。默认为 False。
python sorted_json_string = json.dumps(python_data, indent=4, sort_keys=True) print("\n--- Sorted JSON String (sort_keys=True) ---") print(sorted_json_string) # 输出中 "address", "age", "departments", "is_manager", "name", "projects", "salary" 会按此顺序排列 # --- Sorted JSON String (sort_keys=True) --- # { # "address": null, # "age": 25, # "departments": [ # "HR", # "Finance" # ], # "is_manager": true, # "name": "Bob", # "projects": [ # { # "id": 1, # "name": "Project Alpha" # }, # { # "id": 2, # "name": "Project Beta" # } # ], # "salary": 50000.5 # }
ensure_ascii: (布尔值) 默认为 True。如果为 True，所有非 ASCII 字符（如中文字符）会被转义为 \uXXXX 的形式。如果设置为 False，这些字符将按原样输出。当需要生成包含原生 Unicode 字符（如中文、日文等）的 JSON 时，应设为 False，并确保后续处理环境（如文件写入、HTTP 响应）能正确处理 UTF-8 编码。
“`python
data_with_unicode = {“城市”: “北京”, “greetings”: “你好世界”}
ascii_json = json.dumps(data_with_unicode, ensure_ascii=True)
unicode_json = json.dumps(data_with_unicode, ensure_ascii=False)

print(f”\n— ensure_ascii=True —“)
print(ascii_json) # 输出: {“\u57ce\u5e02”: “\u5317\u4eac”, “greetings”: “\u4f60\u597d\u4e16\u754c”}

print(f”\n— ensure_ascii=False —“)
print(unicode_json) # 输出: {“城市”: “北京”, “greetings”: “你好世界”}
“`
skipkeys: (布尔值) 默认为 False。JSON 对象的键必须是字符串。如果 Python 字典中包含非字符串类型的键（如整数、元组等），并且 skipkeys 为 False（默认），json.dumps() 会抛出 TypeError。如果设置为 True，这些包含非字符串键的键值对会被直接跳过，不会出现在最终的 JSON 输出中。通常不建议使用非字符串键，设置 skipkeys=True 可能会隐藏数据问题。
“`python
dict_with_non_string_key = {
“string_key”: “value1”,
123: “value2”, # 整数键
(“tuple”, “key”): “value3” # 元组键
}

try:
json.dumps(dict_with_non_string_key)
except TypeError as e:
print(f”\n— Error with skipkeys=False (default): {e} —“)

skipped_json = json.dumps(dict_with_non_string_key, skipkeys=True)
print(f”\n— JSON with skipkeys=True —“)
print(skipped_json)

输出:

— Error with skipkeys=False (default): keys must be str, int, float, bool or None, not tuple —

— JSON with skipkeys=True —

{“string_key”: “value1”} (非字符串键被跳过)

`` *注意：根据Python版本和具体实现，对于整数、浮点数、布尔值或None作为键，行为可能略有不同，但元组等复杂类型通常会触发错误或被跳过（当skipkeys=True`时）。最佳实践是始终使用字符串作为字典键。*
default: (函数) 用于处理 json 模块本身无法直接序列化的 Python 对象类型（例如 datetime 对象、自定义类的实例、set 等）。你需要提供一个函数，该函数接收无法序列化的对象作为参数，并返回一个可以被 JSON 序列化的表示形式（如字符串、字典等）。如果对象仍然无法处理，该函数应抛出 TypeError。
“`python
import datetime
import decimal

complex_data = {
“name”: “Complex Object”,
“timestamp”: datetime.datetime.now(),
“numbers”: {decimal.Decimal(“10.5”), 1, 2}, # set 包含 Decimal
“value”: decimal.Decimal(“99.99”)
}

def custom_serializer(obj):
if isinstance(obj, datetime.datetime):
return obj.isoformat() # 将 datetime 转为 ISO 8601 字符串
elif isinstance(obj, decimal.Decimal):
return float(obj) # 将 Decimal 转为 float (注意精度损失可能)
# 或者 return str(obj) # 转为字符串以保持精度
elif isinstance(obj, set):
return list(obj) # 将 set 转为 list
raise TypeError(f”Object of type {obj.class.name} is not JSON serializable”)

try:
# 尝试直接序列化，会失败
json.dumps(complex_data)
except TypeError as e:
print(f”\n— Direct serialization error: {e} —“)

使用 default 函数进行序列化

serialized_complex = json.dumps(complex_data, default=custom_serializer, indent=4)
print(“\n— Serialized with custom ‘default’ function —“)
print(serialized_complex)

输出可能类似（时间戳和 set 顺序会变化）:

— Direct serialization error: Object of type datetime is not JSON serializable —

— Serialized with custom ‘default’ function —

{

“name”: “Complex Object”,

“timestamp”: “2023-10-27T10:30:00.123456”,

“numbers”: [

1,

2,

10.5

],

“value”: 99.99

}

“`
cls: ( json.JSONEncoder 子类) 这是一个更高级的定制方式。你可以继承 json.JSONEncoder 类并覆盖其 default() 方法来实现复杂的、针对特定类型的序列化逻辑。这比提供 default 函数更具结构化和可复用性。
“`python
class CustomEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, datetime.datetime):
return obj.isoformat()
elif isinstance(obj, decimal.Decimal):
return str(obj) # 使用字符串保持精度
elif isinstance(obj, set):
return sorted(list(obj)) # 排序以保证一致性
# 让基类处理它能处理的类型，或者抛出 TypeError
return super().default(obj)

serialized_with_encoder = json.dumps(complex_data, cls=CustomEncoder, indent=4)
print(“\n— Serialized with custom ‘cls’ (JSONEncoder subclass) —“)
print(serialized_with_encoder)

输出类似，但 Decimal 会是字符串，set 会排序

— Serialized with custom ‘cls’ (JSONEncoder subclass) —

{

“name”: “Complex Object”,

“timestamp”: “2023-10-27T10:35:00.987654”,

“numbers”: [

1,

2,

“10.5”

],

“value”: “99.99”

}

“`

反序列化：将 JSON 字符串转换为 Python 对象 (`json.loads()`)

反序列化（Deserialization 或 Decoding）是将 JSON 格式的字符串解析回 Python 数据结构的过程。json.loads() 函数用于此目的。

基本用法:

“`python
import json

json_string = ‘{“name”: “Charlie”, “age”: 35, “city”: “New York”, “skills”: [“Python”, “Docker”, “AWS”], “active”: true, “metadata”: null}’

使用 loads 进行反序列化

python_object = json.loads(json_string)

print(f”JSON String Type: {type(json_string)}”)
print(f”Python Object Type: {type(python_object)}”)
print(“— Python Object —“)
print(python_object)
print(f”Name: {python_object[‘name’]}, Type: {type(python_object[‘name’])}”)
print(f”Age: {python_object[‘age’]}, Type: {type(python_object[‘age’])}”)
print(f”Skills: {python_object[‘skills’]}, Type: {type(python_object[‘skills’])}”)
print(f”Active: {python_object[‘active’]}, Type: {type(python_object[‘active’])}”)
print(f”Metadata: {python_object[‘metadata’]}, Type: {type(python_object[‘metadata’])}”)

输出:

JSON String Type:

Python Object Type:

— Python Object —

{‘name’: ‘Charlie’, ‘age’: 35, ‘city’: ‘New York’, ‘skills’: [‘Python’, ‘Docker’, ‘AWS’], ‘active’: True, ‘metadata’: None}

Name: Charlie, Type:

Age: 35, Type:

Skills: [‘Python’, ‘Docker’, ‘AWS’], Type:

Active: True, Type:

Metadata: None, Type:

`` 可以看到，JSON 字符串被成功解析为 Python 字典，JSON 的true对应 PythonTrue，null对应None，数组对应list，对象对应dict`。

json.loads() 的常用参数详解:

json.loads() 也提供了一些参数来定制反序列化的行为：

object_hook: (函数) 如果提供，这个函数会在每次解析完一个 JSON 对象（{...}）后被调用，参数是解析得到的 Python 字典。该函数的返回值将替代原来的字典。这对于将 JSON 对象直接转换为自定义类的实例非常有用。
“`python
class User:
def init(self, name, age):
self.name = name
self.age = age
def repr(self):
return f”User(name='{self.name}’, age={self.age})”

json_user_string = ‘{“name”: “David”, “age”: 40, “type“: “User”}’

def decode_user(dct):
if “type” in dct and dct[“type“] == “User”:
return User(dct[‘name’], dct[‘age’])
return dct # 如果不是 User 类型，返回原始字典

user_instance = json.loads(json_user_string, object_hook=decode_user)
print(f”\n— Deserialized with object_hook —“)
print(user_instance)
print(f”Type: {type(user_instance)}”)

输出:

— Deserialized with object_hook —

User(name=’David’, age=40)

Type:

`` *注意:object_hook` 会在嵌套对象的最内层首先被调用。*
parse_float: (函数) 用于自定义 JSON 浮点数（number）的解析方式。该函数接收表示浮点数的字符串作为参数，并返回相应的 Python 对象。例如，可以使用 decimal.Decimal 来避免浮点数精度问题。
“`python
from decimal import Decimal

json_numeric_string = ‘{“price”: 99.95, “quantity”: 100}’

使用 Decimal 解析浮点数

data_with_decimal = json.loads(json_numeric_string, parse_float=Decimal)
print(f”\n— Deserialized with parse_float=Decimal —“)
print(data_with_decimal)
print(f”Price type: {type(data_with_decimal[‘price’])}”) #
print(f”Quantity type: {type(data_with_decimal[‘quantity’])}”) # (整数不受 parse_float 影响)

输出:

— Deserialized with parse_float=Decimal —

{‘price’: Decimal(‘99.95’), ‘quantity’: 100}

Price type:

Quantity type:

“`
parse_int: (函数) 类似于 parse_float，用于自定义 JSON 整数（number）的解析方式。接收整数的字符串表示，返回相应的 Python 对象。
“`python
json_int_string = ‘{“count”: 12345678901234567890, “small_count”: 5}’

def parse_large_int_as_str(num_str):
# 假设我们想把非常大的整数当作字符串处理
if len(num_str) > 15:
return num_str
return int(num_str) # 其他情况正常转为 int

data_with_custom_int = json.loads(json_int_string, parse_int=parse_large_int_as_str)
print(f”\n— Deserialized with custom parse_int —“)
print(data_with_custom_int)
print(f”Large count type: {type(data_with_custom_int[‘count’])}”) #
print(f”Small count type: {type(data_with_custom_int[‘small_count’])}”) #

输出:

— Deserialized with custom parse_int —

{‘count’: ‘12345678901234567890’, ‘small_count’: 5}

Large count type:

Small count type:

“`
parse_constant: (函数) 用于处理 JSON 中的特殊常量 -Infinity, Infinity, 和 NaN。标准 JSON 不支持这些，但某些实现或 JavaScript 环境可能生成它们。默认情况下，json.loads 遇到这些会抛出 ValueError。parse_constant 接收这些常量（作为字符串）并返回相应的 Python 对象（例如 float('-inf'), float('inf'), float('nan')）。
“`python
import math

json_special_constants = ‘{“value1”: Infinity, “value2”: -Infinity, “value3”: NaN}’

def handle_constants(const_str):
if const_str == ‘Infinity’:
return float(‘inf’)
elif const_str == ‘-Infinity’:
return float(‘-inf’)
elif const_str == ‘NaN’:
return float(‘nan’)
raise ValueError(f”Unknown constant: {const_str}”)

Python 3.9+ json.loads 支持直接解析 Infinity, -Infinity, NaN

对于旧版本或需要自定义处理时，使用 parse_constant

try:
# 尝试直接解析（在支持的版本上可能成功）
parsed_constants_direct = json.loads(json_special_constants)
print(“\n— Direct parsing of special constants (if supported) —“)
print(parsed_constants_direct)
except json.JSONDecodeError as e:
print(f”\n— Direct parsing failed (likely unsupported): {e} —“)
# 使用 parse_constant
parsed_constants = json.loads(json_special_constants, parse_constant=handle_constants)
print(“\n— Deserialized with parse_constant —“)
print(parsed_constants)
print(f”Is value1 infinite? {math.isinf(parsed_constants[‘value1’])}”)
print(f”Is value3 NaN? {math.isnan(parsed_constants[‘value3’])}”)

输出（取决于Python版本和运行环境）可能为：

— Direct parsing failed (likely unsupported): Expecting value: line 1 column 11 (char 10) — (如果不支持)

— Deserialized with parse_constant —

{‘value1’: inf, ‘value2’: -inf, ‘value3’: nan}

Is value1 infinite? True

Is value3 NaN? True

或者如果直接解析成功：

— Direct parsing of special constants (if supported) —

{‘value1’: inf, ‘value2’: -inf, ‘value3’: nan}

“`
object_pairs_hook: (函数) 类似于 object_hook，但更底层。它在解析 JSON 对象时被调用，参数是一个包含 (key, value) 对的列表 list[tuple[str, Any]]。这个列表保留了 JSON 对象中键值对的原始顺序（Python 3.7+ 的字典默认保持插入顺序，但 JSON 标准本身不保证对象键的顺序）。object_pairs_hook 的返回值将替代这个对象。这对于需要保留原始顺序（例如使用 collections.OrderedDict）或处理重复键（JSON 标准不建议，但可能遇到）的场景很有用。如果同时提供了 object_hook 和 object_pairs_hook，object_pairs_hook 优先。
“`python
from collections import OrderedDict

json_ordered_string = ‘{“c”: 3, “a”: 1, “b”: 2}’

使用 OrderedDict 保留顺序

ordered_data = json.loads(json_ordered_string, object_pairs_hook=OrderedDict)
print(f”\n— Deserialized with object_pairs_hook=OrderedDict —“)
print(ordered_data)
print(list(ordered_data.keys())) # 验证顺序

输出:

— Deserialized with object_pairs_hook=OrderedDict —

OrderedDict([(‘c’, 3), (‘a’, 1), (‘b’, 2)])

[‘c’, ‘a’, ‘b’]

处理重复键 (示例：取最后一个值)

json_duplicate_keys = ‘{“key”: “value1”, “key”: “value2”}’
def handle_duplicates(pairs):
d = {}
for key, value in pairs:
d[key] = value # 后面的值会覆盖前面的
return d

data_duplicates_handled = json.loads(json_duplicate_keys, object_pairs_hook=handle_duplicates)
print(f”\n— Handling duplicate keys with object_pairs_hook —“)
print(data_duplicates_handled) # {‘key’: ‘value2’}
“`
cls: (json.JSONDecoder 子类) 与 dumps 中的 cls 对应，允许你提供一个 json.JSONDecoder 的子类来自定义整个解码过程。你可以覆盖 decode(), raw_decode() 或利用其中的 object_hook, parse_float 等属性。这提供了最大的灵活性，但通常只在有非常复杂的解码需求时才需要。

与文件交互：`json.dump()` 和 `json.load()`

虽然 dumps 和 loads 处理的是内存中的字符串，但 json 模块还提供了直接与文件（或类文件对象）交互的函数：

json.dump(obj, fp, *, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw): 将 Python 对象 obj 序列化为 JSON 格式并写入到一个文件对象 fp（需要以写入模式打开，如 open('data.json', 'w')）。参数与 dumps 大部分相同。
json.load(fp, *, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw): 从文件对象 fp（需要以读取模式打开，如 open('data.json', 'r')）读取 JSON 数据并将其反序列化为 Python 对象。参数与 loads 大部分相同。

示例:

“`python
import json

准备数据

my_data = {
“id”: 101,
“name”: “Example Data”,
“items”: [1, “two”, 3.0, True]
}

file_path = ‘data.json’

— 使用 dump 写入文件 —

try:
with open(file_path, ‘w’, encoding=’utf-8′) as f:
json.dump(my_data, f, indent=4, ensure_ascii=False) # 写入美化格式，支持 Unicode
print(f”\nData successfully written to {file_path}”)
except IOError as e:
print(f”Error writing to file: {e}”)

— 使用 load 读取文件 —

try:
with open(file_path, ‘r’, encoding=’utf-8′) as f:
loaded_data = json.load(f)
print(f”\nData successfully loaded from {file_path}:”)
print(loaded_data)
print(f”Type of loaded data: {type(loaded_data)}”)
except FileNotFoundError:
print(f”Error: File ‘{file_path}’ not found.”)
except json.JSONDecodeError as e:
print(f”Error decoding JSON from file: {e}”)
except IOError as e:
print(f”Error reading from file: {e}”)

data.json 文件内容会是:

{

“id”: 101,

“name”: “Example Data”,

“items”: [

1,

“two”,

3.0,

true

]

}

输出:

Data successfully written to data.json

Data successfully loaded from data.json:

{‘id’: 101, ‘name’: ‘Example Data’, ‘items’: [1, ‘two’, 3.0, True]}

Type of loaded data:

`` 使用dump和load通常比先dumps到字符串再写入文件（或先读取整个文件到字符串再loads`）更高效，特别是对于大数据量，因为它们可以流式处理数据，避免将整个内容加载到内存。

错误处理

在使用 json 模块时，可能会遇到以下常见错误：

TypeError: 在序列化 (dumps 或 dump) 时发生，通常是因为尝试序列化一个 json 模块不支持的数据类型（如自定义对象、set、datetime 等），并且没有提供 default 函数或 cls 来处理它，或者字典键不是字符串且 skipkeys=False。
python my_set = {1, 2, 3} try: json.dumps(my_set) except TypeError as e: print(f"\nTypeError during dumps: {e}") # Object of type set is not JSON serializable
json.JSONDecodeError: 在反序列化 (loads 或 load) 时发生，表示输入的字符串或文件内容不是有效的 JSON 格式。错误消息通常会指示问题发生的位置（行号、列号）。
python invalid_json_string = '{"name": "Eve", "age": 28, city: "London"}' # city 的值 "London" 没有引号 try: json.loads(invalid_json_string) except json.JSONDecodeError as e: print(f"\nJSONDecodeError during loads: {e}") # Expecting property name enclosed in double quotes: line 1 column 30 (char 29)
ValueError: (较少见，在旧版本或特定情况下) loads 遇到不支持的常量（如 NaN, Infinity）且没有 parse_constant 时可能抛出。现在更常见的是 JSONDecodeError。
RecursionError: 如果尝试序列化的 Python 对象包含循环引用（例如 a = []; b = {'list': a}; a.append(b)），并且 check_circular=True (默认值)，dumps 会检测到并抛出 ValueError (或 RecursionError 在深度嵌套时)。如果 check_circular=False，则可能导致无限递归和 RecursionError。

进行 JSON 操作时，特别是处理来自外部源（文件、网络请求）的数据时，务必使用 try...except 块来捕获这些潜在的错误，以确保程序的健壮性。

高级话题与最佳实践

性能: 对于性能敏感的应用，标准库 json 可能不是最快的选择。第三方库如 ujson 和 orjson 通常提供显著更快的序列化和反序列化速度，尤其是在处理大量数据时。它们通常具有与标准库兼容的 API。
python # 示例（需要先安装 pip install ujson orjson） # import ujson # import orjson # fast_json_string = ujson.dumps(python_data) # faster_python_object = orjson.loads(json_string_bytes) # orjson 通常处理 bytes
安全性:
- 永远不要使用 eval() 来解析来自不可信来源的 JSON 字符串！ eval() 可以执行任意 Python 代码，存在严重的安全风险。始终使用 json.loads() 或 json.load()，它们只会解析数据，不会执行代码。
- 警惕 “JSON Flooding” 或 “Billion Laughs” 类型的攻击，即构造深度嵌套或包含大量重复键的 JSON，可能耗尽解析器的内存或 CPU 资源。在处理来自外部的不可信 JSON 时，考虑限制输入大小或嵌套深度。
自定义对象的序列化/反序列化:
- 序列化: 使用 default 参数或 JSONEncoder 子类是标准做法。可以约定在对象字典中加入一个特殊字段（如 __type__ 或 _class_) 来标记其原始类型。
- 反序列化: 使用 object_hook 或 object_pairs_hook 检查特殊类型字段，并据此重建相应的 Python 对象实例。
编码: 处理包含非 ASCII 字符的 JSON 时，确保：
- 使用 ensure_ascii=False 进行 dumps 或 dump 以获得人类可读的 Unicode 字符。
- 在读写文件时，明确指定 encoding='utf-8' (或其他合适编码)。
- 在 HTTP 响应中，设置正确的 Content-Type 头，如 application/json; charset=utf-8。
一致性: 使用 sort_keys=True 可以确保 JSON 输出对于相同内容的 Python 字典总是一致的，方便比较和测试。
可读性: 开发和调试时，使用 indent 参数生成格式化的 JSON 输出，极大提高可读性。在生产环境中为了效率和带宽，通常不使用 indent。
选择 loads/dumps vs load/dump: 优先使用 load/dump 直接处理文件流，除非确实需要先在内存中操作完整的 JSON 字符串。

结论

Python 的 json 模块为处理 JSON 数据提供了强大而灵活的工具集。json.dumps() 和 json.loads() 作为核心函数，分别负责将 Python 对象序列化为 JSON 字符串和将 JSON 字符串反序列化为 Python 对象。通过理解它们的工作原理、丰富的参数选项（如 indent, sort_keys, ensure_ascii, default, object_hook, parse_float 等），以及与文件操作函数 json.dump() 和 json.load() 的配合，开发者可以高效、安全地在 Python 应用程序中集成 JSON 数据交换能力。

掌握 JSON 序列化与反序列化是现代 Python 开发者的基本技能。无论是构建 Web API、处理配置文件，还是与其他系统进行数据交互，对 json 模块的深入理解都将大有裨益。通过实践这些知识并遵循最佳实践，你可以编写出更健壮、更高效、更易于维护的 Python 代码。

Python JSON 序列化与反序列化 (loads/dumps) 深度详解

引言

JSON 基础回顾

Python json 模块与数据类型映射

序列化：将 Python 对象转换为 JSON 字符串 (json.dumps())

一个包含各种 Python 数据类型的字典

使用 dumps 进行序列化

输出:

Python Data Type:

JSON String Type:

— JSON String —

{“name”: “Bob”, “age”: 25, “salary”: 50000.5, “is_manager”: true, “departments”: [“HR”, “Finance”], “projects”: [{“id”: 1, “name”: “Project Alpha”}, {“id”: 2, “name”: “Project Beta”}], “address”: null}

输出:

— Error with skipkeys=False (default): keys must be str, int, float, bool or None, not tuple —

— JSON with skipkeys=True —

{“string_key”: “value1”} (非字符串键被跳过)

使用 default 函数进行序列化

输出可能类似（时间戳和 set 顺序会变化）:

— Direct serialization error: Object of type datetime is not JSON serializable —

— Serialized with custom ‘default’ function —

{

“name”: “Complex Object”,

“timestamp”: “2023-10-27T10:30:00.123456”,

“numbers”: [

1,

2,

10.5

],

“value”: 99.99

}

输出类似，但 Decimal 会是字符串，set 会排序

— Serialized with custom ‘cls’ (JSONEncoder subclass) —

{

“name”: “Complex Object”,

“timestamp”: “2023-10-27T10:35:00.987654”,

“numbers”: [

1,

2,

“10.5”

],

“value”: “99.99”

}

反序列化：将 JSON 字符串转换为 Python 对象 (json.loads())

使用 loads 进行反序列化

输出:

JSON String Type:

Python Object Type:

— Python Object —

{‘name’: ‘Charlie’, ‘age’: 35, ‘city’: ‘New York’, ‘skills’: [‘Python’, ‘Docker’, ‘AWS’], ‘active’: True, ‘metadata’: None}

Name: Charlie, Type:

Age: 35, Type:

Skills: [‘Python’, ‘Docker’, ‘AWS’], Type:

Active: True, Type:

Metadata: None, Type:

输出:

— Deserialized with object_hook —

User(name=’David’, age=40)

Type:

使用 Decimal 解析浮点数

输出:

— Deserialized with parse_float=Decimal —

{‘price’: Decimal(‘99.95’), ‘quantity’: 100}

Price type:

Quantity type:

输出:

— Deserialized with custom parse_int —

{‘count’: ‘12345678901234567890’, ‘small_count’: 5}

Large count type:

Small count type:

Python 3.9+ json.loads 支持直接解析 Infinity, -Infinity, NaN

对于旧版本或需要自定义处理时，使用 parse_constant

输出（取决于Python版本和运行环境）可能为：

— Direct parsing failed (likely unsupported): Expecting value: line 1 column 11 (char 10) — (如果不支持)

— Deserialized with parse_constant —

{‘value1’: inf, ‘value2’: -inf, ‘value3’: nan}

Is value1 infinite? True

Is value3 NaN? True

或者如果直接解析成功：

— Direct parsing of special constants (if supported) —

{‘value1’: inf, ‘value2’: -inf, ‘value3’: nan}

Python `json` 模块与数据类型映射

序列化：将 Python 对象转换为 JSON 字符串 (`json.dumps()`)

反序列化：将 JSON 字符串转换为 Python 对象 (`json.loads()`)

与文件交互：`json.dump()` 和 `json.load()`

发表评论取消回复