Pretty Print JSON in Python
Posted in Python by Dirk - last update: Jan 04, 2024
Summary
To pretty print JSON in Python, you can use the json
module along with the dumps
method. Set the indent
parameter to control the number of spaces for indentation. For better readability, you can also use the sort_keys
parameter to sort the keys alphabetically.
What is JSON
JSON (JavaScript Object Notation) is a lightweight data interchange format that is easy for humans to read and write, and easy for machines to parse and generate. It is a text format that is language-independent, making it a popular choice for data exchange between different programming languages.
Use Cases of JSON in Python
- Data Exchange: JSON is commonly used to exchange data between a web server and a web client (browser). Python can use JSON for communication in web applications.
- Configuration Files: JSON is often used for configuration files in Python applications, providing a structured way to store and retrieve settings.
- API Responses: Many APIs return data in JSON format, and Python can easily parse and work with this data using the json module.
Why Prettify JSON
Prettifying JSON makes it more readable and visually organized, especially when dealing with complex nested structures. The output represents the original JSON data with indentation and sorted keys, making it easier to read and understand the structure. It is useful for debugging, understanding the structure of data, and sharing information with others.
Methods to Pretty Print JSON in Python
Using json.dumps:
To convert a JSON object to a pretty print JSON string, the json.dumps() method can be used
json.dumps(obj, *, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw)
The json.dumps
method converts a Python object to a JSON-formatted string.
Most important parameters:
- The
indent
parameter specifies the number of spaces to use for indentation. If None (default), no indentation is used.
- The
sort_keys
parameter, when set to True, sorts the keys alphabetically (default: False)
Other parameters
skipkeys
: If True, dict keys that are not basic types (str, int, float, bool, None) will be skipped during serialization. Default is False.
ensure_ascii
: If True (default), the output is guaranteed to have only ASCII characters. If False, non-ASCII characters will be encoded as Unicode escape sequences.
check_circular
: If True (default), it checks for circular references while serializing. If False, it ignores circular references but may result in infinite recursion.
allow_nan
: If False (default), it raises a ValueError if NaN or Infinity is encountered during serialization. If True, they are serialized as “null”.
cls
: If specified, this should be a subclass of json.JSONEncoder. It allows customization of the JSON serialization for non-basic types.
separators
: A tuple specifying the separators for the JSON string. Default is (’,’, ‘:’).
default: If specified, it should be a function that gets called for objects that are not serializable. It should return a serializable version of the object.
sort_keys: If True, the output dictionary keys are sorted in lexicographical order. Default is False.
kw
: Additional keyword arguments. These are passed through to the underlying json.JSONEncoder class.
import json
data = {"name": "John", "age": 30, "city": "New York"}
pretty_json = json.dumps(data, indent=4, sort_keys=True)
print(pretty_json)
Output:
{
"age": 30,
"city": "New York",
"name": "John"
}
The example above is very simple, but it works on more complex objects as well:
data = {
"person": {
"name": "Alice",
"age": 28,
"address": {
"city": "Wonderland",
"zipcode": "12345"
},
"contacts": [
{"type": "email", "value": "[email protected]"},
{"type": "phone", "value": "+123456789"}
]
},
"projects": [
{"title": "Project A", "status": "ongoing", "members": ["Alice", "Bob", "Charlie"]},
{"title": "Project B", "status": "completed", "members": ["Alice", "David"]}
]
}
Output:
{
"person": {
"name": "Alice",
"age": 28,
"address": {
"city": "Wonderland",
"zipcode": "12345"
},
"contacts": [
{
"type": "email",
"value": "[email protected]"
},
{
"type": "phone",
"value": "+123456789"
}
]
},
"projects": [
{
"title": "Project A",
"status": "ongoing",
"members": ["Alice", "Bob", "Charlie"]
},
{
"title": "Project B",
"status": "completed",
"members": ["Alice", "David"]
}
]
}
In this example, the JSON structure includes nested objects, arrays, and various data types. Pretty printing helps to visually organize the data, making it easier to navigate and comprehend the hierarchical relationships within the JSON structure.
Using json.dump with a File:
The function json.dump
is very similar to json.dumps
but writes the JSON directly to a file
import json
data = {"name": "John", "age": 30, "city": "New York"}
with open("output.json", "w") as file:
json.dump(data, file, indent=4, sort_keys=True)
Potential issues
When using json.dumps
(or json.dump
’) in Python, there are several potential issues or challenges that you may encounter:
Circular References:
If your data contains circular references (objects referencing each other in a loop), json.dumps
may raise a TypeError
. To handle this, you can use the default
parameter with a custom serialization function.
Short:
import json
class CustomEncoder(json.JSONEncoder):
def default(self, obj):
# Handle custom serialization, e.g., for objects with circular references
pass
data = {"key": "value", "circular_ref": some_object}
pretty_json = json.dumps(data, cls=CustomEncoder, indent=4)
An example:
Circular references occur when there is a loop in the object references. For instance, Object A references Object B, and Object B references Object A, creating a loop. This can cause issues when serializing the objects to JSON using json.dumps because it expects a tree-like structure, and a loop violates this expectation.
Here’s an example to illustrate the issue and how to handle it:
import json
class Person:
def __init__(self, name):
self.name = name
self.friend = None # Circular reference
# Creating objects with a circular reference
alice = Person("Alice")
bob = Person("Bob")
alice.friend = bob
bob.friend = alice
# Attempting to serialize the objects with json.dumps without handling circular references
try:
json_data = json.dumps({"alice": alice, "bob": bob}, indent=4)
except TypeError as e:
print(f"Error: {e}")
In this example, we have two Person
objects, Alice and Bob, and each has a friend
attribute referring to the other, creating a circular reference. If we try to directly serialize these objects using json.dumps, it will raise a TypeError:
Error: Object of type Person is not JSON serializable
To handle this circular reference issue, we can provide a custom serialization function using the default
parameter of json.dumps
. We’ll create a custom encoder class that extends json.JSONEncoder
and override the default
method to handle the serialization of our custom objects:
class PersonEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, Person):
# Return a dictionary representation of the Person object
return {"name": obj.name, "friend": obj.friend.name if obj.friend else None}
return super().default(obj)
# Serialize objects using the custom encoder
json_data = json.dumps({"alice": alice, "bob": bob}, cls=PersonEncoder, indent=4)
print(json_data)
In this example, the PersonEncoder
class defines a default method that checks if the object is an instance of the Person class. If it is, it returns a dictionary representation of the Person object, handling the circular reference by representing the friend attribute as the name of the friend. The super().default(obj) call is used to handle other types not explicitly handled in the custom encoder.
This way, we can successfully serialize objects with circular references to JSON by providing a custom serialization strategy.
Non-serializable Types
Some Python data types are not serializable to JSON by default. Examples include datetime
objects. You can handle this by providing a custom serialization function or using the default
param
import json
from datetime import datetime
data = {"timestamp": datetime.now()}
def serialize_datetime(obj):
if isinstance(obj, datetime):
return obj.isoformat()
pretty_json = json.dumps(data, default=serialize_datetime, indent=4)
Encoding Issues:
Ensure that your data is properly encoded. Non-ASCII characters might cause encoding issues. You can use the ensure_ascii
parameter to control this.
data = {"name": "José"}
pretty_json = json.dumps(data, ensure_ascii=False, indent=4)
Error handling
Be aware that json.dumps
doesn’t catch all errors. For instance, if you have a custom serialization function specified with the default
parameter, make sure it handles all possible scenarios to avoid unexpected errors.
def custom_serializer(obj):
if isinstance(obj, some_custom_type):
# Handle serialization for custom type
pass
else:
raise TypeError(f"Object of type {type(obj)} is not JSON serializable")
data = {"key": "value", "custom_obj": some_custom_object}
pretty_json = json.dumps(data, default=custom_serializer, indent=4)
Other articles