Pretty Print JSON in Python

Posted in Python by Dirk - last update: Jan 04, 2024

Summary

To pretty print JSON in Python, you can use the json module along with the dumps method. Set the indent parameter to control the number of spaces for indentation. For better readability, you can also use the sort_keys parameter to sort the keys alphabetically.

What is JSON

JSON (JavaScript Object Notation) is a lightweight data interchange format that is easy for humans to read and write, and easy for machines to parse and generate. It is a text format that is language-independent, making it a popular choice for data exchange between different programming languages.

Use Cases of JSON in Python

Data Exchange: JSON is commonly used to exchange data between a web server and a web client (browser). Python can use JSON for communication in web applications.
Configuration Files: JSON is often used for configuration files in Python applications, providing a structured way to store and retrieve settings.
API Responses: Many APIs return data in JSON format, and Python can easily parse and work with this data using the json module.

Why Prettify JSON

Prettifying JSON makes it more readable and visually organized, especially when dealing with complex nested structures. The output represents the original JSON data with indentation and sorted keys, making it easier to read and understand the structure. It is useful for debugging, understanding the structure of data, and sharing information with others.

Methods to Pretty Print JSON in Python

Using json.dumps:

To convert a JSON object to a pretty print JSON string, the json.dumps() method can be used

json.dumps(obj, *, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw)

The json.dumps method converts a Python object to a JSON-formatted string. Most important parameters:

The indent parameter specifies the number of spaces to use for indentation. If None (default), no indentation is used.
The sort_keys parameter, when set to True, sorts the keys alphabetically (default: False)

Other parameters

skipkeys: If True, dict keys that are not basic types (str, int, float, bool, None) will be skipped during serialization. Default is False.
ensure_ascii: If True (default), the output is guaranteed to have only ASCII characters. If False, non-ASCII characters will be encoded as Unicode escape sequences.
check_circular: If True (default), it checks for circular references while serializing. If False, it ignores circular references but may result in infinite recursion.
allow_nan: If False (default), it raises a ValueError if NaN or Infinity is encountered during serialization. If True, they are serialized as “null”.
cls: If specified, this should be a subclass of json.JSONEncoder. It allows customization of the JSON serialization for non-basic types.
separators: A tuple specifying the separators for the JSON string. Default is (’,’, ‘:’). default: If specified, it should be a function that gets called for objects that are not serializable. It should return a serializable version of the object. sort_keys: If True, the output dictionary keys are sorted in lexicographical order. Default is False.
kw: Additional keyword arguments. These are passed through to the underlying json.JSONEncoder class.

import json

data = {"name": "John", "age": 30, "city": "New York"}
pretty_json = json.dumps(data, indent=4, sort_keys=True)
print(pretty_json)

Output:

{
    "age": 30,
    "city": "New York",
    "name": "John"
}

The example above is very simple, but it works on more complex objects as well:

data = {
    "person": {
        "name": "Alice",
        "age": 28,
        "address": {
            "city": "Wonderland",
            "zipcode": "12345"
        },
        "contacts": [
            {"type": "email", "value": "[email protected]"},
            {"type": "phone", "value": "+123456789"}
        ]
    },
    "projects": [
        {"title": "Project A", "status": "ongoing", "members": ["Alice", "Bob", "Charlie"]},
        {"title": "Project B", "status": "completed", "members": ["Alice", "David"]}
    ]
}

Output:

{
    "person": {
        "name": "Alice",
        "age": 28,
        "address": {
            "city": "Wonderland",
            "zipcode": "12345"
        },
        "contacts": [
            {
                "type": "email",
                "value": "[email protected]"
            },
            {
                "type": "phone",
                "value": "+123456789"
            }
        ]
    },
    "projects": [
        {
            "title": "Project A",
            "status": "ongoing",
            "members": ["Alice", "Bob", "Charlie"]
        },
        {
            "title": "Project B",
            "status": "completed",
            "members": ["Alice", "David"]
        }
    ]
}

In this example, the JSON structure includes nested objects, arrays, and various data types. Pretty printing helps to visually organize the data, making it easier to navigate and comprehend the hierarchical relationships within the JSON structure.

Using json.dump with a File:

The function json.dump is very similar to json.dumps but writes the JSON directly to a file

import json

data = {"name": "John", "age": 30, "city": "New York"}
with open("output.json", "w") as file:
    json.dump(data, file, indent=4, sort_keys=True)

Potential issues

When using json.dumps (or json.dump’) in Python, there are several potential issues or challenges that you may encounter:

Circular References:

If your data contains circular references (objects referencing each other in a loop), json.dumps may raise a TypeError. To handle this, you can use the default parameter with a custom serialization function.

Short:

import json

class CustomEncoder(json.JSONEncoder):
    def default(self, obj):
        # Handle custom serialization, e.g., for objects with circular references
        pass

data = {"key": "value", "circular_ref": some_object}

pretty_json = json.dumps(data, cls=CustomEncoder, indent=4)

An example:

Circular references occur when there is a loop in the object references. For instance, Object A references Object B, and Object B references Object A, creating a loop. This can cause issues when serializing the objects to JSON using json.dumps because it expects a tree-like structure, and a loop violates this expectation.

Here’s an example to illustrate the issue and how to handle it:

import json

class Person:
    def __init__(self, name):
        self.name = name
        self.friend = None  # Circular reference

# Creating objects with a circular reference
alice = Person("Alice")
bob = Person("Bob")
alice.friend = bob
bob.friend = alice

# Attempting to serialize the objects with json.dumps without handling circular references
try:
    json_data = json.dumps({"alice": alice, "bob": bob}, indent=4)
except TypeError as e:
    print(f"Error: {e}")

In this example, we have two Person objects, Alice and Bob, and each has a friend attribute referring to the other, creating a circular reference. If we try to directly serialize these objects using json.dumps, it will raise a TypeError:

Error: Object of type Person is not JSON serializable

To handle this circular reference issue, we can provide a custom serialization function using the default parameter of json.dumps. We’ll create a custom encoder class that extends json.JSONEncoder and override the default method to handle the serialization of our custom objects:

class PersonEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, Person):
            # Return a dictionary representation of the Person object
            return {"name": obj.name, "friend": obj.friend.name if obj.friend else None}
        return super().default(obj)

# Serialize objects using the custom encoder
json_data = json.dumps({"alice": alice, "bob": bob}, cls=PersonEncoder, indent=4)
print(json_data)

In this example, the PersonEncoder class defines a default method that checks if the object is an instance of the Person class. If it is, it returns a dictionary representation of the Person object, handling the circular reference by representing the friend attribute as the name of the friend. The super().default(obj) call is used to handle other types not explicitly handled in the custom encoder.

This way, we can successfully serialize objects with circular references to JSON by providing a custom serialization strategy.

Non-serializable Types

Some Python data types are not serializable to JSON by default. Examples include datetime objects. You can handle this by providing a custom serialization function or using the default param

import json
from datetime import datetime

data = {"timestamp": datetime.now()}

def serialize_datetime(obj):
    if isinstance(obj, datetime):
        return obj.isoformat()

pretty_json = json.dumps(data, default=serialize_datetime, indent=4)

Encoding Issues:

Ensure that your data is properly encoded. Non-ASCII characters might cause encoding issues. You can use the ensure_ascii parameter to control this.

data = {"name": "José"}

pretty_json = json.dumps(data, ensure_ascii=False, indent=4)

Error handling

Be aware that json.dumps doesn’t catch all errors. For instance, if you have a custom serialization function specified with the default parameter, make sure it handles all possible scenarios to avoid unexpected errors.

def custom_serializer(obj):
    if isinstance(obj, some_custom_type):
        # Handle serialization for custom type
        pass
    else:
        raise TypeError(f"Object of type {type(obj)} is not JSON serializable")

data = {"key": "value", "custom_obj": some_custom_object}

pretty_json = json.dumps(data, default=custom_serializer, indent=4)