How to Convert CSV to JSON in Python: Flat, Nested & NDJSON

Convert CSV to JSON Python Tutorial

CSV (Comma-Separated Values) is a widely used format for tabular data. It’s lightweight, easy to generate, and common in reports, logs, exports, and data feeds. However, modern web applications, APIs, and NoSQL databases prefer JSON for its hierarchical structure, flexibility, and compatibility with JavaScript.

Converting CSV to JSON in Python is a practical skill for developers who need to:

  • Prepare CSV data for APIs and web services
  • Migrate CSV exports into NoSQL databases like MongoDB
  • Transform flat CSV tables into nested JSON objects
  • Enable data exchange between systems that require hierarchical formats

This step-by-step tutorial shows you how to convert CSV files to JSON in Python, including flat JSON, nested JSON, JSON with grouped data, and JSON Lines (NDJSON). By the end, you’ll be able to transform raw CSV datasets into well-structured JSON ready for APIs, applications, or data pipelines.

Table of Contents

Why Convert CSV to JSON?

CSV files are lightweight and tabular, but they lack hierarchy. JSON allows structured, nested data ideal for APIs and applications. Converting CSV to JSON enables:

  • API Integration: Most APIs prefer JSON over CSV
  • Flexible Data Structures: JSON supports nested objects
  • Web Development: JSON works natively with JavaScript
  • Database Migration: NoSQL and cloud databases often require JSON
  • Automation: Python scripts can process JSON efficiently

Python CSV to JSON Converter – Installation

To convert CSV files to JSON in Python, this tutorial uses Spire.XLS for Python to read CSV files and Python’s built-in json module to handle JSON conversion.

Why Spire.XLS?

It simplifies working with CSV files by allowing you to:

  • Load CSV files into a workbook structure for easy access to rows and columns
  • Extract and manipulate data efficiently, cell by cell
  • Convert CSV to JSON in flat, nested, or NDJSON formats
  • Export CSV to Excel, PDF, and other formats if needed

Install Spire.XLS

You can install the library directly from PyPI using pip:

pip install spire.xls

If you need detailed guidance on the installation, refer to this tutorial: How to Install Spire.XLS for Python on Windows.

Once installed, you’re ready to convert CSV data into different JSON formats.

Convert CSV to Flat JSON in Python

Converting a CSV file to flat JSON turns each row into a separate JSON object and uses the first row as keys, making the data organized and easy to work with.

Steps to Convert CSV to Flat JSON

  • Load the CSV file into a workbook using Workbook.LoadFromFile.
  • Select the worksheet.
  • Extract headers from the first row.
  • Iterate through each subsequent row to map values to headers.
  • Append each row dictionary to a list.
  • Write the list to a JSON file using json.dump.

Code Example

from spire.xls import *
import json

# Load the CSV file into a workbook object
workbook = Workbook()
workbook.LoadFromFile("employee.csv", ",")

# Select the desired worksheet
sheet = workbook.Worksheets[0]

# Extract headers from the first row
headers = [sheet.Range[1, j].Text for j in range(1, sheet.LastColumn + 1)]

# Map the subsequent CSV rows to JSON objects
data = []
for i in range(2, sheet.LastRow + 1):
    row = {headers[j-1]: sheet.Range[i, j].Text for j in range(1, sheet.LastColumn + 1)}
    data.append(row)

# Write JSON to file
with open("output_flat.json", "w", encoding="utf-8") as f:
    json.dump(data, f, indent=4, ensure_ascii=False)

# Clean up resources
workbook.Dispose()

Output JSON

Python CSV to Flat JSON example

Convert CSV to Nested JSON in Python

When a single CSV row contains related columns, you can combine these columns into nested JSON objects. For example, merging the Street and City columns into an Address object. Each CSV row produces one JSON object, which can include one or more nested dictionaries. This approach is ideal for scenarios requiring hierarchical data within a single record, such as API responses or application configurations.

Steps to Convert CSV to Nested JSON

  • Load the CSV file and select the worksheet.
  • Decide which columns should form a nested object (e.g., street and city).
  • Iterate over rows and construct each JSON object with a sub-object for nested fields.
  • Append each nested object to a list.
  • Write the list to a JSON file with json.dump.

Code Example

from spire.xls import *
import json

# Create a Workbook instance and load the CSV file (using comma as the delimiter)
workbook = Workbook()
workbook.LoadFromFile("data.csv", ",")

# Get the first worksheet from the workbook
sheet = workbook.Worksheets[0]

# List to store the converted JSON data
data = []

# Loop through rows starting from the second row (assuming the first row contains headers)
for i in range(2, sheet.LastRow + 1):
    # Map each row into a JSON object, including a nested "Address" object
    row = {
        "ID": sheet.Range[i, 1].Text,         # Column 1: ID
        "Name": sheet.Range[i, 2].Text,       # Column 2: Name
        "Address": {                          # Nested object for address
            "Street": sheet.Range[i, 3].Text, # Column 3: Street
            "City": sheet.Range[i, 4].Text    # Column 4: City
        }
    }
    # Add the JSON object to the list
    data.append(row)

# Write the JSON data to a file with indentation for readability
with open("output_nested.json", "w", encoding="utf-8") as f:
    json.dump(data, f, indent=4, ensure_ascii=False)

# Release resources used by the workbook
workbook.Dispose()

Output Nested JSON

Python CSV to Nested JSON example

Convert CSV to JSON with Grouped Data

When multiple CSV rows belong to the same parent entity, you can group these rows under a single parent object. For example, an order with multiple items can store all items in an items array under one order object. Each parent object has a unique key (like order_id), and its child rows are aggregated into an array. This method is useful for e-commerce orders, data pipelines, or any scenario requiring grouped hierarchical data across multiple rows.

Steps to Convert CSV to JSON with Grouped Data

  • Use defaultdict to group rows by a parent key (order_id).
  • Iterate rows and append child items to the parent object.
  • Convert the grouped dictionary to a list of objects.
  • Write the JSON file.

Code Example

from collections import defaultdict
from spire.xls import *
import json

# Create a Workbook instance and load the CSV file (comma-separated)
workbook = Workbook()
workbook.LoadFromFile("orders.csv", ",")

# Get the first worksheet from the workbook
sheet = workbook.Worksheets[0]

# Use defaultdict to store grouped data
# Each order_id maps to a dictionary with customer name and a list of items
data = defaultdict(lambda: {"customer": "", "items": []})

# Loop through rows starting from the second row (skip header row)
for i in range(2, sheet.LastRow + 1):
    order_id = sheet.Range[i, 1].Text      # Column 1: Order ID
    customer = sheet.Range[i, 2].Text      # Column 2: Customer
    item = sheet.Range[i, 3].Text          # Column 3: Item
    
    # Assign customer name (same for all rows with the same order_id)
    data[order_id]["customer"] = customer
    # Append item to the order's item list
    data[order_id]["items"].append(item)

# Convert the grouped dictionary into a list of objects
# Each object contains order_id, customer, and items
result = [{"order_id": oid, **details} for oid, details in data.items()]

# Write the grouped data to a JSON file with indentation for readability
with open("output_grouped.json", "w", encoding="utf-8") as f:
    json.dump(result, f, indent=4, ensure_ascii=False)

# Release resources used by the workbook
workbook.Dispose()

Output JSON with Grouped Data

Python CSV to Grouped JSON example

If you're also interested in saving JSON back to CSV, follow our guide on converting JSON to CSV in Python.

Convert CSV to JSON Lines (NDJSON) in Python

JSON Lines (also called NDJSON – Newline Delimited JSON) is a format where each line is a separate JSON object. It is ideal for large datasets, streaming, and big data pipelines.

Why use NDJSON?

  • Streaming-friendly: Process one record at a time without loading the entire file into memory.
  • Big data compatibility: Tools like Elasticsearch, Logstash, and Hadoop natively support NDJSON.
  • Error isolation: If one line is corrupted, the rest of the file remains valid.

Code Example

from spire.xls import *
import json

# Create a Workbook instance and load the CSV file (comma-separated)
workbook = Workbook()
workbook.LoadFromFile("employee.csv", ",")

# Get the first worksheet from the workbook
sheet = workbook.Worksheets[0]

# Extract headers from the first row to use as JSON keys
headers = [sheet.Range[1, j].Text for j in range(1, sheet.LastColumn + 1)]

# Open a file to write JSON Lines (NDJSON) format
with open("output.ndjson", "w", encoding="utf-8") as f:
    # Loop through each row in the worksheet, starting from the second row
    for i in range(2, sheet.LastRow + 1):
        # Map each header to its corresponding cell value for the current row
        row = {headers[j - 1]: sheet.Range[i, j].Text for j in range(1, sheet.LastColumn + 1)}
        
        # Write the JSON object to the file followed by a newline
        # Each line is a separate JSON object (NDJSON format)
        f.write(json.dumps(row, ensure_ascii=False) + "\n")

# Release resources used by the workbook
workbook.Dispose()

Output NDJSON

Python CSV to NDJSON example

Handle Large CSV Files to JSON Conversion

For large CSV files, it’s not always efficient to load everything into memory at once. With Spire.XLS, you can still load the file as a worksheet, but instead of appending everything into a list, you can process rows in chunks and write them to JSON incrementally. This technique minimizes memory usage, making it suitable for big CSV to JSON conversion in Python.

Code Example

from spire.xls import *
import json

# Create a Workbook instance and load the CSV file (comma-separated)
workbook = Workbook()
workbook.LoadFromFile("large.csv", ",")

# Get the first worksheet from the workbook
sheet = workbook.Worksheets[0]

# Open a JSON file for writing, with UTF-8 encoding
with open("large.json", "w", encoding="utf-8") as json_file:
    json_file.write("[\n")  # Start the JSON array
    first = True  # Flag to handle commas between JSON objects

    # Loop through each row in the worksheet, starting from the second row
    # (skip the header row)
    for i in range(2, sheet.LastRow + 1):
        # Create a dictionary mapping each header to its corresponding cell value
        row = {sheet.Range[1, j].Text: sheet.Range[i, j].Text
               for j in range(1, sheet.LastColumn + 1)}

        # Add a comma before the object if it is not the first row
        if not first:
            json_file.write(",\n")

        # Write the JSON object to the file
        json.dump(row, json_file, ensure_ascii=False)
        first = False  # After the first row, set the flag to False

    json_file.write("\n]")  # End the JSON array

# Release resources used by the workbook
workbook.Dispose()

Best Practices for CSV to JSON Conversion

When converting CSV to JSON in Python, follow these best practices can ensure data integrity and compatibility:

  • Always Use CSV headers as JSON keys.
  • Handle missing values with null or default values.
  • Normalize data types (convert numeric strings to integers or floats).
  • Use UTF-8 encoding for JSON files.
  • Stream large CSV files row by row to reduce memory usage.
  • Validate JSON structure after writing, especially for nested JSON.

Conclusion

Converting CSV to JSON in Python helps you work with data more efficiently and adapt it for modern applications. Using Python and libraries like Spire.XLS for Python, you can:

  • Convert flat CSV files into structured JSON objects.
  • Organize related CSV data into nested JSON structures.
  • Group multiple CSV rows into coherent JSON objects for analysis or APIs.
  • Create JSON Lines (NDJSON) for large datasets or streaming scenarios.
  • Process large CSV files efficiently without loading everything into memory.

These approaches let you handle CSV data in a way that fits your workflow, making it easier to prepare, share, and analyze data for APIs, applications, or big data pipelines.

FAQs

Q1: How do I convert CSV to JSON with headers in Python?

A1: If your CSV has headers, use the first row as keys and map subsequent rows to dictionaries. With Spire.XLS, you can access sheet.Range[1, j].Text for headers.

Q2: How do I convert CSV to nested JSON in Python?

A2: Identify related columns (e.g., Street and City) and group them into a sub-object when building JSON. See the Nested JSON example above.

Q3: What’s the best way to handle large CSV files when converting to JSON?

A3: Use a streaming approach where each row is processed and written to JSON immediately, instead of storing everything in memory.

Q4: Can Spire.XLS handle CSV files with different delimiters?

A4: Yes, when loading the CSV with Spire.XLS’s LoadFromFile method, specify the delimiter (e.g., "," or ";").

Q5: How to convert JSON back to CSV in Python?

A5: Use Python’s json module to read the JSON file into a list of dictionaries, then write it back to CSV using Spire.XLS for Python for advanced formatting and export options.

Q6: How to convert CSV to JSON Lines (NDJSON) in Python?

A6: JSON Lines (NDJSON) writes each JSON object on a separate line. Stream each CSV row to the output file line by line, which is memory-efficient and compatible with big data pipelines like Elasticsearch or Logstash.