Handling CSV and JSON Files in Python
Table of Contents
Part 1: Handling CSV Files
1.1 What is a CSV File
1.2 Why CSV Files are Used
1.3 How Python Handles CSV Files
1.4 Reading CSV Files
1.5 Skipping Headers
1.6 Reading CSV as Dictionaries
1.7 Writing to CSV Files
1.8 Custom Delimiters
1.9 Real-World Use Case
Part 2: Handling JSON Files
2.1 What is JSON
2.2 Why JSON
2.3 How Python Handles JSON
2.4 Reading JSON Files
2.5 Reading JSON Strings
2.6 Writing JSON Files
2.7 Pretty Printing JSON
2.8 Converting Between CSV and JSON
2.9 Common Errors
2.10 Real-Life Uses of JSON
Summary
- CSV vs JSON Functions Table
Part 1: Handling CSV Files
1.1 What is a CSV File
A CSV (Comma-Separated Values) file is one of the simplest formats to store structured data, typically in tabular form — that means, data arranged in rows and columns, like in a spreadsheet.
Each row in the file represents a single record, and each column within that row represents a specific attribute or field of the record.
The comma (,) acts as a delimiter — it separates one field from another in a single row.
However, the delimiter doesn’t have to be a comma — it could also be a semicolon (;), tab (\t), or even a pipe (|), depending on the system or application generating the CSV.
Example:
Name,Age,City
Alice,14,Delhi
Bob,15,Mumbai
Charlie,13,Chennai
This file has:
- 3 columns (Name, Age, City)
- 3 records (one per line after the header)
1.2 Why CSV Files are Used (Conceptual Explanation)
CSV is extremely popular for data exchange between applications because:
- Human-readable: It’s plain text — you can open it in any text editor.
- Lightweight: It doesn’t require special formatting or encoding.
- Universal: Works across almost all software — Excel, Google Sheets, databases, etc.
- Easy to parse: Simple structure — no nested or complex data types.
However, the simplicity also means limitations:
- It can’t represent hierarchical (nested) data.
- Doesn’t store data types (all values are strings).
- Lacks metadata — no way to define schema, data types, or encoding.
That’s why for complex data, we use formats like JSON or XML.
1.3 How Python Handles CSV Files
Python provides a built-in module named csv, which simplifies reading and writing CSV files.
The csv module converts the plain text file into Python data structures like lists or dictionaries, which can be easily manipulated.
When you read a CSV file:
- Python opens the file using open().
- csv.reader() goes through each line, splits it by the delimiter (default comma ,), and gives a list of field values.
- Each row becomes a list, or if you use DictReader, a dictionary.
When you write a CSV file:
- You create lists or dictionaries in Python.
- The csv.writer() or csv.DictWriter() writes them to the file line by line, separating fields with commas.
1.4 Reading CSV Files (Detailed Example)
Let’s start by reading a CSV file using the csv.reader() function.
import csv
with open('students.csv', mode='r') as file:
csv_reader = csv.reader(file)
for row in csv_reader:
print(row)
Step-by-step explanation:
- open('students.csv', 'r') → Opens the file in read mode.
Python treats the file as a stream of text. - csv.reader(file) → Returns an iterator (not the full file at once).
Each time you loop, it reads one line, splits it at commas, and returns a list. - for row in csv_reader: → Iterates through all lines.
- print(row) → Prints each row as a list.
Output:
['Name', 'Age', 'City']
['Alice', '14', 'Delhi']
['Bob', '15', 'Mumbai']
['Charlie', '13', 'Chennai']
Behind the scenes:
- The csv module reads line by line (not all at once — memory efficient).
- It automatically handles quoted values like "New York, USA".
- You can customize the delimiter, quote character, etc.
1.5 Skipping Headers
If you want to ignore the first line (header row), use next() once before looping.
import csv
with open('students.csv', 'r') as file:
reader = csv.reader(file)
next(reader) # Skips the header row
for row in reader:
print(row)
1.6 Reading CSV as Dictionaries
The DictReader class reads each row as a dictionary, mapping column headers to their corresponding values.
import csv
with open('students.csv', 'r') as file:
reader = csv.DictReader(file)
for row in reader:
print(row['Name'], row['City'])
Output:
Alice Delhi
Bob Mumbai
Charlie Chennai
Theory behind DictReader:
- The first line is treated as keys (column names).
- Each subsequent line becomes a dictionary where keys = header names.
This makes the data easier to work with — instead of remembering column indexes, you use names.
1.7 Writing to CSV Files
Using csv.writer()
import csv
data = [
['Name', 'Age', 'City'],
['Alice', 14, 'Delhi'],
['Bob', 15, 'Mumbai']
]
with open('output.csv', 'w', newline='') as file:
writer = csv.writer(file)
writer.writerows(data)
How this works:
- The file is opened in write mode ('w').
- writerows() writes multiple rows at once.
- Each list in data becomes a single row.
- newline='' prevents extra blank lines in the output.
Using csv.DictWriter()
import csv
students = [
{'Name': 'Alice', 'Age': 14, 'City': 'Delhi'},
{'Name': 'Bob', 'Age': 15, 'City': 'Mumbai'}
]
with open('students_output.csv', 'w', newline='') as file:
fieldnames = ['Name', 'Age', 'City']
writer = csv.DictWriter(file, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(students)
Explanation:
- DictWriter converts dictionaries to rows.
- writeheader() writes column names.
- writerows() writes each dictionary as a new row.
Let’s break down the concept of Custom Delimiters in the csv module in Python 👇
1.8 Custom Delimiters —
A delimiter is a character that separates values (or fields) in a text file, most commonly used in CSV (Comma-Separated Values) files.
By default, the CSV module in Python uses a comma ( , ) as the delimiter between columns.
However, sometimes your data may already contain commas (for example, in addresses or names).
In such cases, using a different character as the delimiter — such as a semicolon ( ; ), tab ( \t ), or pipe ( | ) — makes the data easier to read and prevents confusion.
That’s where custom delimiters come in.
Explanation:
When you create a CSV writer object in Python using:
writer = csv.writer(file)
It assumes that fields are separated by commas.
But if your data contains commas, you can define a different delimiter like this:
writer = csv.writer(file, delimiter=';')
Here, each column in your CSV file will be separated by a semicolon ( ; ) instead of a comma.
Example:
import csv
data = [
['Name', 'Age', 'City'],
['Alice', '24', 'New York'],
['Bob', '30', 'Los Angeles']
]
with open('people.csv', 'w', newline='') as file:
writer = csv.writer(file, delimiter=';')
writer.writerows(data)
Output file (people.csv):
Name;Age;City
Alice;24;New York
Bob;30;Los Angeles
Now, the data values are separated using ; instead of ,.
Why It Is Used:
✅ Avoid confusion when data already contains commas (,) inside text.
✅ Match regional or software standards — e.g., in some countries (like parts of Europe), ; is the default CSV separator.
✅ Make data easier to parse for systems or programs expecting a specific delimiter.
✅ Handle special cases like tab-separated files (TSV files), where the delimiter is \t.
1.9 Real-World Use Case
CSV files are often used for:
- Data analysis: Import/export data from Excel.
- Machine learning: Datasets like Titanic or Iris are in CSV format.
- Database exchange: Exporting tables from SQL.
Part 2: Handling JSON Files
2.1 What is JSON
JSON (JavaScript Object Notation) is a lightweight format used for data exchange between systems — especially web APIs.
It was derived from JavaScript syntax, but it’s completely language-independent, meaning any programming language (like Python, Java, or C++) can read/write JSON data.
A JSON file stores data as key-value pairs (like a Python dictionary) and supports:
- Objects → {} represent dictionaries
- Arrays → [] represent lists
- Strings, Numbers, Booleans, and null
Example (data.json):
{
"name": "Alice",
"age": 14,
"subjects": ["Math", "Science"],
"is_active": true
}
2.2 Why JSON?
- Human-readable and structured.
- Machine-friendly: Easy for computers to parse and generate.
- Cross-language: Used in APIs, web servers, and mobile apps.
- Supports nested data structures, unlike CSV.
2.3 How Python Handles JSON
Python provides a built-in module json that allows you to:
- Convert JSON strings/files → Python objects (load, loads)
- Convert Python objects → JSON strings/files (dump, dumps)
| JSON | Python Equivalent |
|---|---|
| Object {} | dict |
| Array [] | list |
| String | str |
| Number | int/float |
| true / false | True / False |
| null | None |
2.4 Reading JSON Files
import json
with open('data.json', 'r') as file:
data = json.load(file)
print(data)
print(data['name'])
Explanation:
- json.load(file) parses the file’s JSON text into a Python dictionary.
- Once converted, you can access keys as usual.
2.5 Reading JSON Strings
import json
json_string = '{"name": "Bob", "age": 15}'
data = json.loads(json_string)
print(data['age'])
json.loads() converts a JSON string (not file) into a Python object.
2.6 Writing JSON Files
import json
student = {
"name": "Alice",
"age": 14,
"subjects": ["Math", "Science"],
"is_active": True
}
with open('student.json', 'w') as file:
json.dump(student, file)
This writes the dictionary into a JSON file.
2.7 Pretty Printing JSON
with open('pretty.json', 'w') as file:
json.dump(student, file, indent=4, sort_keys=True)
Explanation:
- indent=4 → adds newlines and spaces for readability.
- sort_keys=True → alphabetically sorts keys.
2.8 Converting Between CSV and JSON
CSV → JSON
import csv, json
data = []
with open('students.csv', 'r') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
data.append(row)
with open('students.json', 'w') as jsonfile:
json.dump(data, jsonfile, indent=4)
JSON → CSV
import csv, json
with open('students.json', 'r') as jsonfile:
data = json.load(jsonfile)
with open('students.csv', 'w', newline='') as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=data[0].keys())
writer.writeheader()
writer.writerows(data)
2.9 Common Errors
| Error | Meaning | Fix |
|---|---|---|
| JSONDecodeError | Invalid JSON format | Check commas, quotes, brackets |
| UnicodeDecodeError | Encoding mismatch | Use encoding='utf-8' in open() |
| Extra newlines in CSV | Missing newline='' | Always use newline='' when writing |
2.10 Real-Life Uses of JSON
- API responses and requests (e.g., from web servers)
- Configuration files (config.json)
- Data serialization in web and mobile apps
- Transferring data between programming languages
Summary:
| Task | CSV Function | JSON Function |
|---|---|---|
| Read file | csv.reader(), DictReader() | json.load() |
| Write file | csv.writer(), DictWriter() | json.dump() |
| Read string | — | json.loads() |
| Write string | — | json.dumps() |
