1. Introduction to CSV and JSON
What is CSV (Comma-Separated Values)?
- CSV files store tabular data in plain text, with each row represented on a new line. Columns are separated by a delimiter, typically a comma (,) but sometimes a tab (\t) or semicolon (;).
- CSV files are widely used in data analysis, spreadsheets, and lightweight data storage.
What is JSON (JavaScript Object Notation)?
- JSON is a lightweight format for data exchange between systems.
- It represents data as key-value pairs, arrays, and nested objects, making it ideal for hierarchical and structured data.
2. Why Use CSV and JSON?
Advantages of CSV:
- Simple to create and edit using any text editor or spreadsheet software.
- Lightweight and human-readable for small to medium-sized datasets.
- Supported by most data analysis and spreadsheet tools like Excel, Google Sheets, and Python.
Advantages of JSON:
- Designed for web-based applications, making it the default format for APIs and configuration files.
- Can handle complex, nested data structures.
- Easily convertible to Python dictionaries and lists for further manipulation.
3. Working with CSV Files in Python
Python’s built-in csv module simplifies the handling of CSV files for both reading and writing.
Opening a File for CSV Operations
To work with a CSV file, you must first open it using the open() function. The way you open a file depends on the operation you intend to perform:
- Read Mode ("r"): Opens the file for reading.
- Write Mode ("w"): Opens the file for writing. Any existing content will be overwritten.
- Append Mode ("a"): Opens the file for appending data at the end.
- Read+Write Mode ("r+"): Opens the file for both reading and writing.
Example
# Opening a file in read mode
file = open("data.csv", "r")
# Opening a file in write mode
file = open("output.csv", "w")
Using with Statement
The recommended way to open files is by using the with statement. This ensures the file is properly closed after the operation is completed.
with open("data.csv", "r") as file:
# Perform file operations
pass
Reading CSV Files
You can use csv.reader() to read CSV files line by line.
Example:
import csv
with open("data.csv", "r") as file:
reader = csv.reader(file)
for row in reader:
print(row) # Each row is returned as a list
Explanation:
- open("data.csv", "r"): Opens the file in read mode.
- csv.reader(file): Reads the CSV file.
- Each row is a list where elements correspond to the columns.
Advanced Reading Options
Sometimes, CSV files use delimiters other than commas or include headers that you might want to skip.
Custom Delimiters:
with open("data.csv", "r") as file:
reader = csv.reader(file, delimiter=";") # Use semicolon as delimiter
for row in reader:
print(row)
Skipping Headers:
with open("data.csv", "r") as file:
reader = csv.reader(file)
next(reader) # Skips the first row (header)
for row in reader:
print(row)
Writing to CSV Files
The csv.writer() method enables you to write data to a CSV file row by row.
Basic Writing:
import csv
with open("output.csv", "w", newline="") as file:
writer = csv.writer(file)
writer.writerow(["Name", "Age", "City"]) # Writing a header
writer.writerow(["Alice", 30, "New York"]) # Writing a data row
Explanation:
- newline="": Ensures no extra blank lines are added (important for Windows).
- writer.writerow(): Writes a single row to the file.
Writing Multiple Rows
Use writer.writerows() to write multiple rows at once.
Example:
data = [
["Name", "Age", "City"],
["Alice", 30, "New York"],
["Bob", 25, "Los Angeles"],
["Charlie", 35, "Chicago"]
]
with open("output.csv", "w", newline="") as file:
writer = csv.writer(file)
writer.writerows(data)
Working with CSV Files as Dictionaries
Using csv.DictReader and csv.DictWriter allows you to handle CSV data as dictionaries.
Opening a File for Dictionary Operations
To work with CSV files, you must first open them using the open() function, specifying the appropriate mode:
- Read Mode ("r"): To read data from the file.
- Write Mode ("w"): To write data to the file.
- Append Mode ("a"): To add new data to an existing file.
Using with Statement
The recommended way to open a file is with the with statement, as it automatically handles closing the file.
Example
# Opening a file in read mode
with open("data.csv", "r") as file:
# Perform operations
pass
# Opening a file in write mode
with open("output.csv", "w") as file:
# Perform operations
pass
Reading as Dictionary:
import csv
with open("data.csv", "r") as file:
reader = csv.DictReader(file)
for row in reader:
print(f"Name: {row['Name']}, Age: {row['Age']}")
Writing as Dictionary:
import csv
data = [
{"Name": "Alice", "Age": 30, "City": "New York"},
{"Name": "Bob", "Age": 25, "City": "Los Angeles"}
]
with open("output.csv", "w", newline="") as file:
fieldnames = ["Name", "Age", "City"]
writer = csv.DictWriter(file, fieldnames=fieldnames)
writer.writeheader() # Write the header row
writer.writerows(data) # Write the data rows
Advantages:
- Easier to work with well-defined column names.
- Reduces indexing errors.
4. Working with JSON Files in Python
Python’s json module is used for handling JSON data effectively.
Opening a File for JSON Operations
To work with JSON files, the open() function is used to specify the file's path and the mode of operation:
- Read Mode ("r"): To read JSON data from a file.
- Write Mode ("w"): To write JSON data to a file.
Using with Statement
The with statement ensures that files are properly closed after the operations, even if an error occurs.
Example
# Opening a file in read mode
with open("data.json", "r") as file:
# Perform JSON reading operations
pass
# Opening a file in write mode
with open("output.json", "w") as file:
# Perform JSON writing operations
pass
Reading JSON Files
The json.load() method reads JSON data from a file and converts it into a Python object.
Example:
import json
with open("data.json", "r") as file:
data = json.load(file) # Parse JSON into a Python dictionary
print(data)
Accessing JSON Data:
print(data["name"]) # Accessing a key in the JSON object
Writing to JSON Files
The json.dump() method writes Python objects to a file in JSON format.
Example:
data = {"name": "Alice", "age": 30, "city": "New York"}
with open("output.json", "w") as file:
json.dump(data, file, indent=4) # Pretty-prints JSON with 4 spaces
Converting Python Objects to JSON Strings
Use json.dumps() to convert Python objects to JSON strings.
Example:
python_dict = {"name": "Bob", "skills": ["Python", "ML"]}
json_string = json.dumps(python_dict, indent=4)
print(json_string)
Working with Nested JSON
Access nested keys using chained indexing.
Example:
data = {
"person": {"name": "Alice", "details": {"age": 30, "city": "New York"}}
}
print(data["person"]["details"]["age"]) # Output: 30
5. Key Differences Between CSV and JSON
Feature | CSV | JSON |
---|---|---|
Structure | Tabular (rows and columns) | Hierarchical (key-value pairs) |
Use Case | Data analysis, spreadsheets | APIs, nested configurations |
File Size | Smaller for simple datasets | Larger due to detailed structures |
Human Readability | Simple for tabular data | More readable for complex data |
Which Format to Choose?
CSV is better when:
- Your data is flat (rows and columns).
- You want smaller file sizes.
- You prioritize human readability and simplicity.
- Use case: Exporting tabular data like a spreadsheet.
JSON is better when:
- Your data has a hierarchical or nested structure.
- You need to store a variety of data types (e.g., arrays, booleans).
- You want to exchange data between systems or APIs.
- Use case: Storing structured data like a user profile or API response.
Key takeaways:
- CSV Files:
- Best for simple, tabular data.
- Use csv.reader, csv.writer, and their dictionary counterparts for efficient handling.
- JSON Files:
- Ideal for structured, hierarchical data.
Use json.load, json.dump, and json.dumps for parsing and serialization.
Next Topic : Classes and Objects in Python