Basic Python For ML December 12 ,2024

1. Introduction to CSV and JSON

What is CSV (Comma-Separated Values)?

  • CSV files store tabular data in plain text, with each row represented on a new line. Columns are separated by a delimiter, typically a comma (,) but sometimes a tab (\t) or semicolon (;).
  • CSV files are widely used in data analysis, spreadsheets, and lightweight data storage.

What is JSON (JavaScript Object Notation)?

  • JSON is a lightweight format for data exchange between systems.
  • It represents data as key-value pairs, arrays, and nested objects, making it ideal for hierarchical and structured data.

2. Why Use CSV and JSON?

Advantages of CSV:

  • Simple to create and edit using any text editor or spreadsheet software.
  • Lightweight and human-readable for small to medium-sized datasets.
  • Supported by most data analysis and spreadsheet tools like Excel, Google Sheets, and Python.

Advantages of JSON:

  • Designed for web-based applications, making it the default format for APIs and configuration files.
  • Can handle complex, nested data structures.
  • Easily convertible to Python dictionaries and lists for further manipulation.

3. Working with CSV Files in Python

Python’s built-in csv module simplifies the handling of CSV files for both reading and writing.

Opening a File for CSV Operations

To work with a CSV file, you must first open it using the open() function. The way you open a file depends on the operation you intend to perform:

  • Read Mode ("r"): Opens the file for reading.
  • Write Mode ("w"): Opens the file for writing. Any existing content will be overwritten.
  • Append Mode ("a"): Opens the file for appending data at the end.
  • Read+Write Mode ("r+"): Opens the file for both reading and writing.

Example

# Opening a file in read mode
file = open("data.csv", "r")

# Opening a file in write mode
file = open("output.csv", "w")

Using with Statement

The recommended way to open files is by using the with statement. This ensures the file is properly closed after the operation is completed.

with open("data.csv", "r") as file:
    # Perform file operations
    pass

Reading CSV Files

You can use csv.reader() to read CSV files line by line.

Example:

import csv  

with open("data.csv", "r") as file:  
    reader = csv.reader(file)  
    for row in reader:  
        print(row)  # Each row is returned as a list

Explanation:

  • open("data.csv", "r"): Opens the file in read mode.
  • csv.reader(file): Reads the CSV file.
  • Each row is a list where elements correspond to the columns.
Advanced Reading Options

Sometimes, CSV files use delimiters other than commas or include headers that you might want to skip.

Custom Delimiters:

with open("data.csv", "r") as file:  
    reader = csv.reader(file, delimiter=";")  # Use semicolon as delimiter  
    for row in reader:  
        print(row)

Skipping Headers:

with open("data.csv", "r") as file:  
    reader = csv.reader(file)  
    next(reader)  # Skips the first row (header)  
    for row in reader:  
        print(row)

Writing to CSV Files

The csv.writer() method enables you to write data to a CSV file row by row.

Basic Writing:

import csv  

with open("output.csv", "w", newline="") as file:  
    writer = csv.writer(file)  
    writer.writerow(["Name", "Age", "City"])  # Writing a header  
    writer.writerow(["Alice", 30, "New York"])  # Writing a data row

Explanation:

  • newline="": Ensures no extra blank lines are added (important for Windows).
  • writer.writerow(): Writes a single row to the file.
Writing Multiple Rows

Use writer.writerows() to write multiple rows at once.

Example:

data = [  
    ["Name", "Age", "City"],  
    ["Alice", 30, "New York"],  
    ["Bob", 25, "Los Angeles"],  
    ["Charlie", 35, "Chicago"]  
]  

with open("output.csv", "w", newline="") as file:  
    writer = csv.writer(file)  
    writer.writerows(data)

Working with CSV Files as Dictionaries

Using csv.DictReader and csv.DictWriter allows you to handle CSV data as dictionaries.

Opening a File for Dictionary Operations

To work with CSV files, you must first open them using the open() function, specifying the appropriate mode:

  • Read Mode ("r"): To read data from the file.
  • Write Mode ("w"): To write data to the file.
  • Append Mode ("a"): To add new data to an existing file.

Using with Statement

The recommended way to open a file is with the with statement, as it automatically handles closing the file.

Example

# Opening a file in read mode
with open("data.csv", "r") as file:
    # Perform operations
    pass

# Opening a file in write mode
with open("output.csv", "w") as file:
    # Perform operations
    pass

Reading as Dictionary:

import csv  

with open("data.csv", "r") as file:  
    reader = csv.DictReader(file)  
    for row in reader:  
        print(f"Name: {row['Name']}, Age: {row['Age']}")

Writing as Dictionary:

import csv  

data = [  
    {"Name": "Alice", "Age": 30, "City": "New York"},  
    {"Name": "Bob", "Age": 25, "City": "Los Angeles"}  
]  

with open("output.csv", "w", newline="") as file:  
    fieldnames = ["Name", "Age", "City"]  
    writer = csv.DictWriter(file, fieldnames=fieldnames)  
    writer.writeheader()  # Write the header row  
    writer.writerows(data)  # Write the data rows

Advantages:

  • Easier to work with well-defined column names.
  • Reduces indexing errors.

4. Working with JSON Files in Python

Python’s json module is used for handling JSON data effectively.

Opening a File for JSON Operations

To work with JSON files, the open() function is used to specify the file's path and the mode of operation:

  • Read Mode ("r"): To read JSON data from a file.
  • Write Mode ("w"): To write JSON data to a file.

Using with Statement

The with statement ensures that files are properly closed after the operations, even if an error occurs.

Example

# Opening a file in read mode
with open("data.json", "r") as file:
    # Perform JSON reading operations
    pass

# Opening a file in write mode
with open("output.json", "w") as file:
    # Perform JSON writing operations
    pass

Reading JSON Files

The json.load() method reads JSON data from a file and converts it into a Python object.

Example:

import json  

with open("data.json", "r") as file:  
    data = json.load(file)  # Parse JSON into a Python dictionary  
    print(data)

Accessing JSON Data:

print(data["name"])  # Accessing a key in the JSON object

Writing to JSON Files

The json.dump() method writes Python objects to a file in JSON format.

Example:

data = {"name": "Alice", "age": 30, "city": "New York"}  

with open("output.json", "w") as file:  
    json.dump(data, file, indent=4)  # Pretty-prints JSON with 4 spaces

Converting Python Objects to JSON Strings

Use json.dumps() to convert Python objects to JSON strings.

Example:

python_dict = {"name": "Bob", "skills": ["Python", "ML"]}  
json_string = json.dumps(python_dict, indent=4)  
print(json_string)

Working with Nested JSON

Access nested keys using chained indexing.

Example:

data = {  
    "person": {"name": "Alice", "details": {"age": 30, "city": "New York"}}  
}  

print(data["person"]["details"]["age"])  # Output: 30

 

5. Key Differences Between CSV and JSON

 

FeatureCSVJSON
StructureTabular (rows and columns)Hierarchical (key-value pairs)
Use CaseData analysis, spreadsheetsAPIs, nested configurations
File SizeSmaller for simple datasetsLarger due to detailed structures
Human ReadabilitySimple for tabular dataMore readable for complex data

 

Which Format to Choose?

CSV is better when:

  • Your data is flat (rows and columns).
  • You want smaller file sizes.
  • You prioritize human readability and simplicity.
  • Use case: Exporting tabular data like a spreadsheet.

JSON is better when:

  • Your data has a hierarchical or nested structure.
  • You need to store a variety of data types (e.g., arrays, booleans).
  • You want to exchange data between systems or APIs.
  • Use case: Storing structured data like a user profile or API response.

Key takeaways:

  • CSV Files:
    • Best for simple, tabular data.
    • Use csv.reader, csv.writer, and their dictionary counterparts for efficient handling.
  • JSON Files:
    • Ideal for structured, hierarchical data.
    • Use json.load, json.dump, and json.dumps for parsing and serialization.

       

Next Topic : Classes and Objects in Python

 

Purnima
0

You must logged in to post comments.

Get In Touch

123 Street, New York, USA

+012 345 67890

techiefreak87@gmail.com

© Design & Developed by HW Infotech