File handling connects your Python scripts to real data: CSV exports from Excel, JSON from APIs, text logs from systems. Business analysts read files to analyze data, write files to share results, and automate report generation. Master file I/O and you never manually copy-paste data again.

Estimated reading time: 25–30 minutes

File Operations

open() with 'r' → read existing files
open() with 'w' → create/overwrite files
open() with 'a' → append to existing files
with statement → auto-close files (safer)

Great for: loading data, saving reports

Common Formats

CSV → tabular data (Excel exports, databases)
JSON → structured data (APIs, configs)
TXT → logs, notes, simple reports
Encoding → use utf-8 for international characters

Great for: ETL pipelines, data integration

Reading and Writing Text Files

Text files are the simplest format. Use the with statement to auto-close files.

python

# Writing a text file
with open("report.txt", "w") as f:
    f.write("Sales Report\n")
    f.write("Total: $125,000\n")

# Reading a text file
with open("report.txt", "r") as f:
    content = f.read()
    print(content)

# Reading line by line (memory efficient)
with open("report.txt", "r") as f:
    for line in f:
        print(line.strip())

Working with CSV Files

CSV is the standard for tabular data. Python csv module handles parsing and writing.

python

import csv

# Reading CSV
with open("sales.csv", "r") as f:
    reader = csv.DictReader(f)  # returns dicts with column names as keys
    for row in reader:
        print(row['name'], ":", row['amount'])

# Writing CSV
data = [
    {"name": "Ana", "amount": 1200},
    {"name": "Bob", "amount": 1500},
]

with open("output.csv", "w", newline="") as f:
    writer = csv.DictWriter(f, fieldnames=["name", "amount"])
    writer.writeheader()
    writer.writerows(data)

CSV Best Practices

Use DictReader / DictWriter for named columns (clearer than lists)
Always set newline="" when writing (prevents blank rows on Windows)
Specify encoding="utf-8" for international characters

Working with JSON Files

JSON stores structured data (dicts, lists). Common for API responses and configs.

python

import json

# Reading JSON
with open("config.json", "r") as f:
    data = json.load(f)
    print(data["api_key"])

# Writing JSON
report = {
    "date": "2024-01-15",
    "sales": [
        {"rep": "Ana", "amount": 1200},
        {"rep": "Bob", "amount": 1500}
    ],
    "total": 2700
}

with open("report.json", "w") as f:
    json.dump(report, f, indent=2)  # indent=2 for readability

JSON Tips

Use indent=2 for human-readable output
JSON keys must be strings (not ints or tuples)
Use json.dumps() to convert to string without writing to file

Error Handling for Files

Files can be missing, locked, or corrupted. Handle errors gracefully.

python

try:
    with open("data.csv", "r") as f:
        content = f.read()
except FileNotFoundError:
    print("Error: File not found")
except PermissionError:
    print("Error: No permission to read file")
except Exception as e:
    print("Unexpected error:", e)

Cornerstone Project — Sales Report Generator (step-by-step)

Build a tool that reads sales data from CSV, calculates totals and top performers, and exports results to both JSON (for APIs) and TXT (for email). This is a real workflow analysts use to automate weekly reports.

Step 1 — Create sample input CSV

Simulate a sales export from your CRM.

python

import csv

# Create sample data
sales_data = [
    {"rep": "Ana", "region": "North", "amount": 1200},
    {"rep": "Bob", "region": "South", "amount": 1500},
    {"rep": "Carol", "region": "North", "amount": 980},
    {"rep": "Ana", "region": "North", "amount": 1100},
]

with open("sales.csv", "w", newline="") as f:
    writer = csv.DictWriter(f, fieldnames=["rep", "region", "amount"])
    writer.writeheader()
    writer.writerows(sales_data)

Step 2 — Read and parse CSV

Load data into a list of dicts for processing.

python

def load_sales(filename):
    with open(filename, "r") as f:
        reader = csv.DictReader(f)
        return [row for row in reader]

sales = load_sales("sales.csv")
print("Loaded", len(sales), "records")

Step 3 — Calculate totals by rep

Aggregate sales by rep name.

python

from collections import defaultdict

def totals_by_rep(sales):
    totals = defaultdict(float)
    for row in sales:
        rep = row["rep"]
        amount = float(row["amount"])
        totals[rep] += amount
    return dict(totals)

rep_totals = totals_by_rep(sales)
print(rep_totals)  # {"Ana": 2300.0, "Bob": 1500.0, "Carol": 980.0}

Step 4 — Find top performers

Sort by total sales descending.

python

def top_performers(totals, n=3):
    sorted_reps = sorted(totals.items(), key=lambda x: x[1], reverse=True)
    return sorted_reps[:n]

top_3 = top_performers(rep_totals)
print("Top performers:", top_3)

Step 5 — Export to JSON

Save structured data for APIs or dashboards.

python

def export_json(totals, top, filename):
    report = {
        "date": "2024-01-15",
        "total_sales": sum(totals.values()),
        "rep_totals": totals,
        "top_performers": [{"rep": r, "amount": a} for r, a in top]
    }
    
    with open(filename, "w") as f:
        json.dump(report, f, indent=2)
    
    print("JSON report saved to", filename)

export_json(rep_totals, top_3, "sales_report.json")

Step 6 — Export to TXT for email

Create a human-readable summary.

python

def export_txt(totals, top, filename):
    with open(filename, "w") as f:
        f.write("Sales Report - Week of Jan 15\n")
        f.write("=" * 40 + "\n\n")
        
        total = sum(totals.values())
        f.write("Total Sales: $" + str(round(total, 2)) + "\n\n")
        
        f.write("Top Performers:\n")
        for i, (rep, amount) in enumerate(top, 1):
            f.write("  " + str(i) + ". " + rep + ": $" + str(round(amount, 2)) + "\n")
        
        f.write("\nAll Reps:\n")
        for rep, amount in sorted(totals.items()):
            f.write("  • " + rep + ": $" + str(round(amount, 2)) + "\n")
    
    print("Text report saved to", filename)

export_txt(rep_totals, top_3, "sales_report.txt")

Step 7 — Put it all together

Combine into a single pipeline function.

python

def generate_sales_report(input_csv):
    # Load
    sales = load_sales(input_csv)
    
    # Process
    totals = totals_by_rep(sales)
    top = top_performers(totals, n=3)
    
    # Export
    export_json(totals, top, "sales_report.json")
    export_txt(totals, top, "sales_report.txt")
    
    print("Report generation complete!")

# Run it
generate_sales_report("sales.csv")

How this helps at work

Weekly automation → run every Monday, email results to team
Multi-format output → JSON for dashboards, TXT for humans
Scalable → handles 10 rows or 10,000 rows the same way
Reusable pattern → adapt for expenses, inventory, customer data

Key Takeaways

with statement → auto-closes files, prevents leaks
CSV → use DictReader/DictWriter for named columns
JSON → structured data, perfect for APIs and configs
Error handling → catch FileNotFoundError and PermissionError
Encoding → always specify utf-8 for international text
Cornerstone → sales report generator automates weekly workflows

Next Steps

You have mastered file I/O. Next, explore error handling to make your file scripts robust, or dive into working with libraries like pandas for advanced CSV/Excel processing.

File Handling

File Operations

Common Formats

Reading and Writing Text Files

Working with CSV Files

CSV Best Practices

Working with JSON Files

JSON Tips

Error Handling for Files

Cornerstone Project — Sales Report Generator (step-by-step)

Step 1 — Create sample input CSV

Step 2 — Read and parse CSV

Step 3 — Calculate totals by rep

Step 4 — Find top performers

Step 5 — Export to JSON

Step 6 — Export to TXT for email

Step 7 — Put it all together

How this helps at work

Key Takeaways

Next Steps

On This Page