Introduction to Automation

Automation eliminates repetitive tasks. Business analysts automate file organization, report generation, data backups, and email sending. Master automation and you save hours every week while reducing human error.

Estimated reading time: 25–30 minutes

Automation Tools

  • os module → file/directory operations
  • shutil → copy, move, delete files
  • pathlib → modern path handling
  • schedule → run tasks on schedule

Great for: file management, scheduled tasks

Common Use Cases

  • File organization → sort by type, date
  • Backup automation → copy files regularly
  • Report generation → create daily/weekly reports
  • Data cleanup → remove old files

Great for: saving time, consistency

File Operations with os and shutil

Basic file and directory operations.

python
import os
import shutil

# List files in directory
files = os.listdir('.')
print(files)

# Create directory
os.makedirs('reports/2024', exist_ok=True)

# Copy file
shutil.copy('data.csv', 'backup/data.csv')

# Move file
shutil.move('old_location/file.txt', 'new_location/file.txt')

# Delete file
os.remove('temp.txt')

# Delete directory
shutil.rmtree('old_folder')

Modern Path Handling with pathlib

pathlib provides object-oriented path operations.

python
from pathlib import Path

# Create Path object
data_dir = Path('data')
reports_dir = Path('reports')

# Create directory
reports_dir.mkdir(exist_ok=True)

# List files
for file in data_dir.glob('*.csv'):
    print(file.name)

# Check if exists
if data_dir.exists():
    print("Directory exists")

# Join paths
output_file = reports_dir / 'summary.txt'
print(output_file)

Scheduling Tasks

Run tasks automatically at specific times.

python
import schedule
import time

def backup_files():
    print("Running backup...")
    # Your backup code here

def generate_report():
    print("Generating report...")
    # Your report code here

# Schedule tasks
schedule.every().day.at("09:00").do(backup_files)
schedule.every().monday.at("08:00").do(generate_report)
schedule.every(30).minutes.do(backup_files)

# Run scheduler
while True:
    schedule.run_pending()
    time.sleep(60)  # check every minute

Cornerstone Project — Automated File Organizer (step-by-step)

Build a tool that watches a downloads folder, automatically organizes files by type into subfolders, renames files with timestamps, and logs all actions. Perfect for keeping your workspace clean.

Step 1 — Define file categories

python
from pathlib import Path
import shutil
from datetime import datetime

# File type mappings
FILE_CATEGORIES = {
    'Documents': ['.pdf', '.doc', '.docx', '.txt', '.xlsx', '.pptx'],
    'Images': ['.jpg', '.jpeg', '.png', '.gif', '.svg'],
    'Videos': ['.mp4', '.avi', '.mov', '.mkv'],
    'Archives': ['.zip', '.rar', '.7z', '.tar', '.gz'],
    'Code': ['.py', '.js', '.html', '.css', '.json']
}

def get_category(file_extension):
    for category, extensions in FILE_CATEGORIES.items():
        if file_extension.lower() in extensions:
            return category
    return 'Others'

# Test it
print(get_category('.pdf'))  # Documents
print(get_category('.jpg'))  # Images

Step 2 — Create organization function

python
def organize_file(file_path, base_dir):
    file_path = Path(file_path)
    
    # Skip if not a file
    if not file_path.is_file():
        return None
    
    # Get category
    category = get_category(file_path.suffix)
    
    # Create category folder
    category_dir = base_dir / category
    category_dir.mkdir(exist_ok=True)
    
    # Generate new filename with timestamp
    timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
    new_name = timestamp + '_' + file_path.name
    new_path = category_dir / new_name
    
    # Move file
    shutil.move(str(file_path), str(new_path))
    
    return new_path

# Test it
# result = organize_file('download.pdf', Path('organized'))

Step 3 — Organize entire directory

python
def organize_directory(source_dir):
    source_dir = Path(source_dir)
    organized_dir = source_dir / 'Organized'
    organized_dir.mkdir(exist_ok=True)
    
    organized_count = 0
    
    # Process all files in source directory
    for file_path in source_dir.iterdir():
        # Skip the organized folder itself
        if file_path == organized_dir:
            continue
        
        # Skip directories
        if file_path.is_dir():
            continue
        
        # Organize file
        result = organize_file(file_path, organized_dir)
        if result:
            print("Organized:", file_path.name, "->", result)
            organized_count += 1
    
    print("\nOrganized", organized_count, "files")
    return organized_count

# Test it
# organize_directory('Downloads')

Step 4 — Add logging

python
import logging

# Setup logging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(message)s',
    handlers=[
        logging.FileHandler('file_organizer.log'),
        logging.StreamHandler()
    ]
)

def organize_file_with_logging(file_path, base_dir):
    try:
        result = organize_file(file_path, base_dir)
        if result:
            logging.info("Moved: %s -> %s", file_path.name, result)
        return result
    except Exception as e:
        logging.error("Error organizing %s: %s", file_path, e)
        return None

Step 5 — Add duplicate handling

python
def organize_file_safe(file_path, base_dir):
    file_path = Path(file_path)
    
    if not file_path.is_file():
        return None
    
    category = get_category(file_path.suffix)
    category_dir = base_dir / category
    category_dir.mkdir(exist_ok=True)
    
    # Generate unique filename
    timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
    new_name = timestamp + '_' + file_path.name
    new_path = category_dir / new_name
    
    # Handle duplicates
    counter = 1
    while new_path.exists():
        new_name = timestamp + '_' + str(counter) + '_' + file_path.name
        new_path = category_dir / new_name
        counter += 1
    
    # Move file
    shutil.move(str(file_path), str(new_path))
    logging.info("Organized: %s -> %s", file_path.name, new_path)
    
    return new_path

Step 6 — Add scheduling

python
import schedule
import time

def auto_organize():
    downloads_dir = Path.home() / 'Downloads'
    organize_directory(downloads_dir)

# Schedule to run every hour
schedule.every().hour.do(auto_organize)

# Or run once daily at specific time
schedule.every().day.at("18:00").do(auto_organize)

print("File organizer started. Press Ctrl+C to stop.")

while True:
    schedule.run_pending()
    time.sleep(60)

How this helps at work

  • Clean workspace → files automatically organized
  • Time saved → no manual sorting needed
  • Audit trail → log shows all file movements
  • Customizable → add your own categories and rules

Key Takeaways

  • os and shutil → file operations (copy, move, delete)
  • pathlib → modern, object-oriented path handling
  • schedule → run tasks automatically on schedule
  • logging → track what automation does
  • Error handling → automation must be robust
  • Cornerstone → file organizer keeps workspace clean

Next Steps

You have mastered automation basics. Next, explore web scraping with BeautifulSoup, or dive intoemail automation with smtplib to send automated reports.