Modules and Packages

Modules organize code into reusable files. Packages group related modules into folders. Business analysts use modules to share utility functions across projects, import powerful libraries (pandas, requests), and structure large automation systems. Master modules and you never copy-paste code again.

Estimated reading time: 25–30 minutes

Module Basics

  • import module → load entire module
  • from module import func → import specific items
  • import module as alias → shorter names
  • Custom modules → any .py file can be imported

Great for: organizing code, reusing functions

Package Management

  • pip install → add third-party libraries
  • requirements.txt → track dependencies
  • Virtual environments → isolate project dependencies
  • Standard library → built-in modules (no install needed)

Great for: reproducible projects, collaboration

Importing Built-in Modules

Python standard library has 200+ modules for common tasks. No installation needed.

python
import math
import random
import datetime
from pathlib import Path

# Math operations
print(math.sqrt(16))        # 4.0
print(math.ceil(4.2))       # 5

# Random data
print(random.randint(1, 10))
print(random.choice(["A", "B", "C"]))

# Dates
today = datetime.date.today()
print(today.strftime("%Y-%m-%d"))

# File paths
data_dir = Path("data")
print(data_dir / "sales.csv")  # data/sales.csv

Import Styles

Different ways to import, each with trade-offs.

python
# Import entire module (explicit, clear)
import math
print(math.pi)

# Import specific items (shorter, but can cause name conflicts)
from math import pi, sqrt
print(pi)

# Import with alias (common for long names)
import datetime as dt
print(dt.date.today())

# Import all (avoid in production, unclear what is imported)
from math import *
print(pi)  # where did this come from?

Creating Custom Modules

Any .py file is a module. Put reusable functions in separate files.

utils.py (your custom module):

python
def clean_email(email):
    return email.strip().lower()

def validate_sku(sku):
    return sku.startswith("SKU") and len(sku) == 9

TAX_RATE = 0.08

main.py (using your module):

python
import utils

email = utils.clean_email("  Ana@Company.COM  ")
print(email)  # "ana@company.com"

is_valid = utils.validate_sku("SKU12345")
print(is_valid)  # True

tax = 100 * utils.TAX_RATE
print(tax)  # 8.0

Packages — Organizing Multiple Modules

A package is a folder with an __init__.py file (can be empty).

python
# Project structure
my_project/
  main.py
  analytics/
    __init__.py
    sales.py
    customers.py
  utils/
    __init__.py
    validation.py
    formatting.py

main.py imports from packages:

python
from analytics.sales import calculate_revenue
from utils.validation import validate_email

revenue = calculate_revenue([1200, 1500, 980])
print("Revenue: $", round(revenue, 2))

Installing Third-Party Packages

Use pip to install libraries from PyPI (Python Package Index).

python
# Install a package
pip install requests

# Install specific version
pip install pandas==2.0.0

# Install from requirements.txt
pip install -r requirements.txt

# List installed packages
pip list

# Uninstall
pip uninstall requests

Managing Dependencies with requirements.txt

Track project dependencies for reproducibility.

python
# Generate requirements.txt
pip freeze > requirements.txt

# Example requirements.txt
requests==2.31.0
pandas==2.0.3
openpyxl==3.1.2

# Install all dependencies
pip install -r requirements.txt

Cornerstone Project — Reusable Analytics Toolkit (step-by-step)

Build a custom package with utility modules for common business tasks: data validation, formatting, and reporting. You will create a mini-library you can import into any project, saving hours of repetitive coding.

Step 1 — Create package structure

Set up folders and __init__.py files.

python
# Create this structure
analytics_toolkit/
  __init__.py
  validation.py
  formatting.py
  reporting.py
main.py

Step 2 — Build validation module

analytics_toolkit/validation.py:

python
def validate_email(email):
    """Check if email has @ and domain."""
    if "@" not in email:
        return False
    parts = email.split("@")
    return len(parts) == 2 and "." in parts[1]

def validate_sku(sku):
    """Check if SKU follows SKU##### format."""
    return sku.startswith("SKU") and len(sku) == 9 and sku[3:].isdigit()

def validate_positive(value):
    """Check if value is positive number."""
    try:
        return float(value) > 0
    except (ValueError, TypeError):
        return False

Step 3 — Build formatting module

analytics_toolkit/formatting.py:

python
def format_currency(amount):
    """Format number as USD currency."""
    return "$" + str(round(amount, 2))

def format_percent(value, decimals=1):
    """Format decimal as percentage."""
    return str(round(value * 100, decimals)) + "%"

def clean_text(text):
    """Strip whitespace and normalize case."""
    return text.strip().lower()

def truncate(text, length=50):
    """Truncate text with ellipsis."""
    return text[:length] + "..." if len(text) > length else text

Step 4 — Build reporting module

analytics_toolkit/reporting.py:

python
from .formatting import format_currency, format_percent

def summary_stats(numbers):
    """Calculate basic statistics."""
    if not numbers:
        return None
    return {
        "count": len(numbers),
        "total": sum(numbers),
        "average": sum(numbers) / len(numbers),
        "min": min(numbers),
        "max": max(numbers)
    }

def sales_report(sales_data):
    """Generate formatted sales report."""
    stats = summary_stats([s["amount"] for s in sales_data])
    
    report = []
    report.append("Sales Report")
    report.append("=" * 40)
    report.append("Total Sales: " + format_currency(stats['total']))
    report.append("Average: " + format_currency(stats['average']))
    report.append("Range: " + format_currency(stats['min']) + " - " + format_currency(stats['max']))
    report.append("Transactions: " + str(stats['count']))
    
    return "\n".join(report)

Step 5 — Configure package __init__.py

analytics_toolkit/__init__.py:

python
"""Analytics Toolkit - Reusable business analysis utilities."""

from .validation import validate_email, validate_sku, validate_positive
from .formatting import format_currency, format_percent, clean_text
from .reporting import summary_stats, sales_report

__version__ = "1.0.0"
__all__ = [
    "validate_email",
    "validate_sku", 
    "validate_positive",
    "format_currency",
    "format_percent",
    "clean_text",
    "summary_stats",
    "sales_report"
]

Step 6 — Use your toolkit

main.py:

python
from analytics_toolkit import (
    validate_email,
    validate_sku,
    format_currency,
    sales_report
)

# Validate data
emails = ["ana@co.com", "invalid", "bob@co.com"]
valid = [e for e in emails if validate_email(e)]
print("Valid emails:", valid)

# Format output
revenue = 125000
print("Revenue:", format_currency(revenue))

# Generate report
sales = [
    {"rep": "Ana", "amount": 1200},
    {"rep": "Bob", "amount": 1500},
    {"rep": "Carol", "amount": 980}
]
print(sales_report(sales))

How this helps at work

  • Reusable → import toolkit into any project, no copy-paste
  • Consistent → same validation/formatting logic everywhere
  • Testable → test each module independently
  • Shareable → teammates can use your toolkit too

Key Takeaways

  • Modules → any .py file can be imported
  • Packages → folders with __init__.py containing modules
  • Standard library → 200+ built-in modules (math, datetime, pathlib)
  • pip → install third-party packages from PyPI
  • requirements.txt → track dependencies for reproducibility
  • Cornerstone → custom analytics toolkit for reusable utilities

Next Steps

You have mastered modules and packages. Next, explore working with libraries to use powerful third-party packages like pandas and requests, or dive into virtual environments to isolate project dependencies.