Modules and Packages
Modules organize code into reusable files. Packages group related modules into folders. Business analysts use modules to share utility functions across projects, import powerful libraries (pandas, requests), and structure large automation systems. Master modules and you never copy-paste code again.
Estimated reading time: 25–30 minutes
Module Basics
- import module → load entire module
- from module import func → import specific items
- import module as alias → shorter names
- Custom modules → any .py file can be imported
Great for: organizing code, reusing functions
Package Management
- pip install → add third-party libraries
- requirements.txt → track dependencies
- Virtual environments → isolate project dependencies
- Standard library → built-in modules (no install needed)
Great for: reproducible projects, collaboration
Importing Built-in Modules
Python standard library has 200+ modules for common tasks. No installation needed.
import math
import random
import datetime
from pathlib import Path
# Math operations
print(math.sqrt(16)) # 4.0
print(math.ceil(4.2)) # 5
# Random data
print(random.randint(1, 10))
print(random.choice(["A", "B", "C"]))
# Dates
today = datetime.date.today()
print(today.strftime("%Y-%m-%d"))
# File paths
data_dir = Path("data")
print(data_dir / "sales.csv") # data/sales.csvImport Styles
Different ways to import, each with trade-offs.
# Import entire module (explicit, clear)
import math
print(math.pi)
# Import specific items (shorter, but can cause name conflicts)
from math import pi, sqrt
print(pi)
# Import with alias (common for long names)
import datetime as dt
print(dt.date.today())
# Import all (avoid in production, unclear what is imported)
from math import *
print(pi) # where did this come from?Creating Custom Modules
Any .py file is a module. Put reusable functions in separate files.
utils.py (your custom module):
def clean_email(email):
return email.strip().lower()
def validate_sku(sku):
return sku.startswith("SKU") and len(sku) == 9
TAX_RATE = 0.08main.py (using your module):
import utils
email = utils.clean_email(" Ana@Company.COM ")
print(email) # "ana@company.com"
is_valid = utils.validate_sku("SKU12345")
print(is_valid) # True
tax = 100 * utils.TAX_RATE
print(tax) # 8.0Packages — Organizing Multiple Modules
A package is a folder with an __init__.py file (can be empty).
# Project structure
my_project/
main.py
analytics/
__init__.py
sales.py
customers.py
utils/
__init__.py
validation.py
formatting.pymain.py imports from packages:
from analytics.sales import calculate_revenue
from utils.validation import validate_email
revenue = calculate_revenue([1200, 1500, 980])
print("Revenue: $", round(revenue, 2))Installing Third-Party Packages
Use pip to install libraries from PyPI (Python Package Index).
# Install a package
pip install requests
# Install specific version
pip install pandas==2.0.0
# Install from requirements.txt
pip install -r requirements.txt
# List installed packages
pip list
# Uninstall
pip uninstall requestsManaging Dependencies with requirements.txt
Track project dependencies for reproducibility.
# Generate requirements.txt
pip freeze > requirements.txt
# Example requirements.txt
requests==2.31.0
pandas==2.0.3
openpyxl==3.1.2
# Install all dependencies
pip install -r requirements.txtCornerstone Project — Reusable Analytics Toolkit (step-by-step)
Build a custom package with utility modules for common business tasks: data validation, formatting, and reporting. You will create a mini-library you can import into any project, saving hours of repetitive coding.
Step 1 — Create package structure
Set up folders and __init__.py files.
# Create this structure
analytics_toolkit/
__init__.py
validation.py
formatting.py
reporting.py
main.pyStep 2 — Build validation module
analytics_toolkit/validation.py:
def validate_email(email):
"""Check if email has @ and domain."""
if "@" not in email:
return False
parts = email.split("@")
return len(parts) == 2 and "." in parts[1]
def validate_sku(sku):
"""Check if SKU follows SKU##### format."""
return sku.startswith("SKU") and len(sku) == 9 and sku[3:].isdigit()
def validate_positive(value):
"""Check if value is positive number."""
try:
return float(value) > 0
except (ValueError, TypeError):
return FalseStep 3 — Build formatting module
analytics_toolkit/formatting.py:
def format_currency(amount):
"""Format number as USD currency."""
return "$" + str(round(amount, 2))
def format_percent(value, decimals=1):
"""Format decimal as percentage."""
return str(round(value * 100, decimals)) + "%"
def clean_text(text):
"""Strip whitespace and normalize case."""
return text.strip().lower()
def truncate(text, length=50):
"""Truncate text with ellipsis."""
return text[:length] + "..." if len(text) > length else textStep 4 — Build reporting module
analytics_toolkit/reporting.py:
from .formatting import format_currency, format_percent
def summary_stats(numbers):
"""Calculate basic statistics."""
if not numbers:
return None
return {
"count": len(numbers),
"total": sum(numbers),
"average": sum(numbers) / len(numbers),
"min": min(numbers),
"max": max(numbers)
}
def sales_report(sales_data):
"""Generate formatted sales report."""
stats = summary_stats([s["amount"] for s in sales_data])
report = []
report.append("Sales Report")
report.append("=" * 40)
report.append("Total Sales: " + format_currency(stats['total']))
report.append("Average: " + format_currency(stats['average']))
report.append("Range: " + format_currency(stats['min']) + " - " + format_currency(stats['max']))
report.append("Transactions: " + str(stats['count']))
return "\n".join(report)Step 5 — Configure package __init__.py
analytics_toolkit/__init__.py:
"""Analytics Toolkit - Reusable business analysis utilities."""
from .validation import validate_email, validate_sku, validate_positive
from .formatting import format_currency, format_percent, clean_text
from .reporting import summary_stats, sales_report
__version__ = "1.0.0"
__all__ = [
"validate_email",
"validate_sku",
"validate_positive",
"format_currency",
"format_percent",
"clean_text",
"summary_stats",
"sales_report"
]Step 6 — Use your toolkit
main.py:
from analytics_toolkit import (
validate_email,
validate_sku,
format_currency,
sales_report
)
# Validate data
emails = ["ana@co.com", "invalid", "bob@co.com"]
valid = [e for e in emails if validate_email(e)]
print("Valid emails:", valid)
# Format output
revenue = 125000
print("Revenue:", format_currency(revenue))
# Generate report
sales = [
{"rep": "Ana", "amount": 1200},
{"rep": "Bob", "amount": 1500},
{"rep": "Carol", "amount": 980}
]
print(sales_report(sales))How this helps at work
- Reusable → import toolkit into any project, no copy-paste
- Consistent → same validation/formatting logic everywhere
- Testable → test each module independently
- Shareable → teammates can use your toolkit too
Key Takeaways
- Modules → any .py file can be imported
- Packages → folders with __init__.py containing modules
- Standard library → 200+ built-in modules (math, datetime, pathlib)
- pip → install third-party packages from PyPI
- requirements.txt → track dependencies for reproducibility
- Cornerstone → custom analytics toolkit for reusable utilities
Next Steps
You have mastered modules and packages. Next, explore working with libraries to use powerful third-party packages like pandas and requests, or dive into virtual environments to isolate project dependencies.