Testing and Debugging
Testing catches bugs before users do. Debugging fixes them fast. Business analysts write tests for data pipelines, validation rules, and calculations to ensure accuracy. Master testing and your scripts become reliable tools teammates trust.
Estimated reading time: 25–30 minutes
Testing Basics
- Unit tests → test individual functions
- assert statements → verify expected results
- Test coverage → how much code is tested
- Automated testing → run tests on every change
Great for: catching bugs early, confidence in changes
Debugging Techniques
- Print debugging → simple, effective for quick checks
- Logging → track program flow in production
- Debugger (pdb) → step through code line by line
- Error messages → read stack traces carefully
Great for: finding and fixing bugs fast
Print Debugging
Simplest debugging: print variables to see what is happening.
def calculate_discount(price, discount_pct):
print("Input price:", price)
print("Discount %:", discount_pct)
discount = price * (discount_pct / 100)
print("Discount amount:", discount)
final_price = price - discount
print("Final price:", final_price)
return final_price
result = calculate_discount(100, 20)
print("Result:", result)Assert Statements
Assert checks conditions that must be true. Raises error if false.
def validate_email(email):
assert "@" in email, "Email must contain @"
assert "." in email.split("@")[1], "Email must have domain with ."
return True
# Valid email
validate_email("ana@company.com") # OK
# Invalid email
try:
validate_email("invalid-email") # AssertionError
except AssertionError as e:
print("Validation failed:", e)Unit Testing with unittest
Write automated tests that verify your functions work correctly.
import unittest
def calculate_total(items):
return sum(item['price'] * item['qty'] for item in items)
class TestCalculateTotal(unittest.TestCase):
def test_empty_list(self):
result = calculate_total([])
self.assertEqual(result, 0)
def test_single_item(self):
items = [{'price': 10, 'qty': 2}]
result = calculate_total(items)
self.assertEqual(result, 20)
def test_multiple_items(self):
items = [
{'price': 10, 'qty': 2},
{'price': 5, 'qty': 3}
]
result = calculate_total(items)
self.assertEqual(result, 35)
# Run tests
if __name__ == '__main__':
unittest.main()Common Assertions
unittest provides many assertion methods.
import unittest
class TestAssertions(unittest.TestCase):
def test_equality(self):
self.assertEqual(2 + 2, 4)
self.assertNotEqual(2 + 2, 5)
def test_boolean(self):
self.assertTrue(10 > 5)
self.assertFalse(10 < 5)
def test_membership(self):
self.assertIn('a', 'abc')
self.assertNotIn('x', 'abc')
def test_exceptions(self):
with self.assertRaises(ZeroDivisionError):
10 / 0
def test_approximate(self):
self.assertAlmostEqual(0.1 + 0.2, 0.3, places=7)Testing with pytest (simpler syntax)
pytest is more popular than unittest for its simplicity.
# Install: pip install pytest
def calculate_total(items):
return sum(item['price'] * item['qty'] for item in items)
# Test file: test_calculations.py
def test_empty_list():
assert calculate_total([]) == 0
def test_single_item():
items = [{'price': 10, 'qty': 2}]
assert calculate_total(items) == 20
def test_multiple_items():
items = [
{'price': 10, 'qty': 2},
{'price': 5, 'qty': 3}
]
assert calculate_total(items) == 35
# Run: pytest test_calculations.pyLogging for Production
Use logging instead of print for production code.
import logging
# Configure logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s - %(levelname)s - %(message)s',
filename='app.log'
)
def process_order(order_id, amount):
logging.info("Processing order %s", order_id)
if amount < 0:
logging.error("Invalid amount: %s", amount)
return False
logging.info("Order %s processed: $%s", order_id, amount)
return True
# Use it
process_order("ORD123", 100)
process_order("ORD124", -50)Cornerstone Project — Data Validation Test Suite (step-by-step)
Build a comprehensive test suite for a data validation module. This ensures your data cleaning functions work correctly and catch regressions when you make changes. Essential for production data pipelines.
Step 1 — Create validation module
validators.py:
def validate_email(email):
"""Validate email format."""
if not isinstance(email, str):
return False
if "@" not in email:
return False
parts = email.split("@")
if len(parts) != 2:
return False
return "." in parts[1]
def validate_sku(sku):
"""Validate SKU format (SKU + 5 digits)."""
if not isinstance(sku, str):
return False
if not sku.startswith("SKU"):
return False
if len(sku) != 8:
return False
return sku[3:].isdigit()
def validate_amount(amount):
"""Validate amount is positive number."""
try:
value = float(amount)
return value > 0
except (ValueError, TypeError):
return False
def clean_text(text):
"""Clean and normalize text."""
if not isinstance(text, str):
return ""
return text.strip().lower()Step 2 — Write tests for email validation
test_validators.py:
import unittest
from validators import validate_email
class TestEmailValidation(unittest.TestCase):
def test_valid_emails(self):
"""Test valid email formats."""
valid_emails = [
"user@example.com",
"test.user@company.co.uk",
"admin@test-site.org"
]
for email in valid_emails:
self.assertTrue(validate_email(email), email + " should be valid")
def test_invalid_emails(self):
"""Test invalid email formats."""
invalid_emails = [
"invalid",
"@example.com",
"user@",
"user@@example.com",
"",
None,
123
]
for email in invalid_emails:
self.assertFalse(validate_email(email), str(email) + " should be invalid")Step 3 — Write tests for SKU validation
class TestSKUValidation(unittest.TestCase):
def test_valid_skus(self):
"""Test valid SKU formats."""
valid_skus = ["SKU12345", "SKU00001", "SKU99999"]
for sku in valid_skus:
self.assertTrue(validate_sku(sku))
def test_invalid_skus(self):
"""Test invalid SKU formats."""
invalid_skus = [
"ABC12345", # wrong prefix
"SKU123", # too short
"SKU1234567", # too long
"SKU1234A", # non-digit
"",
None,
12345
]
for sku in invalid_skus:
self.assertFalse(validate_sku(sku))Step 4 — Write tests for amount validation
class TestAmountValidation(unittest.TestCase):
def test_valid_amounts(self):
"""Test valid amounts."""
valid_amounts = [1, 10.5, "100", "99.99", 0.01]
for amount in valid_amounts:
self.assertTrue(validate_amount(amount))
def test_invalid_amounts(self):
"""Test invalid amounts."""
invalid_amounts = [0, -10, "-5", "abc", None, ""]
for amount in invalid_amounts:
self.assertFalse(validate_amount(amount))Step 5 — Write tests for text cleaning
class TestTextCleaning(unittest.TestCase):
def test_strip_whitespace(self):
"""Test whitespace removal."""
self.assertEqual(clean_text(" hello "), "hello")
def test_lowercase(self):
"""Test case normalization."""
self.assertEqual(clean_text("HELLO"), "hello")
self.assertEqual(clean_text("HeLLo"), "hello")
def test_combined(self):
"""Test strip and lowercase together."""
self.assertEqual(clean_text(" HELLO WORLD "), "hello world")
def test_edge_cases(self):
"""Test edge cases."""
self.assertEqual(clean_text(""), "")
self.assertEqual(clean_text(None), "")
self.assertEqual(clean_text(123), "")Step 6 — Add integration test
Test multiple validators together.
class TestDataPipeline(unittest.TestCase):
def test_valid_record(self):
"""Test processing valid record."""
record = {
'email': 'user@example.com',
'sku': 'SKU12345',
'amount': '99.99',
'name': ' Product Name '
}
self.assertTrue(validate_email(record['email']))
self.assertTrue(validate_sku(record['sku']))
self.assertTrue(validate_amount(record['amount']))
self.assertEqual(clean_text(record['name']), 'product name')
def test_invalid_record(self):
"""Test processing invalid record."""
record = {
'email': 'invalid',
'sku': 'ABC123',
'amount': '-10'
}
self.assertFalse(validate_email(record['email']))
self.assertFalse(validate_sku(record['sku']))
self.assertFalse(validate_amount(record['amount']))Step 7 — Run tests and generate report
# Run all tests
if __name__ == '__main__':
# Run with verbose output
unittest.main(verbosity=2)
# Or run from command line:
# python -m unittest test_validators.py -v
# Run specific test class:
# python -m unittest test_validators.TestEmailValidation -vHow this helps at work
- Confidence → know your validation logic works correctly
- Regression prevention → tests catch bugs when you change code
- Documentation → tests show how functions should behave
- Faster debugging → failing test pinpoints exact problem
Key Takeaways
- Print debugging → simple, effective for quick checks
- Assert statements → verify conditions during development
- unittest → built-in testing framework
- pytest → simpler syntax, more popular
- Logging → track program flow in production
- Cornerstone → validation test suite ensures data quality
Next Steps
You have mastered testing basics. Next, explore test-driven development (TDD), or dive intocontinuous integration (CI) to run tests automatically on every commit.