Testing and Debugging

Testing catches bugs before users do. Debugging fixes them fast. Business analysts write tests for data pipelines, validation rules, and calculations to ensure accuracy. Master testing and your scripts become reliable tools teammates trust.

Estimated reading time: 25–30 minutes

Testing Basics

  • Unit tests → test individual functions
  • assert statements → verify expected results
  • Test coverage → how much code is tested
  • Automated testing → run tests on every change

Great for: catching bugs early, confidence in changes

Debugging Techniques

  • Print debugging → simple, effective for quick checks
  • Logging → track program flow in production
  • Debugger (pdb) → step through code line by line
  • Error messages → read stack traces carefully

Great for: finding and fixing bugs fast

Print Debugging

Simplest debugging: print variables to see what is happening.

python
def calculate_discount(price, discount_pct):
    print("Input price:", price)
    print("Discount %:", discount_pct)
    
    discount = price * (discount_pct / 100)
    print("Discount amount:", discount)
    
    final_price = price - discount
    print("Final price:", final_price)
    
    return final_price

result = calculate_discount(100, 20)
print("Result:", result)

Assert Statements

Assert checks conditions that must be true. Raises error if false.

python
def validate_email(email):
    assert "@" in email, "Email must contain @"
    assert "." in email.split("@")[1], "Email must have domain with ."
    return True

# Valid email
validate_email("ana@company.com")  # OK

# Invalid email
try:
    validate_email("invalid-email")  # AssertionError
except AssertionError as e:
    print("Validation failed:", e)

Unit Testing with unittest

Write automated tests that verify your functions work correctly.

python
import unittest

def calculate_total(items):
    return sum(item['price'] * item['qty'] for item in items)

class TestCalculateTotal(unittest.TestCase):
    def test_empty_list(self):
        result = calculate_total([])
        self.assertEqual(result, 0)
    
    def test_single_item(self):
        items = [{'price': 10, 'qty': 2}]
        result = calculate_total(items)
        self.assertEqual(result, 20)
    
    def test_multiple_items(self):
        items = [
            {'price': 10, 'qty': 2},
            {'price': 5, 'qty': 3}
        ]
        result = calculate_total(items)
        self.assertEqual(result, 35)

# Run tests
if __name__ == '__main__':
    unittest.main()

Common Assertions

unittest provides many assertion methods.

python
import unittest

class TestAssertions(unittest.TestCase):
    def test_equality(self):
        self.assertEqual(2 + 2, 4)
        self.assertNotEqual(2 + 2, 5)
    
    def test_boolean(self):
        self.assertTrue(10 > 5)
        self.assertFalse(10 < 5)
    
    def test_membership(self):
        self.assertIn('a', 'abc')
        self.assertNotIn('x', 'abc')
    
    def test_exceptions(self):
        with self.assertRaises(ZeroDivisionError):
            10 / 0
    
    def test_approximate(self):
        self.assertAlmostEqual(0.1 + 0.2, 0.3, places=7)

Testing with pytest (simpler syntax)

pytest is more popular than unittest for its simplicity.

python
# Install: pip install pytest

def calculate_total(items):
    return sum(item['price'] * item['qty'] for item in items)

# Test file: test_calculations.py
def test_empty_list():
    assert calculate_total([]) == 0

def test_single_item():
    items = [{'price': 10, 'qty': 2}]
    assert calculate_total(items) == 20

def test_multiple_items():
    items = [
        {'price': 10, 'qty': 2},
        {'price': 5, 'qty': 3}
    ]
    assert calculate_total(items) == 35

# Run: pytest test_calculations.py

Logging for Production

Use logging instead of print for production code.

python
import logging

# Configure logging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(levelname)s - %(message)s',
    filename='app.log'
)

def process_order(order_id, amount):
    logging.info("Processing order %s", order_id)
    
    if amount < 0:
        logging.error("Invalid amount: %s", amount)
        return False
    
    logging.info("Order %s processed: $%s", order_id, amount)
    return True

# Use it
process_order("ORD123", 100)
process_order("ORD124", -50)

Cornerstone Project — Data Validation Test Suite (step-by-step)

Build a comprehensive test suite for a data validation module. This ensures your data cleaning functions work correctly and catch regressions when you make changes. Essential for production data pipelines.

Step 1 — Create validation module

validators.py:

python
def validate_email(email):
    """Validate email format."""
    if not isinstance(email, str):
        return False
    if "@" not in email:
        return False
    parts = email.split("@")
    if len(parts) != 2:
        return False
    return "." in parts[1]

def validate_sku(sku):
    """Validate SKU format (SKU + 5 digits)."""
    if not isinstance(sku, str):
        return False
    if not sku.startswith("SKU"):
        return False
    if len(sku) != 8:
        return False
    return sku[3:].isdigit()

def validate_amount(amount):
    """Validate amount is positive number."""
    try:
        value = float(amount)
        return value > 0
    except (ValueError, TypeError):
        return False

def clean_text(text):
    """Clean and normalize text."""
    if not isinstance(text, str):
        return ""
    return text.strip().lower()

Step 2 — Write tests for email validation

test_validators.py:

python
import unittest
from validators import validate_email

class TestEmailValidation(unittest.TestCase):
    def test_valid_emails(self):
        """Test valid email formats."""
        valid_emails = [
            "user@example.com",
            "test.user@company.co.uk",
            "admin@test-site.org"
        ]
        for email in valid_emails:
            self.assertTrue(validate_email(email), email + " should be valid")
    
    def test_invalid_emails(self):
        """Test invalid email formats."""
        invalid_emails = [
            "invalid",
            "@example.com",
            "user@",
            "user@@example.com",
            "",
            None,
            123
        ]
        for email in invalid_emails:
            self.assertFalse(validate_email(email), str(email) + " should be invalid")

Step 3 — Write tests for SKU validation

python
class TestSKUValidation(unittest.TestCase):
    def test_valid_skus(self):
        """Test valid SKU formats."""
        valid_skus = ["SKU12345", "SKU00001", "SKU99999"]
        for sku in valid_skus:
            self.assertTrue(validate_sku(sku))
    
    def test_invalid_skus(self):
        """Test invalid SKU formats."""
        invalid_skus = [
            "ABC12345",  # wrong prefix
            "SKU123",    # too short
            "SKU1234567",  # too long
            "SKU1234A",  # non-digit
            "",
            None,
            12345
        ]
        for sku in invalid_skus:
            self.assertFalse(validate_sku(sku))

Step 4 — Write tests for amount validation

python
class TestAmountValidation(unittest.TestCase):
    def test_valid_amounts(self):
        """Test valid amounts."""
        valid_amounts = [1, 10.5, "100", "99.99", 0.01]
        for amount in valid_amounts:
            self.assertTrue(validate_amount(amount))
    
    def test_invalid_amounts(self):
        """Test invalid amounts."""
        invalid_amounts = [0, -10, "-5", "abc", None, ""]
        for amount in invalid_amounts:
            self.assertFalse(validate_amount(amount))

Step 5 — Write tests for text cleaning

python
class TestTextCleaning(unittest.TestCase):
    def test_strip_whitespace(self):
        """Test whitespace removal."""
        self.assertEqual(clean_text("  hello  "), "hello")
    
    def test_lowercase(self):
        """Test case normalization."""
        self.assertEqual(clean_text("HELLO"), "hello")
        self.assertEqual(clean_text("HeLLo"), "hello")
    
    def test_combined(self):
        """Test strip and lowercase together."""
        self.assertEqual(clean_text("  HELLO WORLD  "), "hello world")
    
    def test_edge_cases(self):
        """Test edge cases."""
        self.assertEqual(clean_text(""), "")
        self.assertEqual(clean_text(None), "")
        self.assertEqual(clean_text(123), "")

Step 6 — Add integration test

Test multiple validators together.

python
class TestDataPipeline(unittest.TestCase):
    def test_valid_record(self):
        """Test processing valid record."""
        record = {
            'email': 'user@example.com',
            'sku': 'SKU12345',
            'amount': '99.99',
            'name': '  Product Name  '
        }
        
        self.assertTrue(validate_email(record['email']))
        self.assertTrue(validate_sku(record['sku']))
        self.assertTrue(validate_amount(record['amount']))
        self.assertEqual(clean_text(record['name']), 'product name')
    
    def test_invalid_record(self):
        """Test processing invalid record."""
        record = {
            'email': 'invalid',
            'sku': 'ABC123',
            'amount': '-10'
        }
        
        self.assertFalse(validate_email(record['email']))
        self.assertFalse(validate_sku(record['sku']))
        self.assertFalse(validate_amount(record['amount']))

Step 7 — Run tests and generate report

python
# Run all tests
if __name__ == '__main__':
    # Run with verbose output
    unittest.main(verbosity=2)

# Or run from command line:
# python -m unittest test_validators.py -v

# Run specific test class:
# python -m unittest test_validators.TestEmailValidation -v

How this helps at work

  • Confidence → know your validation logic works correctly
  • Regression prevention → tests catch bugs when you change code
  • Documentation → tests show how functions should behave
  • Faster debugging → failing test pinpoints exact problem

Key Takeaways

  • Print debugging → simple, effective for quick checks
  • Assert statements → verify conditions during development
  • unittest → built-in testing framework
  • pytest → simpler syntax, more popular
  • Logging → track program flow in production
  • Cornerstone → validation test suite ensures data quality

Next Steps

You have mastered testing basics. Next, explore test-driven development (TDD), or dive intocontinuous integration (CI) to run tests automatically on every commit.