Data structures are containers that organize your data for fast lookups, updates, and analysis. Business analysts use lists for ordered records, dicts for KPI mappings, sets for deduplication, and tuples for immutable configs. Master these four and you handle 95% of real-world data tasks.

Estimated reading time: 25–30 minutes

When to Use Each Structure

List → ordered, mutable (sales records, task queues)
Dict → key-value lookups (employee ID to name, SKU to price)
Set → unique items, fast membership (dedupe emails, tags)
Tuple → immutable sequence (coordinates, config constants)

Great for: organizing business data by access pattern

Performance & Best Practices

Dict/set lookups are O(1) → use for find by ID tasks
List append is O(1), insert/delete is O(n) → append when possible
Comprehensions are faster and cleaner than loops
Use tuples for data that should not change (safer, hashable)

Great for: writing efficient, bug-resistant code

Lists — Ordered, Mutable Sequences

Lists hold ordered items you can change. Perfect for sales records, task lists, or any sequence you will modify.

python

# Creating lists
sales = [1200, 1500, 980, 2100]
regions = ["North", "South", "East", "West"]

# Accessing by index (0-based)
first_sale = sales[0]      # 1200
last_sale = sales[-1]      # 2100

# Slicing
top_two = sales[:2]        # [1200, 1500]

# Modifying
sales.append(1800)         # add to end
sales[0] = 1250            # update first
sales.remove(980)          # remove by value

Dictionaries — Fast Key-Value Lookups

Dicts map keys to values for instant lookups. Use for employee records, SKU prices, or any find by ID task.

python

# Creating dicts
employee = {"id": 101, "name": "Ana", "dept": "Sales"}
prices = {"SKU123": 49.99, "SKU456": 89.99}

# Accessing
name = employee["name"]           # "Ana"
price = prices.get("SKU123", 0)   # 49.99 (safe with default)

# Adding/updating
employee["email"] = "ana@co.com"
prices["SKU789"] = 120.00

# Iterating
for sku, price in prices.items():
    print(sku, ":", price)

Sets — Unique Items, Fast Membership

Sets store unique items with O(1) membership checks. Perfect for deduplication and finding overlaps.

python

# Creating sets
emails = {"ana@co.com", "bob@co.com", "ana@co.com"}  # auto-dedupes
print(emails)  # {"ana@co.com", "bob@co.com"}

# Adding/removing
emails.add("carol@co.com")
emails.discard("bob@co.com")  # safe remove (no error if missing)

# Set operations
team_a = {"Ana", "Bob", "Carol"}
team_b = {"Bob", "David", "Eve"}

both = team_a & team_b         # intersection: {"Bob"}
either = team_a | team_b       # union: all 5 names
only_a = team_a - team_b       # difference: {"Ana", "Carol"}

Tuples — Immutable Sequences

Tuples are like lists but cannot be changed. Use for coordinates, config constants, or dict keys.

python

# Creating tuples
coords = (40.7128, -74.0060)  # NYC lat/lon
config = ("prod", 8080, True)

# Accessing
lat, lon = coords  # unpacking
env = config[0]    # "prod"

# Tuples are hashable so can be dict keys
locations = {
    (40.7128, -74.0060): "New York",
    (34.0522, -118.2437): "Los Angeles"
}

Comprehensions — Concise Data Transforms

Comprehensions create lists/dicts/sets in one line. Faster and cleaner than loops.

python

# List comprehension
sales = [1200, 1500, 980, 2100]
high_sales = [s for s in sales if s > 1000]  # [1200, 1500, 2100]

# Dict comprehension
prices = {"A": 10, "B": 20, "C": 15}
discounted = {k: v * 0.9 for k, v in prices.items()}  # 10% off

# Set comprehension
regions = ["North", "South", "North", "East"]
unique = {r.upper() for r in regions}  # {"NORTH", "SOUTH", "EAST"}

Cornerstone Project — Product Inventory Tracker (step-by-step)

Build a simple inventory system that tracks stock levels, flags low inventory, and finds duplicate SKUs. You will combine lists, dicts, and sets to organize data efficiently—skills you will use daily for dashboards, reports, and ETL.

Step 1 — Define the inventory data

Start with a list of dicts (mimics CSV rows).

python

inventory = [
    {"sku": "A101", "name": "Widget", "qty": 50, "price": 12.99},
    {"sku": "B202", "name": "Gadget", "qty": 5, "price": 24.50},
    {"sku": "C303", "name": "Doohickey", "qty": 120, "price": 8.75},
    {"sku": "A101", "name": "Widget", "qty": 30, "price": 12.99},  # duplicate SKU
]

Step 2 — Find low-stock items (list comprehension)

Filter items below a threshold in one line.

python

LOW_STOCK = 10
low_items = [item for item in inventory if item["qty"] < LOW_STOCK]

print("Low Stock Alert:")
for item in low_items:
    print(" •", item['name'], "(SKU", item['sku'], "):", item['qty'], "left")

Step 3 — Build a SKU to details lookup (dict)

Convert list to dict for instant lookups by SKU.

python

sku_map = {item["sku"]: item for item in inventory}

# Fast lookup
if "B202" in sku_map:
    print("Found:", sku_map["B202"]["name"])

Step 4 — Detect duplicate SKUs (set)

Use a set to find SKUs that appear more than once.

python

seen = set()
duplicates = set()

for item in inventory:
    sku = item["sku"]
    if sku in seen:
        duplicates.add(sku)
    else:
        seen.add(sku)

if duplicates:
    print("Duplicate SKUs found:", duplicates)

Step 5 — Calculate total inventory value

Sum up qty times price for all items.

python

total_value = sum(item["qty"] * item["price"] for item in inventory)
print("Total inventory value: $", round(total_value, 2))

Step 6 — Put it all together

Combine into a reusable function that returns a summary dict.

python

def inventory_report(items, low_threshold=10):
    low = [i for i in items if i["qty"] < low_threshold]
    
    seen, dupes = set(), set()
    for i in items:
        (dupes if i["sku"] in seen else seen).add(i["sku"])
    
    total = sum(i["qty"] * i["price"] for i in items)
    
    return {
        "low_stock": low,
        "duplicates": list(dupes),
        "total_value": round(total, 2)
    }

report = inventory_report(inventory)
print("Low stock:", len(report["low_stock"]))
print("Duplicates:", report["duplicates"])
print("Total value: $", report['total_value'])

How this helps at work

Instant alerts → spot low stock before customers complain
Data quality → catch duplicate SKUs that break reports
Financial visibility → know your inventory value in seconds
Reusable pattern → adapt for customer lists, order tracking, etc.

Key Takeaways

Lists → ordered, mutable; use for sequences you will modify
Dicts → key-value pairs; O(1) lookups for find by ID tasks
Sets → unique items; fast membership and deduplication
Tuples → immutable sequences; safe for constants and dict keys
Comprehensions → one-line transforms; faster and cleaner than loops
Cornerstone → inventory tracker combining all four structures

Next Steps

You have mastered Python core data structures. Next, explore loops and iterations to process these structures efficiently, or jump to file handling to load real CSV/JSON data into your trackers.

Data Structures

When to Use Each Structure

Performance & Best Practices

Lists — Ordered, Mutable Sequences

Dictionaries — Fast Key-Value Lookups

Sets — Unique Items, Fast Membership

Tuples — Immutable Sequences

Comprehensions — Concise Data Transforms

Cornerstone Project — Product Inventory Tracker (step-by-step)

Step 1 — Define the inventory data

Step 2 — Find low-stock items (list comprehension)

Step 3 — Build a SKU to details lookup (dict)

Step 4 — Detect duplicate SKUs (set)

Step 5 — Calculate total inventory value

Step 6 — Put it all together

How this helps at work

Key Takeaways

Next Steps

On This Page