🔍 Series vs DataFrame

Pandas has two main data structures: Series and DataFrames. Think of them as single columns vs. full spreadsheets. Let's see the difference and when to use each!

📋 Understanding Series

A Series is just one column of data with an index:

import pandas as pd

# Create a Series
scores = pd.Series([85, 92, 78, 88])
print("Simple Series:")
print(scores)
print()

# Series with custom index
names = pd.Series(['Alice', 'Bob', 'Charlie'], index=[1, 2, 3])
print("Series with custom index:")
print(names)
print()

# Check the type
print(f"Type: {type(scores)}")

📊 Understanding DataFrames

A DataFrame is multiple columns (Series) put together:

import pandas as pd

# Create a DataFrame
students = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie'],
    'math': [85, 92, 78],
    'english': [90, 87, 85]
})

print("DataFrame:")
print(students)
print()

# Check the type
print(f"Type: {type(students)}")
print(f"Shape: {students.shape}")

🔄 Converting Between Series and DataFrame

You can easily convert between them:

import pandas as pd

# Start with a DataFrame
df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie'],
    'score': [85, 92, 78]
})

print("Original DataFrame:")
print(df)
print()

# Extract a Series (one column)
scores_series = df['score']
print("Extracted Series:")
print(scores_series)
print(f"Type: {type(scores_series)}")
print()

# Convert Series back to DataFrame
scores_df = scores_series.to_frame('scores')
print("Series to DataFrame:")
print(scores_df)

🎯 When to Use Each

Use Series WhenUse DataFrame When
Working with one columnWorking with multiple columns
Simple calculationsComplex data analysis
Single data typeMixed data types
Quick operationsBuilding datasets

Series Examples

import pandas as pd

# Perfect for single column operations
temperatures = pd.Series([22, 25, 19, 24, 21])

print("Temperatures:")
print(temperatures)
print()

# Easy calculations
print("Analysis:")
print(f"Average: {temperatures.mean():.1f}°C")
print(f"Max: {temperatures.max()}°C")
print(f"Days above 22°C: {len(temperatures[temperatures > 22])}")

DataFrame Examples

import pandas as pd

# Perfect for multi-column data
weather = pd.DataFrame({
    'day': ['Mon', 'Tue', 'Wed'],
    'temp': [22, 25, 19],
    'humidity': [65, 70, 80]
})

print("Weather Data:")
print(weather)
print()

# Complex analysis
print("Analysis:")
print(f"Average temp: {weather['temp'].mean():.1f}°C")
print(f"Average humidity: {weather['humidity'].mean():.1f}%")

# Add calculated column
weather['comfort'] = weather['temp'] * (100 - weather['humidity']) / 100
print("\nWith comfort index:")
print(weather)

🔍 Accessing Data Differently

The way you access data is slightly different:

import pandas as pd

# Series access
scores = pd.Series([85, 92, 78], index=['Alice', 'Bob', 'Charlie'])
print("Series:")
print(scores)
print(f"Alice's score: {scores['Alice']}")
print()

# DataFrame access
students = pd.DataFrame({
    'math': [85, 92, 78],
    'english': [90, 87, 85]
}, index=['Alice', 'Bob', 'Charlie'])

print("DataFrame:")
print(students)
print(f"Alice's math score: {students.loc['Alice', 'math']}")
print(f"All math scores: {students['math'].tolist()}")

🛠️ Common Operations Comparison

Here's how the same operations work with both:

import pandas as pd

# Create test data
series_data = pd.Series([10, 20, 30, 40])
df_data = pd.DataFrame({'values': [10, 20, 30, 40]})

print("Series operations:")
print(f"Sum: {series_data.sum()}")
print(f"Mean: {series_data.mean()}")
print(f"Max: {series_data.max()}")
print()

print("DataFrame operations:")
print(f"Sum: {df_data['values'].sum()}")
print(f"Mean: {df_data['values'].mean()}")
print(f"Max: {df_data['values'].max()}")
print()

# Note: DataFrame operations need column selection!

🎯 Key Takeaways

🚀 What's Next?

Now you understand the building blocks of Pandas! Let's learn how to get basic information about your DataFrames to understand your data better.

Continue to: Basic DataFrame Info

You're mastering Pandas fundamentals! 📊🔍

Was this helpful?

😔Poor
🙁Fair
😊Good
😄Great
🤩Excellent