🔍 Series vs DataFrame
Pandas has two main data structures: Series and DataFrames. Think of them as single columns vs. full spreadsheets. Let's see the difference and when to use each!
📋 Understanding Series
A Series is just one column of data with an index:
import pandas as pd
# Create a Series
scores = pd.Series([85, 92, 78, 88])
print("Simple Series:")
print(scores)
print()
# Series with custom index
names = pd.Series(['Alice', 'Bob', 'Charlie'], index=[1, 2, 3])
print("Series with custom index:")
print(names)
print()
# Check the type
print(f"Type: {type(scores)}")
📊 Understanding DataFrames
A DataFrame is multiple columns (Series) put together:
import pandas as pd
# Create a DataFrame
students = pd.DataFrame({
'name': ['Alice', 'Bob', 'Charlie'],
'math': [85, 92, 78],
'english': [90, 87, 85]
})
print("DataFrame:")
print(students)
print()
# Check the type
print(f"Type: {type(students)}")
print(f"Shape: {students.shape}")
🔄 Converting Between Series and DataFrame
You can easily convert between them:
import pandas as pd
# Start with a DataFrame
df = pd.DataFrame({
'name': ['Alice', 'Bob', 'Charlie'],
'score': [85, 92, 78]
})
print("Original DataFrame:")
print(df)
print()
# Extract a Series (one column)
scores_series = df['score']
print("Extracted Series:")
print(scores_series)
print(f"Type: {type(scores_series)}")
print()
# Convert Series back to DataFrame
scores_df = scores_series.to_frame('scores')
print("Series to DataFrame:")
print(scores_df)
🎯 When to Use Each
Use Series When | Use DataFrame When |
---|---|
Working with one column | Working with multiple columns |
Simple calculations | Complex data analysis |
Single data type | Mixed data types |
Quick operations | Building datasets |
Series Examples
import pandas as pd
# Perfect for single column operations
temperatures = pd.Series([22, 25, 19, 24, 21])
print("Temperatures:")
print(temperatures)
print()
# Easy calculations
print("Analysis:")
print(f"Average: {temperatures.mean():.1f}°C")
print(f"Max: {temperatures.max()}°C")
print(f"Days above 22°C: {len(temperatures[temperatures > 22])}")
DataFrame Examples
import pandas as pd
# Perfect for multi-column data
weather = pd.DataFrame({
'day': ['Mon', 'Tue', 'Wed'],
'temp': [22, 25, 19],
'humidity': [65, 70, 80]
})
print("Weather Data:")
print(weather)
print()
# Complex analysis
print("Analysis:")
print(f"Average temp: {weather['temp'].mean():.1f}°C")
print(f"Average humidity: {weather['humidity'].mean():.1f}%")
# Add calculated column
weather['comfort'] = weather['temp'] * (100 - weather['humidity']) / 100
print("\nWith comfort index:")
print(weather)
🔍 Accessing Data Differently
The way you access data is slightly different:
import pandas as pd
# Series access
scores = pd.Series([85, 92, 78], index=['Alice', 'Bob', 'Charlie'])
print("Series:")
print(scores)
print(f"Alice's score: {scores['Alice']}")
print()
# DataFrame access
students = pd.DataFrame({
'math': [85, 92, 78],
'english': [90, 87, 85]
}, index=['Alice', 'Bob', 'Charlie'])
print("DataFrame:")
print(students)
print(f"Alice's math score: {students.loc['Alice', 'math']}")
print(f"All math scores: {students['math'].tolist()}")
🛠️ Common Operations Comparison
Here's how the same operations work with both:
import pandas as pd
# Create test data
series_data = pd.Series([10, 20, 30, 40])
df_data = pd.DataFrame({'values': [10, 20, 30, 40]})
print("Series operations:")
print(f"Sum: {series_data.sum()}")
print(f"Mean: {series_data.mean()}")
print(f"Max: {series_data.max()}")
print()
print("DataFrame operations:")
print(f"Sum: {df_data['values'].sum()}")
print(f"Mean: {df_data['values'].mean()}")
print(f"Max: {df_data['values'].max()}")
print()
# Note: DataFrame operations need column selection!
🎯 Key Takeaways
🚀 What's Next?
Now you understand the building blocks of Pandas! Let's learn how to get basic information about your DataFrames to understand your data better.
Continue to: Basic DataFrame Info
You're mastering Pandas fundamentals! 📊🔍
Was this helpful?
Track Your Learning Progress
Sign in to bookmark tutorials and keep track of your learning journey.
Your progress is saved automatically as you read.