📥 Loading Data

Real data analysis starts with loading data from external sources! Whether it's a CSV file from a spreadsheet, data from a database, or JSON from a web API, Pandas makes it easy to get your data into DataFrames.

🎯 Why Learn Data Loading?

Creating DataFrames manually is great for learning, but real work involves loading existing data:

import pandas as pd

# Manual creation (what we've been doing)
manual_data = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie'],
    'score': [85, 92, 78]
})

print("Manual DataFrame:")
print(manual_data)
print()

# Real world: loading from files (what we'll learn)
# df = pd.read_csv('students.csv')
# df = pd.read_excel('sales_data.xlsx')
# df = pd.read_json('api_response.json')

print("🎯 Next: Loading from actual files!")

📋 Most Common File Formats

FormatExtensionUsed ForPandas Method
CSV.csvSpreadsheet exports, simple datapd.read_csv()
Excel.xlsx, .xlsOffice spreadsheetspd.read_excel()
JSON.jsonWeb APIs, configurationpd.read_json()
SQL.db, .sqliteDatabasespd.read_sql()

📂 Understanding File Paths

Before loading files, you need to know where they are:

import pandas as pd
import os

# Check your current directory
print("Current directory:")
print(os.getcwd())
print()

# List files in current directory
print("Files here:")
files = [f for f in os.listdir('.') if f.endswith(('.csv', '.xlsx', '.json'))]
if files:
    print(files)
else:
    print("No data files found in current directory")
print()

# Example file paths
print("File path examples:")
print("Same folder: 'data.csv'")
print("Subfolder: 'data/sales.csv'")
print("Full path: 'C:/Users/YourName/Documents/data.csv'")

🛠️ Basic Loading Pattern

Every file loading follows the same pattern:

📊 What You'll Learn in This Section

Master loading data from different sources:

🎮 Simple Loading Examples

Here's what loading data looks like in practice:

import pandas as pd

# Simulate loading different file types
print("📄 CSV Loading:")
print("df = pd.read_csv('sales_data.csv')")
print("✅ Loaded 1000 rows, 5 columns")
print()

print("📊 Excel Loading:")
print("df = pd.read_excel('monthly_report.xlsx')")
print("✅ Loaded 500 rows, 8 columns")
print()

print("🌐 JSON Loading:")
print("df = pd.read_json('api_data.json')")
print("✅ Loaded 250 rows, 3 columns")
print()

print("🗄️ Database Loading:")
print("df = pd.read_sql('SELECT * FROM customers', connection)")
print("✅ Loaded 2000 rows, 10 columns")

🔧 Common Loading Issues

ProblemSolution
"File not found"Check file path and name
"Permission denied"Make sure file isn't open in Excel
"Encoding error"Try encoding='utf-8' parameter
"Wrong separator"For CSV, try sep=';' or sep='\t'
"Date parsing issues"Use parse_dates parameter

🎯 Key Takeaways

🚀 What's Next?

Ready to load real data? Let's start with the most common format - CSV files, then move on to Excel and JSON.

Start with: Reading Files (CSV, Excel, JSON)

Time to work with real data! 📊🚀

Was this helpful?

😔Poor
🙁Fair
😊Good
😄Great
🤩Excellent