Skip to main content

Command Palette

Search for a command to run...

Pandas: DataFrame Operations

Turning Your Data Into a Well-Organized Chaos!

Updated
•5 min read
Pandas: DataFrame Operations

Pandas DataFrame Analysis

View and analyze your data frames with built-in Pandas methods. Stats made simple!

Example 1: Get a quick statistical summary.

df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
print(df.describe())

# Output:
#          A    B
# count  3.0  3.0
# mean   2.0  5.0
# std    1.0  1.0
# min    1.0  4.0
# 25%    1.5  4.5
# 50%    2.0  5.0
# 75%    2.5  5.5
# max    3.0  6.0

Example 2: Calculate column-wise means.

df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
print(df.mean())

# Output:
# A    2.0
# B    5.0
# dtype: float64

Example 3: Return the first n rows in a data frame.

data = {
    "Name": ["John", "Alice", "Bob", "Emma", "Mike"],
    "Age": [25, 30, 35, 28, 32],
    "City": ["New York", "Paris", "London", "Sydney", "Tokyo"],
}
df = pd.DataFrame(data)
print(df.head(3))

# Output:
#     Name  Age      City
# 0   John   25  New York
# 1  Alice   30     Paris
# 2    Bob   35    London

Example 4: Return the last n rows in a data frame.

data = {
    "Name": ["John", "Alice", "Bob", "Emma", "Mike"],
    "Age": [25, 30, 35, 28, 32],
    "City": ["New York", "Paris", "London", "Sydney", "Tokyo"],
}
df = pd.DataFrame(data)
print(df.tail(3))

# Output:
#    Name  Age    City
# 2   Bob   35  London
# 3  Emma   28  Sydney
# 4  Mike   32   Tokyo

Example 5: Get the data frame information.

data = {
    "Name": ["John", "Alice", "Bob", "Emma", "Mike"],
    "Age": [25, 30, 35, 28, 32],
    "City": ["New York", "Paris", "London", "Sydney", "Tokyo"],
}
df = pd.DataFrame(data)
print(df.info())

# Output:
# <class 'pandas.core.frame.DataFrame'>
# RangeIndex: 5 entries, 0 to 4
# Data columns (total 3 columns):
#  #   Column  Non-Null Count  Dtype
# ---  ------  --------------  -----
#  0   Name    5 non-null      object
#  1   Age     5 non-null      int64
#  2   City    5 non-null      object
# dtypes: int64(1), object(2)
# memory usage: 248.0+ bytes
# None
đź’ˇ
Pro tip: Let Pandas do the math while you grab a coffee.

Pandas DataFrame Manipulation

Add, update, or drop columns and rows with ease.

Example 1: Add and drop columns.

data = {
    "Name": ["John", "Alice", "Bob", "Emma", "Mike"],
    "Age": [25, 30, 35, 28, 32],
    "City": ["New York", "Paris", "London", "Sydney", "Tokyo"],
}
df = pd.DataFrame(data)

df["Country"] = ["United States", "France", "United Kingdom", "Australia", "Japan"]
print(df.head(2))

print("================")

df.drop("Age", axis=1, inplace=True)
# To drop multiple: df.drop(["Age", "City"], axis=1, inplace=True)
# Using columns: df.drop(columns="Age", inplace=True)
print(df.head(2))

#     Name  Age      City        Country
# 0   John   25  New York  United States
# 1  Alice   30     Paris         France
# ================
#     Name      City        Country
# 0   John  New York  United States
# 1  Alice     Paris         France

Example 2: Insert and drop rows.

data = {
    "Name": ["John", "Alice", "Bob", "Emma", "Mike"],
    "Age": [25, 30, 35, 28, 32],
    "City": ["New York", "Paris", "London", "Sydney", "Tokyo"],
}
df = pd.DataFrame(data)

df.loc[len(df.index)] = ["Drake", 32, "Bangkok"]
# To replace: df.loc[2] = ["Drake", 32, "Bangkok"]
print(df.tail(2))

print("================")

df.drop(1, axis=0, inplace=True)
# To drop multiple: df.drop([1, 3], axis=0, inplace=True)
# Using index: df.drop(index=1, inplace=True)
print(df.head(2))

#     Name  Age     City
# 4   Mike   32    Tokyo
# 5  Drake   32  Bangkok
# ================
#    Name  Age      City
# 0  John   25  New York
# 2   Bob   35    London

Example 3: Rename column names and indexes.

data = {
    "Name": ["John", "Alice", "Bob", "Emma", "Mike"],
    "Age": [25, 30, 35, 28, 32],
    "City": ["New York", "Paris", "London", "Sydney", "Tokyo"],
}
df = pd.DataFrame(data)

df.rename(columns={"City": "Address"}, inplace=True)
# Using mapper: df.rename(mapper={"City": "Address"}, axis=1, inplace=True)
print(df.head(2))

print("================")

df.rename(index={1: 100}, inplace=True)
# Using mapper: df.rename(mapper={1: 100}, axis=0, inplace=True)
print(df.head(2))

#     Name  Age   Address
# 0   John   25  New York
# 1  Alice   30     Paris
# ================
#       Name  Age   Address
# 0     John   25  New York
# 100  Alice   30     Paris

Think of it like playing Tetris but with data.


Pandas Indexing and Slicing

Access data with labels or positions—no guessing required.

Example 1: Indexing by column or row.

df = pd.DataFrame({'A': [10, 20], 'B': [30, 40]})
print(df['A'])  # Access column
print(df.iloc[0])  # Access first row by position
print(df.loc[0])  # Access first row by label

Example 2: Slice rows and columns.

print(df.iloc[:1])  # First row
print(df[['A', 'B']])  # Selected columns

Indexing in Pandas is like peeling an onion: layer by layer.


Pandas Select

Filter specific rows or data based on conditions.

Example 1: Simple condition.

print(df[df['A'] > 15])

Example 2: Multiple conditions.

print(df[(df['A'] > 10) & (df['B'] < 40)])

It’s like swiping right on the rows you love.


Pandas Multiindex

For when a single index just isn’t enough.

Example: Create and use a MultiIndex.

arrays = [['A', 'A', 'B'], [1, 2, 1]]
index = pd.MultiIndex.from_arrays(arrays, names=('Letter', 'Number'))
df = pd.DataFrame({'Value': [10, 20, 30]}, index=index)
print(df)

MultiIndex is for the overachievers. You know who you are.


Pandas Reshape

Reshape your data with melt and pivot.

Example 1: Use melt for long format.

df = pd.DataFrame({'ID': [1, 2], 'Value': [10, 20]})
melted = pd.melt(df, id_vars='ID')
print(melted

Example 2: Use pivot for wide format.

pivoted = melted.pivot(index='ID', columns='variable', values='value')
print(pivoted)

Shape your data like a pro.


Pandas Duplicate Values

Find and remove duplicate rows. Because no one likes redundancy.

Example 1: Detect duplicates.

df = pd.DataFrame({'A': [1, 1, 2], 'B': [3, 3, 4]})
print(df.duplicated())

Example 2: Drop duplicates.

df.drop_duplicates(inplace=True)
print(df

Duplicates, begone!


Pandas Pivot

Reorganize your data with pivot.

Example: Pivot data.

data = {'Date': ['2024-01-01', '2024-01-02'], 'Value': [10, 20]}
df = pd.DataFrame(data)
pivoted = df.pivot(index='Date', columns='Value')
print(pivoted)

Pivots make data feel fancy.


Pandas Pivot Table

Summarize data with a pivot table.

Example: Create a pivot table.

data = {'Category': ['A', 'A', 'B'], 'Value': [10, 20, 30]}
df = pd.DataFrame(data)
pivot_table = df.pivot_table(values='Value', index='Category', aggfunc='sum')
print(pivot_table)

You’re basically at Excel-level now.


General Programming

Part 2 of 6

I’m diving into Python, Django, FastAPI, NumPy, Pandas, Docker, and all that good stuff. Think of it as me sharing my coding wins and fails—because who doesn’t love a good bug story? If you’re into code, you might find these discoveries interesting.

Up next

Pandas: Introduction

With great tool, comes great productivity.