# Pandas: DataFrame Operations

## Pandas DataFrame Analysis

View and analyze your data frames with built-in Pandas methods. Stats made simple!

Example 1: Get a quick statistical summary.

```python
df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
print(df.describe())

# Output:
#          A    B
# count  3.0  3.0
# mean   2.0  5.0
# std    1.0  1.0
# min    1.0  4.0
# 25%    1.5  4.5
# 50%    2.0  5.0
# 75%    2.5  5.5
# max    3.0  6.0
```

Example 2: Calculate column-wise means.

```python
df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]})
print(df.mean())

# Output:
# A    2.0
# B    5.0
# dtype: float64
```

Example 3: Return the first n rows in a data frame.

```python
data = {
    "Name": ["John", "Alice", "Bob", "Emma", "Mike"],
    "Age": [25, 30, 35, 28, 32],
    "City": ["New York", "Paris", "London", "Sydney", "Tokyo"],
}
df = pd.DataFrame(data)
print(df.head(3))

# Output:
#     Name  Age      City
# 0   John   25  New York
# 1  Alice   30     Paris
# 2    Bob   35    London
```

Example 4: Return the last n rows in a data frame.

```python
data = {
    "Name": ["John", "Alice", "Bob", "Emma", "Mike"],
    "Age": [25, 30, 35, 28, 32],
    "City": ["New York", "Paris", "London", "Sydney", "Tokyo"],
}
df = pd.DataFrame(data)
print(df.tail(3))

# Output:
#    Name  Age    City
# 2   Bob   35  London
# 3  Emma   28  Sydney
# 4  Mike   32   Tokyo
```

Example 5: Get the data frame information.

```python
data = {
    "Name": ["John", "Alice", "Bob", "Emma", "Mike"],
    "Age": [25, 30, 35, 28, 32],
    "City": ["New York", "Paris", "London", "Sydney", "Tokyo"],
}
df = pd.DataFrame(data)
print(df.info())

# Output:
# <class 'pandas.core.frame.DataFrame'>
# RangeIndex: 5 entries, 0 to 4
# Data columns (total 3 columns):
#  #   Column  Non-Null Count  Dtype
# ---  ------  --------------  -----
#  0   Name    5 non-null      object
#  1   Age     5 non-null      int64
#  2   City    5 non-null      object
# dtypes: int64(1), object(2)
# memory usage: 248.0+ bytes
# None
```

<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">Pro tip: Let Pandas do the math while you grab a coffee.</div>
</div>

---

## Pandas DataFrame Manipulation

Add, update, or drop columns and rows with ease.

Example 1: Add and drop columns.

```python
data = {
    "Name": ["John", "Alice", "Bob", "Emma", "Mike"],
    "Age": [25, 30, 35, 28, 32],
    "City": ["New York", "Paris", "London", "Sydney", "Tokyo"],
}
df = pd.DataFrame(data)

df["Country"] = ["United States", "France", "United Kingdom", "Australia", "Japan"]
print(df.head(2))

print("================")

df.drop("Age", axis=1, inplace=True)
# To drop multiple: df.drop(["Age", "City"], axis=1, inplace=True)
# Using columns: df.drop(columns="Age", inplace=True)
print(df.head(2))

#     Name  Age      City        Country
# 0   John   25  New York  United States
# 1  Alice   30     Paris         France
# ================
#     Name      City        Country
# 0   John  New York  United States
# 1  Alice     Paris         France
```

Example 2: Insert and drop rows.

```python
data = {
    "Name": ["John", "Alice", "Bob", "Emma", "Mike"],
    "Age": [25, 30, 35, 28, 32],
    "City": ["New York", "Paris", "London", "Sydney", "Tokyo"],
}
df = pd.DataFrame(data)

df.loc[len(df.index)] = ["Drake", 32, "Bangkok"]
# To replace: df.loc[2] = ["Drake", 32, "Bangkok"]
print(df.tail(2))

print("================")

df.drop(1, axis=0, inplace=True)
# To drop multiple: df.drop([1, 3], axis=0, inplace=True)
# Using index: df.drop(index=1, inplace=True)
print(df.head(2))

#     Name  Age     City
# 4   Mike   32    Tokyo
# 5  Drake   32  Bangkok
# ================
#    Name  Age      City
# 0  John   25  New York
# 2   Bob   35    London
```

Example 3: Rename column names and indexes.

```python
data = {
    "Name": ["John", "Alice", "Bob", "Emma", "Mike"],
    "Age": [25, 30, 35, 28, 32],
    "City": ["New York", "Paris", "London", "Sydney", "Tokyo"],
}
df = pd.DataFrame(data)

df.rename(columns={"City": "Address"}, inplace=True)
# Using mapper: df.rename(mapper={"City": "Address"}, axis=1, inplace=True)
print(df.head(2))

print("================")

df.rename(index={1: 100}, inplace=True)
# Using mapper: df.rename(mapper={1: 100}, axis=0, inplace=True)
print(df.head(2))

#     Name  Age   Address
# 0   John   25  New York
# 1  Alice   30     Paris
# ================
#       Name  Age   Address
# 0     John   25  New York
# 100  Alice   30     Paris
```

Think of it like playing Tetris but with data.

---

## Pandas Indexing and Slicing

Access data with labels or positions—no guessing required.

Example 1: Indexing by column or row.

```python
df = pd.DataFrame({'A': [10, 20], 'B': [30, 40]})
print(df['A'])  # Access column
print(df.iloc[0])  # Access first row by position
print(df.loc[0])  # Access first row by label
```

Example 2: Slice rows and columns.

```python
print(df.iloc[:1])  # First row
print(df[['A', 'B']])  # Selected columns
```

Indexing in Pandas is like peeling an onion: layer by layer.

---

## Pandas Select

Filter specific rows or data based on conditions.

Example 1: Simple condition.

```python
print(df[df['A'] > 15])
```

Example 2: Multiple conditions.

```python
print(df[(df['A'] > 10) & (df['B'] < 40)])
```

It’s like swiping right on the rows you love.

---

## Pandas Multiindex

For when a single index just isn’t enough.

Example: Create and use a MultiIndex.

```python
arrays = [['A', 'A', 'B'], [1, 2, 1]]
index = pd.MultiIndex.from_arrays(arrays, names=('Letter', 'Number'))
df = pd.DataFrame({'Value': [10, 20, 30]}, index=index)
print(df)
```

MultiIndex is for the overachievers. You know who you are.

---

## Pandas Reshape

Reshape your data with `melt` and `pivot`.

Example 1: Use `melt` for long format.

```python
df = pd.DataFrame({'ID': [1, 2], 'Value': [10, 20]})
melted = pd.melt(df, id_vars='ID')
print(melted
```

Example 2: Use `pivot` for wide format.

```python
pivoted = melted.pivot(index='ID', columns='variable', values='value')
print(pivoted)
```

Shape your data like a pro.

---

## Pandas Duplicate Values

Find and remove duplicate rows. Because no one likes redundancy.

Example 1: Detect duplicates.

```python
df = pd.DataFrame({'A': [1, 1, 2], 'B': [3, 3, 4]})
print(df.duplicated())
```

Example 2: Drop duplicates.

```python
df.drop_duplicates(inplace=True)
print(df
```

Duplicates, begone!

---

## Pandas Pivot

Reorganize your data with `pivot`.

Example: Pivot data.

```python
data = {'Date': ['2024-01-01', '2024-01-02'], 'Value': [10, 20]}
df = pd.DataFrame(data)
pivoted = df.pivot(index='Date', columns='Value')
print(pivoted)
```

Pivots make data feel fancy.

---

## Pandas Pivot Table

Summarize data with a pivot table.

Example: Create a pivot table.

```python
data = {'Category': ['A', 'A', 'B'], 'Value': [10, 20, 30]}
df = pd.DataFrame(data)
pivot_table = df.pivot_table(values='Value', index='Category', aggfunc='sum')
print(pivot_table)
```

You’re basically at Excel-level now.

---
