Pandas basics

Column operations

Renaming columns

import pandas as pd
import numpy as np
import warnings

warnings.filterwarnings('ignore')

df = pd.DataFrame({
    'a':np.random.randn(6),
    'b':np.random.choice( [5,7,np.nan], 6),
    'c':np.random.choice( ['foo','bar','baz'], 6),
    })
df.head()
       a     b    c
0  1.630   NaN  foo
1  2.378   NaN  foo
2  0.461 7.000  baz
3 -1.176   NaN  baz
4  0.909   NaN  foo

```python df.rename(columns={“a”: “new_name”}, inplace=True) df.columns ```

Index([‘new_name’, ‘b’, ‘c’], dtype=‘object’)

Using a mapping function. In this case `str.upper()`:

```python df.rename(columns=str.upper, inplace=True) df.columns ```

Index([‘NEW_NAME’, ‘B’, ‘C’], dtype=‘object’)

We can also use a lambda. For instance, using `lambda x: x.capitalize()` would result:

```python df.rename(columns=lambda x: x.capitalize(), inplace=True) df.columns ```

Index([‘New_name’, ‘B’, ‘C’], dtype=‘object’)

A list of column names can be passed directly to columns.

```python df.columns = [“first”, “second”, “third”] df.columns ```

Index([‘first’, ‘second’, ‘third’], dtype=‘object’)

### Dropping columns

A column can be dropped using the `.drop()` method along with the `column` keyword. For instance in the dataframe `df`:

```python df ```

first second third
0 0.549838 5.0 baz
1 0.658684 NaN foo
2 -0.784545 NaN foo
3 0.204787 5.0 foo
4 1.206179 5.0 foo
5 -0.898500 5.0 baz

We can drop the `second` column using:

```python df.drop(columns=‘second’) ```

first third
0 0.549838 baz
1 0.658684 foo
2 -0.784545 foo
3 0.204787 foo
4 1.206179 foo
5 -0.898500 baz

The `del` keyword is also a possibility. However, `del` changes the dataframe in-place, therefore we will make a copy of the dataframe first.

```python df_copy = df.copy() df_copy ```

first second third
0 0.549838 5.0 baz
1 0.658684 NaN foo
2 -0.784545 NaN foo
3 0.204787 5.0 foo
4 1.206179 5.0 foo
5 -0.898500 5.0 baz

```python del df_copy[‘second’] df_copy ```

first third
0 0.549838 baz
1 0.658684 foo
2 -0.784545 foo
3 0.204787 foo
4 1.206179 foo
5 -0.898500 baz

Yet another possibility is to drop the column by index. For instance:

```python df.drop(columns=df.columns[1]) ```

first third
0 0.549838 baz
1 0.658684 foo
2 -0.784545 foo
3 0.204787 foo
4 1.206179 foo
5 -0.898500 baz

Or we could use ranges, for instance:

```python df.drop(columns=df.columns[0:2]) ```

third
0 baz
1 foo
2 foo
3 foo
4 foo
5 baz

```python

```