Home » Pandas: How to Remove Special Characters from Column

Pandas: How to Remove Special Characters from Column

by Erma Khan

You can use the following basic syntax to remove special characters from a column in a pandas DataFrame:

df['my_column'] = df['my_column'].str.replace('W', '', regex=True)

This particular example will remove all characters in my_column that are not letters or numbers.

The following example shows how to use this syntax in practice.

Example: Remove Special Characters from Column in Pandas

Suppose we have the following pandas DataFrame that contains information about various basketball players:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team' : ['Mavs$', 'Nets', 'Kings!!', 'Spurs%', '&Heat&'],
                   'points' : [12, 15, 22, 29, 24]})

#view DataFrame
print(df)

      team  points
0    Mavs$      12
1     Nets      15
2  Kings!!      22
3   Spurs%      29
4   &Heat&      24

Suppose we would like to remove all special characters from values in the team column.

We can use the following syntax to do so:

#remove special characters from team column
df['team'] = df['team'].str.replace('W', '', regex=True)

#view updated DataFrame
print(df)

    team  points
0   Mavs      12
1   Nets      15
2  Kings      22
3  Spurs      29
4   Heat      24

Notice that all special characters have been removed from values in the team column.

Note: The regex W is used to find all non-word characters, i.e. characters which are not alphabetical or numerical.

In this example, we replaced each non-word character with an empty value which is equivalent to removing the non-word characters.

Additional Resources

The following tutorials explain how to perform other common tasks in pandas:

How to Replace NaN Values with Zeros in Pandas
How to Replace Empty Strings with NaN in Pandas
How to Replace Values in Column Based on Condition in Pandas

Related Posts