You can use the following syntax to convert a date column in a pandas DataFrame to a YYYYMMDD format:
#convert date column to datetime df['date_column'] = pd.to_datetime(df['date_column']) #convert date to YYYYMMDD format df['date_column'] = df['date_column'].dt.strftime('%Y%m%d').astype(int)
The following example shows how to use this syntax in practice.
Example: Convert Date to YYYYMMDD Format in Pandas
Suppose we have the following pandas DataFrame that shows the sales made by some company on various dates:
import pandas as pd
#create DataFrame
df = pd.DataFrame({'date': pd.date_range(start='1/1/2022', freq='MS', periods=8),
'sales': [18, 22, 19, 14, 14, 11, 20, 28]})
#view DataFrame
print(df)
date sales
0 2022-01-01 18
1 2022-02-01 22
2 2022-03-01 19
3 2022-04-01 14
4 2022-05-01 14
5 2022-06-01 11
6 2022-07-01 20
7 2022-08-01 28
Now suppose that we would like to format the values in the date column as YYYYMMDD.
We can use the following syntax to do so:
#convert date column to datetime
df['date'] = pd.to_datetime(df['date'])
#convert date to YYYYMMDD format
df['date'] = df['date'].dt.strftime('%Y%m%d').astype(int)
#view updated DataFrame
print(df)
date sales
0 20220101 18
1 20220201 22
2 20220301 19
3 20220401 14
4 20220501 14
5 20220601 11
6 20220701 20
7 20220801 28
Notice that the values in the date column are now formatted in a YYYYMMDD format.
Note that in this example, the date column already had a class of datetime.
However, we can use the to_datetime() function anyway to ensure that a given column has a class of datetime before applying a YYYYMMDD format.
Additional Resources
The following tutorials explain how to perform other common operations in pandas:
How to Add and Subtract Days from a Date in Pandas
How to Select Rows Between Two Dates in Pandas
How to Calculate a Difference Between Two Dates in Pandas