When analyzing time series data, use columns such as timestamps 「2016-12-17 09:59:17」 It may be saved in a format like this. From this format, so that it can be handled by machine learning ** How to split a date into year, month, day, hour, and day of the week **
This time, we will use the following dummy data.
time.py
import pandas as pd
df = pd.read_csv('df.csv')
df.head()
#output
    patient	Last UpDdated
0	5.0	    2020-03-22 10:00:00
1	4.0	    2020-03-22 11:00:00
2	6.0	    2020-03-22 12:00:00
3	10.0	2020-03-23 10:00:00
4	3.0	    2020-03-23 11:00:00
df.info()
#output
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 21 entries, 0 to 20
Data columns (total 3 columns):
patient          21 non-null float64
Last UpDdated    21 non-null object
dtypes: float64(2), object(1)
memory usage: 800.0+ bytes
Splits the date in the Last Up Dated column.
■ Order
** ① Convert from object type to datetime64 [ns] type **
  pd.to_datetime(df['Last UpDdated'])
** ②. dt. Get year, month, day, time, day of the week with ~ **
  df['Last UpDdated'].dt.month
time.py
df['Last UpDdated'] = pd.to_datetime(df['Last UpDdated']) #Convert type
df.dtypes
#output
patient                 float64
Last UpDdated    datetime64[ns]
dtype: object
#Added column "manth"
df['month'] = df['Last UpDdated'].dt.month
#Added column "day"
df['day'] = df['Last UpDdated'].dt.day
#Added column "hour"
df['hour'] = df['Last UpDdated'].dt.hour
#Added column "week"
df['week'] = df['Last UpDdated'].dt.dayofweek
#Removed Last Up Ddated
df = df.drop(['Last UpDdated'],axis=1)
df.head()
#output
	patient	month  day	hour  week
0	5.0	    3	   22	10	  6
1	4.0	    3	   22	11	  6
2	6.0	    3	   22	12	  6
3	10.0	3	   23	10	  0
4	3.0	    3	   23	11	  0
Month, day, hour, and day of the week columns have been added based on the values in the Last UpDated column! The day of the week is an int type from 0 to 6.
Recommended Posts