Pandas Time Series Flashcards
from datetime import datetime
datetime(year=2015, month=7, day=4)
manually build a date using the datetime type
datetime.datetime(2015, 7, 4, 0, 0)
from dateutil import parser
date = parser.parse(“4th of July, 2015”)
date
using the dateutil module, you can parse dates from a variety of string formats
date.strftime(‘%A’)
Once you have a datetime object, you can do things like printing the day of the week:
datetime(year=1976, month=9, day=13).strftime(‘%A’+’ %B’)
‘Monday September’
import numpy as np
date = np.array(‘2015-07-04’, dtype=np.datetime64)
date
NumPy team to add a set of native time series data type to NumPy
numpy datetime
date + np.arange(12)
Once we have this date formatted, however, we can quickly do vectorized operations on it
np.datetime64(‘2015-07-04 12:00’)
Here is a minute-based datetime
NumPy will infer the desired unit from the input
np.datetime64(‘2015-07-04 12:59:59.50’, ‘ns’)
Y Year ± 9.2e18 years [9.2e18 BC, 9.2e18 AD]
M Month ± 7.6e17 years [7.6e17 BC, 7.6e17 AD]
W Week ± 1.7e17 years [1.7e17 BC, 1.7e17 AD]
D Day ± 2.5e16 years [2.5e16 BC, 2.5e16 AD]
h Hour ± 1.0e15 years [1.0e15 BC, 1.0e15 AD]
m Minute ± 1.7e13 years [1.7e13 BC, 1.7e13 AD]
s Second ± 2.9e12 years [ 2.9e9 BC, 2.9e9 AD]
ms Millisecond ± 2.9e9 years [ 2.9e6 BC, 2.9e6 AD]
The following table, drawn from the NumPy datetime64 documentation, lists the available format codes along with the relative and absolute timespans that they can encode
Pandas TIMESTAMP
import pandas as pd
date = pd.to_datetime(“4th of July, 2015”)
date
Timestamp(‘2015-07-04 00:00:00’)
numpy style operations on pandas object
date + pd.to_timedelta(np.arange(12), ‘D’)
DatetimeIndex([‘2015-07-04’, ‘2015-07-05’, ‘2015-07-06’, ‘2015-07-07’,
‘2015-07-08’, ‘2015-07-09’, ‘2015-07-10’, ‘2015-07-11’,
‘2015-07-12’, ‘2015-07-13’, ‘2015-07-14’, ‘2015-07-15’],
dtype=’datetime64[ns]’, freq=None)
index = pd.DatetimeIndex([‘2014-07-04’, ‘2014-08-04’,
‘2015-07-04’, ‘2015-08-04’])
data = pd.Series([0, 1, 2, 3], index=index)
data
Pandas time series tools really become useful is when you begin to index data by timestamps
data[‘2014-07-04’:’2015-07-04’]
data[‘2015’]
make use of any of the Series indexing patterns we discussed in previous sections, passing values that can be coerced into dates:
passing a year to obtain a slice of all data from that year:
dates = pd.to_datetime([datetime(2015, 7, 3), ‘4th of July, 2015’,
‘2015-Jul-6’, ‘07-07-2015’, ‘20150708’])
dates
passing a series of dates by default yields a DatetimeIndex
dates.to_period(‘D’)
Any DatetimeIndex can be converted to a PeriodIndex with the to_period() function with the addition of a frequency code; here we’ll use ‘D’ to indicate daily frequency:
dates - dates[0]
A TimedeltaIndex is created, for example, when a date is subtracted from another: