Resample Data & OHLC

Notes of AI for Trading, Project 1, Resample Data

Posted by Chauncey on May 11, 2020

New Data

new an example series of days

dates = pd.date_range('10/10/2018', periods=11, freq='D')
close_prices = np.arange(len(dates))
close = pd.Series(close_prices, dates)
close
2018-10-10     0
2018-10-11     1
2018-10-12     2
2018-10-13     3
2018-10-14     4
2018-10-15     5
2018-10-16     6
2018-10-17     7
2018-10-18     8
2018-10-19     9
2018-10-20    10
Freq: D, dtype: int64

Resample

Let’s say we want to bucket these days into one week period.

To do that, we’ll use the DataFrame.resample function. The first parameter in this function is a string called rule, which is a representation of how to resample the data. This string representation is made using an offset alias. You can find a list of them here. To create 1 week periods, we’ll set rule to “W”.

pd.DataFrame({
    'days': close,
    'weeks': close.resample('W').first()})
days weeks
2018-10-10 0.0 NaN
2018-10-11 1.0 NaN
2018-10-12 2.0 NaN
2018-10-13 3.0 NaN
2018-10-14 4.0 0.0
2018-10-15 5.0 NaN
2018-10-16 6.0 NaN
2018-10-17 7.0 NaN
2018-10-18 8.0 NaN
2018-10-19 9.0 NaN
2018-10-20 10.0 NaN
2018-10-21 NaN 5.0

OHLC

Now that you’ve seen how Pandas resamples time series data, we can apply this to Open, High, Low, and Close (OHLC). Pandas provides the Resampler.ohlc function will convert any resampling frequency to OHLC data. Let’s get the Weekly OHLC.

close.resample('W').ohlc()
open high low close
2018-10-14 0 4 0 4
2018-10-21 5 10 5 10

Problem

Can you spot a potential problem with that? It has to do with resampling data that has already been resampled.

We’re getting the OHLC from close data. If we want OHLC data from already resampled data, we should resample the first price from the open data, resample the highest price from the high data, etc..

Converting Tick-By-Tick Data with this function is fine.

OHLC 2.0

Implement days_to_weeks function to resample OHLC price data to weekly OHLC price data. You find find more Resampler functions here for calculating high and low prices.

def days_to_weeks(open_prices, high_prices, low_prices, close_prices):
    """Converts daily OHLC prices to weekly OHLC prices.
    
    Parameters
    ----------
    open_prices : DataFrame
        Daily open prices for each ticker and date
    high_prices : DataFrame
        Daily high prices for each ticker and date
    low_prices : DataFrame
        Daily low prices for each ticker and date
    close_prices : DataFrame
        Daily close prices for each ticker and date

    Returns
    -------
    open_prices_weekly : DataFrame
        Weekly open prices for each ticker and date
    high_prices_weekly : DataFrame
        Weekly high prices for each ticker and date
    low_prices_weekly : DataFrame
        Weekly low prices for each ticker and date
    close_prices_weekly : DataFrame
        Weekly close prices for each ticker and date
    """
    
    open_prices_weekly = open_prices.resample('W').first()
    high_prices_weekly = high_prices.resample('w').max()
    low_prices_weekly = low_prices.resample('w').min()
    close_prices_weekly = close_prices.resample('w').last()
    
    return open_prices_weekly, high_prices_weekly, low_prices_weekly, close_prices_weekly