Magic Formula (Part II)



Quantopian is a resource for financial research, offering access to a broad range of financial data as well as a platform set up to backtest that data. In exploring the Magic Formula as a strategy, we can take advantage of Quantopian's research framework that allows for quick scripting in a Jupyter Notebook.

A strong limitation to consider is the inability to work locally with Quantopian. To prevent abuse and mass downloads of their valuable data sources, users are limited to their online enviroment for research, which can get quite cumbersome. Furthermore, some data sets like Factset's Fundamentals data has a built-in hold out period preventing access to the most recent (and relevant) datasets needed for research.

We replicate the formula as closely as possible, but was only able to match a subset of securities when compared with the Magic Formula stocks present in Greenblatt's own website.

Get the Book: The Little Book that Beats the Market

Parts:

One , Two , Three , Four , Five


Replicating The Magic Formula Using Python in Quantopian


In Part I of this series, we introduced the Magic Formula and discussed it's origins from legendary value investor Joel Greenblatt, founder and manager of Gotham Asset Management LLC, which he started in 1985 (as Gotham Capital) as a fund with a deep value philosophy. His book, The Little Book that Beats the Market, introduced his audience to a strategy for stock selection that focuses on companies that are both “cheap” and “good”. In Part II of this strategy discussion, we take a deep dive into replicating the two factor screen, and compare our results to those demonstrated by Greenblatt.


Data


Our first consideration is a data source for company fundamentals. There are a number of data providers that can satisfy this need. An earlier version of this post used a license from Sharadar Core US Equities Bundle found on the financial data site Quandl. As of this publication, the price of this data package starts at $70 a month.

Since the time of the original post, we have pivoted to the data provided by Quantopian. Quantopian offers access to comprehensive Fundamentals coverage through their licence with both Morningstar and Factset, and we found these sources to be more robust (and Free) and easy to work with (and FREE) relative to the Sharadar data. There are, however, some downsides to using Quantopian, the biggest of which is that you are only allowed to access this data within their research environment, to prevent abuse and mass download of their valuable data sources. This can limit the complexity of the strategies, but should work just fine for Factor Modeling.


Getting Started


Quantopian offers a research environment where we can test ideas out interactively through a Jupyter Notebook. From there, one can implement and backtest the idea through the IDE provided. The bulk of the work will be done using the pandas library in Python, in conjunction with a number of quantopian specific modules. As with anything else, we get started by importing the needed libraries. For the sake of this demonstration, I assume the audience is somewhat familiar with Python and pandas, or have the requisite knowledge to find answers online via resources like StackOverflow. To import the general modules, we start with:


import pandas as pd
import numpy as np
                

And then the specific quantopian modules we need:


from quantopian.pipeline import CustomFactor, Pipeline
from quantopian.pipeline.data.factset import Fundamentals as FFundamentals
from quantopian.pipeline.filters import default_us_equity_universe_mask
from quantopian.pipeline.data.morningstar import Fundamentals
from quantopian.pipeline.classifiers.morningstar import Sector
from quantopian.pipeline.domain import US_EQUITIES
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.research import run_pipeline
                

In handling large datasets, especially as it relates to screening, Quantopian uses something called a Pipeline API to facilitate fast computation. Specifically, Pipeline makes it easier to find values at a point in time of an asset (Factor), reduce a set of assets into a subset for further computation based on some evaluation (Filter), and group an asset based on properties that may facilitate your desired screen (Classify). Combined, factors, filters, and classifiers represent a set of functions that will get us a list of magic formula stocks at a given period in time. We'll do this by creating a make_pipeline() function.

Step One


The first step in the Magic Formula dictates that we establish a minimum market capitalization (usually greater than $50 Million). We will use Morningstar's calculation of market cap through Fundamentals.market_cap, and qualify that it is at a date we specify by adding the method latest to the end of the object.


market_cap_filter = Fundamentals.market_cap.latest > 200000000
                

Step Two


Next, we make use of Morningstar's Sector Classification to exclude Financial and Utility companies.


sector = Sector()
sector_filter = sector.element_of([
    101, #Materials
    102, #Consumer Discretionary
    #103, #Financial
    104, #Real Estate
    205, #Consumer Staples
    206, #Healthcare
    #207, #Utilities
    308, #Telecoms
    309, #Energy
    310, #Industrials
    311, #Technology
])
                

Step Three


Step Three of the magic formula requires that we exclude ADRs, for similar reasons as Step Two in that we cannot be sure of the capital structure of international companies and may not be comparing apples to apples when determining things like EBIT or Rates of Capitalization. To do this, we leverage a pre-screened list of securities in Quantopian called default_us_equity_universe_mask(). This base filter requires that the security be the primary share class for its company. In addition, the security must not be an ADR or an OTC traded symbol. The security must not be for a limited partnership, and it must have a non-zero volume and price for the previous trading day. The mask also has a parameter to pass in a value for minimum market cap, which we'll again set to 200MM.


tradable_filter = default_us_equity_universe_mask(minimum_market_cap=2000000)
                

We then combine the filters to retrieve our stock universe.


universe = market_cap_filter & sector_filter & tradable_filter
                

Step Four


The fourth step of the magic formula is determining Earning's Yield, defined as EBIT/EV. Greenblatt uses a trailing twelve month EBIT, which differs from Morningstar's default fundamental EBIT value that only looks at the most recent quarter. One way to do this is to write a windowing function to aggregate quarterly values for the past four quarters. Alternatively, Quantopian's FactSet data has many of the data in a trailing twelve month basis, so we can switch to this for this part of our screen. **Note: FactSet Data on Quantopian contains a one year holdout period, so your backtest can only run up to one year prior to present day. For example, if you are working on this script on October 1st, 2020, then your backtest can only run up until October 1st, 2019. Not super ideal, and it makes the Morningstar Fundamentals a much more attractive solution if your intention is to find a list of suitable symbols for investment.


ebit_ltm = FFundamentals.ebit_oper_ltm.latest
ev = Fundamentals.enterprise_value.latest
earnings_yield = ebit_ltm/ev
                

Step Five


Greenblatt's return on capital differs from a typical ROE or ROIC value. Within the Magic Formula, a company's return on capital is measured as EBIT/ tangible capital employed. In other words, we're trying to find the tangible costs to the business in generating the reported earnings within the period, where tangible capital employed is more precisely defined as Net Working Capital plus Net Fixed Assets.

Net Working Capital is simply the total current assets minus current liabilites, with an adjustment to remove short term interest bearing debt from current liabilites, and another to remove excess cash. Greenblatt does not offer details on how excess cash should be considered, but it is often calculated based off of a percentage of cash needed relative to sales generated within a period. For our simulation, we'll take the Max(Total Cash - Sales_LTM * 0.03, 0)

Net Fixed Assets is then added back to Net Working Capital, in the form of Net PPE.


ppe_net = FFundamentals.ppe_net.latest
sales_ltm = FFundamentals.sales_ltm.latest
total_assets = Fundamentals.total_assets.latest
current_liabilities = Fundamentals.current_liabilities.latest
goodwill_and_intangibles = Fundamentals.goodwill_and_other_intangible_assets.latest
#cash = Fundamentals.cash_and_cash_equivalents.latest
cash = Fundamentals.cash.latest
excess_cash = max((cash-(sales_ltm*0.03)),0)
current_notes_payable = Fundamentals.current_notes_payable.latest
net_working_capital = (total_assets - (current_liabilities - current_notes_payable))

roc = ebit_ltm / (net_working_capital + ppe_net - goodwill_and_intangibles - excess_cash)
                

Step Six


Now we need to rank our universe by the highest earnings yield and highest return on capital.


ey_rank = earnings_yield.rank(ascending=True)
roc_rank = roc.rank(ascending=True)
sum_rank = (ey_rank + roc_rank).rank()
                

Putting It All Together


The code for our screen is complete, and we'll need to return a Pipeline with a sorted list of symbols ranked by these two factors.


import pandas as pd
import numpy as np
from quantopian.pipeline import CustomFactor, Pipeline
from quantopian.pipeline.data.factset import Fundamentals as FFundamentals
from quantopian.pipeline.filters import default_us_equity_universe_mask
from quantopian.pipeline.data.morningstar import Fundamentals
from quantopian.pipeline.classifiers.morningstar import Sector
from quantopian.pipeline.domain import US_EQUITIES
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.research import run_pipeline

def make_pipeline():
    
    # Step One
    # Limiting to MarketCap Over 200MM
    market_cap_filter = Fundamentals.market_cap.latest > 200000000
    
    # Step Two
    # Filtering out Financials and Utilities
    sector = Sector()
    sector_filter = sector.element_of([
        101, #Materials
        102, #Consumer Discretionary
        #103, #Financial
        104, #Real Estate
        205, #Consumer Staples
        206, #Healthcare
        #207, #Utilities
        308, #Telecoms
        309, #Energy
        310, #Industrials
        311, #Technology
    ]) #& (Fundamentals.morningstar_industry_code.latest != 30910060)
    
    # Step Three
    # Filtering out ADRs
    tradable_filter = default_us_equity_universe_mask(minimum_market_cap=200000000)
    # Combining Filters Into a Screen
    universe = market_cap_filter & sector_filter & tradable_filter

    
    # Step Four
    # Determine Earning's Yield EBIT/EV
    # EBIT TTM
    ebit_ltm = FFundamentals.ebit_oper_ltm.latest
    ev = Fundamentals.enterprise_value.latest
    earnings_yield = ebit_ltm/ev
    
    # Step Five
    # Determine Return on Capital
    ppe_net = FFundamentals.ppe_net.latest
    sales_ltm = FFundamentals.sales_ltm.latest
    total_assets = Fundamentals.total_assets.latest
    current_liabilities = Fundamentals.current_liabilities.latest
    goodwill_and_intangibles = Fundamentals.goodwill_and_other_intangible_assets.latest
    cash = Fundamentals.cash.latest
    excess_cash = max((cash-(sales_ltm*0.03)),0)
    current_notes_payable = Fundamentals.current_notes_payable.latest
    net_working_capital = (total_assets - (current_liabilities - current_notes_payable))
    
    roc = ebit_ltm / (net_working_capital + ppe_net - goodwill_and_intangibles - excess_cash)

    # Step Six
    # Rank Companies
    ey_rank = earnings_yield.rank(ascending=True)
    roc_rank = roc.rank(ascending=True)
    sum_rank = (ey_rank + roc_rank).rank()
    
    return Pipeline(
        columns={
            'symbol': Fundamentals.primary_symbol.latest,
            'market_cap': Fundamentals.market_cap.latest,
            'earnings_yield' : earnings_yield,            
            'roc': roc,
            'ey_rank': ey_rank,
            'roc_rank': roc_rank,
            'sum_rank': sum_rank,
            'Sector'  : Fundamentals.morningstar_sector_code.latest,
        },
        screen = universe,
    )

my_pipe = make_pipeline()
result = run_pipeline(my_pipe,
                     start_date = '2018-06-07',
                     end_date = '2018-06-07')
top30 = result.sort_values(by = 'sum_rank', ascending=False).head(30)
                

Results


After we have our list of highly ranked symbols, our screen is largely complete.

Now is a good time to mention that Joel Greenblatt has a website with a screener specifically for the Magic Formula. The direct screener can be found here. The screener itself is very straightforward. You can enter in a market cap filter, and choose the number of securities the screen should spit out.

We have some older data collected from Greenblatt's website, and will use one date to make a comparison between the official model and the one we have created. The original screen produced these results:

Running our own screen in the newly created Jupyter Notebook, on the same date as specified in the website data, our list of the Top 30 stocks are as follows:


Official Website Screen Our Quantopian Screen
ACOR EVC
AEIS AGX
AGX NHTC
BPT PINC
CJREF PDLI
DLX GME
EGOV UTHR
ESRX FTSI
EVC PTN
FTSI HPQ
GILD AMCX
HPQ OMC
IDCC ZAGG
IMMR PPC
INVA INVA
MIK MU
MPAA SDI
MSB KLIC
MSGN BIG
NHTC REGI
NLS LPX
OMC UIS
PTN FL
SP NRZ
SRNE AYI
TUP IMMR
TVTY MTOR
UIS IPG
UTHR NLS
VIAB ALSN

The results of our screen did not completely replicate that of the screen produced by magicformulainvesting.com, and that is likely due to the fact that our calculations of tangible capital employed may have differed with Greenblatts. Nonetheless, more than a third of the securities from the official website made it into our screen, so we feel confident that our screen would produce securities with conceptually similar characteristics to "cheap" and "good" as described in the book.