An Overview of Pairs Trading Strategies
By Hudson & Thames
Summary
Topics Covered
- Pairs Trading Classified by Five Approaches
- Distance Beats Cointegration with Simplicity
- Stochastic Control Finds Optimal Weights
- Copulas Capture Hidden Dependencies
- PCA Isolates Idiosyncratic Mean Reversion
Full Transcript
hi everyone i'm really glad to see all of you today on our first day of the paris trading workshop let us start so first i think many of you know about hudson and thames and
what we do but just as a quick intro for those of you who maybe don't so we are an engineering company and we build implementations of algorithms that are described out there
in the quantitative finance literature and here you can see three are main packages that we present um and they deliver specific
use cases for for users those would be arbitrage lab melfin lab and portfolio lab uh just a brief description here so arbitrage lab is dedicated to statistical arbitrage and
various trading algorithms and many of the things that we are going to be talking about during these two days of the workshop are related specifically to that topic of quantitative finance then we
have the amalfin lab which is centered around machine learning techniques and it was mainly works of professor lopez de tribal so two of his widely known books plus some of the
implementations from his papers that didn't that weren't included in those books as well as we added some extra modules to be those data generation backtest overfitting as well as networks
and maybe some extra ones for you to use and check out plus there is the portfolio lab package and the sole goal of this package is to implement a lot of algorithms dedicated to portfolio
optimization and we have algorithms ranging from standard market optimization and all the way to some complex clustering algorithms be those h or b from the prado
h or c form uh dr raffanat or nco whatever if you are interested in portfolio optimization i encourage you to look at that for amalfin lab and portfolio lab they are available on pipey
python package index so you can just quickly pip install them plus a part of the documentation is open to start using them for the arbiter's lab it's a paid for package and details on that you can find on
hudson tanks web webpage now to the series of this two days of presentations um today it's going to be rather
a short one maybe closer to 40 to 45 minutes my presentation regarding an overview of pairs trading strategies the main idea behind this was to introduce you to
paris trading strategies how are they classified what are the difference between them and so that you would know more about the field that we're going to be discussing on tomorrow
today they're not going to be any specific formulas algorithms it's just an overview so to say but i'll also point you to places where you can find
the algorithms themselves as well as the descriptions of them if you would wish tomorrow is the main event and it's going to be three maybe even three and a half hours
long so it starts at 6 p.m london time and ends i think a bit
6 p.m london time and ends i think a bit later than 9 00 p.m london time and we're going to cover seven distinct presentations from three of our apprentice so the first three presentations are going
to be presented by johan who was working on developing the distance approach and don't worry right now about the different approaches that i'm mentioning in a few slides you'll understand
completely how the uh fair shooting field works in in our understanding um so the first two presentations entirely dedicated to distance approach
strategies the third presentation uh is dedicated to hedge ratios estimation models um then the next three presentations would be covered by
vijay and those would be mainly on the stochastic control fields so part one and part two but also we're mentioning their partner selection using copulas which are from the copula approach and the
final the seventh presentation will be by franz who i believe is right now in uh this webinar and he'll introduce us to common filters in time series analysis
which is essentially a problem of uh time series approach it will be more of a theoretical overview but still very very interesting the plan for tomorrow's presentation is
about 20 to 25 minutes per presentations plus a five minute q a after each presentation today my presentation will take maybe 40 minutes maybe a bit less
and then we can have an extended 10 to 15 minute q a not only related to pairs trading or maybe you have some questions about arbitrage or other packages but we can in general discuss various topics it
would be i think of a great interest let's go on a quick note about me my name is ilya barzi and i'm a quantitative research team lead at
hudson and teams i have mastering computer science in endogonometrics in the university of warsaw here are some links to my github twitter or linkedin and if you either
want to follow us to check some of the futures events please do that as well as we can carry out some of the conversation outside of this
event today to discuss maybe in some more details some aspects please uh feel free to finally the main part of the presentation and overview of
pairs trading strategies and here i should probably mention that not all the papers that are in the field are obviously highlighted there is a really really broad range of literature
and what i'm covering here are only the core papers and even the core papers that we decide to uh implement and showcase and to discuss upon so if you would like
to really dig deep into you'll have to use some tool for example as connected papers and just scroll through various various papers to find the exact thing that you're looking
for it's a rather long trip but i hope that after this presentation you'll have a much deeper and broader understanding of what's going on out there so this is the general structure of
today's presentation and as you can see it's divided basically by approach i'll go piece by piece and highlight the core idea of the approach as well as main
papers or textbooks contributing to this particular topic in the end you can also find a list of full references for those of you who would like to check the original sources or
just get a more understanding of the specific algorithm that i was describing introduction this presentation is
based on the paper by christopher kraus and this is a great paper for you to get introduced into the topic and the name is statistical arbitrage pairs trading strategies
um the main idea is that it rose the problem that over the past two decades or so there were many pairs trading strategies and
different authors uh call them in a different ways they can call them [Music] statistical arbitrage or pairs trading but essentially we call it first reading in
this particular presentation uh there are too many of them and they should be in one way or the other categorized based on their nature based on the behavior and based on the logic inside and christopher proposed sorry krause
proposed five distinct approaches to pairs trading that are divided divided by their logic and he also provided some core papers and descriptions
in a broad form for you to get to know these approaches better what is a first trading strategy so as we start this is a general description uh
from a paper gatew and others from 2006 which is the most widely cited paper in the domain of fairest trading in general and they define a logic behind a bearish trading
strategies as first we would pick two prices who sorry two assets whose prices move together historically worth noting that assets can not be individual elements but also a portfolio
of elements then we would monitor the spread between them during some periods and in general if the prices would diverge we would short the winner and buy the loser
rather simple approach but based on how vag it is you can kind of think of hey there are a lot of things that can be done in one or the other way and that is essentially why there are so many papers on this
topic they can also be classified by the nature of spread construction so to say if you would be trading one asset versus the other it would be a univariate strategy
if you would trade one asset versus a bundle of assets so for a portfolio that would be a quasi-multivariate setting and finally if you would be trading portfolios against each other that would
be a truly multivariate setting now um how can paris trading strategies be classified in with respect to their nature of the strategy itself
first there would be a distance approach the main idea is that we're using simple non-parametric distance metrics and then some simple
set of rules to generate trading signals an example of this would be taking square distances of normalized prices as in the paper by gatew
and then applying some simple thresholds of divergence of this numeral light prices to generate trading signals rather simple then we have the co-integration approach
and this usually requires stricter co-movement of the assets or portfolios that we're treating um those would require basically
configuration testing in this or the other form it may be more strict less strict uh but uh it just requires those and those would be the configuration uh the strategies from the integration
approach then there is the time series approach and uh authors in this domain in general omit the question of picking pairs whatsoever and expect that the user already knows
which two assets or portfolios he or she would like to trade and it's solely basis on trying to somewhat forecast or model
the price and based on the difference between model price and the actual price will generate a trading signal then there is a stochastic control approach
and the core idea here is that you would identify some optimal trading rules or optimal weights for elements in a pair related to the statistical properties of the
process of prices or the process of spread itself finally what kraus defines as other approaches is a series of relevant
papers in the field however not enough papers or materials to gather them in a distinct way but we extend this a bit and decide for
the sake of this presentation to call them specific separate approaches those would be a machine learning approach based on the name you would assume that yes right it's a basis
a trading signal generation on some machine learning technique be that neural networks uh support vectors whatever then we have the copula approach and
here uh the main idea is just to use a mathematical concept of copula that i'll not deeply explain but i'll link you to the sources that can easily explain that to
you so you would be more familiar with this still as this is great and pretty novel and not that much research was done actually in the topic of copula pairs trading and finally we have
the pca and other approaches and here we mainly have the pca approach and also some other minor papers here and there so i hope after this quick rather
introduction you have a general understanding of what's going on right there in the pairs trading field of paper and methods first the distance approach
tomorrow's presentation as i um already mentioned two of them will be dedicated to distance approach and essentially what i'm outlining here will be described in much more detail
and jovan uh will do a great job in this so i really encourage you to come here tomorrow and check the those presentations out but what is worth mentioning here so the first would be
distance basic distance strategies and adjustments to them as i've mentioned the paper by gaitif and others is the most cited paper in the domain of pairs trading in general
and it proposes a really simple algorithm that was in addition robust to data scooping bias uh it was back tested on i believe data set of
u.s stocks from 60s to 2002
u.s stocks from 60s to 2002 and it showed abnormally high returns and uh since a whole basket of stocks were picked for the original strategy
as i said it was robust to data snooping and it pro it produced a great discussion in the field and a lot of subsequent papers went out to
um either empirically test the result or determine what is the nature of those returns unfortunately after about 2005 or so based on
works of other researchers uh the profitability of such an approach degraded uh and also if transaction costs would be taken into account this would be even
worse um what's the upside though is that a paper by doing that released in 2010 both does a great overview of the original strategy proposes various adjustments to it
and those adjustments actually show improved um returns and profitability um but yeah in general it gave and you
breathe to the to the standard method more on that topic you can check out tomorrow on uh jovan's presentation then there is a quasi-multivariate extension so
the previous one was the univariate one and right here authors propose a set of strategies that would extend this to a quasi-multivariate setting
um essentially they're using alternative metrics to identify pairs there such as pearson's correlation and
others you'll essentially check this out tomorrow in more detail but the upset here is that it showed more robust results and plus since the papers were released a bit later
they had a broader back testing time frame which showed improved uh performance in comparison to the standard gated and all approach uh it was
worth also noting that there are a lot of strategies in this domain but it's also worth noting that they're usually either an empirical testing of such or such variation of the basic
strategy or an adjustment of um distance strategy to some particular market their paper is just dedicated to
setting this strategy to be optimal in a commodities market for example or adjusting it to a high frequency trading setting and we're not adding them to this presentation
but they're described in the original cross paper so if you'd like to check them out there's the uh source for you now the second approach that we will describe is the
co-integration approach and uh to get a deeper information about this i encourage you to check our youtube recordings from ethan
who did a great video on uh minimum profit optimization plus also a sparse mini reverting portfolio you would basically understand the concept itself as we're
doing right now in this webinar the core strategy in the um co-integration approach is in the is described in the
book appreciating by vidya murthy and it is proposing a three-step framework of pre-selection of potentially contributed players then testing them for tradability and
essentially optimizing the uh target levels to get the uh optimal um expected return from the strategy uh it's also interesting to note here
that even though it's under the co-integration approach per se the original approach didn't uh sorry the original strategy didn't in particular require for the
configuration testing and instead opted for a strong mean reverting property which is rather interesting the second
element of the configuration approach is the minimum profit optimization and these are a set of strategies that would optimize the minimum profit even though
it doesn't sound really exciting like minimum profit but essentially what we are trying to do here is maximize the worst case scenario
in the trading setting so um the first paper i believe is one of the first in in this domain and the second one is an adjustment to the original one to briefly describe how it works is we
have a testing period or training period however you would call this and we observe a process of spread and then based on how it behaves we can set
optimal levels to enter or exit the strategy and we will try to get the maximum expected minimum value that we can get out of the trade
there was the critique to the original paper that hey uh and it was described in the previous work by video murthy that um expected general return from
a first rating strategy and from strategy is a function of number of trades and the so to say performance or the profit per trade and the original
paper was focusing mainly on the profit per trade and not taking into account the general number of trades so the second paper fixes that issue and
uh proposes a more adjusted framework there are a few downsides however to this uh one of the biggest being that there is a specific requirement for the configuration error to be
a symmetric process plus an ar1 process but if your specific problem fulfills that then it's a really good element to use the other downside
is a quite complex so obiek of and cube time complexity of the algorithm however i believe that there is slightly optimized uh i may be wrong here slightly
optimized algorithm which is o big and square log n that we're using in our implementation uh and then the last element in the co-integration approach
is the sparse min reverting portfolios it's not widely described in the original krauss paper but we found it of a great interest and also some practitioners were mentioning it uh as a core element of the pair's
trading strategies what's interesting about this well probably the most interesting thing that it actually opts against the standard
co-integration testing as a way to determine uh weights for trading assets because when we would for example use
either standard angle grandeur or your hands and tests to determine uh weights to trade a pair of elements we would get that these are dance portfolios which means that we have
coefficients for every element in the portfolio and if we're we are rebalancing frequently these are high transaction costs plus if we have coefficients for each of them
it's lower intervertability not that good you can obviously cut out some of the lower coefficients but you're not
specifically sure how to do this so he das per month proposes a great approach to this which is essentially taking into account the
correlation matrix of the assets then uh generate a sparse mean reverting portfolio so a great portfolio that would at the same time be
reverting but it would require as less elements being added to it as possible and as i said there is a great video from yifang on this topic uh this
method so this problem can be solved as described in the original paper by two methods either greedy search or a convex relaxation the first is a bit quicker but sorry gives worse results the second
gives better results but requires a bit more time to run and as i said the upsides for this are improved interpretability and lower transaction costs in general
the time series approach it's worth noting at this point that the time series approach is still being kind of under development in our packages and uh
another thing worth noting is that the kalman filter time series approach strategy will be described tomorrow by friends even though in the more theoretical setting it's still worth
understanding it in more detail and even if you would like to use for example camouflage kalman filters in some of your strategies as a separate element not as a part of this trading strategy
it's a really good presentation for you to visit next uh essentially yes persuading with the kalman filter and the main paper in the domain of
uh page trading with time series is this first trading by elliot and others it's the most cited in the domain and the main idea behind this is we assume that we can describe the
spread as a specific state space model and these parameters we can estimate using the kalman filter we essentially as i said previously in the time series approaches we would
forecast the value of the spread process or in some elements would be forecasting probably also maybe prices or returns as processes
um and based on this forecast we can later apply filters and these filters would allow us to generate trading signals those filters can be similar as threshold filters they can be
something more complex as adding and fernstein will and back on top of the uh model so there's like quite a lot of things to check out um and i believe uh if you'll be here
till i'll be presenting for the machine learning approach there is a much larger basket of tools for you to use if you're interested in this specific approach so a forecasting a process and then
applying uh filter to the forecast in real price to generate a trading signal and here is also a paper sorry it's rather a textbook
by sarmanto and horta it's not that widely mentioned in the literature but we decided to add it here as well it's rather versatile and based on the
original descriptions of the author it showed good success in the empirical setting and the idea here is that it consists of two elements so first you would need any forecasting algorithm
you can uh use kalman filter you can use erima and anything up till neural networks or maybe some more advanced stuff and then you would check what is the difference between the actual forecasted
value and the real value and based on to which quantile does this value falls you would generate a trading signal uh changing the sensitivity in this method
is available by changing essentially the quantile at which you would be uh deciding to generate a trading signal uh now we can move to the stochastic
control approach tomorrow's two of the presentation will be dedicated specifically to this approach uh vj will do them and you'll be able to get a deeper understanding as i
said to the stochastic control field first element in this approach is the orange style and back based models
and a general assumption here is that uh our spread follows nu process and we have some extra elements added to it those could be
expectations of the trader uh maybe volatility of the market and some extra inputs here and there which allow us to essentially construct a
model with specific use case that would give us uh in some cases that could be optimal weights for the legs of the spread sorry the legs of
the spread yes so the first asset and the second asset as well as maybe entry and exit levels um these two elements support different utility functions so based on the
interest of the agent who is the who is doing the trade they can be adjusted it's also interesting that these models actually show that it's not
optimal always to enter a trade and there might be sections where you would be better off taking out of the trade and essentially vijay will cover this in his tomorrow presentation
[Music] now the next element here would be the optimal convergence and the main paper in this domain is by liu and timmerman you can think of this as an extension to
some of the previous models however much more advanced because it not only looks at the spread but it assumes that both prices move with a specific model in mind that takes many
elements into account and however hard maybe inside the derivation of these models they actually show good results um it's also worth noting that in this models two opportunities
types take into account the recurring and non-recurring which means that um theoretically you can have a setting where you can trade only once and get the most out of the trade
or you can trade multiple times and accumulative gets a lot from those trades so there are solutions for both of them um also here are some extra insights from those models
these models show that actually sometimes it may be profitable to hold both assets in a pair either long or short or maybe it's worth holding only one of the assets whereas not holding the other
which is kind of contradictory to the standard pairs trading strategy logic that i described in the initial slides now the optimal mean reversion trading
which is a you can say standalone element because it's also not included in the kraus overview but we believe that's because the elements presented here are
essentially posted outside so after the original overview paper was released and here we are adding two works so first would be a textbook from professor
leung and lee and the other would be a paper from professor lipton and the prado in the first textbook the idea essentially that hey let's assume that
our process spread spread process you can say um follows a particular distribution and process those three processes are covered in the original work
would be einstein on that model which is also widely used in different other pairs trading approaches exponential arms tangle and deck model for plus a cox single
cross model um even though the first one is the most widely used it proposes optimal levels for this strategy meaning that you provide it with a some uh training period
of this spread and you can also add there for example uh stop-loss ratios that you prefer or maybe uh trading constraints or transaction costs that you're
incurring during the trades and you would essentially get as simple as just the output of hey here is the level for you to enter and here is the level for you to exit take into
account your inputs which would give you the optimal output and the optimal profitability per per that trade and it's worth worth also
noting that this can be adjusted to both only simple sorry single opportunity and multiple opportunity as was in the previous uh slide so you can either decide that hey
i want to get the maximum out of one trade or i would preferably trade multiple times and get the cumulative maximum from that the second paper by professor lipton and
the prado takes a bit different approach but still takes into account that our spread may be distributed and maybe an einstein own deck process
and by solving it from a bit different perspective also gives you the optimal levels for you to use in the trading strategy and that's when we're finished with this
stochastic control approach and moving on to machine learning approach maybe a lot of information at the same time be sure i'll answer any of your questions afterwards this is just for you to be informed in general in the
field of first reading as i said in the beginning so the machine learning approach here as i said essentially any method using machine learning
would apply but in core it's really similar to um the time series approach as well you can see this from the second slide that i'll be presenting in this topic
if you would like to check out more detailed description of these methods there is a video recording from aaron who is right now on this webinar that you can check out it's on our youtube channel
to get a deeper understanding into machine learning and per selection methods so first would be uh here machine learning and prayer selection and we decided to showcase this as a separate method even though
not also covered in the original krauss paper because it provides a really great framework for people who are searching for pairs to trade in their strategies
because essentially if you approach this problem you can either go with a distance approach so some simple non-parametric uh tools or you can use co-integration
strict or like strict in the sense of angle grunge or johansson test plus maybe less strict as proposed by video marthy in his work you can maybe for the copula project
will cover use some um statistics as spearman's raw or kendall style to determine players to trade there are multiple ways but here is a general really good
algorithm and the framework for you to use um which address is essentially the problem of hey how would you find good pairs to trade and it consists of three steps so first would be
a dimensionality reduction that would allow us to get a compact representation of the assets then there would be an unsupervised learning clustering where we would cluster our elements in the
space and in the very end we would get some pairs using our per selection criteria which are multiple out there starting from uh strict configuration testing
checking the first exponent checking how often the spread actually deviates to zero so how often will possibly a trade be performed uh we'll encourage you to
check our implementation of this paper and to check the original work in in general uh the second and here are rather a lot of papers but i assure you
there are much more papers on this domain um they are written by dunis and also other researchers i highlighted here three that are probably
the most important and the most cited in reality i think there are more than six of them being of a maybe similar nature but
uh of a really similar concept so to trade some specific spreads those could be gasoline as you can see soybean corn ethanol etc
there is a general so to say framework to use that would consist of either so modeling this spread using some machine learning elements in this case
these are multiple neural networks and they are like a lot of them that can be used in this approach and second element here would be applying a filter which can be a
symmetrical asymmetrical filter with different variations that would take the expected so the forecasted value plus the actual value and based on this would generate a trading signal for you
um again encourage you to check uh aaron's presentation on youtube on on this regard plus it uh this particular element takes into account
the threshold auto regression model that adjusts the trading process then we have a copula approach
i remember that i said that i'll explain this but there is generally no in my understanding better way to get and understand copula than to read one of our blog posts
the introduction to basic copyless for page trading and introduction to advanced copyless for pair trading you can find them on our hudson thames webpage as well as on youtube they're
great materials by hanson who recorded them and even though they might feel complex it's a great topic to discover because based on some of the latest papers this is a
great great tool to use in paris trading scenarios because it shows more robust results with excess returns
abnormally in some trading settings i'll elaborate on this and give you a reference paper in a bit so right here we have the what you can call maybe
basic cochlear strategies those would be papers by lou and wu stander and others and also an adjustment uh using them as pricing
index by shia and others uh what is interesting here is that okay so essentially a copula is a form of representing the relation between two
assets and they can follow some mathematic statistic relations we have
two elements here one would be a testing sorry not testing but um training periods where we would get the observations and then
see which copula fits our observations the most so that we can kind of see the relationship between the elements in a more strict format and not just the empirical observation and then during the trading period
itself based on where we see uh the prices in this probabilistic space we would generate a trading signal for example this may show us that hey if this particular situation
of different returns or different prices occurred it's there is a high chance that uh one asset is underpriced and the other is overpriced and this is extended uh using the ms
pricing index in the last paper here by shia which adds a general higher level layer on top of the basic copula strategy here
[Music] we can also have a look at fine copula spare selection and trading strategies and fine copper is an extension to the original idea of calculus
both of this so the pair selection the trading strategies are mentioned in this uh and not mentioned but basically uh in detail described in this paper by
uh schliebenger and others and you can actually see that one of the authors of this paper is kraus who is the author of the original overview uh that's why it's not included in the original overview because it was
released uh i believe two years later uh so first to the pair selection element this is a great tool again for pair selecting
select pairs for your trading strategies uh the interesting element here is that for each element that you're selecting a pair for it will generate you three partners for each asset and uh
there will be a dedicated presentation for this tomorrow vlc will cover all the details as well as show you some of the examples of how this is used and what the results are
encourage you to to join us tomorrow um and vine copula in general is a improvement over the standard copula strategy and what authors claim in this paper is
that over a rather long back testing period they were able to get really robust good trading results from these strategies um there are also blog posts for this
topic on our hudson and things web page in the research part of the web page if you would like to get more into detail but i um tell you beforehand that
hey it's a rather complex topic to check out but please check out the standard copula strategies because they're a great tool to apply to your uh
to your strategies and to your problems finally the last element of our today's presentation will be the pca and other approaches it's rather short because essentially here will only be the pca approach as
we decided that maybe some of the other elements are not that worth being mentioned here if you would like to check out [Music] video dedicated to the pca approach uh there is a
recording on youtube that's i think we made about a few months ago and either go in detail into how the pca approach works uh what are maybe the result that you
should expect and to which problems are you uh can you essentially apply the pc approach um
so yeah the pca strategy uh it's written by avalanche and lee a rather great work which in essence um so the main idea there is
that you can take a set of assets and you can decompose their returns into a systematic and the idiosyncratic components
and the way you can achieve this split is to regress the returns of the elements to the returns of the
some market sector representative thing and this thing can be either a pca eigen portfolio where essentially would take eigenvector
of a correlation matrix of your asset and you would adjust it slightly to the variants and you would get a set of weights to create a eigen portfolio which is
essentially a natural split portfolio split of this assets or you can use just sector etfs so for example if you are assuming u.s
stocks you may take sector etfs for uh i don't know technology industrial etc etc and then regress your stock returns on sector etf returns and you would
essentially get the same idea of splitting returns into the systematic and idiosyncratic components and then based on the performance of that idiosyncratic component that we're
able to model and we also apply there on steinbeck process there we're able to generate trading signals and this here essentially is a good example of a quasi multivariate approach
because in the pca setting not the etf setting would be trading one asset versus a whole portfolio of other assets from the same places here there are also various model adjustments in this domain
uh for example if we have the ornstein olympic process we may fix the drift we may not fix the drift uh we may adjust the returns that we have from the
original assets uh by adjusting them by volatility because then it would in some times filter out the uh kind of false positive
activations of the strategy it's a great paper to for you to check out for check some of our materials on the topic and yep the unfortunately that's it here
will be a set of other long set of references as you can see about four pages i have here
and some of them have links to the original papers as you can click and see them also i just realized that i forgot to post the link to this presentation to
uh the chat so right after i'm done and when i'll be seeing your questions and discussing them you'll be able to see that link if you'd like to save this presentation and go over it
uh afterwards you'll have the ability to do so and i think that's it uh thank you so much for joining the first day of our uh parish trading
workshop please come tomorrow as there's a lot a lot to come uh follow me if and follow hudson and things you would like to see more of this content
and now i would gladly answer uh any of your questions regarding the strategies maybe regarding some of the products or just discuss on various topics here and there thank
you so much for attending and i'm waiting for your questions
Loading video analysis...