Quant Finance with Python and Pandas | 50 Concepts you NEED to Know in 9 Minutes | [Getting Started]
By Daniel Boctor
Summary
Topics Covered
- Use Total Returns for True Performance
- 1 + r Format Enables Compounding
- Variance Drag Crushes Simple Averaging
- Annualize Returns via Compound Growth
- Drawdowns Track Peak-to-Trough Losses
Full Transcript
welcome to computational finance 101 over the next nine minutes you're going to gain a foundation from the ground up in data science techniques risk and return modeling data visualization and more whether it's your first time ever dealing with financial Concepts inside of a programming language or you just want to brush upon the basics you're in the right spot let's get started first we're going to import the most widely used numerical library in Python numpy we're also going to import a numpy
wrapper called pandas that we're going to use to hold our data as well as matplotlib which we'll use for data visualization to import data into pandas we can use the read CSV function and hand it either a path to a local file or a URL to a remote hosted resource in this video we're going to use a python Library called y Finance to fetch our data which will put it into a pandas data structure for us let's go ahead and download monthly stock prices for the ETF spy adjusted close gives us closing
prices that are inclusive of cash flows so we aren't leaving any data out this way returns will be calculated as total returns as opposed to price returns there's two different pandas data structures that we can use to store data being series and data frames a series of single dimensional and can be used for data from a single stock while a data frame is two-dimensional and can be used to store data from multiple stocks in this video we're going to go ahead and stick with data frames
in cases where we have n a values like we do in this case because the two different stocks have different Inception dates we can remove all rows with an N A by calling the drop n a method when applicable the In-Place parameter can be used to perform the operations on the existing data frame as opposed to returning a new one what we have now is called a price series because we have prices at consecutive time steps let's take a closer look at our data frame this day column is special and is
known as our index we can change the format over index to monthly to match the granularity of the data with the two period method we can now go ahead and plot our prices with matplotlib by calling Dot Plot on our data frame this doesn't actually help us perform any type of comparison as our prices start at two different arbitrary points as this is just raw ETF data at the moment we'll be seeing how to correct this shortly when it comes to calculating returns from these prices let's go over some
basic math first the most common return formula that you've probably been taught is where you take the difference in final price and the initial price of an asset and scale it by the initial price this gives you a single period return expressed as a decimal which you can then convert into a percentage when it comes to computational finance it's going to be helpful to slightly modify the standard formula if we divide the final price by the initial price we are left with the exact same number but
larger by one if we then subtract one we have the exact same result this is called the one plus r format and it's going to be a reoccurring theme all of that was for a single period return to calculate a multi-period return we can add one to each single period return multiply them all together and then subtract one this is known as a compound or a geometric return it's important to note that in order to calculate the compound return which is the real return you cannot just add the
returns together due to something called variance drag essentially if you invest a hundred dollars into an asset and earn thirty percent in time step one that's thirty percent of a hundred dollars if you then lose thirty percent in time step two that's thirty percent of a hundred thirty dollars which is thirty nine dollars the process outlined here is called geometric linking if you want to go ahead and turn this price series into a return series it can take a little shortcut and simply call the
percent change method rather than using the previous formula when converting from prices to returns though you will always lose a single data point as returns cannot be computed for the first day as previous closing prices are not available we can then use the drop and a method to clean this up and we now have a return series if you want to see how our returns look we can once again plot this with Dot Plot if we now want to compound the entire return series we can add one to all of
our single period returns call Dot prod which will calculate the product between all of them and then subtract one if you want to view this more clearly we can round this and apply the dot as type method to change the numbers to Strings these are now our in Sample compound returns coming back to some more data frame methods we can view the first five or n elements with DOT head the same is applicable for DOT tail if you want to view the total number of elements in the
data frame we can use dot size and if you want to view the size of each individual axis we can use dot shape we also have dot index to view our dates and Dot columns to view our stocks we can index into a single stock with single brackets and get a series or any number of stocks with double brackets to keep it as a data frame we can also index across rows with Dot Lock given an index or dot eye lock for integer location given the position of an index both of these methods also work with a
colon operator for slicing if you want to select a range turning to the risk side of things now let's go ahead and calculate the volatility measured by standard deviation we know that standard deviation is just a square root of variance once again we can shorthand the underlying mathematics by using the dot STD method now that we know how to calculate returns and volatility let's move on to annualizing both of these metrics as this is what we actually use to analyze performance
let's start by calculating annualized return we know that if we had a single monthly return We Could annualize It by merely raising it to the 12th power as this would be equivalent to a compound return over a 12-month period in reality though we have a return series with hundreds or thousands of different per period Returns what we can do is work our way backward if we have a single monthly return we could add one to it raise it to the 12th power and then subtract the one
if you want to work backward and calculate this monthly return we would need to raise our entire samples compound return to one over the number of periods where the compound growth is as we have calculated it before and the number of periods can be determined with its shape we can actually do some shorthand here and raise the compound growth to the periods per year over the number of periods let's put this into a function where periods per year can be parametric
annualized volatility on the other hand is going to be much easier as we can just multiply your volatility by the square root of the periods per year let's go ahead and put this into its own function as well we're at it we can calculate the raw sharp ratio which is just a risk-adjusted return measure to calculate the raw sharp ratio we can just scale or annualize returns by our annualized volatility to calculate the actual sharp ratio this would require a Time series of t-built returns also
known as the risk-free rate which will be covered in an upcoming video if you want to now use a return series to create a wealth index we can just add one to all of our returns but rather than calling dot prod we can call document which calculates the cumulative product which is just the running product at each successive timestamp we can now see an accurate comparison between Securities this also represents the growth of one dollar something to note is that the wealth index doesn't
actually start at one because the first data point represents the first day's return if you want to insert a static one as the first data point in our wealth index we can take our start date with DOT index.min cycle to the previous month with pd.date offset and then prepend this into the data frame with PD Dot concat for our last topic let's calculate and plot our drawdowns draw items are just going to be the return from the previous Peak to the current price we're going to
use the wealth index to help us calculate drawdowns in conjunction with the previous Peaks which can be calculated by calling dot Q Max on the wealth index qmax is just a cumulative maximum which will set each value equal to the highest data point on or before each respective time step to arrive at our drawdowns we can take the difference between our wealth index and our previous Peaks and scale it by our previous Peaks this is now our drawdowns a key stat that is often calculated
would be the maximum drawdown which we can calculate by calling dot Min since our numbers are negative lastly we can call Dot idx men or index minimum to get the date associated with the maximum drawdown if you want to plot all of this together we can add in a matplotlib annotation where we format the max drawdown give it coordinates position it and add in an arrow this now turns all of this code that might be hard to wrap your head around into a nice data visualization if
you enjoyed I might make this video the first in a series which would include a wide range of topics spanning from stochastic modeling and portfolio Insurance to asset pricing and Factor regressions if you like the video consider leaving a thumbs up as well as any additional feedback or ways it can improve thanks for watching
Loading video analysis...