LongCut logo

StatQuest: PCA main ideas in only 5 minutes!!!

By StatQuest with Josh Starmer

Summary

Topics Covered

  • Cells Reveal Differences via Gene Activity
  • Correlations Show Cell Similarities
  • PCA Converts Correlations to 2D Clusters
  • PC1 Captures Most Important Variance

Full Transcript

that quest is the best if you don't think so then we have different opinions hello I'm Josh stommer

and welcome to stat quest today we're gonna be talking about the main ideas behind principle component analysis and we're going to cover those concepts in five minutes if you want more details

than you get here be sure to check out my other PCA video let's say we had some normal cells

if you're not a biologist imagine that these could be people or cars or cities or etc they could be anything even though they look the same we suspect

that there are differences these might be one type of cell or one type of person or car or city etc these might be

another type of cell and lastly these might be a third type of cell unfortunately we can't observe

differences from the outside so we sequence the messenger RNA in each cell to identify which genes are active this

tells us what the cell is doing if they were people we could measure their weight blood pressure reading level etc

okay here's the data each column shows how much each gene is transcribed in each cell for now let's

imagine there are only two cells if we just have two cells then we can plot the measurements for each gene this gene

gene one is highly transcribed in cell one and lowly transcribed in cell two and this gene gene 9 is lowly

transcribed in cell 1 and highly transcribed in cell 2 in general cell one and cell to have an inverse correlation this means that they

are probably two different types of cells since they are using different genes now let's imagine there are three cells

we've already seen how we can plot the first two cells to see how closely they are related now we can also compare cell one to sell three cell one and cell

three are positively correlated suggesting they are doing similar things lastly we can also compare cell two to

cell three the negative correlation suggests that cell two is doing something different from cell 3 alternatively we could try to plot all

three cells at once on a three dimensional graph cell one could be the vertical axis cell two could be the

horizontal axis and sell three could be depth we could then rotate this graph around to see how

the cells are related to each other but what do we do when we have four or more cells draw tons and tons of to sell

plots and try to make sense of them all or draw some crazy graph that has an axis for each cell and makes our brain explode

no both of those options are just plain silly instead we draw a principal component analysis or PCA plot

a PCA plot converts the correlations or lack thereof among the cells into a 2d graph

cells that are highly correlated cluster together this cluster of cells are highly correlated with each other

so are these and so are these to make the clusters easier to see we can color-code them once we've identified

the clusters in the PCA plot we can go back to the original cells and see that they represent three different types of cells doing three different types of

things with their genes BAM here's one last main idea about how to interpret PCA plots the axes are ranked

in order of importance differences among the first principal component access PC one are more

important than differences along the second principal component access PC two if the plot looked like this

where the distance between these two clusters is about the same as the distance between these two clusters then these two clusters are more

different from each other than these two clusters before we go you should know that PCA is just one way to make sense of this type of data there are lots of

other methods that are variations on this theme of dimension reduction these methods include heat maps tea Snee plots

and multiple dimension scaling plots the good news is that I've got stat quests for all of these so you can check those out if you want to learn more note if

the concept of dimension reduction is freaking you out check out the original stat quest on PCA I take it nice and slow so it's clearly explained hooray

we've made it to the end of another exciting stat quest if you like the stat quest and want to see more of them please subscribe and if you have any ideas for additional stat quests well

put them in the comments below until next time quest on

Loading...

Loading video analysis...