LongCut logo

Two-dimensional continuous distributions: an introduction

By Ben Lambert

Summary

## Key takeaways - **Continuous Outcomes Require Continuous Distributions**: When describing random processes with continuous outcomes, like the volume of beer consumed or body fat percentage, we must use continuous probability distributions, not discrete ones. [00:10] - **Visualizing 2D Distributions: Surface vs. Contour Plot**: Two-dimensional continuous probability distributions can be visualized as a 3D surface, but contour plots are often preferred for easier analysis, representing lines of constant probability density. [02:15], [03:14] - **Valid 2D Distribution Conditions**: For a two-dimensional continuous distribution to be valid, the probability density function must be non-negative for all outcomes, and the double integral over all possible values must equal one. [05:25], [06:03] - **Integral Represents Total Probability**: The double integral of a two-dimensional probability distribution function over all possible values represents the total probability, which must equal one, signifying that any sampled individual must fall within the defined outcome space. [06:34], [07:02]

Topics Covered

  • Randomness means uncertainty, not chaos.
  • Visualizing complex data with contour plots.
  • Contour plots simplify 3D probability distributions.
  • Valid probability distributions require non-negativity.
  • The integral of probability density must equal one.

Full Transcript

in this video I want to further explain

what is meant by a 2-dimensional

probability distribution and in this

video we're going to be considering a 2d

probability distribution for continuous

outcomes so as before we're considering

two processes or we're considering the

outcome of two separate processes where

each of those processes is random in

nature and by random I don't mean the

colloquial use of random I mean that the

outcome of that process is uncertain and

unlike the discrete case the individual

outcome of one of these processes

belongs or can be seen as belonging to

one of a continuum of possible outcomes

and because the thing which we're

describing can be measured as being

continuous and because the outcome is

continuous in nature the distributions

that we use to describe it are

unsurprisingly continuous probability

distributions the example that I'm going

to be using here is going to be two

processes one of them is the volume or

beer that an individual drinks per week

and I'm going to use the random variable

B to represent that measurement of the

volume of beer that they drink on an

average week the other thing that we're

going to consider is the number of grams

of body fat that an individual has and

just for completeness I'm going to say

that we measure the volume of beer drunk

in terms of the number of liters so why

are we using probability distributions

to describe the outcomes of these two

processes well the idea is that what

we're imagining we're doing here is that

we are sampling an individual at random

from the population and because before

we actually measure their level of body

fat or asked them or they tell us their

level of alcohol that they drink a week

we are uncertain about both of those

outcomes and hence because of this

uncertainty we use probabilities

distributions so what might a

probability distribution function look

like in this case well we've got two

potential outcomes and so we've got on

our axes here be a random variable which

represents the volume of beer drunk per

week and F which represents an

individual's level of body fat

now the third axis as for the univariate

case represents a probability density so

what might our probability distribution

function look like in this case well it

would be a kind of surface so I might

sort of represent it in this sort of

form here

where we can sort of imagine that the

thing I'm drawing is actually kind of a

three dimensional surface or rather a

surface which exists in three dimensions

but this representation that I've shown

thus far is a bit difficult to analyze

and so what we typically do is we

represent the probability distribution

function in a slightly different form

here what we do is we use something

which is known as a contour plot so here

the bottom axis will be the volume of

beer drunk per week and then the

vertical axis will be an individual's

level of fat then what we do is we draw

contours in a graph and these contours

represent lines which have all got

constant probability density so here

this purple line which I've drawn here

might correspond in the left hand case

to a sort of path which is represented

by the purple line I've shown here on

the Left where all of the combinations

of body fat and alcohol drunk per week

that I've shown here by the sort of path

on the sort of horizontal plane

correspond to the same level of

probability density then I could draw a

contour of slightly lower levels of

probability density and represent that

by the Green Line and perhaps that green

line corresponds to a sort of Green Line

which I'm now writing over the orange

line on the left-hand side here and then

finally I might draw another contour

which car

horns on the left here to the sort of

line of level of a sort of lower level

of probability density here and I

represent that in the right hand case

here by another sort of consort which I

draw here because humans tend to find

two dimensions easier to think in than

three dimensions we often use this kind

of thing on the right hand side here

this visualization trick which is known

as a contour plot but remember whenever

you see a contour plot what essentially

is happening here is that the

probability density is sort of coming

out of the page at you or out of the

screen in this case where in this case

the pink line here corresponds to a high

probability density the green to a

slightly lower one and the light blue to

a slightly lower one still okay so now

we've covered how we kind of visualize a

two-dimensional continuous probability

distribution but what are the conditions

under which that distribution is

actually a valid probability

distribution well much like the discrete

case we require that the values of our

function which is now a function of the

two random variables B and F must be

greater than or equal to zero for all

potential values of B NF and we see that

that satisfied here because of the fact

that our probability density of all I've

really shown it but it's always greater

than zero the second condition is

essentially the analogue of the two

dimensional discrete case where we have

to do two summations

now because we're dealing with

continuous variables what we do is we do

two integrals we integrate the

probability distribution function of now

it's of two variables being an F between

B being between zero and infinity and F

between zero and infinity it's zero here

in both cases because you can't drink a

negative volume of beer nor can you have

a negative level of body fat and this

integral must be equal to one and the

intuition behind this condition is just

saying that any individual that we pick

from any population

must belong in somewhere in this kind of

plane that we've drawn here where we've

got positive levels of body fat and

positive levels of beer drunk per week

or not at least non-negative values of

both of those individual things so what

actually does this two-dimensional

integral kind of represent well we can

imagine it as being sort of working out

the volume which is contained under this

surface here and plane which corresponds

to a probability density of 0 so we're

kind of working out the the volume

underneath this particular surface that

I've drawn here

Loading...

Loading video analysis...