An Introduction to Physics-Informed Neural Networks
By Zara Dar
Summary
Topics Covered
- PINs Conquer Data Scarcity
- Inject Physics into Loss Function
- Physics Loss Enforces DE Everywhere
- Next AI Wave Masters Physics
Full Transcript
Hey, it's Zara. In this video, we'll be talking about physics informed neural networks. I will explain this concept
networks. I will explain this concept through a simple example so you get a general idea of how they combine physics with neural networks to make better predictions even with limited data.
Imagine you have received an initial injection of a drug with the concentration C 0.
We want to know the concentration in your bloodstream C of T at any specific moment. However, in reality, we cannot
moment. However, in reality, we cannot monitor you continuously. We could only take a blood sample to measure the concentration, let's say every 5 hours.
A classical neural network, which we previously discussed, would effectively complain that there's almost no data with just a handful of time points and concentrations at those time points. It
might interpolate poorly between measurements and extrapolate in completely unphysical ways because it's just pure data. There's no direct physical input. However, let us assume
physical input. However, let us assume that we have prior knowledge of the differential equation governing the concentration where t is the time elapsed. C of t is the drug
elapsed. C of t is the drug concentration in the blood at the time t. C 0 is the initial concentration at
t. C 0 is the initial concentration at the moment of injection. So t equals z.
K is the elimination rate constant or how fast the drug clears the body. DCT
is the rate of change in the concentration and the negative here means that it is a decay. So K in this case is a positive number. While this is a simple differential equation and
assumes we have a simple physical model governing the drug concentration decay, it still serves as a nice model for demonstration.
[Applause] So in 2019, researchers proposed a compromise. Let a neural network respect
compromise. Let a neural network respect the known physics as it learns from the scarce data available. Their idea was physics informed neural networks or pins
and has since spread across many fields including fluid mechanics, climate, material, and finance.
To put this discussion into context, my previous video focused on datadriven neural networks where the loss function was based on mean squared error or MSE without incorporating any physical
constraints. This approach made sense
constraints. This approach made sense because the task involved detecting birds in images, a setting where meaningful physical laws are not applicable. In simple terms, the
applicable. In simple terms, the presence of a bird cannot be inferred through differential equations.
So what is a physicsinformed neural network? Let's take an ordinary feed
network? Let's take an ordinary feed forward neural network. You simply feed it the coordinates you care about, say spatial position or time, and ask it to output the physical quantity you need,
say the temperature or velocity. To
qualify as a PIN, the training generally differs from normal deep learning in one key way which is by injecting the known physics into the loss function.
The loss function covers measured data misfit which means prediction error like regular mean squared error with the residual of the governing differential equation. Let me explain this using the
equation. Let me explain this using the simple example we discussed earlier. the
concentration of a drug in the bloodstream after an initial injection.
We can start with a simple input time t and this time goes into our neural network.
Our neural network initially has random or untrained weights w.
The network then produces a prediction for the drug concentration at time t.
The advantage here is that modern neural networks come with automatic differentiation which is just a fancy term for the chain rule propagated through the network.
This means that we can easily compute the derivative.
Now recall that we have two things we know about this problem. The initial
concentration C 0 at time Z and the differential equation governing the drug concentration variation over time. And
to train this neural network as a pin, these known conditions are turned into parts of the loss function.
One part of the loss is the boundary condition loss at t equals 0, ensuring that the predicted concentration at time 0 is close to the initial concentration c0.
And the second part of the loss is the physics loss. And here we have to use
physics loss. And here we have to use the known differential equation and enforce it at multiple collocation points. We can select as many of these
points. We can select as many of these points as we want. Let's say m points in here.
And this ensures that the network respects the physics throughout the time domain.
So you can see how I converted the differential equation into a loss by moving everything to the left hand side.
So we take the prediction and compute its derivative and plug it into the OT residual. Ideally this residual is zero
residual. Ideally this residual is zero everywhere but in practice it won't be zero and minimizing it becomes part of the training.
[Applause] Afterwards, by combining both the boundary condition loss and the physics loss, we form a total loss. And we
minimize the total loss using any standard optimizer like stochastic gradient descent or atom.
And then we iterate this process until convergence. So I went ahead and
convergence. So I went ahead and implemented this example in a simple Python script which you can find on my GitHub. So this example is intentionally
GitHub. So this example is intentionally very simple. In practice, the real
very simple. In practice, the real challenge is forming the correct differential equation for the problem and choosing the appropriate training strategy and network architecture.
Everything we've discussed so far, adding the residual term to the loss, turns a generic neural network like an FNN into a pin during training. Physics
isn't built into the architecture itself. It's only enforced through the
itself. It's only enforced through the loss function. However, there is a
loss function. However, there is a growing class of pins specific architectures such as four-year pins or neural operators that go way further.
These designs incorporate physics more directly either through coordinate transformations, bias functions or so on. And here I want to take a moment to
on. And here I want to take a moment to quote Jensen Wong. He said that the next wave of AI will require us to understand things like the laws of physics, friction, inertia, and cause and effect.
It's not just about data anymore. It's
about understanding how data interacts with the real world.
Loading video analysis...