An actual Q&A for my video about actually tracking Stealth Fighters with cheap webcams.
By Consistently Inconsistent
Summary
## Key takeaways - **Low SNR Target Detection Breakthrough**: Previous multi-sensor tracking techniques struggle with low signal-to-noise ratio (SNR) targets, becoming computationally impossible as more cameras are added. This new method allows for the accumulation and correlation of data in a voxel grid, boosting SNR and making it feasible to add many cameras. [01:46], [04:12] - **Beyond Basic Triangulation**: Many existing techniques sound similar but are limited to high SNR targets, essentially performing basic triangulation. This method excels at detecting low SNR targets in cluttered environments, a fundamental limitation of prior approaches. [01:21], [01:48] - **Multi-Camera Advantage for Low SNR**: While adding cameras to older systems exponentially increased computational cost, this new technique offers a linear increase in cost per camera. Each added camera exponentially increases the ability to differentiate targets from background noise, making it viable to 'throw more cameras at the problem'. [03:05], [04:21] - **Versatile Sensing Capabilities**: This technique can be applied at night using thermal vision, through clouds with radar, and even for improving 3D scanning data. It overcomes limitations like diffraction and photon shot noise, offering a significant improvement over existing methods. [01:02], [06:32] - **Practicality Over Costly Alternatives**: Historically, improving low SNR detection was often more expensive than simply using higher quality cameras. This new method, even with cheap webcams, demonstrates a more effective and cost-efficient approach compared to older, more complex techniques. [03:48], [06:15]
Topics Covered
- New technique detects faint objects beyond 100 km.
- Old tracking methods fail on low signal-to-noise targets.
- Adding more cameras exponentially boosts detection, not cost.
- New method makes stealth fighters detectable.
- This technique is better than existing 3D scanning methods.
Full Transcript
Thanks for the life-changing support in
my last few videos, and sorry for the
double remaster. Sadly, YouTube doesn't
let you edit videos, and it was meant to
be my summer of math exposition
submission, which I was hyped for, and I
didn't know how to fit this into it. As
a probably necessary recap and summary
for those who haven't seen the series
before, which you can feel free to skip
using this timestamp, I made a video
where I introduced a technique which
allows you to detect and track faint
objects such as drones, asteroids, and
stealth fighters, far better than any
other technique in existence, allowing
you to detect and precisely track them
from well over a 100 km away, which it
achieves by first extracting the motion
values from a series of images and then
taking this motion and back projecting
the motion values. of each pixel in the
direction of its origin into a grid of
voxels and adding the motion of each
pixel to each voxil the ray collides
with and then repeats this from multiple
different cameras which allows us to
average out and filter out noise and
objects such as insects and birds from
the foreground leaving just the objects
that are moving within our grid. This
technique can also be used
interchangeably at night from well over
a 100 km away using thermal vision and
with radar to majorly improve your
ability to see through clouds, not to
mention solve a whole bunch of other
problems. With that out of the way,
probably the most important question is,
doesn't this already exist, though?
Well, that's because there are plenty of
techniques that on the surface sound
exactly like they're doing the same
thing as this, and they can perform
almost all of the same functions, but
only on high signal to noise ratio
targets, at which point they're just
doing basic triangulation. A major
mistake in my videos was that I wasn't
very clear about what the major
fundamental limitations of those
previous techniques were and what this
is meant to be the best at, which is
specifically detecting low signal to
noise ratio targets in cluttered spaces.
as a brief summary of the major weakness
of what was previously the best-in-class
technique for tracking faint objects
like this, which was also easily the
number one most commented technique by
you guys under my last video, which
sadly doesn't really have a single
unified name. So, I'm going to have to
sum up most of the names that get lumped
up into this as being multi-ensor
tracking before detection triangulation
using calman filtering on radon hull
transforms. The problem with this
approach is that it always inevitably
runs straight into a brick wall of
performance issues on ultra low signal
to noise ratio targets as it relies on
being able to call off false positions.
And unfortunately, when your object's
motion signature is faint, you won't
have enough signal compared to noise to
decide what pixels in your image should
be called off and ignored from further
processing. This very rapidly becomes a
major problem as each extra point on
your image that you're unable to
confidently cull adds a ridiculously
exponentially accumulating amount of
extra associations that you have to
check which will cap out even the best
computers before they can process more
than a small fraction of the pixels on
two cameras let alone five or six which
makes this basically useless for ultra
low SNR. But the real killer is that
each extra camera you add increases the
power to which the total candidates that
you have to check is raised to. So it's
more or less computationally impossible
to see any of the gains that you can get
from just adding more cameras until it
works. Which is why you pretty much
never see systems using more than one
camera. And if you want an incredible
video about the difficulty of finding
planes, even using telescopes, then
check out this incredible video by DST
Studios on the topic, which ironically
was what inspired me to realize that the
asteroid tracker that I was working on
could do this. This is why historically
it was pretty much never used in
practice for low SNR tasks as a small
amount of juice really isn't worth the
squeeze because it was pretty much
always going to be cheaper to just buy a
higher quality camera instead of dealing
with the many many headaches and
impossibilities of the older technique.
But because of this new method, you're
now able to add all of your data into a
voxal grid first to combine, correlate,
and boost the signal to noise ratio of
candidate pixels inside voxels, which
otherwise would have been called off,
which makes it much, much easier to
reduce the amount of candidate pixels
with each added camera, all while only
having a linear increase in
computational cost for each additional
camera. Especially because each camera
you add exponentially increases your
ability to differentiate your target
from foreground and background objects.
So if you don't have enough signal from
one or two cameras, you can just keep
throwing more cameras at the problem
until it works. Which is why this has
always been a theoretical dream
technique. Obviously there are a lot of
things wrong with that summary and it is
a bit more nuance than that but we would
be here all day. So broadly speaking
those really were the limitations with
previous techniques. you were either
sacrificing precious speed or
sacrificing crucial information. Which
is why pretty much all of the work you
will find on improving multi-ensor
tracking before detection triangulation
using common filtering on radon hold
transforms was just focusing on the
smallest gains you could get from better
culling. And that's not me trying to
knock on those other techniques as they
not only can but absolutely should be
used in combination with this to unlock
ridiculous amounts more data which is
especially true for enhancing 3D
scanning techniques like cryo electron
tomography where you can make very
reasonable assumptions of certain
substructures appearing in your sample
to help you determine its orientation.
And I've never been able to find an
example of this, including Andrew's
lattis system, that was able to show
results anywhere near as good as I was
able to get. I have tried my absolute
hardest to find any example of this
before me to credit them. But even the
extremely thorough Center for Strategic
and International Studies report on
using distributed infrared sensors to
track stealth fighters only ever
considers the case of triangulating
stealth fighters once you've already
found it. Moving on to defraction
limits. While yes, if you're using
thermal cameras at nighttime, defraction
does start to become a problem for ultra
cheap thermal cameras. Cheap is of
course relative, and it's still well
within the limits of already existing
thermal camera systems while still being
both cheaper and much better than any
other technique. The use of webcams is
just to show off its power. You would
always want to use better quality
cameras to eliminate headaches earlier.
You also gain the added benefit of being
able to use the information from
multiple cameras to create a very
accurate firmy estimate of the position
of the object which allows you to
overcome defraction and other blurring
effects such as photon shot noise etc
etc. In terms of improving radar to see
through clouds better. This also helps
and uses a completely separate technique
to multi-static radar which has been
done for decades and is what you're
thinking of if you've heard of
networking radar dishes before. I would
also expect that radar would see a
drastically bigger boost than what
infrared gets from this as radar
contains a lot more data to correlate
such as distance and velocity while
infrared only gives you brightness.
Also, pretty much all of the problems
that you can think of for this usually
have surprisingly interesting and
reasonably easy to implement solutions
if you throw a little bit of brain power
at figuring out the solution. And even
for more extreme applications, they will
still be well within the scale of what
you can do if you have a few million
dollars to throw at the program to
develop this, which is still very cheap
compared to what this can do after you
solve those problems. And all the
hobbyists who replicated their own
version of this said that they were able
to overcome most of the problems that
people including me usually think of
with this. While I don't think it is
productive to measure the success of a
technique that can be used for plenty of
other things against a series of
programs that cost hundreds of billions
of dollars in total, even if this does
make stealth fighters entirely
ineffective at being stealthy, I still
wouldn't expect the contracts for them
to get cancelled for at least another
year or two, as I would still want time
to do feasibility studies for
workarounds and alternative use cases
since technically if you are able to
knock out enemy radar, you could still
be invisible inside cloud cover. Not to
mention that it's free money to those
who actually matter. Anyways, that being
said, given that in order to be within
the range to drop GBU57s
during the Natan strikes, the B2s would
have had to travel within 20 km of the
facility, which also happened on a night
that we know had clear skies, which
means that you can use the background
stars to plate solve this orientation
stabilization. I do wonder if an
unintended consequence of me releasing
that video, at least in part, was the
sudden unexpected rush to bomb the
facility before a minimum viable
tracking system could be developed. But
I suppose we will have to wait for Ken
Clippenstein for that. Moving on to
things that are actually useful. While
without information about the phase of
the wave that you are measuring, this
doesn't give you the 2D ultra resolution
that you get from intererometry.
We're also only able to perform
interferometry with synthetic apertures
bigger than a few dozen meters on large
wavelengths such as radio waves. So, I'd
still say this is more than pretty good,
especially for asteroid hunting because
it allows your sensors to be as far away
from each other as you want with
whatever wavelength you want. and it
still gives you a ridiculous amount of
extra free information specifically
inside 3D volumes that you don't get
from all of the other techniques which
don't use difference images as well as
the normal images and instead just use
the normal images which I explained
better in my third video. But to expand
on what I said in that video, this uses
information that isn't accessed by other
techniques for 3D scanning such as radon
transforms and in particular gausian
splats and nerfs, etc., etc., which are
much better at a lot of other things.
This is just better at other things that
they weren't designed to be good at or
needed to be good at to begin with, as
despite their incredibly high visual
fidelity, they still struggle at finding
faint small objects like this because
why would you need to? You would never
notice them anyways. And it's also
really cool seeing how they work
different to my technique behind the
scenes. People have also told me to
patent this, but that ship has sailed.
And let me know if you want me to upload
a full translated version of what is
probably my favorite replication of
this, which is an entire de vlog series
made by a fan from China using cheap
webcams. And if you've built something
with this yourself, then feel free to
send it to me by email or on Twitter. As
if you guys want to see it, then I want
to make a compilation of submissions.
Speaking of which, something that I
really didn't expect about having a
video blow up is that pretty much all of
the people who send you something send
it within the first 30,000 or so views.
After that, it falls off like crazy. As
in within the first 10,000, I will have
people within my city messaging me. So,
don't be afraid to send something
regardless of the view count. The
closest paper I could find to this was
this one, which comes painfully close to
doing the same thing, but unfortunately
doesn't quite put all of the pieces
together to do things like accumulation
of ultra low signal objects and is
instead about finding the whole of an
object that is already high SNR, such as
a person. The next closestish study that
I could find was the 2017 pyramid scans,
which use a similar back projection
technique, but don't use a pseudo motion
between sensors on top to improve their
resolution. despite already having close
to ideal spacing between the scanners.
Anyways, I want to thank you guys again
for your incredibly life-changing
support on the last few videos. I'm
planning on going back to my normal fun
videos in my spare time now that my life
has settled a bit. So, we will be back
to your regularly scheduled
deprogramming unless Google wants to
partner on a video to show off how their
Tekken platform enables progress or
something. seeing as I posted this on
their platform. Although, the thing that
I want to make is so cool and
interesting that I might just make it
anyways for the fun of it. To recap,
this improves on previous techniques by
allowing you to find ultra low signal
targets, which would previously cost an
impossible amount of computing time to
find and wouldn't let you keep on adding
more cameras. It works at night from
over 100 km away using thermals and
through clouds with radar, and it is
separate to other network radar
techniques. It isn't phased by
defraction or photon shot noise while
still being cheaper and more than
effective than alternatives. It can also
be used to extract more information from
3D scanning data sets provided that
there is enough parallax between the
cameras for the objects to appear in
different spots. So, it won't help you
with 2D scanning like taking images of
distant galaxies which is what you need
interferometry for. Even though I tried
to go over some of it again, there's
also a bunch of extra stuff in the third
video that I don't cover here. So, some
of your questions may have already been
covered there with most of the extra
stuff happening at these timestamps. But
yeah, it takes ages to edit videos and
this video is already getting long. So,
I'm going to write out what didn't need
to be said in this video in the
description to finally actually get this
out. And this video has pretty much just
been B-roll. Anyways, anyways, thanks
again for watching. I'll see you on the
next
Loading video analysis...