RTX5090/DLSS4深度评测:全靠科技与狠活!
By 极客湾Geekerwan
Summary
Topics Covered
- Path Tracing Games Need DLSS4 to Be Playable
- Traditional GPU Performance Is Plateauing
- China's 5090D Gaming Performance Matches Global 5090
- DLSS4 Transformer Architecture: Smarter Than CNN
- 4x Frame Generation Is Surprisingly Usable on 5090
Full Transcript
After two and a half years of practice, the RTX50 series graphics cards are finally here.
The two graphics cards in front of me are the new flagship RTX5090.
RTX5090D cannot be helped . We currently only have the 5090D officially sold in China,
. We currently only have the 5090D officially sold in China, so today we will take a look at how the 5090D performs. Is it better than the 4090?
How effective is DLSS4 ?
And the most important thing is how different the gaming performance of 5090D is from 5090.
It happened that an uncle we were very familiar with lent us a 5090 without D!
We can also find out how much influence this D has.
The relationship between 4090D and 4090 is different.
5090D, at least in terms of paper parameters, is exactly the same as 5090. It has exactly the same number of CUDA cores and the same number of Tensor Cores . 96M L2 cache
Tensor Cores . 96M L2 cache exactly the same 32GB 512bit GDDR7 video memory basically has a scale increase of more than 30% compared to the 4090.
For details about the Blackwell architecture, you can refer to the content of our previous technology preview, which has been discussed in detail.
The 5090D does not have an official FE version.
This time we collected a total of Two RTX5090Ds from ASUS and Colorful have arrived : ASUS's ROG Night God RTX5090D and Colorful's RTX5090D AD Colorful's 5090D The size of the AD is quite solid.
There are two 11cm fans on both sides and a 10cm fan in the middle.
There are vents on the back panel but there is no exhaust fan on the back.
The overall thickness is close to four slots.
AD should be more mainstream in terms of positioning . Its price will not be much higher than the official guide price
. Its price will not be much higher than the official guide price of ROG’s Night God 5090 The size of the D is larger.
It has a heat dissipation design with three fans on the front and an exhaust fan on the back.
Next, why don’t we disassemble this ROG Night God 5090D?
First remove the fan and you can see that this card uses an 8x8mm heat dissipation The heat dissipation area of the module is quite large.
If you remove the radiator, you can see the PCB.
This ROG The PCB of the 5090D is quite luxurious.
A total of 16 pieces of video memory form a 32GB 512bit video memory specification.
The last time I saw so much video memory was the GTX280!
The PCB is soldered with 18+6+7 phase power supply and this ROG is designed to be used as the exhaust fan at the rear.
5090D also made a special-shaped PCB design similar to the public version 4090.
Then we scraped off the 7950 phase change silicone grease on the core.
This is the GPU core of 5090D.
The code name of this core is GB202-250.
Note that it is 250 without D for 5090.
There is a difference between the 300 core code names , so it would be quite difficult to change the 5090D to 5090.
The surface of the GB202 core The area is quite huge.
We measured a core area of about 750 square millimeters.
The flagship card of all generations is also a super large core that is on par with TU102, and the package size is even larger.
It can almost be regarded as the largest in history.
After all, a 512-bit video memory controller is no joke.
This time, I am very grateful to my three old friends, Uncle Tony, Wan Shilan and Kurnal, for opening a GB202 core, which can be said to be a cost explosion!
The full specification GB202 core has a total of 192 sets of SM units, a total of 24576 CUDA cores.
The huge SRAM area in the middle is a 128M L2 cache .
The scale of the entire core is much larger than the previous generation AD102 core.
5090 and 5090D are turned on. Among them, the 21760 sets of CUDA cores and 96M L2 cache castration range are almost the same as 4090.
The full scale GB202 core should be the future RTX. Gamers who are exclusive to B6000 and "B40" computing cards should not think about this idea.
Next, let's try it on a real machine to see how the performance of the RTX5090D is.
In order not to cause a CPU bottleneck, we chose a test platform that currently has no The controversial CPU with the strongest gaming performance, AMD 9800X3D, is paired with dual-slot DDR5 6000 C28 memory and ROG X870E In addition to the 5090D Hero motherboard test, we also retested the old flagship RTX4090 as a reference
and set the complete 5090 to the same frequency and power consumption wall for comparison.
First, let’s take a look at 3DMark’s classic DX12 test item Time Spy.
The score of the 5090D is almost 48,000. This score
is about 32% better than the 4090.
The performance of the 5090 is exactly the same as the 5090D!
In the more stressful Time Spy Extreme, the 5090D still has a gap of more than 30% with the 4090 , while the 5090 continues to remain exactly the same . However, in the new Steel Nomad test,
. However, in the new Steel Nomad test, the two cards of the 5090 have gained a huge advantage over the 4090. As much as 50% , but in the Steel Nomad Light test, the gap between the two generations narrowed to less than 30%.
We also ran the Port In the Royal light tracing test , the performance of the 5090 brothers is about 40% better than that of the 4090.
This range will be greater than the advantage of the pure rasterization test.
The light tracing efficiency of the 50 series has indeed been further improved. Anyway,
at least from the perspective of running Judging from the running scores, the graphics performance of the 5090D and 5090 is exactly the same.
There is no difference.
To be honest, I am a little disappointed just looking at the running scores.
So how does it perform in actual games?
What is the effect of DLSS4?
We tested a total of 10 games this time to compare 5090, 5090D and 4090.
Let’s take a look.
First, play a few games that do not support DLSS. For
example, Elden’s Ring is one of the few masterpieces that does not support DLSS yet, but it’s okay. I understand that because Ring of Dharma normally limits the number of unlocked frames to 60 frames, there are only a few people because it will trigger anti-cheating and affect online play . Judging from the results,
. Judging from the results, Ring of Dharma is indeed a game with relatively small graphics card improvement 5090D The highest 4K image quality is 167 frames without ray tracing, an increase of 20% , and 110 frames with ray tracing, an increase of 22%.
There is a slight improvement, but it is really not much , and the power consumption of 5090D is significantly higher than that of 4090, although it is far away from the power consumption wall limit of 575W. It is still very far away , but the power consumption of the current 500W level is definitely not lower than that of the 3090TI at that time.
The other one that does not support DLSS is a chicken . The game is more often a CPU and memory cache bottleneck game. If
. The game is more often a CPU and memory cache bottleneck game. If
it weren't for the 9800X3D, we might not have any difference at all.
Anyway, the 4K full-scale 5090D on this platform can run more than 220 frames. The increase is about 28%.
Then it’s time for us to introduce several games that support DLSS.
The first is the optimization of Stalker 2 in 2024. Heart of Chernobyl is a game that really inherits the glorious tradition of Slavic developers.
Even though it does not support ray tracing , the 4090 can only run at 55 frames in 4K epic quality.
The optimization is too bad.
The new 5090D can barely run at an average speed.
A 27% increase over 70 frames is considered playable.
Fortunately, STALKER 2 supports DLSS over-scoring and frame-filling.
After turning on the balanced over-score, the number of frames finally increased significantly by 5090. The D can run about 109 frames
5090. The D can run about 109 frames and then if you turn on the supplementary frame, it can run up to 163 frames.
It seems that the 5090D is indeed no different from the 5090.
Even the performance under DLSS is not much different.
Call of Duty: Black Ops 6 is a graphics card with a low threshold but an upper limit. For relatively high competitive games, the 4090 can already run 4K at full 123 without DLSS.
If the 5090D is used, it will increase by another 21%. It runs 149 frames.
If you turn on a balanced setting, DLSS can run 172 frames . But if you want to achieve an e-sports frame rate of two to three hundred frames,
. But if you want to achieve an e-sports frame rate of two to three hundred frames, If the low frame rate is above 165, the current graphics card can still only reduce the image quality.
Red Dead Redemption 2 is our old friend.
As an old game from 7 years ago, it can still put pressure on the flagship graphics card.
Without the help of DLSS, 4090 has just run. To an average of 125 frames, the improvement of 5090D in Red Dead Redemption 2 is quite unobvious, only close to 20% , and the DLSS effect of this game is not very significant with DLSS turned on. The 5090D only ran around 170.
This time we also tested the Unreal 5 game Infinite Love, which has been criticized by many for poor optimization.
Not only does this game put a lot of pressure on the graphics card, but it also has a serious CPU bottleneck.
Only the 9800X3D can barely run over 120 frames even in 4K. Full special effects lighting chasing is still a CPU bottleneck on the 5090D, so you It seems that the improvement is not that big, because the 9800X3D can't run Nuan Nuan . Therefore, the frame rate will not improve after turning on DLSS super score.
. Therefore, the frame rate will not improve after turning on DLSS super score.
The optimization is really bad.
The 5090D seems to have improved these games tested before. It’s a bit embarrassing that they are both quite small .
So are there any games where the 5090D has a bigger improvement?
Of course, there are also games that like to embrace the latest graphics technology.
The 5090D performs pretty well.
For example, we once again invited our old friend Cyberpunk 2077.
Speaking of which, 2077 is also a veteran of four dynasties. Do you still remember which graphics card he first co-branded?
It’s 2080TI!
Then it was first launched with the 30 series as a light chasing demo game.
Later, when the 40 series was released , it also used 2077 to demonstrate the new features of DLSS3's frame-filling.
Path tracing and ray reconstruction were also first supported by 2077.
Now the first launch of DLSS4 is still 2077 , and it is also today. Official update support As an old actor, I have to say that 2077 is really good at least in terms of graphics technology.
Now the picture quality of 2077 with full path tracking is really amazing.
So far, no other game can play it, but the price is that if there are no various With such technology and powerful blessing, it is indeed difficult for you to have good-looking performance even if Without turning on ray tracing, the 4090 can run at 75 frames in 4K super image quality.
The 5090D can run at 108 frames.
There is a saying that the improvement is indeed quite large, nearly 44%.
If the intensity is increased and the OverDrive setting is turned on, the ray tracing performance will automatically increase. Needless to say, just take a break.
At this time, only 4090 is left. At 25 frames, 5090D only averages 34 frames.
Light tracing actually plays poorly at such a low frame rate because the number of light sampling samples at low frame rates is too small.
You will see smear problems caused by various light tracing . If you can tolerate the low frame rate,
. If you can tolerate the low frame rate, you will definitely not be able to tolerate the image quality, so you need to introduce D at this time. LSS is used to solve this problem , because only when the frame rate is high and the light sampling is sufficient, artifacts are less likely to occur.
First, do not turn on the frame filler and only turn on the balanced file of DLSS. The performance of the super-resolution 5090D without ray tracing can be improved to 143 frames with path tracing and ray tracing. It can be increased to 78 frames , which is finally a smooth enough game.
And if you want to pursue the most extreme gaming experience, then you should introduce the multi-frame generation of DLSS4 at this time.
It can directly bring the frame rate of 5090D to 248 frames.
Since the basic frame rate of 78 frames is high enough , this supplementary frame The experience of 248 frames is indeed silky smooth . We have seen 409 0 The first reviewer must know that
. We have seen 409 0 The first reviewer must know that DLSS frame-filling will occupy the original graphics card performance, so the 248 frames are not directly added from 79 frames , but are affected and reduced to 62 frames and then quadrupled to 248 frames . It is said that even if the basic performance of 5090D has not improved much
. It is said that even if the basic performance of 5090D has not improved much , it can cope with the current situation. The game is already quite powerful and it is difficult for the original frame to reach 60 frames after turning on the DLSS balanced file.
Therefore, multi-frame generation is very usable on the 5090D.
I am just more worried about those low-end 50 series cards.
If the original performance itself cannot be guaranteed, can the frame-filling effect be guaranteed?
Including latency and other aspects that everyone is concerned about, we will also specifically verify it later.
Another game that attaches great importance to graphics technology is Alan Wake 2.
This is one of the first games on the PC to support grid shading and path tracing.
And now Alan Wake 2 has The support for Mega Geometry has been updated for the first time, which is expected to improve the performance overhead and image quality under ray tracing.
We tested it and it is also a very performance-hungry game even without ray tracing 4K high preset and DLSS.
The 4090 can only run at 76 frames.
The 5090D can improve by about 34% and run above 100 frames.
It will be very miserable if ray tracing is turned on.
Everyone knows the power of path tracing.
At this time, the 4090 only has 36 frames left.
The 5090D is also Although it ran at 53 frames, the improvement was indeed quite large, over 47. %,
but obviously it still can’t be said to run smoothly.
At this time, we have to rely on DLSS.
First, we use overclocking to increase the frame rate to 97 frames, and then use the multi-frame generation of DLSS4 to increase the frame rate to nearly 300 frames.
The monitors of modern players are fast. If it can’t match the frame-filling pace of the 5090D,
are fast. If it can’t match the frame-filling pace of the 5090D, there is another one that supports DLSS4. The game is Legacy of Hogwarts.
This game is also very friendly to the 50 series.
The pure rasterization performance of the 5090D is 35% stronger. The
ray tracing is 29% stronger.
It can even run to 378 frames after turning on multi-frame generation.
To be honest, this game is not much better. The graphics card bottleneck is the CPU bottleneck, which is caused by various weird CPU lags.
Actual perception In fact, DLSS4 can hardly help in this situation. We
can only rely on game developers to further optimize it.
The finale is of course Black Myth: Wukong.
I believe everyone knows that the performance requirements of Black Monkey are really not a joke . Do not enable hardware ray tracing and only enable Lumen of UE5
. Do not enable hardware ray tracing and only enable Lumen of UE5 The 4090 runs 4K movie-level picture quality without turning on DLSS and only has 43 frames. The 5090D has only 57 frames.
If I turn on the hardware ray tracing, I will just stop.
The 5090D can only run over 30 frames of native frames.
It’s so refreshing.
After turning on DLSS, the super score is indeed After a lot of relief, the 4K balanced file 5090D can finally run close to 70 frames.
Black Monkey has also said that it will add DLSS4 multi-frame generation. Support,
but the current version has not been updated.
If DLSS3 is used to supplement frames, the 5090D can temporarily run to an average of 115 frames.
Overall, compared to the 4090, if we put aside the multi-frame generation of DLSS4, the improvement in traditional rasterized games will be close to 30%.
The improvement in light chasing games is 33%.
If I remember correctly, this is probably the generational improvement of flagship cards in the past ten years. The smallest speed ever, and the power consumption is due to the lack of iteration in the process.
It can be said that the 5090D has improved power consumption as much as the performance has been improved . It is common for high-voltage games to consume 500W.
. It is common for high-voltage games to consume 500W.
Considering that N cards are only updated once every two years, if there is no blessing of DLSS4, this The slight improvement is indeed not enough.
At the same time, the 5090D specially supplied in China currently has no gaming performance compared to the 5090. The slightest difference in DLSS performance can be basically considered as an error.
It seems that it is different from the 4090D.
The knife skills of the 5090D this time indeed did not affect the game performance.
The specific mechanism may not be known until more AI tests.
Anyway, we have previously reviewed the 5090D game. All the worries about performance can really be relieved.
Of course, since Lao Huang dares to sell this thing at a higher price Probably because he has confidence in DLSS4.
If the costs can be dealt with, this will indeed be a game changer.
Compared with mediocre performance improvements, DLSS4 is the real killer feature of the 50 series graphics cards.
I briefly talked with you in the last preview video. Learn about
the features of DLSS4 Today we will actually test the image quality, frame rate and latency performance of DLSS4.
DLSS4 contains multiple improvements . The first thing we want to talk about
. The first thing we want to talk about is the upgrade that can also be enjoyed by old cards in the 20, 30 and 40 series . The resolution processing method has been drastically updated. New
. The resolution processing method has been drastically updated. New
This is a new feature that can be used by all RTX graphics cards.
Simply put, it means changing the model used in super-score from CNN to Transformer.
Just like we said in the last issue, the main reason for switching from CNN to Transformer is to allow the model to pay attention to the overall situation. The picture information of CNN is relatively localized, and the Transformer is based on more information of the entire picture.
We first use 2077 to uniformly open the DLSS super-scoring to the performance level.
In this state, the native rendering resolution is only for 1080P Super resolution to 4K is very suitable to see the difference in image quality.
Let’s first look at the first scene.
At night, we tried to observe the image of the wheel steering.
The DLSS under the old CNN model showed a very strange image.
The tires and the ground appeared mysterious. The blur
is completely blurred.
If you look closely at the tires under the Transformer model, you can indeed see that there is a slight smear on the letter part , but the overall situation is very close to the native rendering.
This is actually a quite extreme situation because of the difference in brightness between the tires and the ground. Very small
CNN may not be able to correctly distinguish tires at this time The boundary between the tire and the ground does not analyze the movement of the tire as a whole object, so it is confused . Transformer obviously understands
. Transformer obviously understands the movement logic of the tire as a whole object, so it is much clearer in this scene . Another DLSS before The easy-to-find picture errors in the version
. Another DLSS before The easy-to-find picture errors in the version are thin railings or thin wires in the picture.
When there is significant movement, it is easy to detect flickering, which feels like the game does not turn on anti-aliasing.
For example, this fence is nothing in the native picture when we drive forward. It's flashing
but there will be a display under CNN. The flickering problem has been significantly improved in the Transformer version, but some flickering can still be detected.
Nvidia said that they analyzed a large number of examples of models making wrong choices and trained them accordingly.
This is one of the examples when training. Solving
when training. Solving the smear problem can greatly improve it . However, the current Transformer version of DLSS
. However, the current Transformer version of DLSS has several obvious over-scores.
First of all , its processing of translucent objects is even worse than the CNN version.
For example, the red exclamation mark in this scene marks you. It can be seen that in Transformer mode
you. It can be seen that in Transformer mode it has a very serious flickering problem and even the exclamation mark has been torn.
This may be because the Transformer model pays attention to the high-frequency details of the barbed wire, the neon light pattern and the background image at the same time , but after a large number of details are superimposed, it becomes It looks like this.
In fact, there are already certain problems with the original frame picture here.
But after being "noticed" by the attention mechanism, the small flaws are intensified into another picture error.
But if you turn your perspective and face this sign, you will Discovered that DLSS actually helps You have filled in the missing parts, so DLSS actually understands the picture, but its patching method does not look the best and similar in all scenarios . We also found that both models are affected by the occlusion relationship of semi-transparent objects.
. We also found that both models are affected by the occlusion relationship of semi-transparent objects.
The same wrong answer is given invariably.
To solve this problem, the model may need to do more targeted training to solve the problem . The current model can actually understand the picture,
. The current model can actually understand the picture, but it may not know what a good picture looks like.
This can only be said to be promising in the future.
Another problem we found is that the roof should reflect the The reflection comes from the changing neon light panel at the top of the screen.
There is no problem in the native gear and CNN.
But when we adjust to Transformer, the situation is wrong.
Why does the reflection on the roof not move?
In the end, the neon light panel in the reflection on the roof actually merges into A strange color.
This is because in the previous CNN model, the model could only reason about a small piece of convolution kernel in a single frame.
After using the Transformer model, the model can "notice" the entire frame and the past. context many frames ago but the model doesn’t know what to expect What this information does is give a strange answer of "mixed colors", which causes rendering errors.
Now the results of path tracing are directly offset by DLSS, which is still a bit abstract.
In general, the Transformer version of DLSS has solved some problems. The difficult-to-solve graphics defects under the CNN model have once again improved the usability of DLSS super-resolution.
The most important thing is that all RTX graphics cards can benefit from this change.
You must know that the current version of DLSS super-resolution effect is very close to the native image quality.
Transforme r has undoubtedly raised the upper limit of DLSS again.
Of course, it still has some minor problems. It needs to spend time optimizing some of the problems introduced by the attention mechanism.
Considering that it is still a Beta version, I think we can even expect that the look and feel of DLSS will be infinitely close to or even surpass in details in the future. After talking about the possibility of native rendering, it’s time to talk about frame generation.
This time, DLSS4’s frame generation has replaced all links with AI models.
In other words, this new solution no longer uses the built-in GPU like DLSS3. The hardware optical flow unit
like DLSS3. The hardware optical flow unit instead uses Tensor Core performs optical flow calculations.
I know that many of you are very resistant to turning on frame generation.
Although the effect of DLSS3 is already top-notch in the industry , there are still problems such as frame edge damage, unstable frame rate curves, etc. So has DLSS4 improved it?
Let’s make an actual comparison to see if DLSS4 has any improvement in image quality compared to DLSS3 when both use 2x frame generation.
The easiest place to find flaws in DLSS3’s frame generation is at the edges of moving objects and shadows.
We are in 207 When driving in the sun in 7 miles , you can clearly see that the shadow around the car body is damaged. You will see that the edge of the shadow in the native frame is relatively sharp . The performance of DLSS4 is closer to the native frame.
. The performance of DLSS4 is closer to the native frame.
Of course, there are also details. A small amount of loss is another example: The details of the side window are complicated because the side window is made of a translucent material, and it also has transmission and reflection.
When the vehicle is driving, it is obvious that there are signs of damage on the side window of the DLSS3 . You see on this car When the car turns,
. You see on this car When the car turns, the side window is almost gone!
In fact, DLSS4 occasionally breaks, but the situation is much better.
There is no problem with the native frame here, and DLSS4 is indeed closer to the native picture than before.
You also know that the 50 series graphics card introduces up to 4x frame generation.
Under 4x frame generation Will the price of the picture be very high?
You must know that only 1 pixel out of 16 pixels is actually rendered at this time . In fact
. In fact , the 4x frame generation of DLSS4 is very usable at around 200fps, which is the native frame around 60fps.
Take a closer look at it and DLSS4 2x frame generation. The picture
is also difficult to see the flaws in the picture at a glance.
Only if you stare carefully For example, the antenna at the rear of this car will have a perceptible smear phenomenon , and the picture quality at this time is not worse than twice the frame generation of DLSS3!
This is definitely the most important technology of this generation.
Another focus we are paying attention to is frame stability.
Although the frame number of the previous 40 series DLSS3 has increased , the frame generation time may not be stable, especially if the original frame rate is relatively low . Sliding the mouse after completing the frame still feels a bit inconsistent,
. Sliding the mouse after completing the frame still feels a bit inconsistent, but the 50 series has newly introduced flip this time. The metering feature uses the GPU to control the frame interval to help DLSS4 stabilize the frame and reduce the lag in frame generation.
Is this thing useful?
We also specifically tested it.
This is DLSS3 without frame stabilization. 2077
This is its frame generation time , and this is the cooperation of DLSS4. flip Both metering frame generation times enable double frame generation under the original 60 frames.
You can see that the frame time dispersion after stabilization is significantly lower.
We calculated the coefficient of variation and deviated from the mean by 5% under the stabilization , which was not the case before. The frame deviates from the average value by 8.7%.
It seems that with the support of new technology, this wave of DLSS4 supplementary frames has indeed stabilized a lot.
We actually played this 4K game on a 4K resolution monitor.
The experience of 200fps is really refreshing.
The subjective feeling is indeed very close to the native 200fps.
Think about it.
Suddenly one day you can play 2077+path tracing+4K at 200fps.
And if I don’t tell you, it will be difficult for you to notice any difference . It’s the power of DLSS4
. It’s the power of DLSS4 , but what everyone is most concerned about is definitely the delay of DLSS4.
Let’s test it with two games and turn on multi-frame generation to see how high the delay is like Alan Wake. 2.
Our native frame rate can reach an average of 51 frames.
At this time, the measured delay is 56ms. This game should have Reflex turned on by default, so the delay itself is not particularly high . On this basis,
. On this basis, if the supplementary frame of DLSS3 is turned on, 17ms will be introduced. The additional delay reaches 72ms. On the one hand, the optical flow calculation of the frame filling itself will increase the delay.
On the other hand, in our evaluation of 4090 As mentioned before, frame-filling consumes the original GPU performance , which will cause the original frame rate to drop , which means it drops from 51 frames to 42 frames and then doubles the frame-filling to 84 frames.
The delay should be calculated based on 42 frames.
Naturally, some delay will be added . After updating DLSS4, we can see that
. After updating DLSS4, we can see that this delay has indeed been significantly reduced.
If we turn on twice the frame generation, the current delay is only higher than 9ms. The increased delay to 65ms is only half of the original.
Even if the frame generation delay is tripled or even quadrupled, it is still lower than the original DLSS3.
In addition to the improvement in delay of the new model, there is also the reduction in performance consumption of DLSS4. At
this point, you can also see by looking at the frame rate that the original frame rate has not dropped as much as before, and the delay has naturally been reduced.
Of course, Alan Wake 2 now also supports the new Mega Geometry , which will also bring about a slight performance improvement (no frames generate: From 49 frames to 51 frames) We also tested it in another game 2077.
The delay of DLSS4 in this game was also improved.
The loss of original performance was reduced, but it was not as obvious as Alan Wake 2.
It was generated with twice the frame rate. In this case, the delay of DLSS4 is only 1ms lower.
However, the delay does not seem to increase much when the frame rate is increased to 4 times.
The most important thing is that the frame compensation will force the N card to turn on Reflex. As long as Reflex is turned on.
The latency is already much lower than native.
For example, 2077 runs 60 frames here. The native latency is as high as 70ms. Turning on Reflex can reduce the latency by half.
Turning on frame generation is actually to use the saved latency to increase the frame rate . It doesn't matter if the original frame rate is high enough.
. It doesn't matter if the original frame rate is high enough.
But what I think is the most doubtful situation is the performance of the frame replenishment at a low frame rate . For example, from 3 or 40 frames to 4 times the replenishment to 120 frames,
. For example, from 3 or 40 frames to 4 times the replenishment to 120 frames, can it still be played?
In fact, in terms of smoothness, 120 frames is quite similar.
It is indeed very smooth, and it definitely does not have the stuck feeling of 30 frames.
It feels like it has evolved compared to the low frame performance of DLSS3. In
terms of latency, it is 30 frames. The delay
is definitely not that good, but considering that the original frame rate is only a little bit , if you can accept cloud gaming The delay is actually not bad, but if you want to say how good the experience is , it has not been played for a long time. You can still feel the artifacts caused by various low frames.
On the one hand, various broken images of UI text are easy to appear. On the other hand , in some cases, it is not very good at frame prediction.
For example, in 2077, after you open the scope, it will be at such a low original value. Under the frame rate, if the outline of the scope is relatively large and just isolates the picture, the picture in the mirror will twitch in various ways as you move, and it will be impossible to replenish the frame normally, giving people the impression that although the outside of the mirror is 120 frames
, the inside of the mirror It’s still only 30 frames plus all kinds of broken image artifacts.
This experience is really not very good.
Anyway, DLSS 4's frame enhancement is best to ensure that the original frame rate when the frame enhancement is turned off can run above 45 frames to ensure that there will not be too serious image corruption.
However, we still want to do something more exciting.
For example, I further reduce the original frame rate.
For example, if there are only a dozen or twenty frames, what will the picture look like if we pad it 4 times to 60 frames ?
This is what it looks like . It really has a weird sense of smoothness.
. It really has a weird sense of smoothness.
The smoothness of the target 60 frames combined with all kinds of strange flickering broken images is just like the game video generated by AI.
It is really generated by AI, although it is just AI optical flow.
Coupled with this incredible delay of more than ten frames, playing is like drinking fake wine.
Of course, we are just messing around.
Such a low frame rate is far lower than the original design intention of DLSS4.
There is a limit to the frame replenishment, so I want to use it comfortably. Let’s make sure the original frame rate is high enough first.
You must know that the reason why 5090 needs to add a D is to limit the AI performance.
I am also curious about the AI performance of 5090D.
Will it be reduced to the level of 4090 and how big is the gap between it and 5090?
We used Ollama to deploy a 32b Tongyi Qianwen 2.5 model and let it translate the teacher list to record its speech speed.
Finally, 5090 and 5090D ran out The result is almost the same, which is about 50% higher than that of 4090.
The 512-bit GDDR7 memory bandwidth is very powerful.
However, llama is a memory bandwidth bottleneck scenario after all and does not consume too much core computing power . We also used UL's Procyon test suite
. We also used UL's Procyon test suite to test Let’s take a look at their AI performance results. 5090D is still the same as 5090.
results. 5090D is still the same as 5090.
Compared with 4090, it has FP16 and FP8 performance.
There is an improvement of nearly 50%.
The FP4 performance optimized by the Blackwell architecture is indeed very eye-catching.
However, the current FP4 quantified model is not very common.
Whether it is practical enough depends on future development . Anyway, these AI tests cannot see 90D at present. What exactly is castrated?
. Anyway, these AI tests cannot see 90D at present. What exactly is castrated?
You can only get the answer after Pytorch is updated . It
just so happened that the 12.8 version of CUDA was updated today. Toolkit
So we compiled a pytorch compatible with 5090D at the speed of light and built a fine-tuned bert environment to see the performance results of the new card in training AI under different data types ...whether it is traditional fp32 or fp16
...whether it is traditional fp32 or fp16 Is it Nvidia’s newly launched tf32 or the commonly used bf16 for large models?
Anyway, in all the projects we have tested, there is no difference between 5090D and 5090.
Maybe 5090D will have some restrictions in some scenarios , but we can only wait to find out later. After
talking about performance, we should also take a look at the power consumption performance of the 5090.
To be honest, this 575W TGP this time is simply very scary.
We recorded it running 3Dmark Time The power consumption during Spy actually does not hit the power consumption wall in the first scene.
Even in the highly stressful GT2 scene, it only hits the wall very occasionally.
Therefore, its actual power consumption in most cases is It is lower than 575W , but if compared with the previous generation 4090 , the average power consumption of the 5090 has increased significantly, about 30%.
Basically, it can be said that the more performance is improved, the more power is consumed. How much power consumption has the energy consumption ratio remained almost unchanged?
It has also successfully become the generation of nuclear bombs with the highest power consumption in history.
However, starting from the previous generation 40 series, the heat dissipation of non-public graphics cards has been frighteningly luxurious.
We have this extremely heavyweight one. The default power consumption of ROG Night God 5090D is 600W, which can fully feed the GB202 core.
In the end, we overclocked it to more than 3.1GHz and the video memory was very powerful, directly reaching 32Gbps.
Time Spy reached 53,000 points, and Time Spy Extreme also exceeded 28,000 points.
This result can improve performance by about 10% compared to the default state. It
is also a considerable improvement for a flagship card.
Unfortunately, there is no 5090D FE in China.
I am I’m really curious about how such a huge nuclear bomb can be suppressed by a thin dual-slot radiator.
I can only try to get a 5090 in the future. FE came to give it a try.
Overall, I can only describe the 5090's performance and energy efficiency improvements as lackluster. Although NV has come up with a lot of new ideas
as lackluster. Although NV has come up with a lot of new ideas in terms of graphics pipeline technology, especially AI, which will increasingly penetrate into it.
You can say that the 50 series of future game development is the beginning of a new era. Yes,
but this starting point is a bit too low for enthusiasts who just buy it now and play these games now.
The core area is 750 square millimeters. A chip with such a large core area actually only has 30% generational improvement in two years.
I have to ask about the GPU. Is the development of traditional performance really over?
the GPU. Is the development of traditional performance really over?
Its only bright spot is the 32G video memory.
It may be worth the price if you buy it for work. Maybe
all future GPUs will have to rely on software improvements.
Fortunately, the multi-frame generation of DLSS4 really left a deep impression on me, even under strict conditions. When comparing AB, I didn’t notice the price clearly. My
intuitive feeling is that the smoothness is obviously improved and the delay is not much higher . The pre-made dishes with good frames are not unpalatable,
. The pre-made dishes with good frames are not unpalatable, but they still require you to have a certain foundation. frame
Therefore, multi - frame generation is really useful for high-end cards such as the 5090, but it may not necessarily be used for future cards such as the 5060. After all,
what more mainstream players need is from 30 frames flipped to 60 frames instead of flipped from 60 frames to 200 frames.
Finally, the 5090D specially supplied in China made us relieved.
It has almost no impact on game performance and even DLSS has no impact.
If you don’t do AI at all. If you just want to play games, you really don’t have to worry about the special edition.
Anyway, this generation of 5090 can be summed up in one sentence.
Even if it continues to be the king , there are no surprises.
It only relies on technology and hard work.
So the above is the entire content of the 5090 first test. If
you like this episode of our program, don’t forget to like, forward, collect and support us.
This episode is very difficult to do.
There are so many DLSS screen recordings. We are really going to die.
So I also wish everyone a happy new year.
See you in the next episode.
Loading video analysis...