LongCut logo

#TechUpdate: Inside Broadcom’s 102.4 Tbps “Davisson” Switch: Co-Packaged Optics for AI Networking

By NextGenInfra

Summary

Topics Covered

  • AI Traffic Explodes Optical Bandwidth Needs
  • CPO Achieves Million Hours Link Flap Free
  • CPO Delivers 70% Power Reduction
  • CPO Roadmap Scales to 400G Per Lane

Full Transcript

Hi everybody. So Broadcom is just unveiling its third generation co-packaged optics platform, the Tomahawk 6D Davidson, delivering up to 102.4 terabits per second of Ethernet

switching capacity for nextg Ethernet clusters. This is definitely a major

clusters. This is definitely a major milestone in optical integration and power efficiency. Joining me here today

power efficiency. Joining me here today is Manish Meta and Rejief Panki of Broadcom's switch and optical systems division to discuss how this technology

is advancing. So uh welcome.

is advancing. So uh welcome.

>> Thank, you., Appreciate, it., Thanks, Jim.

>> So, Rajie, and, and, Manish, the, first question is on setting the stage you know the need for CPO. Uh can you start by explaining a little bit about the scale of east west AI traffic that we're

seeing today and why traditional pluggables can't keep pace? Sure. So

when we launched our CPO development efforts and we announced that about 5 years ago, we really weren't developing with AI in mind. We were developing solutions for front-end cloud networks

to replace kind of traditional optics which had some challenges and and how they were going to scale. But then what we found over the last few years is uh

AI training scale out networks just exploded the optical bandwidth requirement versus anything we had seen on front-end cloud. It was probably it was about an order of magnitude increase

in optical bandwidth that was required per processor and that has led to a significant increase in volume requirements and also

a rapid increase in velocity on how quickly these new optical generations have to be released. Traditional

pluggable optics are serving the need at the moment for sure, but they continue to increase in power generation over generation. and the increase in

generation. and the increase in complexity, the signal integrity challenges continue to become more difficult, taking signals that are 100

gig progressing to 200 gig from the core A6 out to the front panel. And CPO is a way of really densifying that technology

and placing it very close to the ASIC to simplify the interconnect and reduce the need for complex signal processing and increase laser split ratios and improve

along upon a lot of the challenges that traditional optics are facing right now in in meeting the needs of that training you know east west high rate scaleout traffic.

>> Yeah., So,, it's, been, a, really, interesting evolution as well, right? As you've gone through the first, the second, and now here, the third generation with the CPO and looking to see where that meets the

market demand. You're on track with that

market demand. You're on track with that now, you think?

>> Yeah., Yeah,, we're, we're, pretty, uh excited about our third generation development. You know, our first

development. You know, our first generation was Tomach for Humboldt. It

was our first development of CPO. uh

we're very fortunate to have a lead uh partner who you know did some smallcale deployments. It allowed us to build a

deployments. It allowed us to build a manufacturing platform and start building a partner ecosystem in the supply chain to actually be able to ultimately scale CPO. Our second

generation was Tomok 5 Bailey and it's been a great story in being the first CPO ever in the industry to really get

into multiple customer networks and customer environments and tested for reliability and link stability. Uh and

it also demonstrated the or prove that we can hit the power targets that we were originally going for and the power improvements that we were targeting with

CPO. So I think with Gen 3 with tomahawk

CPO. So I think with Gen 3 with tomahawk 6 Davidson we we now have a product that meets critical customer needs. It's high

ratics it supports the 512 ports available off of Tomahawk 6 fully extracted optically and it's uh you know here and and going to be kind of

sampling to customers at a time that the next phase of training deployments are being architected. We're pretty excited

being architected. We're pretty excited about the timing of this release.

>> Yeah,, it, looks, really, exciting., Rajie,

could you perhaps walk us through some of the architecture and packaging uh that's gone into Davidson and especially the capabilities that have been brought

to bear from TSMC with this coupe technology. The coupe process from TSMC

technology. The coupe process from TSMC is looking very good. The the Davidson engine obviously has that um though that attach you know is is something that

we've done on this Gen 3 Davidson. The

platform that we're using uh leverages everything from our gen one and gen two the backend processes OSATs uh our assembly our fiber attach right all of

that uh you know we built upon and hardened and you know touching on what you were saying earlier you know we we've been you know we brought CPO out

to compete on you know power and cost and reliability versus pluggables but you know with this recent data from our um of our main end customer partners.

It's really showing link stability and that's something that translates into you know, significant value as you start training these large AI clusters of GPUs

and now you don't have to go back to checkpoints and you leverage kind of this robustness of the CPL platform which really has significant value over any type of plugable. Let's talk a

little bit more about that link uh flapping and reliability issue. I I know you guys were quoting a a big number 1 million what is it 1 million hours of

testing.

>> Yeah., So, we, actually, one, of, our customers Meta published a paper and a study where they they stated that the

first million device hours of operation of Tomok 5 Bailey operated link flap free and that's a big deal right these uh as Rajie was was mentioning link

flaps have been a sore point in the migration from sort of front end cloud networking to uh networking to AI training where a link event doesn't have

redundancy to be overcome and you need to kind of go back to a point in that training job and and restart and and that's loss of XPU or GPU utilization.

So this was always one of our theories that if you take silicon photonics and you take mature CMOS nodes um and you

build optical engines using high volume proven foundry and OSAP processes and then you pair it with what we've now proven through our first two generations

on backend assembly and test you know with soldered optical engines on a common substrate with an ASIC that you can actually get silicon level performance and reliability for the optical interconnect.

versus the kind of traditional perspective on optical interconnect performance which is you know tends to be a discrete device you know pluggable

format and this is the first evidence from a customer in their environment that that hypothesis is actually has a

very strong foundation and is being proven in in real hardware with real hardware >> and, um, what, exactly, is, the, reason, then is it because the package packaging has

hermetically sealed it in and you're not getting dust on the optics.

>> So,, it's, a, it's, a, tightly, integrated solution. While a normal pluggable

solution. While a normal pluggable transceiver, the components in the pluggable transceiver can operate very well on their own. But this the

pluggable transceiver is housed in a module that is inserted into a front panel of let's say a switch system or

some some type of ASIC system. uh it's

inserted mechanically. There is

variability system to system on the links the electrical links from the say substrate through the PCB to the front panel. So you can have variability

panel. So you can have variability within a system between the different traces. You're going to have variability

traces. You're going to have variability system to system on those traces, the design, the PCB vendors, the material systems. There's variability on the mechanical connectors. All of that just

mechanical connectors. All of that just generates a lot of sources of variation on the link performance. And as the speed continues to increase, those challenges start emerging in larger pain

points and and training networks. Well

here you have a known good what we call TP1 and TP4 link at manufacturing.

During CPO testing, that electrical side link is defined. it's characterized and and so we think that's a really big reason why we're seeing significant improvement in the performance. So they

also characterized it from something that you know may be a little more relevant for a broader networking audience who's not maybe in the details of like what's a link flap one was that

they you know showed experimentally that the CPO serviceable failures are you know 5x lower on this initial study than pluggable transceivers and we think that's just going to continue to improve

right we're still in the early days of CPO and we still have some things to work through and then this is the very I think exciting one which is uh MTBF

improvement that they calculated to uh deliver a 90% training efficiency improvement for a 24K GPU cluster.

>> Interesting., So, I, I, want, to, go, back, to the power question because another really big item in in your announcement is the number that that's quoted a 70% reduction in power compared to

pluggables. Is is that is that right?

pluggables. Is is that is that right?

And how impactful is that in terms of building out these really massive AI data centers?

>> Yeah., Yeah., So, it's, real., You, know,, we proved it on Bailey where a traditional say 800 gig transceiver um you know at the time we released was operating at

about 16 watts and we hit our target of less than 5.5 watts. Um and that's really two things. uh you eliminate the DSP and you're running CMOS for your

EIC. You know, because we're so tightly

EIC. You know, because we're so tightly integrated, we can use pretty advanced CMOS nodes with low power consumption for the TIA and the driver circuitry.

And we're expecting the same level of power savings for the Davidson generation as well. I think the value of this is certainly important over the

long term. We fully recognize that in in

long term. We fully recognize that in in the short term sites and racks are provisioned to support the pluggable transceivers as they should be because

that's the known technology that's going to continue to deploy in very high volume. So in the short term we think

volume. So in the short term we think that it's really this link performance that is driving a lot of excitement and the next phase of adoption by end users

over generations as their comfort level and grows grows with CPO I think you'll start to see more provisioning done with CPO in mind for power and that that

could be very important in improving rack density of GPUs because you don't have to use as much power on the networking That's >> and, Jim,, yeah,, just, just, to, add, to, that,

right, the data that Meta presented actually shows a 65% reduction between our the CPO platform that they use, the

100 gig Tomok 5 Bailey versus kind of you know, 100 gig, 800 gig, 400 gig modules running at 100 gig per lane. So

now as you move to 200 gig per lane that's where we're saying, okay, look it's 65. It's measured. You're seeing it

it's 65. It's measured. You're seeing it now at 100 gig per lane. you know, you should be in the 70s when you're at two gig per lane >> right?, Right., Yeah., Everything's, moving

>> right?, Right., Yeah., Everything's, moving faster and faster, which leads into the my last question, kind of the look ahead in the road map ahead. Yeah. It seems

like we're just 100 gig per lane and now 200 gig per lane and before you know it people are already planning 400 448 gig per lane, right? Is the CPO roadmap

going to keep up with this? What do you guys see looking ahead?

>> We're, certainly, investing, to, make, it keep up with that. So yes, we absolutely believe that CPO high density silicon photonix integration co-ackaged optics

adv you know using f you know a lot of silicon photonics and cos and advanced packaging techniques will be able to keep up with the bandwidth requirements

of AI networking and know we've already announced that we're in development of our 400 Gen 4 400 gig perlane CPO and there's a lot of innovation required in

in these these next generations, right?

We're looking at increasing the split ratios of our lasers. We're looking at you know, the next phase of kind of advanced packaging techniques. So, we're

pretty excited about not just releasing our Gen 3 Davidson program, but also what lies ahead for Gen 4 and just continuing to do what we can to innovate

and do everything possible to make sure that optics are not the bottleneck for generational advancements in AI networking.

>> All, right,, fantastic., Lots, of, uh, good innovations there. Thanks for thanks for

innovations there. Thanks for thanks for this update. Really appreciate it.

this update. Really appreciate it.

>> Thank, you.

>> Thanks,, Jim.

[Music]

Loading...

Loading video analysis...