Untold story of AI’s fastest chip

By Synapse

Summary

Topics Covered

Simplify chips for 18x AI speed
Grok means deep intuitive understanding
LPUs enable instant AI conversations
US-made LPUs slash AI costs

Full Transcript

this is what a standard AI chip looks like it's a GPU you'll notice the complex Crossing Lines this design is responsible for the DraStic rise in

value for its producers who made the right bet that gpus could be the engine for AI now this is what a far from

standard AI chip looks like uniform clean and really really fast like 18 times faster than its competition but is

this Underdog capable of taking a bite out of the hot AI chip [Music] Market Jonathan Ross helped invent Google's tensor Processing Unit which

was the company's underlying chip for machine learning its initial purpose was to translate human language into text we take it for granted now but the process

unlocked a whole world world of applications based on that experience Ross founded grock in 2016 to compete in the semiconductor industry the basic

idea was that computational needs were changing fast as self-driving artificial intelligence and more were being developed at a rapid Pace a new

architecture more specifically specialist chips would be needed to pull that off something tailored deeply for its specific task even the name grock

was meant to invoke the kind of subject depth the company was striving to represent it's grock and we spell it with a Q and it's because it comes from a science fiction novel and it means to understand something deeply and with

empathy a startup in the chip space needs substantial cash Ross raised $367 Million by the end of 2021 from

chth popedia Tiger Global and more so basically take the chip and make it much much smaller and cheaper and then make many of them and connect them together that was Jonathan's Insight but there

was a serious issue grock couldn't get any customers it wasn't for lack of trying and is a byproduct of a deeply Technical startup that needs a series of

things to go right for a product to be made much less marketed and sold the idea of a simplified and hypers specialized design wasn't connecting for

example one of their first big attempts to break in was to supply chips for Tesla's driving effort after exploring the solution Tesla said no and as Elon walked out of the room he said it would

be a real shame if someone was to take the name and slightly tweak it to be an AI company just kidding but it feels like too much of a coincidence so then grock tried to sell their chips to high

frequency traders who need information and to make trades very quickly but potential customers in that space also shut the door the Clock Was ticking on a

startup dedicated to speeding things up at the time generative AI accounted for less than 1% of their efforts then llms

hit the mainstream and grock created a solution to meet the moment they captured attention with a jaw-dropping demo you are breaking performance

records the speed is definitely a differentiator and people notice it we've gone viral this week so it was a really rapid hockey stick we literally just show people the demo and they're always shocked and how fast it is

compared to what they get on Graphics processors we've built a language processor anyone that regularly uses large language models will notice that this is unlike anything on the market

the millisecond you hit enter this thing is giving you the entirety of its response to further put this speed in perspective grock chips enable llms to

write a full book in about a 100 seconds now that would require a huge context window that their current open- Source llms don't have but that's some insane

speed grock's site went from hardly any users to over 400,000 signups in no time as the company set the new gold standard for low latency so how did they make it

so fast the speed is enabled through grock's lpu design that sequentially processes language tasks that the llms execute to put it more simply the chip

is designed to do exactly this task not a lot of other ancillary tasks that increase complexity and cost it also has memory on the chip which is a rarity the

lpu is tailored towards inference think of the actual messaging with a chat interface rather than training that phase requires intense gpus like the one

from Nvidia you can pretty easily deduce why speed at this level would matter the first one that comes to mind is low latency conversations with AI this means

no awkward pauses as the machine processes your answer and Ross loves to demo this feature got it tell me something most people don't

know um here's something interesting did you know that octopuses have three hearts can you book a reservation at a Michelin star restaurant of course I've made a

reservation for you and your guests at a Michelin star restaurant called em who's the chef the chef at IDM is Demetrius Kraus he's a renowned Chef known Ross

has said that lpu chips will enable significant ly lower compute costs because we're so much faster per chip produced we're able to get a better cost

basis and energy basis this is clearly to attract startups and developers but chamoth said they're also getting a lot of inbound from companies featured in the S&P 500 to do this they're

undergoing a production ramp up Our intention if we can is to deploy over 22,000 lpus this year and next year we want to do 1.5 million there's another

interesting part of this story the entirety of the chip is made in the US which is very unique from its chipm counterparts that rely heavily on taiwan's tsmc Foundry there are some

critical questions that come from grock's value proposition is speed going to matter so much that the company can take meaningful market share does a shorter time to render answers make a difference to the users of large

language models also when companies order chips from grock they have to order a ton of them what grock can handle with 578 lpu chips Nvidia can

handle with two of their h100 GPU chips does this make scaling unfeasible while it's unclear whether grock will be able to turn its moment of virality into the

backbone for fast AI compute this is a moment that was a long time in the making for Ross and crew and if we want real Innovation the kind that drives

down costs solves problems and unlocks new Solutions then we're going to need many more companies like grock fighting for a spot at the big boys table

Loading...

Loading video analysis...