A New Chip Cluster Will Make Massive AI Models Possible

When it comes to the neural networks that power today’s artificial intelligence, sometimes the bigger they are, the smarter they are too. Recent leaps in machine understanding of language, for example, have hinged on building some of the most enormous AI models ever and stuffing them with huge gobs of text. A new cluster of computer chips could now help these networks grow to almost unimaginable size—and show whether going ever larger may unlock further AI advances, not only in language understanding, but perhaps also in areas like robotics and computer vision.

Cerebras Systems, a startup that has already built the world’s largest computer chip, has now developed technology that lets a cluster of those chips run AI models that are more than a hundred times bigger than the most gargantuan ones around today.

Cerebras says it can now run a neural network with 120 trillion connections, mathematical simulations of the interplay between biological neurons and synapses. The largest AI models in existence today have about a trillion connections, and they cost many millions of dollars to build and train. But Cerebras says its hardware will run calculations in about a 50th of the time of existing hardware. Its chip cluster, along with power and cooling requirements, presumably still won’t come cheap, but Cerberas at least claims its tech will be substantially more efficient.

Courtesy of Cerebras

“We built it with synthetic parameters,” says Andrew Feldman, founder and CEO of Cerebras, who will present details of the tech at a chip conference this week. “So we know we can, but we haven’t trained a model, because we’re infrastructure builders, and, well, there is no model yet” of that size, he adds.

Today, most AI programs are trained using GPUs, a type of chip originally designed for generating computer graphics but also well suited for the parallel processing that neural networks require. Large AI models are essentially divided up across dozens or hundreds of GPUs, connected using high-speed wiring.

GPUs still make sense for AI, but as models get larger and companies look for an edge, more specialized designs may find their niches. Recent advances and commercial interest have sparked a Cambrian explosion in new chip designs specialized for AI. The Cerebras chip is an intriguing part of that evolution. While normal semiconductor designers split a wafer into pieces to make individual chips, Cerebras packs in much more computational power by using the entire thing, having its many computational units, or cores, talk to each other more efficiently. A GPU typically has a few hundred cores, but Cerebras’s latest chip, called the Wafer Scale Engine Two (WSE-2), has 850,000 of them.

The design can run a big neural network more efficiently than banks of GPUs wired together. But manufacturing and running the chip is a challenge, requiring new methods for etching silicon features, a design that includes redundancies to account for manufacturing flaws, and a novel water system to keep the giant chip chilled.

To build a cluster of WSE-2 chips capable of running AI models of record size, Cerebras had to solve another engineering challenge: how to get data in and out of the chip efficiently. Regular chips have their own memory on board, but Cerebras developed an off-chip memory box called MemoryX. The company also created software that allows a neural network to be partially stored in that off-chip memory, with only the computations shuttled over to the silicon chip. And it built a hardware and software system called SwarmX that wires everything together.

Photograph: Cerebras

“They can improve the scalability of training to huge dimensions, beyond what anybody is doing today,” says Mike Demler, a senior analyst with the Linley Group and a senior editor of The Microprocessor Report.

Demler says it isn’t yet clear how much of a market there will be for the cluster, especially since some potential customers are already designing their own, more specialized chips in-house. He adds that the real performance of the chip, in terms of speed, efficiency, and cost, are as yet unclear. Cerebras hasn’t published any benchmark results so far.

“There’s a lot of impressive engineering in the new MemoryX and SwarmX technology,” Demler says. “But just like the processor, this is highly specialized stuff; it only makes sense for training the very largest models.”

Cerebras’ chips have so far been adopted by labs that need supercomputing power. Early customers include Argonne National Labs, Lawrence Livermore National Lab, pharma companies including GlaxoSmithKline and AstraZeneca, and what Feldman describes as “military intelligence” organizations.

This shows that the Cerebras chip can be used for more than just powering neural networks; the computations these labs run involve similarly massive parallel mathematical operations. “And they’re always thirsty for more compute power,” says Demler, who adds that the chip could conceivably become important for the future of supercomputing.

David Kanter, an analyst with Real World Technologies and executive director of MLCommons, an organization that measures the performance of different AI algorithms and hardware, says he sees a future market for much bigger AI models. “I generally tend to believe in data-centric ML [machine learning], so we want larger data sets that enable building larger models with more parameters,” Kanter says.

According to Feldman, Cerebras plans to expand by targeting a nascent market for massive natural-language-processing AI algorithms. He says the company has talked to engineers at OpenAI, a firm in San Francisco that has pioneered the use of massive neural networks for language learning as well as robotics and game-playing.

The latest of OpenAI’s algorithms, called GPT-3, can handle language in surprisingly cogent ways, ginning up news articles on a given topic or summarizing content coherently, or even writing computer code, although it is also prone to fits of misunderstanding, misinformation, and occasional misogyny. The neural network behind GPT-3 has around 160 billion parameters.

“From talking to OpenAI, GPT-4 will be about 100 trillion parameters,” Feldman says. “That won’t be ready for several years.”

OpenAI has made GPT-3 accessible to developers and startups via an API, but the company faces increasing competition from startups developing similar language tools. One of the founders of OpenAI, Sam Altman, is an investor in Cerebras. “I certainly think we can make much more progress on current hardware,” Altman says. “But it would be great if Cerebras’ hardware were even more capable.”

Building a model the size of GPT-3 produced some surprising results. Asked whether a version of GPT that’s 100 times larger would necessarily be smarter— perhaps demonstrating fewer errors or a greater understanding of common sense—Altman says it’s hard to be sure, but he’s “optimistic.”

Such advances may be at least a few years away. Nearer term, Cerebras is hoping that enough companies will see a need for hardware designed to supersize all sorts of AI models.


More Great WIRED Stories

Source: A New Chip Cluster Will Make Massive AI Models Possible

*This is a free press release. Upgraded press releases are ad-free!

The stainless steel air distribution industry

When it comes to manufacturing air distribution components for sterile environments, there’s no room for error. And because perfection is required for long-lasting, functional HVAC components, the right material and manufacturing processes are important factors.For everything from stainless steel grilles to air diffusers to HEPA filtration systems, the stainless steel air distribution industry is vitally…

Read Press Release

NTD Expands Its Service in Washington and San Francisco

NTD, The Epoch Times’ sister media, has expanded its service in Washington by partnering with local TV station WJAL. NTD has also added a new 24/7 channel on Comcast in San Francisco. The new launch makes NTD available to nearly 4 million additional cable subscribers. NTD currently broadcasts 24/7 on WJAL’s primary channel 68.1. WJAL…

Read Press Release

100k SOLR tokens will be airdropped ahead of the IDO

SolRazr has been observed getting overwhelming support from its communitySolRazr’s whitelist for the first IDO is now liveAhead of the IDO and whitelisting the protocol will do an airdrop of 100k SOLR tokensSolRazr is the DeFi tool on the Solana blockchain network. It is the first decentralized developer ecosystem for Solna. Notably, the protocol offers…

Read Press Release

FBI Investigates Alleged Attack on Female Soldier by Male Afghan Refugees at New Mexico Base

Marines with Special Purpose Marine Air-Ground Task Force-Crisis Response-Central Command guide evacuees on to a U.S. Air Force C-17 Globemaster III during an evacuation at Hamid Karzai International Airport in Kabul, Afghanistan, August 21, 2021. (Sergeant Samuel Ruiz/U.S. Marine Corps) The FBI has launched an investigation into the alleged assault of a female soldier perpetrated…

Read Press Release

A New Chip Cluster Will Make Massive AI Models Possible - Click To Share

Share on facebook
Share on twitter
Share on reddit
Share on linkedin
Share on email
Share on whatsapp