Luminous sheds light on the optical architecture of the future AI supercomputer

It’s not every day that we hear about a new supercomputer maker with a new architecture, but it sounds like Luminous Computing, a silicon photonics startup that’s been pretty secretive about what it’s doing, will launch its local architecture in the ring.

When Luminous Computing was founded in 2018, it joined the growing list of established tech companies as well as startups looking to use silicon photonics to build faster, more power-efficient chips to support modern workloads. artificial intelligence and machine learning. The company focused on what Marcus Gomez – one of the founders of Luminous and now its chief executive – called optical computing, or the use of optics to solve mathematical problems inherent in AI computing.

Using silicon photonics to solve these problems was Gomez’s goal and the subject of research carried out by Mitchell Nahmias, Luminous’s other founder and now the company’s chief technology officer, during his research at the Princeton University and the Years After.

Shortly after the startup raised $9 million in seed capital in January 2019, Luminous abruptly turned away from the original idea of ​​tackling the computational side of AI computing to look at communications – between parts of the chip as well as between systems, arrays and data centers. That’s where the problem lies, and that’s where optical technology can do the most good, according to Gomez.

“If you look at this work, we’re 5X to 10X away from the theoretical density limit of digital,” Gomez said. The next platform. “Computing isn’t the problem and it’s a tough pivot to what we’ve traditionally focused on. The computing stuff, it’s a really cool science project. In 2030 it could very well be act an important technology for at least edge computing. But for the modern AI user, the bottleneck is not computation. The first thing we do is use the optics to make our digital carrier before moving to a whole new platform to do logic and computation.

It points to Nvidia’s A100 GPU accelerator, which is optimized for AI workloads. He estimates that around 10-20% of the chip’s time is spent on computing.

“This is a chip whose sole purpose in life is to do calculations and it’s definitely not calculation,” he says. “It’s mainly about memory, interface and interconnect. Basically, it’s because you don’t have enough bandwidth in the chip to power it faster than that. »

Rather than just looking at the processor and seeing how photonics can help with that, Luminous’ plan now is to build a complete AI supercomputer with its optical technology being the key connection between all the different tiers.

“We’re building the whole computer,” Gomez says. “It’s a rack-mount solution, a complete supercomputer, and we’re building all the parts of it. It involves several digital chips such as computer chips, switch chips, and memory interface chip. We build all the optics to connect all these chips together and scale them properly, and we package it all together, and then we build all the software on top of it. Thus, the machine learning frameworks TensorFlow and PyTorch will be ready to use from day one. »

The chips themselves will be digital chips, but the key is that they are optically connected, eliminating inbound and outbound communication constraints and relaxing the stringent requirements of traditional computing architecture, he says.

Marcus Gomez, co-founder and CEO of Luminous Computing

“We use this technology to build super-high-speed data links and insert them into the computer architecture exactly where you are stuck in these communication bottlenecks at every scale – at the lowest level between memory and processor, all board-to-board, box-to-box, and rack-to-rack, Gomez says.

Gomez, who dropped out of Stanford University’s master’s program to start Luminous, comes to it with a varied background, which includes being a research scientist at dating app company Tinder in 2018 and research roles at Google in artificial intelligence as well as a trip to the Mayo Clinic. He was also a software engineer at Bloomberg and network biology researcher at Harvard Medical School. Nahmias earned his PhD in Electrical and Electronic Engineering at Princeton, where he studied the relationship between a laser and a biologically-tipped neuron, in the field of neuromorphic photonics.

Matt Change, vice president of photonics at Luminous, also earned a doctorate in electrical engineering from Princeton and spent two years at Apple designing hardware to reduce interference between coexisting wireless radios on the Apple Watch. He left in 2019.

Improving communications within and between components will be key to enabling broader AI models and making training those models more accessible.

“Ten years ago, the biggest pattern you would see in the literature was in terms of 50 million or 100 million parameters,” says Gomez. “That sounds like a lot, but it’s not. You can install this on a single GPU and you can train it in an hour. Today, the largest models in the literature are of the order of 10,000 billion parameters. The larger models we’ve seen take up to a year to train, and they require tens of thousands, if not hundreds of thousands of machines. Therein lies the problem, because when you start talking about a year’s training time, you reach the limit of what a human can reasonably do to conduct an experiment. When you say hundreds of thousands of machines, you’re running out of space, not to mention the cost of hundreds of thousands of machines. You get to the point where basically only a select number of companies are capable of building these giant AI models. It’s getting too expensive. It takes too much time and takes up too much space. There are not enough tokens.

Organizations running large AI workloads must make a trade-off between performance or cost and programmability and even then, the performance gains achieved through more compute capacity cannot keep up with the growing size of the AI ​​workloads. training models. This is where Luminous sees that the bottleneck of AI is communication and not computation. Hence the hard pivot of the startup.

“Optics has always been the known solution to the data movement problem,” says Gomez. “That’s why they’re laying fiber optic cables in the Atlantic Ocean. Light is efficient at moving data over long distances with high bandwidth and low latency. Once we remove the bottleneck, that gives us two things. The first is that we actually get the magnitude of the performance improvements. You may be able to train models 100 to 1000 times larger on our systems than you can train on any other modern hardware in the next five years. For existing models, we are going to take the training time that used to be years and reduce it to months. We are going to reduce things that used to last for months to a few days.

Additionally, the programming model becomes simpler by reducing the number of distributed systems needed to run the workloads and the system is less expensive as it scales more efficiently.

There is also the software challenge. Luminous needs to ensure that existing machine learning code can run on its systems, which means TensorFlow and PyTorch need to be ported over to them. Gomez says the company has “a pretty tough compiler problem to solve and we have a really extraordinarily talented group of engineers working on it.” The key to the software is going to be making the scaling magic trick – which is indeed what it is – actually appear like a magic trick to the customer. The fact that all communications are high bandwidth and data movement is essentially fixed cost will help because there is no distributed systems thinking and no hierarchical thinking is required. You have just entered the next available resource. In a certain sense, the algorithm simply works.

Luminous has deliberately flown under the radar for much of the past few years and even now Gomez is reluctant to go into too much detail about the technology the company is building or its roadmap, though he’s now more open about the shift to building an entire system rather than focusing on chips.

The company plans to sell production systems within 24 months and it just secured a $105 million injection via Series A funding from a wide range of investors, including Bill Gates. At the end of 2021, Luminous had nearly 90 employees. The newly raised money will help it grow its workforce to more than 100 employees, including doubling its engineering team and adding engineers in photonics design, very large scale digital and analog integration (VSLI), packaging as well as machine learning.

The startup – which has working prototypes in its labs – has moved from the analysis phase to the execution period, which means it will have to be more public about what it does. This is important not only for recruiting and selling, but also for differentiating yourself from others in the field of silicon photonics, which includes not only Ayar Labs, Intel and IBM, but also well-funded startups like Cerebras, Lightmatter, Lightelligence and Celestial. AI who have their own angles on this.

A key differentiator is that while “many companies are looking at using optics for computing, we’ve completely abandoned that,” Gomez says. “But at a more fundamental level, all the IT architecture decisions we’ve made and all the optics we’re introducing, it’s not arbitrarily introduced to just have a physical advantage. It is specifically used to make the user experience of building these massive AI models as easy as possible. It’s a conversation that, as far as we know, no one else in material space is really having.

About Leslie Schwartz

Check Also

Tay Gavin Erickson Fall 2022 Lecture Series Welcomes Belinda Campos: UMass Amherst

On Monday, September 26 at 4 p.m., the Center for Family Research will host Belinda …