€80 million for a startup promising to triple AI chip efficiency. Gimlet Labs didn't just find funding — they cracked the code on making today's AI infrastructure smarter than ever.
📖 Read more: AI Shoplifting Detection: Cameras That Watch You
🚀 When One Chip Isn't Enough
Picture this scenario. You've got an AI chatbot that needs to search for information, process it, and deliver an answer. Each step demands different computing power. But most AI systems in 2026 run everything on the same chip type — usually NVIDIA GPUs.
That's like using a Ferrari to haul furniture. Works, but it's not the ideal tool for the job.
Gimlet Labs, a Silicon Valley startup, claims they found the solution. Their multi-silicon inference cloud (the world's first, they say) lets AI workloads run simultaneously across different hardware types. Instead of waiting for one chip to handle everything, their system splits tasks across CPUs, GPUs, and specialized AI accelerators.
⚡ The Logic Behind Multi-Silicon AI Inference
Break it down to basics. When an AI agent executes a task, it might need dozens of different steps. Each step has different requirements:
- Prefill phase: Needs mostly computing power
- Decode phase: Requires more memory
- Tool calls: Depend on network speed
No single chip can handle all these efficiently. The solution is heterogeneity — and that's exactly what Gimlet is trying to achieve.
The company doesn't just distribute workloads. It can slice AI models themselves into pieces, running each section on the most suitable hardware. What used to crawl on one GPU now flies across a combination of NVIDIA, AMD, Intel, and specialized chips like Cerebras.
Numbers That Matter
The stats are impressive. Gimlet claims it can accelerate AI inference workloads 3x to 10x for the same cost and energy consumption. But here's the real problem they're solving: existing hardware runs only 15-30% of the time.
"You're basically wasting hundreds of billions because you're leaving resources idle," as Gimlet CEO Zain Asgar put it. If that sounds dramatic, think about how many GPU farms exist globally just waiting for the next AI task.
📊 Who Paid €80 Million for Multi-Silicon Dreams
Gimlet's Series A came from investors who understand AI infrastructure. Menlo Ventures led the funding round, with participation from Factory (which did the seed round), Eclipse Ventures, Prosperity7, and Triatomic Capital.
But the angel investors tell the real story. Bill Coughran from Sequoia, Stanford professor Nick McKeown, former VMware CEO Raghu Raghuram, even current Intel CEO Lip-Bu Tan. When this many industry veterans write checks, they know something.
The multi-silicon fleet is ready — we just need the software layer to make it work.
Tim Tully, Menlo Ventures
The timing isn't random. The company launched publicly in October 2025 with eight-figure revenues from day one (meaning at least €9+ million). In four months, they doubled their customer base, with clients including major model makers and cloud companies they won't name.
📖 Read more: Battery That Lasts 50 Years: Never Charge Again
🎯 Who This Is For (And Who It's Not)
Gimlet isn't selling to the average developer building a chatbot. Their product targets the big players — AI model labs and data centers running thousands of chips simultaneously.
The company already has partnerships with industry giants:
- NVIDIA: The undisputed leader in AI GPUs
- AMD: The main alternative for high-performance chips
- Intel: With new Arc GPUs and Gaudi AI accelerators
- Cerebras: The wafer-scale engines that are massive
- d-Matrix: Startup building in-memory computing chips
The Business Model
Gimlet offers their software two ways: as standalone software you install on your infrastructure, or as API access to Gimlet Cloud. Both target enterprise customers with serious AI workloads.
The fact they already have eight-figure revenue four months after launch shows real demand exists. When major cloud providers pay from day one, something works.
🔬 The Tech Behind the Magic
Gimlet's core innovation is orchestration software that slices AI workloads and distributes them intelligently across available hardware. But how exactly does it work?
The system analyzes each AI task in real-time and decides which piece should run where. For example:
Compute-bound Operations
Heavy computational tasks go to GPUs for maximum throughput
Memory-bound Tasks
Operations needing fast data access run on high-memory chips
Network-bound Processes
Tool calls and API integrations run better on CPUs
What makes the approach smart is that it doesn't just distribute load statically. The system learns from workload patterns and optimizes distribution dynamically.
The Stanford Connection
Zain Asgar isn't new to this game. Stanford adjunct professor and successfully exited founder, he came from Pixie — a startup building observability tools for Kubernetes that sold to New Relic in 2020 just two months after launch.
Co-founders Michelle Nguyen, Omid Azizi, and Natalie Serrino all worked together at Pixie. This explains why they built a functioning product so quickly — the team knew how to work together.
🏁 What This Means for AI's Future
If Gimlet Labs delivers on their promises, they could change how we think about AI infrastructure. Instead of buying more chips, we could use existing ones smarter.
This has massive cost implications. If you can really get 3-10x performance from the same hardware, then AI's economic equation changes radically. Startup companies without budgets for massive GPU farms suddenly become competitive.
But there's a bigger picture. In 2026, global AI chip shortages remain a problem. If we can make existing chips much more efficient, pressure for new production decreases.
Of course, Gimlet isn't alone in trying to solve this problem. Major companies like Google and Microsoft are working on their own hardware optimization solutions. The difference is that Gimlet is building a hardware-agnostic layer that works with everyone.
With 30 people on headcount and €85+ million in total funding, the company has resources to expand aggressively. The real challenge will be proving their technology works reliably in the large production environments they're targeting.
If they succeed, Gimlet could become an essential layer of the modern AI stack. If not, they'll remain an interesting experiment that tried to solve one of AI infrastructure's biggest problems in a completely different way.
Sources: