Technology

Theorem wants to stop AI-written bugs before they ship — and just raised $6M to do it

As the landscape of software development continues to evolve with the rise of artificial intelligence, one startup is focused on addressing a critical challenge: ensuring the trustworthiness of AI-generated code. Theorem, a San Francisco-based company that recently completed Y Combinator’s Spring 2025 batch, announced a significant milestone with the closing of a $6 million seed funding round led by Khosla Ventures.

The timing of this investment is crucial, as AI-powered coding assistants from tech giants like GitHub, Amazon, and Google are now churning out billions of lines of code annually. While the adoption of AI in software development is accelerating, the industry is facing a growing oversight gap when it comes to verifying the correctness of AI-generated code. This gap poses a significant risk to critical infrastructure such as financial systems and power grids.

Jason Gross, co-founder of Theorem, emphasized the urgency of the situation, highlighting the challenge of human reviewers keeping up with the sheer volume of AI-generated code. Theorem’s solution combines formal verification, a mathematical technique that ensures software behaves as intended, with AI models trained to automatically generate and check proofs. This approach, which historically required years of specialized engineering expertise, can now be completed in a matter of weeks or even days.

Formal verification has traditionally been reserved for high-stakes applications like avionics systems and cryptographic protocols due to its high cost and complexity. Gross, who previously worked on verified cryptography code at MIT, understands the challenges firsthand. By leveraging AI to automate the verification process, Theorem is democratizing this technology and making it accessible to mainstream software development.

Theorem’s system employs a novel approach called “fractional proof decomposition,” which optimizes verification resources based on the importance of each code component. This method has already proven effective in identifying bugs that slipped past traditional testing methods, as demonstrated in a recent collaboration with Anthropic, the AI safety company.

In a practical case study, Theorem helped a customer transform a 1,500-page PDF specification into 16,000 lines of trusted code, significantly improving performance without introducing additional errors. This success story showcases the transformative potential of Theorem’s technology in industries like AI research, electronic design automation, and GPU-accelerated computing.

As AI systems become increasingly integrated into critical infrastructure, the need for robust verification mechanisms is more pressing than ever. Gross emphasizes the importance of “asymmetric defense” in software security, highlighting the role of formal verification in safeguarding against vulnerabilities exploited by AI-driven attacks.

Looking ahead, Theorem plans to use the seed funding to expand its team, enhance compute resources for training verification models, and explore new industries such as robotics, renewable energy, cryptocurrency, and drug synthesis. The startup’s focus on scaling software oversight sets it apart in a competitive landscape where AI and formal verification intersect.

In a world where AI systems are evolving exponentially, Theorem is on a mission to ensure that the machines writing the code are held to the highest standards of trustworthiness. By bridging the oversight gap in AI-generated software, Theorem is paving the way for a future where safety and reliability are paramount in the development process.

Related Articles

Back to top button