How AWS Uses a 50-Year-Old Logic Engine to Catch Bugs Before They Become Code

Software bugs are expensive, but the most costly ones often don't originate in code—they start with flawed requirements. AWS discovered that a staggering 60% of software defects trace back to ambiguities, contradictions, or gaps in the specifications that guide development. To tackle this, AWS has introduced a new feature called Requirements Analysis in its Kiro agentic development platform. Rather than relying solely on AI, Amazon combines large language models (LLMs) with a decades-old automated reasoning technique called satisfiability modulo theories (SMT) solving. This hybrid approach transforms vague requirements into precise mathematical logic and then proves whether conflicts exist, catching bugs before a single line of code is written. Here's how it works and why it matters.

Why are requirements the most expensive type of software bug?

Bugs in requirements are particularly insidious because they become embedded in every subsequent stage of development. A contradictory or ambiguous requirement can be interpreted differently by different developers, leading to inconsistent design and code. By the time the issue surfaces in production—as a feature that doesn't work as expected—developers must rewind through weeks of debugging to trace the problem back to its origin. This rework is far more costly than fixing a coding error early. AWS found that correction efforts for requirement bugs consume disproportionate time and resources. Mike Miller, director of AI product management at AWS, explains that these bugs include contradictions (two rules that can't both be true), ambiguities (a statement open to multiple interpretations), and gaps (missing conditions). Addressing them upfront dramatically reduces downstream defects and accelerates delivery.

How AWS Uses a 50-Year-Old Logic Engine to Catch Bugs Before They Become Code — Source: thenewstack.io

What is AWS's Requirements Analysis feature and how does it work?

Requirements Analysis is a new capability within the Kiro platform designed to catch requirement bugs before they propagate. It operates in three stages. First, an LLM takes vague, natural-language requirements and rewrites them into precise, testable criteria. This step clarifies intent and removes ambiguity. Second, those refined requirements are automatically translated into a formal mathematical representation—essentially converting human language into logical statements. Third, an SMT (satisfiability modulo theories) solver, an automated reasoning engine invented over 50 years ago, runs proofs against that logic. The solver can determine definitively whether any set of requirements contains contradictions, ambiguities, undefined behaviors, or gaps. The results are presented to developers as plain-language questions that typically take 10–15 seconds each to resolve, making the process highly efficient.

What is an SMT solver and why is it better than AI for catching contradictions?

An SMT solver (satisfiability modulo theories) is a type of automated reasoning tool that can determine whether a set of logical formulas is satisfiable—meaning there exists some assignment of values that makes all formulas true simultaneously. If no such assignment exists, the solver has proven a contradiction. This is fundamentally different from an LLM, which can only predict whether something looks wrong based on patterns in training data. AWS emphasizes the word "prove" because the SMT solver provides mathematical certainty, not probabilistic guesswork. When the solver flags an issue, it is demonstrably impossible to implement both conflicting rules. By combining the LLM's ability to interpret natural language with the solver's rigorous proof engine, AWS creates a system that leverages the strengths of each: the LLM handles complexity and nuance, while automated reasoning guarantees correctness.

How many bugs does AWS actually find with this approach?

During internal testing, AWS found that 60% of software requirements contained bugs—contradictions, ambiguities, or gaps that would have led to defects in code. That statistic underscores how pervasive requirement errors are in real-world development. The Requirements Analysis tool surfaced these issues early in the design phase, allowing teams to correct them in minutes rather than weeks of rework later. Each flagged issue is presented as a simple two-option question, typically resolved in 10–15 seconds. Over the course of a project, this cumulative time savings dramatically reduces the cost of quality. Moreover, because the SMT solver provides a mathematical proof, developers can trust the result without needing to manually verify every edge case. The impact extends beyond individual projects: the data helps organizations improve their requirement-writing processes over time.

What is neurosymbolic AI and how does it relate to this approach?

Neurosymbolic AI is the fusion of neural networks (like LLMs) with symbolic reasoning systems (like logic engines). AWS's Requirements Analysis is a textbook example of this paradigm. The LLM handles the unstructured, natural-language side of requirements, while the SMT solver provides deterministic reasoning. This combination overcomes each technology's weaknesses: LLMs can be creative but are prone to hallucination and lack formal verification; symbolic systems are precise but brittle with messy human input. By letting the LLM translate requirements into a formal language and then letting the solver prove properties, AWS achieves a robustness that neither method could deliver alone. Jason Andersen of Moor Insights & Strategy notes that AWS has been pioneering this approach in other areas, such as access control with IAM, and is now extending it to requirements analysis. This method of evaluating LLM outputs using diverse algorithmic models is gaining traction as an alternative to using additional LLMs to inspect outputs.

How does this compare to using AI alone for requirements validation?

Most current approaches to validating LLM outputs rely on using additional LLMs to check for consistency and sensibility—a kind of automated peer review. However, that method still produces probabilistic assessments, not proofs. AWS's Requirements Analysis goes a step further by introducing a formal reasoning engine that mathematically demonstrates contradictions. Mike Miller highlights the philosophy: "The LLM side does what it does best, and automated reasoning does what it does best." The LLM excels at understanding and rewriting ambiguous statements; the SMT solver excels at proving logical consistency. This hybrid approach sets a new standard for requirement quality assurance. It also aligns with broader industry trends toward neurosymbolic architectures. As Jason Andersen points out, the success AWS has seen with automated reasoning in IAM is now spreading to other product lines. The result is a more reliable software development lifecycle built on verified foundations, not just likely ones.