Join the DZone community and get the full member experience.
Join For Free
Any form of data that we can use to make decisions for writing code, be it requirements, specifications, user stories, and the like, must have certain qualities. In agile development, for example, we have the INVEST qualities. More specifically, a user story must be Independent of all others and Negotiable, i.e., not a specific contract for features. It must be Valuable (or vertical) and Estimable (to a good approximation). It must also be Small to fit within an iteration and Testable (in principle, even if there isn't a test for it yet).
This article goes beyond agile, waterfall, rapid application development, and the like. I will summarise a set of general and foundational qualities as a blueprint for software development.
To effectively leverage AI for code generation, while fundamental principles of software requirements remain, their emphasis and application must adapt. This ensures the AI, which lacks human intuition and context, can produce code that is not only functional but also robust, maintainable, and aligned with project constraints.
For each fundamental quality, I first explain its purpose. Its usefulness and applicability when code is generated by AI are also discussed.
The level of detail that I want to cover this topic necessitates two articles. This article summarizes the "what" we should do. A follow-up article gives an elaborate example about "how" we can do that.
Documented
Software requirements must be documented and should not just exist in our minds. Documentation may be as lightweight as possible as long as it's easy to maintain. After all, documentation's purpose is to be a single source of truth.
When we say requirements must be "Documented" for human developers, we mean they need to be written down somewhere accessible (wiki, requirements doc, user stories, etc.). If they only exist in someone's head or if they are scattered across chat messages, they probably won't be very effective. This ensures alignment, provides a reference point, and helps with onboarding. While lightweight documentation is often preferred (like user stories), there's usually an implicit understanding that humans can fill in gaps through conversation, experience, and shared context.
For AI code generation, the "Documented" quality takes on a more demanding role:
Correct
We must understand correctly what is required from the system and what is not required. This may seem simple, but how many times have we implemented requirements that were wrong? The Garbage In, Garbage Out (GIGO) rule applies here.
For AI-code generation, the importance of correctness can be evaluated if we consider that:
Complete
This is about having no missing attributes or features. While incomplete requirements are an issue, again, we may infer missing details, ask clarifying questions, or rely on implicit knowledge. This is not always the case, however, even for us humans! Requirements may remain incomplete even after hours of meetings and discussions. In the case of AI-generated code, I've seen AI-assistants going both ways. There are cases where AI assistants generate what is explicitly stated. The resulting gaps led to incomplete features or the AI making potentially incorrect assumptions. There are also cases where the AI-assistant spotted the missing attributes and made suggestions.
In any case, for completeness, I think it's still worth being as explicit as we can be. Requirements must detail not just the "happy path" but also:
Unambiguous
When we read the requirements, we can all understand the same thing. Ambiguous requirements may lead to misunderstandings, long discussions, and meetings for clarification. They may also lead to rework and bugs. In the worst case, requirements may be interpreted differently and we may develop something different than what was expected. In the case of AI assistants, it also looks particularly dangerous.
Consistent
Consistency in requirements means using the same terminology for the same concepts. It means that statements don't contradict each other and maintain a logical flow across related requirements. For human teams, minor inconsistencies can often be resolved through discussion or inferred context. In the worst case, inconsistency can also lead to bugs and rework.
However, for AI code generators, consistency is vital for several reasons:
Testable
We must have an idea about how to test that the requirements are fulfilled. A requirement is testable if there are practical and objective ways to determine whether the implemented solution meets it. Testability is paramount for both human-generated code and AI-generated code. Our confidence must primarily come from verifying code behavior. Rigorous testing against clear, testable requirements is the primary mechanism to ensure that the code is reliable and fit for purpose. Testable requirements provide the blueprint for verification.
Testability calls for smallness, observability, and controllability. A small requirement here implies that it results in a small unit of code under test. This is where decomposability, simplicity, and modularity become important. Smaller, well-defined, and simpler units of code with a single responsibility are inherently easier to understand, test comprehensively, and reason about than large, monolithic, and complex components. If an AI generates a massive, tangled function, even if it "works" for the happy path, verifying all its internal logic and edge cases is extremely difficult. You can't be sure what unintended behaviours might lurk within. For smallness, decompose large requirements into smaller, more manageable sub-requirements. Each sub-requirement should ideally describe a single, coherent piece of functionality with its own testable outcomes.
Observability is the ease with which you can determine the internal state of a component and its outputs, based on its inputs. This holds true before, during, and after a test execution. Essentially, can you "see" what the software is doing and what its results are? To test, we need to be able to observe behaviour or state. If the effects of an action are purely internal and not visible, testing is difficult. For observability, we need clear and comprehensive logging, exposing relevant state via getters or status endpoints. We need to return detailed and structured error messages, implement event publishing, or use debuggers effectively. This way we can verify intermediate steps, understand the flow of execution, and diagnose why a test might be failing.
Controllability is the ease with which we can "steer" a component into specific states or conditions. How easily can we provide a component with the necessary inputs (including states of dependencies) to execute a test and isolate it from external factors that are not part of the test? We can achieve this through techniques like dependency injection (DI), designing clear APIs and interfaces, using mock objects or stubs for dependencies, and providing configuration options. This allows us to easily set up specific scenarios, test individual code paths in isolation, and create deterministic tests.
They can force you to test your unit along with its real dependencies. This turns unit tests into slow, potentially unreliable integration tests. You can't easily simulate error conditions from the dependency.
Reliance on Global State
If a component reads or writes to global variables or singletons, it's hard to isolate tests. One test might alter the global state, causing subsequent tests to fail or behave unpredictably. Resetting the global state between tests can be complex.
Lack of Clear Input Mechanisms
If a component's behaviour is triggered by intricate internal state changes or relies on data from opaque sources rather than clear input parameters, it's difficult to force it into the specific state needed for a particular test.
Consequences
Traceable
Traceability in software requirements means being able to follow the life of a requirement both forwards and backwards. You should be able to link a specific requirement to the design elements, code modules, and test cases that implement and verify it. Conversely, looking at a piece of code or a test case, you should be able to trace it back to the requirement(s) it fulfills. Traceability tells us why that code exists and what business rule or functionality it's supposed to implement. Without this link, code can quickly become opaque "magic" that developers are hesitant to touch.
Viable
A requirement is "Viable" if it can realistically be implemented within the project's given constraints. These constraints typically include available time, budget, personnel skills, existing technology stack, architectural patterns, security policies, industry regulations, performance targets, and the deployment environment.
Wrapping Up
When writing requirements for AI-generated code, the fundamental principles remain, but the emphasis shifts towards:
In essence, the requirements for AI code generation mean being more deliberate, detailed, and directive. It's about providing the AI with a high-fidelity blueprint that minimizes guesswork. A blueprint that maximizes the probability of generating correct, secure, efficient, and maintainable code. Code that aligns with project goals and technical standards. This involves amplifying the importance of qualities like completeness, unambiguity, and testability. It also involves evolving the interpretation of understandability to suit an AI "developer."
Currently, it seems that carefully crafting software requirements can also reduce hallucinations in AI-generated code. However, it's not expected to eliminate hallucinations entirely just through the requirements alone. The quality and structure of the input prompt (including the requirements) significantly influence how prone the AI is to hallucinate details. Hallucinations also stem from model limitations, training data artifacts, and prompt-context boundaries. Such factors are beyond the scope of this article.