AI Testcase Designer - Assisted Testcase Authoring using LLMs

Posted by Harishbabu.r | Mon, 03 Nov 2025 | Tagged: llm, artificial-intelligence, ai, testcase, automation

Test case development for complex automotive systems is a tedious job because of the vast number of requirements, dependencies, and strict compliance standards that must be satisfied. Manually creating these test cases is time-consuming, error-prone, and often leads to gaps in coverage. In this article, we will see how we can leverage LLM to automate the process.

Introduction

In the automotive domain, software development is tightly governed by standards like ASPICE, where maintaining strict traceability between requirements and acceptance test cases is a core compliance obligation as shown in the below diagram.

/static/images/article-ai-testcase-generator/aspice-v-model.jpg

Given the scale and complexity of modern automotive platforms, the number of software and system-level requirements is often substantially large over ~5000 across various ECUs and subsystems. Which are generally created and managed in the ALM(Application Lifecycle Managment) Softwares such as JIRA, Polarion and Codebees.

Problems in Manual Testcase development

Manually developing test cases in projects that have a large amount of requirements and frequent change in requirements creates persistent, systemic problems. Teams routinely face coverage gaps, duplicated effort, and inconsistent test styles across modules or groups. Because requirements evolve—through design updates, optimizations, or bug fixes—test cases must be continuously revised so each change is promptly and correctly validated.

Keeping track of which tests map to which requirements becomes increasingly difficult as the requirement set grows, and manual coordination quickly becomes a maintenance burden. At scale, this slows verification, increases the risk of missed regressions, and undermines compliance in safety‑critical systems.

In short: manual test‑case development struggles with scale, churn, and consistency—posing a significant bottleneck and risk to quality.

Solution: Test Case Generation with LLMs

LLMs can understand natural language requirements and generate structured, consistent test cases automatically.

By leveraging LLMs, we can automate the generation of test cases directly from requirements available in ALMs (like JIRA), and upload the generated test cases with traceable links. Which ensure coverage as the requirements evolve and eliminating manual errors. Thus, providing up-to-date requirements to testcase completion.

LLMs also help ensure full coverage, consistent formatting, and faster updates compared to traditional manual methods.

For testcase generation with LLMs we need these primary components,

Model
Documents (context)

Choosing a Model

There are multiple model providers and infinite models ranges from running in local machines to Data centers. But, Not all LLMs are created equal. Choosing a best model itself a big task, which is directly proportional to the quality of the generated testcase.

Choosing a best model itself a big task, which is directly propotional to the quality of the generated testcase.

To fish out the best model, We use our own LLM Evaluation program. This program compares the each LLM testcase with Golden testcases(Th) through LLM judge (An state-of-the-art resoning models like GPT-5 or Deepseek) and provide key metrics for test case generation

Those Golden Testcases are manually curated, verified-as-correct testcases that act as the ground truth for comparison against LLM generated output.

Here are those key metrics,

Requirement Understanding: Ability to correctly parse and interpret technical specifications
Domain Knowledge: Grasp of automotive terminology and system behaviors
Test Coverage: Completeness of generated test scenarios
Consistency: Uniformity in test case structure and terminology
Performance: Speed and resource usage during generation

For each model, We compare the model generated testcases with Golden testcases using LLM judge :

Sample requirements with known test cases
Domain-specific terminology checks
Complex edge case scenarios

Document: Context for the LLM

Since LLMs don’t inherently know project information, we must feed them structured context:

Requirements documents: functional specs, user stories, safety conditions
Datasheets and ECU interface definitions
Sample test cases or templates, exemplifying your desired syntax and style
Domain-specific knowledge, including terminology, signal names, and ECU behaviors
Test Setup Description, including hardware configuration, tools and test environment details

Note : If requirements change, you just update the document, the test case designer will automatically generate test cases in click of a button for align to the changes.

Technique Used: Cache-Augmented Generation (CAG)

To optimize both performance and contextual relevance, we adopted Cache‑Augmented Generation (CAG) :

/static/images/article-ai-testcase-generator/cag.png

Here the project documents such as ALM connection, domain knowledge, datasheets, Test setup description and sample testscases are fed into the LLM’s context.
The sample testcases induce a few-shot prompt to generate test cases, without needing to fine-tune.
The output then postprocessed to get the structured testcases.

AI Testcase Designer : User Interface

For better usability and demo purpose, We have designed a simple GUI using streamlit.

Testcase Generation

/static/images/article-ai-testcase-generator/testcase-gen.webp

Figure 1. AI Testcase Designer UI

Model Selector: Choose the LLM (e.g., Groq, llama-3.3‑70b‑versatile).
Config: To configure ALM and LLM Credentials such as URL and tokens.
Document Upload: Drag-and-drop datasheets, and sample testcases—these form the model’s context.
Fetch: select the requirements to generate testcases.
Instruction: Add specific prompts (e.g., consider a specific corner case).
Generate Button: Triggers test-case generation.

/static/images/article-ai-testcase-generator/testcase-gen-with-testcases.webp

Figure 2. Testcase Generated

Once the Generate button is triggered, The Test Case Designer will generate testcases for a selected requirements and display them in the Review window.

If test case generation is not satisfactory, the testcases can be regenerated with a Refinement Hint.

Preview Panel: Displays structured test cases (ID, title, steps, expected results).
Refine Hint (optional): Apply quick adjustments (e.g., “Add edge-case”).
Review: To review and update individual testcases.
Table Download: To export testcases in Excel format for offline review.

Testcase Review

/static/images/article-ai-testcase-generator/review.webp

Figure 3. Review Window

This window enables user to review individual testcases and allow the user to approve or further refine the test case using review comments.

Also, this provides some select options to merge selected testcases into a single testcase without losing its purpose.

Once the review is completed, The Save and upload all button can be used to upload all the testcases into configured ALM tool.

Concluding Notes

In summary, the Test Case Designer streamlines the test authoring workflow — fetching requirements from the ALM, generating and refining test cases, and synchronizing them back seamlessly. By leveraging modern LLMs to interpret requirements, user manuals, and test setup descriptions, it significantly reduces the effort of creating and maintaining test cases.

With its intelligent context handling and tight ALM integration, the tool helps teams produce high-quality, traceable test cases at scale. It eliminates manual bottlenecks, keeps QA aligned with evolving requirements, and ensures consistent, compliant outputs — accelerating delivery and enhancing quality. If you are exploring how AI and LLMs can enhance your test case creation and overall QA processes, we would be happy to discuss how our expertise can support you in achieving that.

With our extensive experience working with LLMs and over a decade of deep expertise in building test automation solutions for automotive and embedded systems, we can help design a custom AI tool for test case authoring and reviewing, tailored to your product needs. Reach out to us at sales@zilogic.com to explore further.

References

Streamlit - https://streamlit.io
Langchan overview - https://python.langchain.com/docs/introduction/
Langchan Structured output - https://python.langchain.com/docs/concepts/structured_outputs/
Few shot prompt - https://www.ibm.com/think/topics/zero-shot-prompting