AI For Testing: A Comparison of Unit Testing Tools

Zartis Team
AI, Software Development

In today’s fast-paced world of software development, efficient and rapid testing has become more important than ever. Organizations are under pressure to deliver high-quality applications within tight deadlines. To meet this challenge, Artificial Intelligence (AI) has emerged as a game-changer in the realm of software testing. By leveraging AI for testing, developers and QAs can streamline their testing processes and significantly improve the accuracy and reliability of their results.

In this article, we will take a closer look at three AI tools that can help speed up the coverage of unit tests and evaluate the strengths and weaknesses of each option, with the goal of assisting you in selecting the most suitable tool to optimize your testing activities within your ecosystem. As we look into AI for testing and the most used tools, we’ll check out how these technologies can make things more efficient, spot possible issues, and, in the end, help create software products more quickly. Let’s get started!

GitHub Copilot

GitHub Copilot is a tool that can assist in code generation and suggest code snippets based on comments and descriptions in your code. However, it does not automatically create unit tests or guarantee the quality of your tests.

Below is a high-level overview of how it works:

Pros:

GitHub Copilot can help with generating unit test code, but it can’t create comprehensive tests automatically because it lacks knowledge of your application’s behavior. You need to provide sufficient information about what to test and how to test it.
AI-based vulnerability prevention system blocks insecure code patterns in real-time to make GitHub Copilot suggestions more secure. The model targets the most common vulnerable coding patterns, including hardcoded credentials, SQL injections, and path injections. Because the model was trained on publicly available code, its training set may contain insecure coding patterns, bugs, or references to outdated APIs or idioms.
When you use GitHub Copilot, your code and comments are sent to the GitHub Copilot service. This service needs to understand the context of the file you are editing, as well as other related files in your project. It may also collect information about the repositories or file paths you are using to help identify the context. You can choose to disable some of the data collection in the telemetry settings, but other parts, such as usage information, are necessary for Copilot to work and cannot be turned off.
Can explain why it suggests certain code (section called “Explain code”).
Data Set Generation: Copilot can assist in generating data sets for unit tests, simplifying the process of creating test cases with various inputs and scenarios. This is especially valuable for testing different data conditions and edge cases.
Test Setup and Teardown: It can assist in setting up the test environment and tearing it down after the tests are complete, simplifying the testing process.
Test Naming and Organization: It can help in naming and organizing your tests effectively, making it easier to manage and maintain your test suite. simplifying the management and maintenance of your test suite. While the test names may be understandable, there is room for improvement.
Mocking and Stubs: Copilot can generate code for creating mock objects and stubs for dependencies, which is crucial for isolating the code under test.
TestPilot (feature included in the experimental extension of Copilot Labs) can help you to easily generate readable unit tests with meaningful assertions for your JavaScript/TypeScript code.

Cons:

You will still have to think of edge cases to test and how to test them.
It is much harder for the AI to come up with a unique solution that it never saw than to interpolate or extrapolate the solutions it did see before.
Code might have performance issues. You might have to specify in order to optimize your code.
Sometimes it may suggest irrelevant or vulnerable code and lead to syntax errors, because it doesn’t have proper context and user behavior. As a developer, we also need to consider how edge cases are covered in unit tests.

ChatGPT 4.0

ChatGPT is a language model announced by OpenAI on November 30, 2022. While the primary purpose of an AI chatbot is to imitate human conversation, ChatGPT is highly versatile and can perform a wide range of tasks such as coding and debugging software, providing responses to questions, and more.

Pros:

Test Quality: While ChatGPT can generate test cases, the quality and effectiveness of the tests depend on the clarity and completeness of the instructions given to the model. It’s essential to ensure that the generated tests adequately cover the desired scenarios.
When working with unfamiliar frameworks or libraries, ChatGPT is your reliable guide.
Scenarios. The scenarios chosen for unit tests are good. It includes boundary testing and has a good balance of covering enough variations without including any unnecessary cases.

Cons:

False Positives: Generated test cases may sometimes produce false positives or tests for unintended behaviors.
ChatGPT generates its responses by following patterns and examples in its training data. However, this can sometimes lead to inaccuracies or inconsistencies in the code suggestions due to the limitations of the training process.
Structure/code: It might repeat code.
Code mutation: While fixing one place, it made changes in another part of the code which was previously secure or even rewrote the code by using a different framework compared to what was originally requested.
ChatGPT does not ensure secure coding or conduct security assessments on code. Version 4.0 is slightly more robust than its 3.5 predecessor. However, an expert’s oversight was still needed to correct flaws in the code.
ChatGPT is a cloud-based service provided by OpenAI, and it currently does not support interface VPC endpoints. This means that you cannot connect to ChatGPT directly from a private network without exposing your traffic to the public internet.
ChatGPT is enabled within IDEs using a non-official third-party that imports ChatGPT’s chat interface to your IDE but does not add code context. Finding relevant code snippets and pasting them to the chat is a complicated and tedious task.
ChatGPT falls short in terms of security. Be careful about what code you input into ChatGPT. Avoid giving ChatGPT access to any code that you do not want to be shared with third parties. Consider using a different AI model if you are concerned about the privacy of your code. There are instances where information about your code may be shared without explicit consent, such as when legally required or deemed necessary to safeguard ChatGPT’s interests or those of others.

Amazon Code Whisperer

CodeWhisperer is an AI coding companion that generates real-time, single-line or full-function code suggestions in your integrated development environment (IDE) to help you quickly build software. When it comes to using AI for testing, CodeWhisperer analyses the English language comments and surrounding code to infer what code is needed to complete the task at hand.

Pros:

Better integration with AWS services. CodeWhisperer focuses on providing first-class support for AWS APIs which provides code recommendations for AWS APIs across the most popular services, including Amazon Elastic Compute Cloud (EC2), AWS Lambda, and Amazon Simple Storage Service (S3) which can be helpful if you’re new to AWS or cloud in general.
- CodeWhisperer prioritizes code security by incorporating a robust security scanner that can detect hard-to-find issues, including those in the top 10 Open Worldwide Application Security Project (OWASP).

Professional user data is not retained. Individual users who use the free tier should know that their code snippets and comments may be retained for service enhancement based on a user’s settings, although you can opt out of this.

Amazon CodeWhisperer only looks at the context of the active file and dependencies. It also can access and incorporate information from project-level files, such as configuration files and global variables, to provide more comprehensive and context-aware suggestions.
There are two ways to generate code; either you do them step by step with specific comments or generate bigger chunks of code by giving multi-step instructions.
Sample Data: CodeWhisperer can generate sample data following the structure you need in your code (CodeWhisperer won’t be as helpful with complex structures).
It can whip up unit tests to set up testing situations for your code. While the generated test cases may be pretty basic, they’ll help you get some essential test coverage for your code in less time. Additionally, it includes logic for most of the specific unit tests needed and even suggests edge cases that might have been overlooked.

Cons:

Code snippet suggestions can be less comprehensive.
Does not explain why it suggests certain code.
Supports a narrower range of programming languages (15 popular languages, 5 at full support and 10 at a lesser degree of support).
Code Invention: When using third party libraries or SDK, even on custom classes, it has been observed that code suggestions generated by CodeWhisperer are sometimes found to not correspond with any existing elements in those libraries. This can cause issues when trying to compile the code.

A Comparison of AI-Powered Unit Testing Tools:

	GitHub Copilot	ChatGPT	Amazon CodeWhisperer
Supported IDE’s	Visual Studio Code and Visual Studio, JetBrains IDE’s and Vim/NeoVim.	3rd parties	VS Code and the JetBrains family of IDEs, AWS Cloud9, the Lambda console, JupyterLab, and SageMaker Studio.
Programming languages	Works especially well for Python, JavaScript, TypeScript, Ruby, Go, C# and C++. Also Supported: C, Go, Java, PHP, Scala.	Versatile	Works especially well for Python, Java, JavaScript, TypeScript, C#,. Narrower range of programming languages (15 popular languages, 5 at full support and 10 at a lesser degree of support)
Stability	GitHub Copilot chat in Beta for Visual Studio and VS Code.	Stable	Stable
Support interface VPC endpoints.	Yes	No	Yes
Reference tracker for open-source code (repositories and licenses):	Yes	No	Yes.
Stated Security & Privacy Policies	No SOC 2 Compliance. You can configure how GitHub uses your telemetry data. GitHub Copilot Business does not retain any Prompts or Suggestions.	No SOC 2 Compliance. Chat history can be disabled. Unsaved chats will be deleted from the systems within 30 days.	Opt-out for code snippet telemetry. No SOC 2 Compliance
Pricing	Billed at $19 USD per user per month. Copilot for Individuals is $10 USD per month or $100 USD per year. Copilot Business is only available for companies with GitHub Enterprise.	Subscription fee of $20 per month (usage capped at 50 messages every three hours). There is also a pay-as-you-go option for ChatGPT 4: $0.03 per 1000 prompt tokens and $0.06 per 1000 sampled tokens.	Individual plan is free (preview version), Professional plan at $19/user/month.
Latency	Good	Good	Can improve
Launched	February 7, 2023	November 30, 2023	April 13, 2023

Author

Carolina Bonet is a passionate senior QA Engineer. She has a strong experience in test automation working with various languages including C#, TS and JS, and frameworks such as K6, Selenium and Cypress. She has actively contributed to the improvement of QA strategies, optimizing company processes for enhanced product quality. Carolina is known for her passion for learning and adapting to new technologies.

Share this post

Do you have any questions?

Zartis Tech Review

Your monthly source for AI and software news