2023 AI Prototype
Requires user to describe what scenario it needs to create
It works on understanding the flow, extracting context from the html and generating test steps that are compatible with gotestpro test steps.
We spent more than 1 mo on this step generation
Success Outcomes:
generated test steps
worked in gotestpro
Failure Outcomes:
inconsistent results, steps don’t always work and scenarios don’t work
Next steps:
chatgpt3.5 has changed, prompt engineering prompts have changed quite a bit
narrow scope to Salesforce Commerce as per conversation w Asif Lala
combine w Zerostep prototype
ZeroStep Prototype
We have prototype of taking a text request where user says “click on product info” and the library interacts with the browser and clicks on the product and proceeds to the product detail steps.
Success Outcomes:
Generated test steps, it executes the steps in the browser
Can import steps from a CSV file, so if user has manual test cases, then they can input test cases into the sequence
tested with sample shoe store ecommerce (air birds current customer)
Failure Outcomes:
Inconsistent results, especially with menu items or items where labels aren’t clear.
Same result that works one time, may not work successfully the next time.
Prompt engineering the text command makes a huge difference.
LLM Cost Performance Benchmarking
Idea is to see if the current AI setup if price effective in a chatbot/llm application. We want a way to benchark to see if an llm app is correct AND also see at what price.
tool that will run simulation of tests against a chatbot
it simulates multiple types of tests:
happy path
confusing questions
inappropriate questions
abort scenarios
it measures chatbot accuracy: did it give correct responses or not?
it also measures number of words and tokens
between test runs you get a price/performance report