Let AI build the UX test app: a safe place to experiment

I set out with my trusty AI team on a mission to create what engineers rarely prioritise, but QA and CX leaders wish they would: a fully automated test app that mirrors the experience of real users.
So many learnings in such a short time.
Many tech leaders are rightly cautious about letting AI loose on the codebase of their core products. I’m one of them. AI developer agents are seriously powerful, but if you’ve spent a career refining your design patterns and coding standards, you’re not about to hand the keys to a machine. You insist on rigour, maintainability and consistency.
But this kind of test app is different. It’s on the periphery of your product tech stack; it’s not your core IP. It interacts with your application from the outside: through browsers, authentication flows, onboarding sequences. It doesn’t touch core logic or data.
Test apps get revised constantly, often rebuilt repeatedly. And they’re used by your QA team, not your customers. That makes them lower risk and an opportunity to experiment.
So I gave my AI team the brief: build a resilient, unattended UX testing app that spins up email accounts, activates users, authenticates to the app, navigates the experience, and logs outcomes with screenshots — all wired into CI.
Before any coding, I ran a thorough design session in voice with my AI, as I would a senior engineer. We talked functional scope, tool selection, edge cases, risks and estimates. I told it to be highly critical of my thinking, and to output a detailed README and a developer spec. We discussed my concerns around incomplete semantic selectors available in the DOM, and it introduced ideas to make submitting data forms more robust.
I find this conversational approach incredibly effective. I even bring in a second agent to critique the first — they get competitive trying to improve on each other’s output.
By mid-morning I had a clear plan. And had time for a quick gym break.
The afternoon’s build work was insanely fast, but deliberate. The core automation with Playwright and Pytest was well within the AI’s wheelhouse, and it built out email creation using Testmail in no time. Automating Salesforce user creation was messier — I had to step in, correct course, reject bad code, escalate to different models.
But I loved the pace of it, and stayed in the flow for hours.
By evening, I had a working UX test app: automated from sign-up, through onboarding, to first use. In years past, I’d have spent that day drafting a job description for the QA engineer I needed to hire. This was a very different approach. 🙂
Originally shared on LinkedIn