How Senior QA Engineers Are Actually Using AI at Work

Many recruiters are still trying to figure out how to vet automation skills.

Now add AI into the mix.

So I asked a senior QE how they’re actually using tools like Claude Code and Cursor day to day. The answer wasn’t “it writes code for me.”

It was:

• Writing test plans, test strategies, test cases, one-pagers, and automation roadmaps
• Analyzing repositories and documenting frameworks
• Creating and modifying Cypress and Playwright tests
• Debugging failures and tracing root causes
• Validating implementation decisions before involving engineers

The common theme: AI handles structure and execution. The QE provides context and judgment.

Then we got more specific and looked at Cursor.

How Cursor is actually being used

Cursor is being used for three core workflows: research, validation, and coding test cases.

1. Research (repo understanding)

Instead of manually digging through a codebase, Cursor is used to:

Scan entire repositories
Break down what the automation framework does
Surface structure, patterns, and dependencies
Generate high-level documentation
Create summaries for team or leadership discussions

This turns a new or unfamiliar repo into something you can reason about quickly, instead of slowly reverse-engineering it.

2. Test case creation and modification

Cursor is used to accelerate test development by:

Generating Cypress or Playwright specs based on existing patterns
Cloning and adapting existing test cases
Modifying tests to match new functionality
Getting tests running locally faster for validation

The goal isn’t to replace test design but to remove repetitive implementation work so QEs can focus on coverage and edge cases.

3. Running tests and debugging failures

This is where Cursor starts to function like a co-developer.

Typical flow:

Run a test case
Identify what is failing
Trace failure back to UI, DOM, or data layer
Cross-reference repo structure
Suggest and apply fixes directly in code

Before any changes are made, it first outlines what it plans to modify. That step keeps control and prevents blind edits.

Model usage (important nuance)

Cursor runs on models like Claude Opus and Sonnet.

In practice:

Sonnet is sufficient for most QE workflows
It is faster, cheaper, and strong enough for test generation and debugging
Opus is only needed for deeper architectural or complex reasoning tasks

Most teams overuse heavier models where lighter ones would work fine.

Starter questions recruiters can ask if they’re trying to vet AI usage

1. What LLM do you use most often and why?

The strongest answers are usually Sonnet, sometimes paired with Haiku depending on the task.

If someone immediately defaults to Opus for everything, it can be a signal they haven’t spent much time balancing cost, speed, and capability in real-world workflows.

2. How do you use Cursor with an LLM to perform coding tasks?

Listen for specifics.

Strong candidates will describe using AI to analyze repositories, generate tests, modify automation frameworks, debug failures, and validate ideas before escalating to engineers.

The key is whether they use AI as a coding partner rather than simply a chatbot.

3. What have you built with AI?

It doesn’t need to be production software.

A bug generator, test utility, internal tool, side project, or workflow automation all count.

Building forces people to learn prompting, context management, validation, and iteration. If they’ve never built anything, that’s often a signal their experience is surface-level.

4. How do you give AI context before asking it to produce work?

This is where the difference between junior and senior usage becomes obvious.

Look for answers involving repository context, documentation, examples, requirements, architecture details, or existing test patterns.

Strong outputs usually come from strong context.

5. Tell me about a time AI gave you a wrong answer. How did you catch it?

Every experienced QE has examples.

The best candidates will talk about validating AI-generated code, testing assumptions, reviewing outputs, and identifying hallucinations before they become defects.

If they’ve never caught AI being wrong, they’re probably not reviewing its work closely enough.

Most teams overuse heavier models where lighter ones would work fine.

Where QEs get it wrong

1. Using Opus when Sonnet is enough

Sonnet covers:

Test generation
Framework modifications
Debugging workflows
Repo-level analysis

Using Opus everywhere increases cost without meaningful benefit in most QE work.

2. Not using AI as a co-developer

This is the bigger issue.

Cursor with Claude is effectively:

A developer
An API engineer
A frontend/backend assistant
A documentation and analysis layer

But only if it’s given real repo context and used deliberately.

Too many QEs still default to:

Asking engineers for things AI can surface quickly
Manually tracing issues that could be accelerated
Treating AI as a tool instead of a collaborator

The skill is no longer just writing automation.

It’s knowing how to direct an AI through a codebase, give it enough context, and validate what it produces.

Closing

The gap in QE performance today isn’t tooling but it’s depth of usage.

Some teams are still using AI to “help write tests.”

Others are using it as a live layer across the codebase to research, debug, generate, and validate work end-to-end.

That gap is widening quickly.

Appreciate Jonathan for sharing how his team is actually applying these tools in practice. It helped ground this in real workflows rather than theory for us recruiters🙂. Highly recommend QE professionals check your guide HERE.

Example Of How Senior QEs Are Actually Using AI in Real Work

How Cursor is actually being used

1. Research (repo understanding)

2. Test case creation and modification

3. Running tests and debugging failures

Model usage (important nuance)

Starter questions recruiters can ask if they’re trying to vet AI usage

Where QEs get it wrong

1. Using Opus when Sonnet is enough

2. Not using AI as a co-developer

Closing

Never Miss a QA Post

Leave a ReplyCancel Reply

How Cursor is actually being used

1. Research (repo understanding)

2. Test case creation and modification

3. Running tests and debugging failures

Model usage (important nuance)

Starter questions recruiters can ask if they’re trying to vet AI usage

Where QEs get it wrong

1. Using Opus when Sonnet is enough

2. Not using AI as a co-developer

Closing

Never Miss a QA Post

Related Posts

You’re Probably Doing More Than “Manual Testing”

AI Doesn’t Fail Because of the Technology

Let’s Not Roll the Dice on a $130K Engineering Seat

Leave a ReplyCancel Reply