How I taught Claude to write Maestro tests (so I don't have to)

Sergio Castrillon
&
April 21, 2026
6
min. read
Table of Contents
TABLA DE CONTENIDOS
ÍNDICE DE CONTEÚDO

There's a type of Android work that doesn't feel like engineering. A new feature ships. QA wants E2E coverage. You open the maestro/ folder, stare at an existing test for reference, copy the structure, swap out element IDs, run it, watch it fail because a timeout is too short, bump the timeout, run it again. Repeat until it passes. Submit the PR.

It's not hard. It's just slow and mechanical — exactly the kind of work AI should be eating.

The problem was: every time I tried to get Claude to generate a Maestro test, it would produce something that looked right but was wrong in subtle ways. Wrong selectors. Wrong timeout values. Missing post-login interruption handling.

So I stopped asking Claude to write Maestro tests directly. I built a skill that teaches it how to do it right.

The Setup

The skill — /create-maestro-test — works in two modes. You can describe what you want in plain English:

/create-maestro-test "Test navigating to inbox and opening the first conversation"

Or you can invoke it with no arguments and it walks you through a questionnaire: feature, action, test type (main flow vs. reusable subflow), user type, clean state, recording, build flavor.

Once it has what it needs, it doesn't just write YAML from scratch. It reads your existing tests first — 2 or 3 similar ones — learns your element IDs, your timeout patterns, your label conventions, and builds from those. The output ends up looking like it belongs in your codebase because it actually learned from your codebase.

Building a Test From Scratch, No Prior Knowledge Needed

One of the most powerful aspects of the skill is that you don't need to know anything about the existing test infrastructure to get started. The questionnaire handles all of it.

You answer: what feature, what action, which user type, clean state or not. The skill then does the heavy lifting — it loads the shared user credentials, reads the timeout constants, searches the codebase for similar existing tests, extracts the relevant element IDs from them, and assembles a complete test that already follows your project's conventions.

For a brand new screen with no prior tests to reference, it still works. It falls back to the Compose and XML naming conventions documented in its guides, marks any uncertain IDs with # TODO: Verify element ID, and gives you a scaffold that's already 80% right. The remaining 20% is confirming that the IDs actually exist in the app — something that takes minutes with Layout Inspector.

This is the part I underestimated when we started. The questionnaire isn't just a UX improvement. It's what makes the skill usable by anyone on the team, regardless of how much they know about Maestro or the existing test suite.

The Thing That Surprised Me Most

Before building this skill, I assumed the hard part would be getting Claude to write valid YAML. It wasn't.

The hard part was teaching it how to find the right UI components to test. Specifically, how to distinguish between XML elements and Compose components, and how to know which selector to use for each.

Maestro uses id: as the universal selector for both XML resource IDs and Compose test tags — but there's a critical distinction in how you tag Compose components. If a developer uses Modifier.testTag("ProfileCard"), Maestro will not find it. The element simply doesn't appear. We kept getting "Element not found" errors and couldn't figure out why — the Layout Inspector clearly showed the component.

The fix: developers need to use Modifier.testTagAsId("ProfileCard") instead. Once tagged with testTagAsId, it's reachable in Maestro via id: 'ProfileCard' — the same selector you'd use for an XML view. The naming convention is the only visual cue that tells you which you're dealing with:

  • XMLsnake_case (e.g., inbox_conversation_container)
  • ComposePascalCase (e.g., ProfileCard, SendMessageButton)

Once that distinction was clear and documented in the skill's guides, the component-finding problem was effectively solved. The skill now searches the codebase for testTagAsId usages when targeting Compose screens, and falls back to the Layout Inspector instructions when nothing is found. That's the kind of thing that takes an hour to debug the first time and two seconds to fix once you know — and now nobody on the team has to rediscover it.

The Auto-Fix Loop

Inspired by Max's unit test post: the feedback loop.

When a generated test fails, the skill doesn't just report the error. It categorizes it and applies a targeted fix:

  • Timeout / Element Not Found → increase timeout to 50s, try alternative selector (check if XML id should be Compose testTagAsId or vice versa), add an extra wait before the failing step
  • Element Not Tappable → add an extendedWaitUntil before the tap, try tapping by visible text as fallback, check for overlays blocking the element
  • Selector Ambiguity → add index: 0 to select the first matching element, make the selector more specific

It retries up to three times, each attempt applying a different fix strategy. After each retry it notifies you what was changed and why. If all three attempts fail, it hands you the full Maestro output with a clear explanation of what category of failure you're dealing with.

Most first-run failures fall into one of those three buckets. The auto-fix resolves the majority of them without any manual intervention, and the ones it can't fix are at least clearly explained so you know exactly where to look.

The Flavor Bug

This one was a real failure moment. When the app wasn't installed and the skill triggered a full build, it was compiling the prod flavor — which points to the production backend. Running E2E tests against prod is not great.

The fix was obvious in retrospect: add a flavor question to the questionnaire and default to debug (dev backend, dev keys). But it only surfaced after someone actually ran the skill end-to-end and noticed the login was hitting real data.

The Gradle task for the default is now explicit:

./gradlew :application:assembleDebug

All three flavors are now documented with plain explanations so the choice is informed rather than accidental:

What It Feels Like to Use

Here's what the inbox test from this post's intro looks like now:

/create-maestro-test "Navigate to inbox and open a conversation"

The skill asks a few quick questions — test type, user, clean state, recording. It reads regression-users.js and test-data.js for credentials and timeouts. It finds the existing control-open-chat.yaml test, learns that the inbox tab is HomeTabInbox and conversations are inbox_conversation_container. It generates the test, shows it to you for confirmation, writes the files, starts an emulator if needed, asks about installation, and runs maestro test.

Total time from prompt to passing test: under 5 minutes, most of which is the emulator booting.

The Takeaway

Building this skill forced us to understand Maestro more deeply than we ever would have by just writing tests manually. The testTagAsId vs testTag distinction, the flavor issue, the auto-fix categories, the selector priority rules — none of these would have ended up documented if we hadn't had to teach them to an AI precisely enough to generate correct output.

That's the underrated benefit of this kind of work. You don't just get automation. You get clarity about what you actually know — and a permanent, shareable record of it.

The questionnaire-driven approach also changed how we think about test authorship. It's no longer a task that requires deep knowledge of the test infrastructure. Anyone who knows what they want to test can produce a valid, passing Maestro test in minutes. That's the real unlock — not the YAML generation, but the democratization of E2E test coverage across the team.

The skill is available in our internal plugin marketplace under android-maestro-testing.

Share this article
Comparte este artículo
Compartilhe este artigo

Find & Meet Yours

Get 0 feet away from the queer world around you.
Thank you! Your phone number has been received!
Oops! Something went wrong while submitting the form.
We’ll text you a link to download the app for free.
Table of Contents
TABLA DE CONTENIDOS
ÍNDICE DE CONTEÚDO
Share this article
Comparte este artículo
Compartilhe este artigo
“A great way to meet up and make new friends.”
- Google Play Store review
Thank you! Your phone number has been received!
Oops! Something went wrong while submitting the form.
We’ll text you a link to download the app for free.
“A great way to meet up and make new friends.”
- Google Play Store review
Discover, navigate, and get zero feet away from the queer world around you.
Descubre, navega y acércate al mundo queer que te rodea.
Descubra, navegue e fique a zero metros de distância do mundo queer à sua volta.
Already have an account? Login
¿Ya tienes una cuenta? Inicia sesión
Já tem uma conta? Faça login

Browse bigger, chat faster.

Find friends, dates, hookups, and more

Featured articles

Artículos destacados

Artigos em Destaque

Related articles

Artículos relacionados

Artigos Relacionados

No items found.

Find & Meet Yours

Encuentra y conoce a los tuyos

Encontre o Seu Match Perfeito

4.6 · 259.4k Raiting
4.6 · 259.4k valoraciones
4.6 · 259.4k mil avaliações