tags: quality practices, testing folks I’ve talked w/ about this: Murali Thirunagari, Senthil, Vince Price

Synthetic testing (also known elsewhere as “canaries” or “Customer Experience Alarms”/CXAs and internally as “smoke tests”) are a quality technique to ensure that your service is working properly by focusing on golden path workflows. A golden path workflow is something that absolutely must work for customers to be successful. For a file store, it would be things like PutObject or GetObject RPC calls. For a search service, it would be something like “If I search for something that I know is there.. does it return?“.

Every 5 minutes, an automated process kicks off to run through a single golden path workflow for your service. When it’s complete, it pushes metrics into the metrics system. You have alarms hooked into these metrics to alert if these golden pathways are broken. If the golden path fails, they alert.. even at 4:35 am.

Let’s take, for example, a git server. One golden pathway is the ability to push data into the system that’s then correctly reflected. To orchestrate this test, we might have two copies of the same repository. In copy A, we make a change, commit it, and then push it to the git server. In copy B, we pull down those changes and ensure they are present. If we can’t push the change or pull it down, we emit a failure metric. If we are successful, we emit a success metric. You may want more metrics to be emitted, but these aren’t used for alerting.

An individual service team likely has around 5 of these golden path workflows. Maybe fewer, but probably not many more. This means they will have 5 separate tests running independently. This means that one test isn’t setting up state for the next test. They’re each self-contained.

These synthetic tests are also a fantastic signal to gate deployments on. Because they’re metric-based, we can hook these alerts into our canary rollout processes.

FAQ

Are these the same thing as integration tests?

There’s some overlap, but I think they’re different. First, integration tests often use internal mechanisms to set up state, like talking directly to a database to get things together. Synthetics use the service just like a customer would.

Integration tests are also not metric-based. This is an important aspect of how these tests run & alarm.

Integration tests often have retries, delays, and similar means of fault-tolerance built in to improve resilience. Synthetics should model what your customers do, which is perhaps a bit harsher than your integration tests are being (i.e. lower waits & fewer retries).

Teams have many integration tests and they’re often very expansive in their surface area (including the things that aren’t critical to the service). Hooking in your integration tests are likely going to be slow and flakey. There’s a real operational risk to being over-alarmed due to the flakiness of the suite, rather than the flakiness of your service.

For these reasons, I prefer a purpose-built solution. As an example, I built a small synthetic test for a graphql service. It was around 150 lines of code, including observability instrumentation and a custom-written auth client.

https://github.corp.ebay.com/fgql/fgqlsynthetic Key files:

  • index.js, houses the bootstrapping
  • monitor.js, houses the observability pieces and core http request setup.

Do synthetic tests use the UI?

Unlikely. UI tests (like Selenium) are quite error-prone and slow. Instead, test things at the API call level.