Flip Rate
Flip rate measures how often a test changes its outcome between consecutive commits in a timeline. A “flip” is any transition where a test switches from passing to failing, or from failing to passing.
Flip rate is a well-established metric for quantifying test flakiness. Apple’s research paper “Modeling and Ranking Flaky Tests at Apple” (ICSE-SEIP 2020) uses flip rate as one of two core signals for ranking flaky tests, achieving near-perfect accuracy in identification and alignment with human interpretation of flakiness.
Formula
Section titled “Formula”For a given test in a timeline:
Flip Rate = Flips / (Invocations - 1)Where:
- Invocations is the number of commits where the test was executed
- Flips is the number of times the test’s status changed between consecutive commits
If a test has only 1 invocation, its flip rate is 0%. If a test has no invocations, the flip rate is not defined.
Why Order Matters
Section titled “Why Order Matters”Unlike simple failure rate, flip rate captures the temporal pattern of test results. Two tests can have the exact same number of passes and failures but very different flip rates depending on the order of results.
Consider two tests, both with 3 passes and 2 failures across 5 commits:
Test A — failures are clustered together:
| Commit 1 | Commit 2 | Commit 3 | Commit 4 | Commit 5 |
|---|---|---|---|---|
| ✅ passed | ✅ passed | ✅ passed | ❌ failed | ❌ failed |
Transitions: ✅→✅, ✅→✅, ✅→❌, ❌→❌ — 1 flip out of 4 transitions = 25%
This pattern suggests a real regression rather than flakiness — the test was stable, then something broke it.
Test B — failures are scattered:
| Commit 1 | Commit 2 | Commit 3 | Commit 4 | Commit 5 |
|---|---|---|---|---|
| ✅ passed | ❌ failed | ✅ passed | ❌ failed | ✅ passed |
Transitions: ✅→❌, ❌→✅, ✅→❌, ❌→✅ — 4 flips out of 4 transitions = 100%
This pattern is a strong signal of a flaky test — the outcome keeps alternating regardless of code changes.
More Examples
Section titled “More Examples”| Commit history | Flips | Invocations | Flip Rate |
|---|---|---|---|
| ✅ ✅ ✅ ✅ ✅ | 0 | 5 | 0% |
| ❌ ❌ ❌ ❌ ❌ | 0 | 5 | 0% |
| ✅ ✅ ✅ ❌ ❌ | 1 | 5 | 25% |
| ✅ ❌ ✅ ✅ ✅ | 2 | 5 | 50% |
| ✅ ❌ ✅ ❌ ✅ | 4 | 5 | 100% |
A consistently passing or failing test has a flip rate of 0%. A test that alternates every commit has a flip rate of 100%.