Android Engineer Hub

Performance and Baseline Profiles for Android Engineers (2026)

In short

Android performance in 2026 starts with a baseline profile shipped in every release. Senior engineers reach for the AndroidX Macrobenchmark library to measure cold/warm/hot startup as p50/p95 distributions, the Compose Layout Inspector and Recomposition counts to find unstable composables, Perfetto + system tracing for production-perf debugging, and R8 full-mode minification with APK Analyzer for size budgets. The bar: you can write a Macrobenchmark test that fails CI on regression, read a Perfetto trace, fix a recomposition storm by stabilizing a parameter type, and ship a 120Hz frame budget without dropping frames on mid-tier devices.

Key takeaways

  • Baseline profiles are the 2026 default for senior+ Android — they ship a list of hot classes/methods that the runtime AOT-compiles at install, dropping cold startup by 20-40% and reducing jank during first interactions. The canonical reference is developer.android.com/topic/performance/baselineprofiles.
  • The AndroidX Benchmark library has two halves: Microbenchmark (in-process, JIT-warmed micro measurements) and Macrobenchmark (out-of-process, real-app cold/warm/hot startup, scrolling, and frame timing). Macrobenchmark is the senior tool because it measures what users actually feel.
  • Cold, warm, and hot startup are distinct measurements with distinct fixes. Cold startup runs Application.onCreate, ContentProvider initialization, and first Activity inflation; warm reuses the process; hot reuses the Activity. The Android Vitals launch-time docs (developer.android.com/topic/performance/vitals/launch-time) are canonical.
  • Compose recomposition is the dominant performance cost in modern Android UI. Senior engineers know the Stable / Immutable annotations, why a non-stable parameter forces children to recompose, and how to read the Layout Inspector recomposition counts to find the offender.
  • Perfetto is the canonical Android system-trace viewer in 2026 (perfetto.dev). It replaces Systrace and reads the same atrace events plus the new perfetto SDK. The trace shows the full picture: app threads, binder transactions, GPU work, choreographer frames, and binder-blocked main-thread waits.
  • R8 is the official Android shrinker, obfuscator, and optimizer (it replaced ProGuard in AGP 3.4+). R8 full-mode (android.enableR8.fullMode=true, default in AGP 8.0+) is the senior bar — it does aggressive optimization including class-merging and method-inlining across modules.
  • 120Hz refresh rate is now mainstream on Android flagships and mid-tier (Pixel 7+, Galaxy S series, OnePlus). The frame budget at 120Hz is 8.33ms — half the 16.67ms 60Hz budget. Senior engineers profile against the 120Hz budget on flagships and the 60Hz budget on the mid-tier P50 device.

Baseline profiles: the 2026 default for senior+ Android

A baseline profile is a list of classes and methods the Android runtime ahead-of-time (AOT) compiles at install. Without one, the runtime starts in JIT-only mode and AOT-compiles methods only after they've cleared the hotness threshold. With a baseline profile, the methods on the critical startup path are AOT-compiled before the user opens the app.

The measured impact is large. Google's internal data (published in the Android baseline profiles guide) shows 20-40% cold-startup wins and similar wins for first-frame jank. Compose-heavy apps see the largest improvements because Compose runtime is the dominant startup cost in modern UI.

The AGP 8 + AndroidX Baseline Profile Gradle Plugin pipeline is the standard 2026 setup. The plugin adds a :baselineProfile module that runs a Macrobenchmark test, captures the methods executed during a representative journey, and writes a baseline-prof.txt bundled into the APK or AAB:

// app/build.gradle.kts
plugins {
    id("com.android.application")
    id("org.jetbrains.kotlin.android")
    id("androidx.baselineprofile")
}

android {
    defaultConfig {
        // R8 must be enabled to use baseline profiles in release.
        // The plugin writes the profile to src/release/generated/baselineProfiles.
    }
    buildTypes {
        release {
            isMinifyEnabled = true
            isShrinkResources = true
            // R8 full-mode is the senior bar (default in AGP 8.0+).
        }
    }
}

dependencies {
    "baselineProfile"(project(":baselineProfileGenerator"))
    implementation("androidx.profileinstaller:profileinstaller:1.4.0")
}
// baselineProfileGenerator/src/main/java/com/example/BaselineProfileGenerator.kt
@RunWith(AndroidJUnit4::class)
@LargeTest
class BaselineProfileGenerator {
    @get:Rule val rule = BaselineProfileRule()

    @Test
    fun generate() = rule.collect(
        packageName = "com.example.app",
        includeInStartupProfile = true,
    ) {
        // Cold start the app and walk through the most-used screens.
        startActivityAndWait()
        // Scroll the main feed (the typical first user action).
        device.findObject(By.res("main_feed")).fling(Direction.DOWN)
        device.waitForIdle()
        // Open the second-most-used surface (e.g., search).
        device.findObject(By.res("search_tab")).click()
        device.wait(Until.hasObject(By.res("search_input")), 5_000)
    }
}

What this gets right: includeInStartupProfile = true generates both a baseline profile (full session) and a startup profile (first-launch path with a tighter runtime budget); the generator walks an actual user journey; device.waitForIdle() ensures full capture. The senior taste call is what to include: cold start, the most-used scroll, the second-most-used surface. Skip rare paths (settings, debug screens) — they bloat the profile and reduce the AOT budget for hot paths. Re-run on every release where the startup path changed; stale profiles lose most of their win. The Chris Banes baseline-profile writeups are the canonical practitioner reference.

Compose performance: recomposition tracing + Layout Inspector

Compose recomposition is the dominant performance cost in modern Android UI. The mental model: when state changes, Compose re-invokes any composable whose inputs are not provably equal to the previous invocation. "Provably equal" means stable — a stable type with equal inputs skips; an unstable type forces the composable to recompose every time its parent does, even if nothing meaningful changed.

The senior bar in 2026 is fluency in three things: the Stable / Immutable annotation contract; the Layout Inspector recomposition counts (Android Studio Hedgehog+); and the Compose compiler's stability inference rules. The canonical fix is to mark a data class @Immutable when the compiler cannot infer it (typically because a property type is from another module):

// BEFORE: kotlin.List has mutable subtypes, so the compiler does NOT
// infer ViewItem as stable. Every parent recomposition forces
// ViewItemRow to recompose even when items are identical.
data class ViewItem(
    val id: String,
    val title: String,
    val tags: List,
)

// AFTER: @Immutable is a contract with the compiler — "this value
// never changes after construction." The compiler trusts the annotation.
@Immutable
data class ViewItem(
    val id: String,
    val title: String,
    val tags: ImmutableList,  // kotlinx.collections.immutable
)

@Composable
fun ViewItemRow(item: ViewItem, onClick: (String) -> Unit) {
    // Now skips recomposition when item and onClick are referentially equal.
}

What this code gets right: @Immutable is a stronger contract than @Stable — Immutable promises the value never changes after construction; ImmutableList from kotlinx-collections-immutable is inferred as stable; the lambda parameter onClick stays stable as long as it's remembered or passed from a remembered source.

The diagnostic loop: run the app under Layout Inspector, enable "Show recomposition counts", and look for composables whose count climbs faster than meaningful state changes. A row that recomposes 50 times during a scroll where its data hasn't changed is the smoking gun — find the unstable parameter, annotate it or replace it with a stable equivalent. The Compose performance guide covers the full pattern; Chris Banes's composable-metrics writeup is the canonical reference for the compiler's stability reports. Two more high-value patterns: derivedStateOf for state derived from other state (recomposes only when the derived value flips), and key() in lazy lists to give Compose a stable identity for items that move.

Startup performance: cold/warm/hot + App Startup library

Android distinguishes three startup states, each with a different baseline cost and a different fix:

  • Cold start — the process does not exist. The system forks Zygote, instantiates Application, runs every ContentProvider's onCreate(), runs every initializer in the App Startup library, inflates the first Activity, and runs first frame. This is the slowest and the one users notice — the Play Store badges apps that exceed the cold-start threshold.
  • Warm start — the process exists but the Activity does not. The system reuses the Application instance, recreates the Activity, and runs first frame. Warm start is roughly 30-50% of cold start time on a well-tuned app.
  • Hot start — the process and Activity both exist (the user backgrounded then resumed). The system brings the Activity to foreground and re-runs onResume + first frame. Hot start should be sub-100ms.

The Android Vitals launch-time thresholds (developer.android.com/topic/performance/vitals/launch-time) are the official bar: cold start under 5 seconds (Excessive at 5+), warm under 2, hot under 1.5. The senior bar is much tighter: cold under 1.5s on a P50 device (Pixel 6a-class), warm under 800ms, hot under 200ms.

The canonical Macrobenchmark test for cold startup, written against AndroidX Macrobenchmark 1.3+:

@RunWith(AndroidJUnit4::class)
@LargeTest
class StartupBenchmark {
    @get:Rule val rule = MacrobenchmarkRule()

    @Test
    fun coldStartupCompilationBaselineProfile() = rule.measureRepeated(
        packageName = "com.example.app",
        metrics = listOf(
            StartupTimingMetric(),
            FrameTimingMetric(),
            TraceSectionMetric("Application.onCreate"),
        ),
        iterations = 10,
        startupMode = StartupMode.COLD,
        compilationMode = CompilationMode.Partial(
            baselineProfileMode = BaselineProfileMode.Require,
        ),
    ) {
        pressHome()
        startActivityAndWait()
        device.wait(Until.hasObject(By.res("main_feed")), 5_000)
    }
}

What this test gets right: 10 iterations (Macrobenchmark reports p50/p95/p99 — one run is noise); explicit StartupMode.COLD (kills the process before each iteration); BaselineProfileMode.Require (fails the test if the profile is missing — prevents silent regression); StartupTimingMetric (canonical metric, reports time-to-initial-display and time-to-fully-drawn); TraceSectionMetric for a custom trace section (you instrument the code with Trace.beginSection / endSection and it surfaces in the trace and as a measured metric).

The companion API — App Startup library (androidx.startup) — is the modern replacement for the abused ContentProvider initialization pattern. Every third-party SDK that initialized via a stealth ContentProvider was paying the cold-start cost on every launch. App Startup consolidates them into one ContentProvider with a dependency graph; you control which initializers run eagerly and which run lazily. The senior pattern: defer everything that isn't needed for the first frame to a lazy initializer or a coroutine started from Application.onCreate, and instrument the eager path with Trace sections so you see it in Perfetto.

Perfetto + Macrobenchmark for production-perf debugging

Perfetto is the canonical Android system-trace tool in 2026 (perfetto.dev). It replaced Systrace and reads atrace events, ftrace events, Perfetto SDK events, and Android Choreographer frame events. Macrobenchmark captures Perfetto traces automatically — every test run produces a .perfetto-trace file you can drag into ui.perfetto.dev.

The senior reading pattern:

  • Frame track — the Choreographer row shows every frame with its category (App, GPU, Display). A red frame is a missed deadline; the segments inside show where the time went.
  • Main thread — long slices (longer than the frame budget) are jank candidates. Common offenders: synchronous I/O, JSON parsing, bitmap decoding, view inflation that should have been async.
  • Binder transactions — synchronous binder calls on the main thread are silent jank multipliers; a 5ms IPC blocks the next frame.
  • GPU work — if the main thread is fine but the frame still misses deadline, look at the GPU track for over-draw or shader compilation stalls.

The mid-tier P50 device matters more than the flagship. Senior teams ship a Macrobenchmark device matrix with a Pixel 6a-class mid-tier and a Pixel 8/9 Pro flagship. A regression invisible on the flagship is often what users see — flagship CPU and GPU mask code that's 4x slower on the P50.

The 120Hz frame budget tightens this further. At 60Hz the budget is 16.67ms; at 90Hz, 11.11ms; at 120Hz, 8.33ms. The senior pattern: profile the 120Hz budget on flagships (the OS refreshes at 120Hz when the app declares it can handle the rate via setFrameRate); profile the 60Hz budget on the P50. A composable taking 10ms is fine at 60Hz and dropping every other frame at 120Hz.

R8 full-mode (android.enableR8.fullMode=true, default in AGP 8.0+) is the senior bar for release builds — cross-module method inlining, class merging, aggressive enum-to-int conversion. The pairing with baseline profiles compounds: R8 produces smaller, faster bytecode, and the baseline profile pre-compiles the hottest of it. APK Analyzer is the canonical verification — open the release APK, check dex method count and resource sizes, confirm shrinking and obfuscation. The Macrobenchmark overview and the Chris Banes blog are the references for the full loop.

Frequently asked questions

Do baseline profiles still matter if my app uses Compose?
Especially if your app uses Compose. Compose has a large runtime that gets exercised heavily during startup; without a baseline profile the JIT spends the first few seconds compiling Compose internals on the user's device. Google's internal data and Chris Banes's writeups both show Compose-heavy apps get the largest baseline-profile wins (often 30-40% on cold startup). If you ship Compose, you ship a baseline profile. Period.
What's the difference between a baseline profile and a startup profile?
Both are AOT compilation hints, but the runtime treats them differently. A baseline profile lists methods to AOT-compile for the full session. A startup profile lists methods specifically for the first-launch path — the runtime applies it with a tighter budget and prioritizes those methods. The Baseline Profile Gradle Plugin generates both when you set includeInStartupProfile = true. The Android baseline-profiles docs cover the distinction.
How do I know if my Compose composable is recomposing too much?
Run the app under Layout Inspector with 'Show recomposition counts' enabled (Android Studio Hedgehog+). Each composable shows its recomposition count and skip count. A composable whose count climbs faster than its meaningful state changes is the offender — usually because a parameter type is unstable. Run the Compose compiler's stability metrics report (kotlinc -P plugin:androidx.compose.compiler.plugins.kotlin:reportsDestination=...) to see which classes the compiler inferred as unstable.
Should I use Microbenchmark or Macrobenchmark?
Both, for different questions. Microbenchmark for in-process measurements of small code units — sorting algorithms, parser routines, hot loops. Macrobenchmark for end-to-end user journeys — startup, scroll jank, navigation. The senior bar is to use Macrobenchmark for anything user-visible and Microbenchmark for hot-path optimization where you need nanosecond-resolution. Microbenchmark warms the JIT before measuring; Macrobenchmark explicitly controls compilation mode.
What is R8 full-mode and should I enable it?
R8 full-mode is the aggressive optimization mode for the R8 shrinker — cross-module method inlining, class merging, more aggressive dead-code elimination. It's enabled by default in AGP 8.0+ via android.enableR8.fullMode=true. The senior answer is yes, enable it, and pair it with baseline profiles. The compounding win is significant: smaller, faster bytecode that's also AOT-compiled on the hot path.
How do I read a Perfetto trace if I've never seen one?
Open ui.perfetto.dev, drag your .perfetto-trace file in. Find your app's process (search by package name). Look at the Choreographer frame track for red frames (missed deadlines), then drop into the main UI thread to find the long slice that caused the miss. Hover over slices to see their wall time and CPU time. Binder transactions and GPU work get their own tracks. The perfetto.dev docs have an interactive tutorial.
What's the right startup time target?
Android Vitals threshold is under 5 seconds for cold startup (anything more is flagged as Excessive in Play Console). The senior bar in 2026 is much tighter: under 1.5 seconds on a P50 mid-tier device (Pixel 6a class), under 1 second on a flagship. Hot startup should be sub-200ms regardless of device. Measure with Macrobenchmark, report the p95 (not the average — the average hides the user-felt tail).
How does 120Hz refresh affect the frame budget?
It halves it. At 60Hz the frame budget is 16.67ms; at 90Hz it's 11.11ms; at 120Hz it's 8.33ms. A composable that takes 10ms to compose is fine at 60Hz and dropping every other frame at 120Hz. The senior pattern is to profile against the 120Hz budget on flagships and the 60Hz budget on the P50 device — both budgets matter and they pull the codebase in different directions.
When should I use the App Startup library vs Application.onCreate?
App Startup library (androidx.startup) for SDK initialization that has a dependency graph and may be deferable. Application.onCreate for code that must run before any Activity starts and has no graph. The senior pattern is to consolidate every third-party SDK initializer that was abusing ContentProvider initialization into App Startup, mark non-critical ones as lazy, and only run eager initializers that the first frame actually needs.

Sources

  1. Android Developers — Baseline Profiles. Canonical guide for AOT compilation hints and the Baseline Profile Gradle Plugin.
  2. Android Developers — Macrobenchmark overview. Canonical reference for cold/warm/hot startup measurement and frame-timing metrics.
  3. Perfetto — System tracing for Android. Canonical viewer for atrace, ftrace, and Perfetto SDK events.
  4. Android Developers — Compose performance. Canonical guide for stability, recomposition, and Layout Inspector recomposition counts.
  5. Chris Banes blog. Practitioner reference for baseline profiles, Compose stability metrics, and Android performance tooling.
  6. Android Developers — App startup time (Vitals). Canonical reference for cold/warm/hot startup definitions and Play Console thresholds.

About the author. Blake Crosley founded ResumeGeni and writes about Android engineering, hiring technology, and ATS optimization. More writing at blakecrosley.com.