At Spotify I was on a team named Stellar. They're seated in the Quality Platform Product Area, which in turn is in the Client Platform Tribe. Our responsibility was to holistically look at app quality from a technical standpoint. In practice this made us "app owners" for the music app. In practice this meant we owned owned's Info.plist and various app-wide performance and reliability metrics.

Startup Instrumentation & iOS Prewarm

In modern versions of iOS Apple implemented the concept of "app prewarming" in Duet Activity Scheduler Daemon. App prewarming would preemptively launch an app if dasd thought the user would use it soon.

For example, if a user frequently launches Spotify throughout the day, dasd would prewarm Spotify whenever the device was idle and had available resources to launch the new process.

Sounds simple, right? Unfortunately on an implementation level, this wasn't so straightforward. The process would launch, and so the process' launch time would be set. Unfortunately, this causes a bug with most startup instrumentation: previous to the prewarming feature it was standard to use the process start time to calculate roughly how long the user had to wait after the initiated launching the Spotify app. After the new prewarming the process start time doesn't correlate to the user initiating an app launch.

Since, during prewarming, the process start time was before the user initiated app launch, this resulted in launch times skewed too long.

Unfortunately Apple has yet to reveal any details of prewarming and insists on users using MetricKit. Unfortunately this isn't as simple as they make it out to be: MetricKit doesn't reveal all the data that's useful for proper performance analysis. In order to solve this problem I reverse engineered how dasd works and discovered some attempted workarounds to fix the instrumentation.

MetricKit Diagnostic Reports

I collaborated with a coworker to implement MetricKit diagnostic report collection for internal employees. We aimed to use this infrastructure to diagnose performance/CPU/battery regressions the employees saw. Long term we hoped to roll this infra out to a random subset of users in order to better understand performance out in prod.

Early Quality Testing

It's a common problem that "quality" is hard to test for in mobile apps. Primarily, this is due to the fact that out in prod we see huge variance in device states: differing scheduler, memory & cache pressure. On top of this, testing is usually done on a specific "test" build of an app.

In order to improve detection of quality regressions I collaborated on the design of a new suite of "early quality" tests. These tests would take the exact build we would send to the app store (or out to employees for alpha/beta testing), run it through a set of product scenarios that were deemed most important by the business & collect various metrics externally from the app. It is important that these metrics were collected external to the app in order to best capture what the user will experience (i.e. they only see the UI of the app and can't manually call specific testing codepaths).

These product scenarios were designed to run against multiple quality tests; e.g. sanitizer tests, startup time tests, etc.. This testing infrastructure was designed as a platform in order to better grow Spotify's testing facilities.

Under Construction

Under Construction

Under Construction


Under Construction

Under Construction