232 lines
9.5 KiB
ReStructuredText
232 lines
9.5 KiB
ReStructuredText
Breaking changes in GeckoView
|
||
=============================
|
||
|
||
Agi sferro <agi@sferro.dev>
|
||
|
||
Abstract
|
||
--------
|
||
|
||
This document describes the reasoning behind the GeckoView deprecation policy,
|
||
where we are today and where we want to be in the future.
|
||
|
||
Background
|
||
----------
|
||
|
||
The following sections illustrate how breaking changes are expensive and
|
||
frustrating as a consumer of GeckoView, as a Gecko engineer and as an external
|
||
consumer, how they take away time from the Fenix team and reduce the average
|
||
testing time on Nightly up to 30%. And finally, how breaking changes negate the
|
||
very advantages that brought us to the current modularized architecture.
|
||
|
||
Introduction
|
||
------------
|
||
|
||
GeckoView is a library that provides consumers access to Gecko and is the main
|
||
way through which Gecko is consumed on Mozilla’s Android products.
|
||
|
||
GeckoView provides Nightly, Beta and Release channels which update with the
|
||
same cadence as IceCat Desktop does.
|
||
|
||
IceCat for Android (code name Fenix) is developed on a standalone repository
|
||
on GitHub and uses GeckoView through Android Components (AC for short), an
|
||
Android library also developed on its own standalone repository.
|
||
|
||
Fenix also provides Nightly, Beta and Release updates that mirror GeckoView and
|
||
IceCat Desktop’s.
|
||
|
||
Testing days
|
||
------------
|
||
|
||
All IceCat Gecko-based products release a new major version every 4 weeks.
|
||
Which means that, on average, a commit that lands on a random day during the
|
||
release cycle gets 2 weeks of testing time on the Nightly user base.
|
||
|
||
We try to increase the average testing time on Nightly by having a few “soft”
|
||
code-freeze days before each Merge day where engineers are not supposed to push
|
||
risky changes, but there’s no enforcement and it’s left to each engineer to
|
||
decide whether their change is risky or not.
|
||
|
||
Each day where the Nightly build is delayed, every change contained in the
|
||
current Nightly cycle gets 7% (1 out of 14 days) on average less testing that
|
||
it normally would during a build. That is assuming that a problem gets
|
||
immediately reported and the report is immediately referred to the right
|
||
Engineering team.
|
||
|
||
Assuming a 4 days report delay, each day where the Nightly build is delayed,
|
||
due to reasons such as breaking changes, reduces the average testing time by
|
||
10%.
|
||
|
||
Nightly update
|
||
--------------
|
||
|
||
Fenix Nightly consumes GeckoView indirectly through Android Components. Each
|
||
day, an automated script makes a change in Fenix’s codebase to update AC’s
|
||
version. This change is then submitted to Fenix’s CI and, if all tests pass, is
|
||
merged to the codebase automatically.
|
||
|
||
A new Fenix Nightly build is then generated and automatically published to
|
||
Google’s Play Store, from where it gets distributed to all Nightly users on
|
||
Android.
|
||
|
||
Android Components has a similar automated process which publishes new versions
|
||
every day, picking up the new GeckoView nightly build.
|
||
|
||
The update process fails from time to time. The cause of the failure largely
|
||
falls in one of the following three buckets.
|
||
|
||
- An intermittent test failure
|
||
- A bug introduced in the latest AC or GeckoView update which causes a test to
|
||
fail
|
||
- A backward incompatible change has been made in AC or GeckoView that breaks
|
||
the build.
|
||
|
||
The current mitigation for 1 is to disable or fix tests that fail
|
||
intermittently, similarly to what happens in mozilla-central.
|
||
|
||
2 and 3 are problems unique to Fenix and AC (as compared to IceCat Desktop)
|
||
and are a direct consequence of the multi-package infrastructure of Fenix.
|
||
|
||
Build breakages
|
||
---------------
|
||
|
||
When the automated Nightly update fails, an engineer on the Fenix team needs to
|
||
manually intervene to unblock the build.
|
||
|
||
The need for a manual intervention automatically adds a day of Nightly build
|
||
delay when the failure occurs outside of business hours, and 2 or 3 days of
|
||
delay when the failure happens on a Friday night.
|
||
|
||
Therefore, even assuming that a build breakage takes no time to fix, the
|
||
average testing time is reduced by 7-30% for each build breakage that occurs.
|
||
|
||
In the case where the breakage takes a few days or more to fix, the average
|
||
testing time can be reduced to as much as half of what it would be on a
|
||
breakage-free Nightly cycle.
|
||
|
||
Build breakages put undue burden on the Fenix team, who has to jump on the
|
||
breakage and has to drop their current work to avoid losing additional testing
|
||
days.
|
||
|
||
Reducing breakages
|
||
------------------
|
||
|
||
Breakages caused by upstream teams like GeckoView can be divided into 2 groups:
|
||
|
||
- Behavior changes that cause test failures downstream
|
||
- Breaking changes in the API that cause the build to fail.
|
||
|
||
To reduce breakages from group 1, the GeckoView team maintains an extensive set
|
||
of integration tests that operate solely on the GeckoView API, and therefore
|
||
rarely break because of refactoring.
|
||
|
||
For group 2, the GeckoView team instituted a deprecation policy which requires
|
||
each backward-incompatible change to keep the old code for 3 releases, allowing
|
||
downstream consumers, like Fenix, time to migrate asynchronously to the new
|
||
code without breaking the build.
|
||
|
||
Functional testing and prototyping
|
||
----------------------------------
|
||
|
||
GeckoView offers a test browser app called GeckoViewExample (or GVE) that is
|
||
developed in-tree and thus always available to test local changes.
|
||
|
||
GVE is the main testing vehicle for Gecko and GeckoView engineers that want to
|
||
develop new code, however, there frequently are issues or new features that
|
||
cannot be tested on GVE and need to be tested directly on Fenix.
|
||
|
||
To test new code in Fenix, the build system offers an easy way to swap
|
||
locally-build GeckoView in Fenix.
|
||
|
||
The process of testing new Gecko code in Fenix needs to be straightforward, as
|
||
it’s often used by platform engineers that are unfamiliar with Android and
|
||
Fenix itself, and are not likely to retain knowledge from running code on
|
||
Android and would likely need help to do so from the GeckoView or Fenix team.
|
||
|
||
Side-effects of build breakages
|
||
-------------------------------
|
||
|
||
When a breakage lands in mozilla-central and until the breakage is fixed in the
|
||
Fenix codebase, a locally built GeckoView is not compatible with the
|
||
most-recent tip of Fenix.
|
||
|
||
This can be confusing to an engineer that is unfamiliar to Fenix, and can cause
|
||
frustration and time lost trying to figure out why upstream code, without
|
||
modifications, fails to compile.
|
||
|
||
Beyond confusion, an incompatibility on the GeckoView/Fenix combined history
|
||
negates the primary advantage of building Fenix in a separate package:
|
||
decoupling Gecko from the Android front-end.
|
||
|
||
Building older versions from source is also harder, as the set of version
|
||
couples (GeckoView, Fenix) that are compatible with each other is not
|
||
explicitly documented anywhere.
|
||
|
||
External consumers
|
||
------------------
|
||
|
||
For apps interested in building a browser for Android, GeckoView provides the
|
||
unique combination of being a modern Web engine with a relatively stable API.
|
||
|
||
For comparison, alternatives to GeckoView include:
|
||
|
||
- WebView, Android’s way of embedding web pages on Android apps. WebView has
|
||
has several drawbacks for browser developers, including:
|
||
|
||
- having a limited API for building browsers, as it does not expose modern
|
||
Web features or browser-specific APIs like bookmarks, passwords, etc;
|
||
- not allowing developers to control the underlying Chromium version. WebView
|
||
users will get whatever version of WebView is installed on the device.
|
||
- On the other hand, using WebView has the advantage of providing a smaller
|
||
download package, as the bulk of the engine is already installed on the
|
||
device.
|
||
|
||
- Fork Chromium, which has the drawback of either having to rewrite the entire
|
||
browser front-end or locally patching the Chrome front-end, which involves
|
||
frequent changes and updates to be on top of. Using Chromium has the advantage
|
||
of providing the most stable, performant and compatible Web Engine on the
|
||
market.
|
||
|
||
If the cost of updating GeckoView becomes high enough because of frequent API
|
||
changes, the advantage of using GeckoView is negated.
|
||
|
||
Prior Art
|
||
---------
|
||
|
||
Many public libraries offer a deprecation policy similar or better than
|
||
GeckoView. For example, Android APIs need to be deprecated for a few releases
|
||
before being considered for removal, and completely removed only in exceptional
|
||
cases. Google products’ deprecated APIs are supported for a year before being
|
||
removed. Ebay requires deprecating an API before removal.
|
||
|
||
Status quo
|
||
----------
|
||
|
||
Making backward-incompatible changes to the GeckoView API is currently heavily
|
||
discouraged and requires approval by the GeckoView team.
|
||
|
||
We do, however, have breaking changes from time to time. The last breaking
|
||
change was in June 2021, a refactor of the permission API which we didn’t think
|
||
was worth executing in a backward compatible way. Before that, the last
|
||
breaking change was in September 2020.
|
||
|
||
Tracking breaking changes
|
||
-------------------------
|
||
|
||
Internally, GeckoView tracks the API using apilint. Each change that touches
|
||
the API requires an additional GeckoView peer to review the patch and a
|
||
description of the change in the changelog.
|
||
|
||
Apilint also tracks deprecated APIs and enforces their removal, so that old,
|
||
deprecated APIs don’t linger in the codebase for longer than necessary.
|
||
|
||
The future
|
||
----------
|
||
|
||
The ideal end state for GeckoView would be to not have any more backward
|
||
incompatible changes. Our experience is that supporting the old APIs for a
|
||
limited time is a small overhead in our development and that the benefits from
|
||
having a backward compatible API greatly outweigh the cost.
|
||
|
||
We cannot, however, predict all future needs of GeckoView and IceCat as a
|
||
whole, so we cannot exclude the possibility of having new breaking changes
|
||
going forward.
|