Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 1 | # Chrome OS Update Process |
| 2 | |
| 3 | [TOC] |
| 4 | |
| 5 | System updates in more modern operating systems like Chrome OS and Android are |
| 6 | called A/B updates, over-the-air ([OTA]) updates, seamless updates, or simply |
| 7 | auto updates. In contrast to more primitive system updates (like Windows or |
| 8 | macOS) where the system is booted into a special mode to override the system |
| 9 | partitions with newer updates and may take several minutes or hours, A/B updates |
| 10 | have several advantages including but not limited to: |
| 11 | |
| 12 | * Updates maintain a workable system that remains on the disk during and after |
| 13 | an update. Hence, reducing the likelihood of corrupting a device into a |
| 14 | non-usable state. And reducing the need for flashing devices manually or at |
| 15 | repair and warranty centers, etc. |
| 16 | * Updates can happen while the system is running (normally with minimum |
| 17 | overhead) without interrupting the user. The only downside for users is a |
| 18 | required reboot (or, in Chrome OS, a sign out which automatically causes a |
| 19 | reboot if an update was performed where the reboot duration is about 10 |
| 20 | seconds and is no different than a normal reboot). |
| 21 | * The user does not need (although they can) to request for an update. The |
| 22 | update checks happen periodically in the background. |
| 23 | * If the update fails to apply, the user is not affected. The user will |
| 24 | continue on the old version of the system and the system will attempt to |
| 25 | apply the update again at a later time. |
| 26 | * If the update applies correctly but fails to boot, the system will rollback |
| 27 | to the old partition and the user can still use the system as usual. |
| 28 | * The user does not need to reserve enough space for the update. The system |
| 29 | has already reserved enough space in terms of two copies (A and B) of a |
| 30 | partition. The system doesn’t even need any cache space on the disk, |
| 31 | everything happens seamlessly from network to memory to the inactive |
| 32 | partitions. |
| 33 | |
| 34 | ## Life of an A/B Update |
| 35 | |
| 36 | In A/B update capable systems, each partition, such as the kernel or root (or |
| 37 | other artifacts like [DLC]), has two copies. We call these two copies active (A) |
| 38 | and inactive (B). The system is booted into the active partition (depending on |
| 39 | which copy has the higher priority at boot time) and when a new update is |
| 40 | available, it is written into the inactive partition. After a successful reboot, |
| 41 | the previously inactive partition becomes active and the old active partition |
| 42 | becomes inactive. |
| 43 | |
Kelvin Zhang | 0e00a0d | 2021-10-27 14:12:28 -0700 | [diff] [blame] | 44 | ### Generation |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 45 | |
Kelvin Zhang | 0e00a0d | 2021-10-27 14:12:28 -0700 | [diff] [blame] | 46 | But everything starts with generating OTA packages on (Google) servers for |
| 47 | each new system image. This is done by calling |
| 48 | [ota_from_target_files](https://cs.android.com/android/platform/superproject/+/master:build/make/tools/releasetools/ota_from_target_files.py) |
| 49 | with source and destination builds. This script requires target_file.zip to work, |
| 50 | image files are not sufficient. |
| 51 | |
| 52 | ### Distribution/Configuration |
| 53 | Once the OTA packages are generated, they are signed with specific keys |
| 54 | and stored in a location known to an update server (GOTA). |
| 55 | GOTA will then make this OTA package accessible via a public URL. Optionally, |
| 56 | operators an choose to make this OTA update available only to a specific |
| 57 | subset of devices. |
| 58 | |
| 59 | ### Installation |
| 60 | When the device's updater client initiates an update (either periodically or user |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 61 | initiated), it first consults different device policies to see if the update |
| 62 | check is allowed. For example, device policies can prevent an update check |
| 63 | during certain times of a day or they require the update check time to be |
| 64 | scattered throughout the day randomly, etc. |
| 65 | |
| 66 | Once policies allow for the update check, the updater client sends a request to |
| 67 | the update server (all this communication happens over HTTPS) and identifies its |
Kelvin Zhang | 0e00a0d | 2021-10-27 14:12:28 -0700 | [diff] [blame] | 68 | parameters like its Application ID, hardware ID, version, board, etc. |
| 69 | |
| 70 | Some policities on the server might prevent the device from getting specific |
| 71 | OTA updates, these server side policities are often set by operators. For |
| 72 | example, the operator might want to deliver a beta version of software to only |
| 73 | a subset of devices. |
| 74 | |
| 75 | But if the update server decides to serve an update payload, it will respond |
| 76 | with all the parameters needed to perform an update like the URLs to download the |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 77 | payloads, the metadata signatures, the payload size and hash, etc. The updater |
| 78 | client continues communicating with the update server after different state |
| 79 | changes, like reporting that it started to download the payload or it finished |
| 80 | the update, or reports that the update failed with specific error codes, etc. |
| 81 | |
Kelvin Zhang | 0e00a0d | 2021-10-27 14:12:28 -0700 | [diff] [blame] | 82 | The device will then proceed to actually installing the OTA update. This consists |
| 83 | of roughly 3 steps. |
| 84 | #### Download & Install |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 85 | Each payload consists of two main sections: metadata and extra data. The |
| 86 | metadata is basically a list of operations that should be performed for an |
| 87 | update. The extra data contains the data blobs needed by some or all of these |
| 88 | operations. The updater client first downloads the metadata and |
| 89 | cryptographically verifies it using the provided signatures from the update |
| 90 | server’s response. Once the metadata is verified as valid, the rest of the |
| 91 | payload can easily be verified cryptographically (mostly through SHA256 hashes). |
| 92 | |
| 93 | Next, the updater client marks the inactive partition as unbootable (because it |
| 94 | needs to write the new updates into it). At this point the system cannot |
| 95 | rollback to the inactive partition anymore. |
| 96 | |
| 97 | Then, the updater client performs the operations defined in the metadata (in the |
| 98 | order they appear in the metadata) and the rest of the payload is gradually |
| 99 | downloaded when these operations require their data. Once an operation is |
| 100 | finished its data is discarded. This eliminates the need for caching the entire |
| 101 | payload before applying it. During this process the updater client periodically |
| 102 | checkpoints the last operation performed so in the event of failure or system |
| 103 | shutdown, etc. it can continue from the point it missed without redoing all |
| 104 | operations from the beginning. |
| 105 | |
| 106 | During the download, the updater client hashes the downloaded bytes and when the |
| 107 | download finishes, it checks the payload signature (located at the end of the |
| 108 | payload). If the signature cannot be verified, the update is rejected. |
| 109 | |
Kelvin Zhang | 0e00a0d | 2021-10-27 14:12:28 -0700 | [diff] [blame] | 110 | #### Hash Verification & Verity Computation |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 111 | |
Kelvin Zhang | 0e00a0d | 2021-10-27 14:12:28 -0700 | [diff] [blame] | 112 | After the inactive partition is updated, the updater client will compute |
| 113 | Forward-Error-Correction(also known as FEC, Verity) code for each partition, |
| 114 | and wriee the computed verity data to inactive partitions. In some updates, |
| 115 | verity data is included in the extra data, so this step will be skipped. |
| 116 | |
| 117 | Then, the entire partition is re-read, hashed and compared to a hash value |
| 118 | passed in the metadata to make sure the update was successfully written into |
| 119 | the partition. Hash computed in this step includes the verity code written in |
| 120 | last step. |
| 121 | |
| 122 | #### Postintall |
| 123 | |
| 124 | In the next step, the [Postinstall] scripts (if any) is called. From OTA's perspective, |
| 125 | these postinstall scripts are just blackboxes. Usually postinstall scripts will optimize |
| 126 | existings apps on the phone and run file system garbage collection, so that device can boot |
| 127 | fast after OTA. But these are managed by other teams. |
| 128 | |
| 129 | #### Finishing Touches |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 130 | |
| 131 | Then the updater client goes into a state that identifies the update has |
| 132 | completed and the user needs to reboot the system. At this point, until the user |
| 133 | reboots (or signs out), the updater client will not do any more system updates |
| 134 | even if newer updates are available. However, it does continue to perform |
| 135 | periodic update checks so we can have statistics on the number of active devices |
| 136 | in the field. |
| 137 | |
| 138 | After the update proved successful, the inactive partition is marked to have a |
| 139 | higher priority (on a boot, a partition with higher priority is booted |
| 140 | first). Once the user reboots the system, it will boot into the updated |
Kelvin Zhang | 0e00a0d | 2021-10-27 14:12:28 -0700 | [diff] [blame] | 141 | partition and it is marked as active. At this point, after the reboot, the |
| 142 | [update_verifier](https://cs.android.com/android/platform/superproject/+/master:bootable/recovery/update_verifier/) |
| 143 | program runs, read all dm-verity devices to make sure the partitions aren't corrupted, |
| 144 | then mark the update as successful. |
| 145 | |
| 146 | A/B updates are considered completed at this point. Virtual A/B updates will have an |
| 147 | additional step after this, called "merging". Merging usually takes few minutes, after that |
| 148 | Virtual A/B updates are considered complete. |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 149 | |
| 150 | ## Update Engine Daemon |
| 151 | |
| 152 | The `update_engine` is a single-threaded daemon process that runs all the |
| 153 | times. This process is the heart of the auto updates. It runs with lower |
| 154 | priorities in the background and is one of the last processes to start after a |
Kelvin Zhang | 0e00a0d | 2021-10-27 14:12:28 -0700 | [diff] [blame] | 155 | system boot. Different clients (like GMS Core or other services) can send requests |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 156 | for update checks to the update engine. The details of how requests are passed |
| 157 | to the update engine is system dependent, but in Chrome OS it is D-Bus. Look at |
Kelvin Zhang | 0e00a0d | 2021-10-27 14:12:28 -0700 | [diff] [blame] | 158 | the [D-Bus interface] for a list of all available methods. On Android it is binder. |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 159 | |
| 160 | There are many resiliency features embedded in the update engine that makes auto |
| 161 | updates robust including but not limited to: |
| 162 | |
| 163 | * If the update engine crashes, it will restart automatically. |
| 164 | * During an active update it periodically checkpoints the state of the update |
| 165 | and if it fails to continue the update or crashes in the middle, it will |
| 166 | continue from the last checkpoint. |
| 167 | * It retries failed network communication. |
| 168 | * If it fails to apply a delta payload (due to bit changes on the active |
| 169 | partition) for a few times, it switches to full payload. |
| 170 | |
| 171 | The updater clients writes its active preferences in |
Kelvin Zhang | 0e00a0d | 2021-10-27 14:12:28 -0700 | [diff] [blame] | 172 | `/data/misc/update_engine/prefs`. These preferences help with tracking changes |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 173 | during the lifetime of the updater client and allows properly continuing the |
| 174 | update process after failed attempts or crashes. |
| 175 | |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 176 | |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 177 | |
| 178 | ### Interactive vs Non-Interactive vs. Forced Updates |
| 179 | |
| 180 | Non-interactive updates are updates that are scheduled periodically by the |
| 181 | update engine and happen in the background. Interactive updates, on the other |
| 182 | hand, happen when a user specifically requests an update check (e.g. by clicking |
| 183 | on “Check For Update” button in Chrome OS’s About page). Depending on the update |
Andrew Lassalle | d04ca0c | 2019-11-18 11:33:57 -0800 | [diff] [blame] | 184 | server's policies, interactive updates have higher priority than non-interactive |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 185 | updates (by carrying marker hints). They may decide to not provide an update if |
| 186 | they have busy server load, etc. There are other internal differences between |
| 187 | these two types of updates too. For example, interactive updates try to install |
| 188 | the update faster. |
| 189 | |
| 190 | Forced updates are similar to interactive updates (initiated by some kind of |
| 191 | user action), but they can also be configured to act as non-interactive. Since |
| 192 | non-interactive updates happen periodically, a forced-non-interactive update |
| 193 | causes a non-interactive update at the moment of the request, not at a later |
| 194 | time. We can call a forced non-interactive update with: |
| 195 | |
| 196 | ```bash |
| 197 | update_engine_client --interactive=false --check_for_update |
| 198 | ``` |
| 199 | |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 200 | ### Network |
| 201 | |
| 202 | The updater client has the capability to download the payloads using Ethernet, |
| 203 | WiFi, or Cellular networks depending on which one the device is connected |
| 204 | to. Downloading over Cellular networks will prompt permission from the user as |
| 205 | it can consume a considerable amount of data. |
| 206 | |
| 207 | ### Logs |
| 208 | |
| 209 | In Chrome OS the `update_engine` logs are located in `/var/log/update_engine` |
| 210 | directory. Whenever `update_engine` starts, it starts a new log file with the |
| 211 | current data-time format in the log file’s name |
| 212 | (`update_engine.log-DATE-TIME`). Many log files can be seen in |
| 213 | `/var/log/update_engine` after a few restarts of the update engine or after the |
| 214 | system reboots. The latest active log is symlinked to |
| 215 | `/var/log/update_engine.log`. |
| 216 | |
Kelvin Zhang | 0e00a0d | 2021-10-27 14:12:28 -0700 | [diff] [blame] | 217 | In Android the `update_engine` logs are located in `/data/misc/update_engine_log`. |
| 218 | |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 219 | ## Update Payload Generation |
| 220 | |
| 221 | The update payload generation is the process of converting a set of |
| 222 | partitions/files into a format that is both understandable by the updater client |
Andrew Lassalle | d04ca0c | 2019-11-18 11:33:57 -0800 | [diff] [blame] | 223 | (especially if it's a much older version) and is securely verifiable. This |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 224 | process involves breaking the input partitions into smaller components and |
| 225 | compressing them in order to help with network bandwidth when downloading the |
| 226 | payloads. |
| 227 | |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 228 | `delta_generator` is a tool with a wide range of options for generating |
| 229 | different types of update payloads. Its code is located in |
| 230 | `update_engine/payload_generator`. This directory contains all the source code |
| 231 | related to mechanics of generating an update payload. None of the files in this |
| 232 | directory should be included or used in any other library/executable other than |
| 233 | the `delta_generator` which means this directory does not get compiled into the |
| 234 | rest of the update engine tools. |
| 235 | |
Kelvin Zhang | 0e00a0d | 2021-10-27 14:12:28 -0700 | [diff] [blame] | 236 | However, it is not recommended to use `delta_generator` directly, as it has way |
| 237 | too many flags. Wrappers like [ota_from_target_files](https://cs.android.com/android/platform/superproject/+/master:build/make/tools/releasetools/ota_from_target_files.py) |
| 238 | or [OTA Generator](https://github.com/google/ota-generator) should be used. |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 239 | |
| 240 | ### Update Payload File Specification |
| 241 | |
| 242 | Each update payload file has a specific structure defined in the table below: |
| 243 | |
Kelvin Zhang | 0e00a0d | 2021-10-27 14:12:28 -0700 | [diff] [blame] | 244 | | Field | Size (bytes) | Type | Description | |
| 245 | | ----------------------- | ------------ | ------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------- | |
| 246 | | Magic Number | 4 | char[4] | Magic string "CrAU" identifying this is an update payload. | |
| 247 | | Major Version | 8 | uint64 | Payload major version number. | |
| 248 | | Manifest Size | 8 | uint64 | Manifest size in bytes. | |
| 249 | | Manifest Signature Size | 4 | uint32 | Manifest signature blob size in bytes (only in major version 2). | |
| 250 | | Manifest | Varies | [DeltaArchiveManifest] | The list of operations to be performed. | |
| 251 | | Manifest Signature | Varies | [Signatures] | The signature of the first five fields. There could be multiple signatures if the key has changed. | |
| 252 | | Payload Data | Varies | List of raw or compressed data blobs | The list of binary blobs used by operations in the metadata. | |
| 253 | | Payload Signature Size | Varies | uint64 | The size of the payload signature. | |
| 254 | | Payload Signature | Varies | [Signatures] | The signature of the entire payload except the metadata signature. There could be multiple signatures if the key has changed. | |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 255 | |
| 256 | ### Delta vs. Full Update Payloads |
| 257 | |
| 258 | There are two types of payload: Full and Delta. A full payload is generated |
| 259 | solely from the target image (the image we want to update to) and has all the |
| 260 | data necessary to update the inactive partition. Hence, full payloads can be |
| 261 | quite large in size. A delta payload, on the other hand, is a differential |
| 262 | update generated by comparing the source image (the active partitions) and the |
| 263 | target image and producing the diffs between these two images. It is basically a |
| 264 | differential update similar to applications like `diff` or `bsdiff`. Hence, |
| 265 | updating the system using the delta payloads requires the system to read parts |
| 266 | of the active partition in order to update the inactive partition (or |
| 267 | reconstruct the target partition). The delta payloads are significantly smaller |
| 268 | than the full payloads. The structure of the payload is equal for both types. |
| 269 | |
| 270 | Payload generation is quite resource intensive and its tools are implemented |
| 271 | with high parallelism. |
| 272 | |
| 273 | #### Generating Full Payloads |
| 274 | |
| 275 | A full payload is generated by breaking the partition into 2MiB (configurable) |
| 276 | chunks and either compressing them using bzip2 or XZ algorithms or keeping it as |
| 277 | raw data depending on which produces smaller data. Full payloads are much larger |
| 278 | in comparison to delta payloads hence require longer download time if the |
| 279 | network bandwidth is limited. On the other hand, full payloads are a bit faster |
| 280 | to apply because the system doesn’t need to read data from the source partition. |
| 281 | |
| 282 | #### Generating Delta Payloads |
| 283 | |
| 284 | Delta payloads are generated by looking at both the source and target images |
| 285 | data on a file and metadata basis (more precisely, the file system level on each |
| 286 | appropriate partition). The reason we can generate delta payloads is that Chrome |
| 287 | OS partitions are read only. So with high certainty we can assume the active |
| 288 | partitions on the client’s device is bit-by-bit equal to the original partitions |
| 289 | generated in the image generation/signing phase. The process for generating a |
| 290 | delta payload is roughly as follows: |
| 291 | |
| 292 | 1. Find all the zero-filled blocks on the target partition and produce `ZERO` |
| 293 | operation for them. `ZERO` operation basically discards the associated |
| 294 | blocks (depending on the implementation). |
| 295 | 2. Find all the blocks that have not changed between the source and target |
| 296 | partitions by directly comparing one-to-one source and target blocks and |
| 297 | produce `SOURCE_COPY` operation. |
| 298 | 3. List all the files (and their associated blocks) in the source and target |
| 299 | partitions and remove blocks (and files) which we have already generated |
| 300 | operations for in the last two steps. Assign the remaining metadata (inodes, |
| 301 | etc) of each partition as a file. |
| 302 | 4. If a file is new, generate a `REPLACE`, `REPLACE_XZ`, or `REPLACE_BZ` |
| 303 | operation for its data blocks depending on which one generates a smaller |
| 304 | data blob. |
| 305 | 5. For each other file, compare the source and target blocks and produce a |
| 306 | `SOURCE_BSDIFF` or `PUFFDIFF` operation depending on which one generates a |
| 307 | smaller data blob. These two operations produce binary diffs between a |
| 308 | source and target data blob. (Look at [bsdiff] and [puffin] for details of |
| 309 | such binary differential programs!) |
| 310 | 6. Sort the operations based on their target partitions’ block offset. |
| 311 | 7. Optionally merge same or similar operations next to each other into larger |
| 312 | operations for better efficiency and potentially smaller payloads. |
| 313 | |
| 314 | Full payloads can only contain `REPLACE`, `REPLACE_BZ`, and `REPLACE_XZ` |
| 315 | operations. Delta payloads can contain any operations. |
| 316 | |
| 317 | ### Major and Minor versions |
| 318 | |
| 319 | The major and minor versions specify the update payload file format and the |
| 320 | capability of the updater client to accept certain types of update payloads |
| 321 | respectively. These numbers are [hard coded] in the updater client. |
| 322 | |
| 323 | Major version is basically the update payload file version specified in the |
| 324 | [update payload file specification] above (second field). Each updater client |
| 325 | supports a range of major versions. Currently, there are only two major |
| 326 | versions: 1, and 2. And both Chrome OS and Android are on major version 2 (major |
| 327 | version 1 is being deprecated). Whenever there are new additions that cannot be |
| 328 | fitted in the [Manifest protobuf], we need to uprev the major version. Upreving |
| 329 | major version should be done with utmost care because older clients do not know |
| 330 | how to handle the newer versions. Any major version uprev in Chrome OS should be |
| 331 | associated with a GoldenEye stepping stone. |
| 332 | |
| 333 | Minor version defines the capability of the updater client to accept certain |
| 334 | operations or perform certain actions. Each updater client supports a range of |
| 335 | minor versions. For example, the updater client with minor version 4 (or less) |
| 336 | does not know how to handle a `PUFFDIFF` operation. So when generating a delta |
| 337 | payload for an image which has an updater client with minor version 4 (or less) |
| 338 | we cannot produce PUFFDIFF operation for it. The payload generation process |
| 339 | looks at the source image’s minor version to decide the type of operations it |
| 340 | supports and only a payload that confirms to those restrictions. Similarly, if |
| 341 | there is a bug in a client with a specific minor version, an uprev in the minor |
| 342 | version helps with avoiding to generate payloads that cause that bug to |
| 343 | manifest. However, upreving minor versions is quite expensive too in terms of |
| 344 | maintainability and it can be error prone. So one should practice caution when |
| 345 | making such a change. |
| 346 | |
| 347 | Minor versions are irrelevant in full payloads. Full payloads should always be |
| 348 | able to be applied for very old clients. The reason is that the updater clients |
| 349 | may not send their current version, so if we had different types of full |
| 350 | payloads, we would not have known which version to serve to the client. |
| 351 | |
| 352 | ### Signed vs Unsigned Payloads |
| 353 | |
| 354 | Update payloads can be signed (with private/public key pairs) for use in |
| 355 | production or be kept unsigned for use in testing. Tools like `delta_generator` |
| 356 | help with generating metadata and payload hashes or signing the payloads given |
| 357 | private keys. |
| 358 | |
| 359 | ## update_payload Scripts |
| 360 | |
Andrew Lassalle | d04ca0c | 2019-11-18 11:33:57 -0800 | [diff] [blame] | 361 | [update_payload] contains a set of python scripts used mostly to validate |
| 362 | payload generation and application. We normally test the update payloads using |
| 363 | an actual device (live tests). [`brillo_update_payload`] script can be used to |
| 364 | generate and test applying of a payload on a host device machine. These tests |
| 365 | can be viewed as dynamic tests without the need for an actual device. Other |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 366 | `update_payload` scripts (like [`check_update_payload`]) can be used to |
| 367 | statically check that a payload is in the correct state and its application |
| 368 | works correctly. These scripts actually apply the payload statically without |
| 369 | running the code in payload_consumer. |
| 370 | |
| 371 | ## Postinstall |
| 372 | |
| 373 | [Postinstall] is a process called after the updater client writes the new image |
| 374 | artifacts to the inactive partitions. One of postinstall's main responsibilities |
| 375 | is to recreate the dm-verity tree hash at the end of the root partition. Among |
| 376 | other things, it installs new firmware updates or any board specific |
| 377 | processes. Postinstall runs in separate chroot inside the newly installed |
| 378 | partition. So it is quite separated from the rest of the active running |
| 379 | system. Anything that needs to be done after an update and before the device is |
| 380 | rebooted, should be implemented inside the postinstall. |
| 381 | |
| 382 | ## Building Update Engine |
| 383 | |
| 384 | You can build `update_engine` the same as other platform applications: |
| 385 | |
Kelvin Zhang | 0e00a0d | 2021-10-27 14:12:28 -0700 | [diff] [blame] | 386 | ### Setup |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 387 | |
Kelvin Zhang | 0e00a0d | 2021-10-27 14:12:28 -0700 | [diff] [blame] | 388 | Run these commands at top of Android repository before building anything. |
| 389 | You only need to do this once per shell. |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 390 | |
Kelvin Zhang | 0e00a0d | 2021-10-27 14:12:28 -0700 | [diff] [blame] | 391 | * `source build/envsetup.sh` |
| 392 | * `lunch aosp_cf_x86_64_only_phone-userdebug` (Or replace aosp_cf_x86_64_only_phone-userdebug with your own target) |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 393 | |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 394 | |
Kelvin Zhang | 0e00a0d | 2021-10-27 14:12:28 -0700 | [diff] [blame] | 395 | ### Building |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 396 | |
Kelvin Zhang | 0e00a0d | 2021-10-27 14:12:28 -0700 | [diff] [blame] | 397 | `m update_engine update_engine_client delta_generator` |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 398 | |
| 399 | ## Running Unit Tests |
| 400 | |
| 401 | [Running unit tests similar to other platforms]: |
| 402 | |
Kelvin Zhang | 0e00a0d | 2021-10-27 14:12:28 -0700 | [diff] [blame] | 403 | * `atest update_engine_unittests` You will need a device connected to |
| 404 | your laptop and accessible via ADB to do this. Cuttlefish works as well. |
| 405 | * `atest update_engine_host_unittests` Run a subset of tests on host, no device |
| 406 | required. |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 407 | |
| 408 | ## Initiating a Configured Update |
| 409 | |
| 410 | There are different methods to initiate an update: |
| 411 | |
| 412 | * Click on the “Check For Update” button in setting’s About page. There is no |
| 413 | way to configure this way of update check. |
Kelvin Zhang | 0e00a0d | 2021-10-27 14:12:28 -0700 | [diff] [blame] | 414 | * Use the [`scripts/update_device.py`] program and pass a path to your OTA zip file. |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 415 | |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 416 | |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 417 | |
| 418 | ## Note to Developers and Maintainers |
| 419 | |
| 420 | When changing the update engine source code be extra careful about these things: |
| 421 | |
| 422 | ### Do NOT Break Backward Compatibility |
| 423 | |
| 424 | At each release cycle we should be able to generate full and delta payloads that |
| 425 | can correctly be applied to older devices that run older versions of the update |
| 426 | engine client. So for example, removing or not passing arguments in the metadata |
| 427 | proto file might break older clients. Or passing operations that are not |
| 428 | understood in older clients will break them. Whenever changing anything in the |
| 429 | payload generation process, ask yourself this question: Would it work on older |
| 430 | clients? If not, do I need to control it with minor versions or any other means. |
| 431 | |
| 432 | Especially regarding enterprise rollback, a newer updater client should be able |
| 433 | to accept an older update payload. Normally this happens using a full payload, |
| 434 | but care should be taken in order to not break this compatibility. |
| 435 | |
| 436 | ### Think About The Future |
| 437 | |
| 438 | When creating a change in the update engine, think about 5 years from now: |
| 439 | |
| 440 | * How can the change be implemented that five years from now older clients |
| 441 | don’t break? |
| 442 | * How is it going to be maintained five years from now? |
| 443 | * How can it make it easier for future changes without breaking older clients |
| 444 | or incurring heavy maintenance costs? |
| 445 | |
| 446 | ### Prefer Not To Implement Your Feature In The Updater Client |
| 447 | If a feature can be implemented from server side, Do NOT implement it in the |
| 448 | client updater. Because the client updater can be fragile at points and small |
| 449 | mistakes can have catastrophic consequences. For example, if a bug is introduced |
| 450 | in the updater client that causes it to crash right before checking for update |
| 451 | and we can't quite catch this bug early in the release process, then the |
| 452 | production devices which have already moved to the new buggy system, may no |
| 453 | longer receive automatic updates anymore. So, always think if the feature is |
| 454 | being implemented can be done form the server side (with potentially minimal |
| 455 | changes to the client updater)? Or can the feature be moved to another service |
| 456 | with minimal interface to the updater client. Answering these questions will pay |
| 457 | off greatly in the future. |
| 458 | |
| 459 | ### Be Respectful Of Other Code Bases |
| 460 | |
Kelvin Zhang | 0e00a0d | 2021-10-27 14:12:28 -0700 | [diff] [blame] | 461 | ~~The current update engine code base is used in many projects like Android.~~~ |
| 462 | |
| 463 | The Android and ChromeOS codebase have officially diverged. |
| 464 | |
| 465 | We sync the code base among these two projects frequently. Try to not break Android |
Amin Hassani | 3b7544b | 2019-08-19 01:02:18 -0700 | [diff] [blame] | 466 | or other systems that share the update engine code. Whenever landing a change, |
| 467 | always think about whether Android needs that change: |
| 468 | |
| 469 | * How will it affect Android? |
| 470 | * Can the change be moved to an interface and stubs implementations be |
| 471 | implemented so as not to affect Android? |
| 472 | * Can Chrome OS or Android specific code be guarded by macros? |
| 473 | |
| 474 | As a basic measure, if adding/removing/renaming code, make sure to change both |
| 475 | `build.gn` and `Android.bp`. Do not bring Chrome OS specific code (for example |
| 476 | other libraries that live in `system_api` or `dlcservice`) into the common code |
| 477 | of update_engine. Try to separate these concerns using best software engineering |
| 478 | practices. |
| 479 | |
| 480 | ### Merging from Android (or other code bases) |
| 481 | |
| 482 | Chrome OS tracks the Android code as an [upstream branch]. To merge the Android |
| 483 | code to Chrome OS (or vice versa) just do a `git merge` of that branch into |
| 484 | Chrome OS, test it using whatever means and upload a merge commit. |
| 485 | |
| 486 | ```bash |
| 487 | repo start merge-aosp |
| 488 | git merge --no-ff --strategy=recursive -X patience cros/upstream |
| 489 | repo upload --cbr --no-verify . |
| 490 | ``` |
| 491 | |
| 492 | [Postinstall]: #postinstall |
| 493 | [update payload file specification]: #update-payload-file-specification |
| 494 | [OTA]: https://source.android.com/devices/tech/ota |
| 495 | [DLC]: https://chromium.googlesource.com/chromiumos/platform2/+/master/dlcservice |
| 496 | [`chromeos-setgoodkernel`]: https://chromium.googlesource.com/chromiumos/platform2/+/master/installer/chromeos-setgoodkernel |
| 497 | [D-Bus interface]: /dbus_bindings/org.chromium.UpdateEngineInterface.dbus-xml |
| 498 | [this repository]: / |
| 499 | [UpdateManager]: /update_manager/update_manager.cc |
| 500 | [update_manager]: /update_manager/ |
| 501 | [P2P update related code]: https://chromium.googlesource.com/chromiumos/platform2/+/master/p2p/ |
| 502 | [`cros_generate_update_payloads`]: https://chromium.googlesource.com/chromiumos/chromite/+/master/scripts/cros_generate_update_payload.py |
| 503 | [`chromite/lib/paygen`]: https://chromium.googlesource.com/chromiumos/chromite/+/master/lib/paygen/ |
| 504 | [DeltaArchiveManifest]: /update_metadata.proto#302 |
| 505 | [Signatures]: /update_metadata.proto#122 |
| 506 | [hard coded]: /update_engine.conf |
| 507 | [Manifest protobuf]: /update_metadata.proto |
| 508 | [update_payload]: /scripts/ |
| 509 | [Postinstall]: https://chromium.googlesource.com/chromiumos/platform2/+/master/installer/chromeos-postinst |
| 510 | [`update_engine` protobufs]: https://chromium.googlesource.com/chromiumos/platform2/+/master/system_api/dbus/update_engine/ |
| 511 | [Running unit tests similar to other platforms]: https://chromium.googlesource.com/chromiumos/docs/+/master/testing/running_unit_tests.md |
| 512 | [Nebraska]: https://chromium.googlesource.com/chromiumos/platform/dev-util/+/master/nebraska/ |
| 513 | [upstream branch]: https://chromium.googlesource.com/aosp/platform/system/update_engine/+/upstream |
| 514 | [`cros flash`]: https://chromium.googlesource.com/chromiumos/docs/+/master/cros_flash.md |
| 515 | [bsdiff]: https://android.googlesource.com/platform/external/bsdiff/+/master |
| 516 | [puffin]: https://android.googlesource.com/platform/external/puffin/+/master |
| 517 | [`update_engine_client`]: /update_engine_client.cc |
| 518 | [`brillo_update_payload`]: /scripts/brillo_update_payload |
| 519 | [`check_update_payload`]: /scripts/paycheck.py |
Amin Hassani | 0199b75 | 2019-08-28 23:56:16 -0700 | [diff] [blame] | 520 | [Dev Server]: https://chromium.googlesource.com/chromiumos/chromite/+/master/docs/devserver.md |