|  | Device Whitelist Controller | 
|  |  | 
|  | 1. Description: | 
|  |  | 
|  | Implement a cgroup to track and enforce open and mknod restrictions | 
|  | on device files.  A device cgroup associates a device access | 
|  | whitelist with each cgroup.  A whitelist entry has 4 fields. | 
|  | 'type' is a (all), c (char), or b (block).  'all' means it applies | 
|  | to all types and all major and minor numbers.  Major and minor are | 
|  | either an integer or * for all.  Access is a composition of r | 
|  | (read), w (write), and m (mknod). | 
|  |  | 
|  | The root device cgroup starts with rwm to 'all'.  A child device | 
|  | cgroup gets a copy of the parent.  Administrators can then remove | 
|  | devices from the whitelist or add new entries.  A child cgroup can | 
|  | never receive a device access which is denied by its parent. | 
|  |  | 
|  | 2. User Interface | 
|  |  | 
|  | An entry is added using devices.allow, and removed using | 
|  | devices.deny.  For instance | 
|  |  | 
|  | echo 'c 1:3 mr' > /sys/fs/cgroup/1/devices.allow | 
|  |  | 
|  | allows cgroup 1 to read and mknod the device usually known as | 
|  | /dev/null.  Doing | 
|  |  | 
|  | echo a > /sys/fs/cgroup/1/devices.deny | 
|  |  | 
|  | will remove the default 'a *:* rwm' entry. Doing | 
|  |  | 
|  | echo a > /sys/fs/cgroup/1/devices.allow | 
|  |  | 
|  | will add the 'a *:* rwm' entry to the whitelist. | 
|  |  | 
|  | 3. Security | 
|  |  | 
|  | Any task can move itself between cgroups.  This clearly won't | 
|  | suffice, but we can decide the best way to adequately restrict | 
|  | movement as people get some experience with this.  We may just want | 
|  | to require CAP_SYS_ADMIN, which at least is a separate bit from | 
|  | CAP_MKNOD.  We may want to just refuse moving to a cgroup which | 
|  | isn't a descendant of the current one.  Or we may want to use | 
|  | CAP_MAC_ADMIN, since we really are trying to lock down root. | 
|  |  | 
|  | CAP_SYS_ADMIN is needed to modify the whitelist or move another | 
|  | task to a new cgroup.  (Again we'll probably want to change that). | 
|  |  | 
|  | A cgroup may not be granted more permissions than the cgroup's | 
|  | parent has. | 
|  |  | 
|  | 4. Hierarchy | 
|  |  | 
|  | device cgroups maintain hierarchy by making sure a cgroup never has more | 
|  | access permissions than its parent.  Every time an entry is written to | 
|  | a cgroup's devices.deny file, all its children will have that entry removed | 
|  | from their whitelist and all the locally set whitelist entries will be | 
|  | re-evaluated.  In case one of the locally set whitelist entries would provide | 
|  | more access than the cgroup's parent, it'll be removed from the whitelist. | 
|  |  | 
|  | Example: | 
|  | A | 
|  | / \ | 
|  | B | 
|  |  | 
|  | group        behavior	exceptions | 
|  | A            allow		"b 8:* rwm", "c 116:1 rw" | 
|  | B            deny		"c 1:3 rwm", "c 116:2 rwm", "b 3:* rwm" | 
|  |  | 
|  | If a device is denied in group A: | 
|  | # echo "c 116:* r" > A/devices.deny | 
|  | it'll propagate down and after revalidating B's entries, the whitelist entry | 
|  | "c 116:2 rwm" will be removed: | 
|  |  | 
|  | group        whitelist entries                        denied devices | 
|  | A            all                                      "b 8:* rwm", "c 116:* rw" | 
|  | B            "c 1:3 rwm", "b 3:* rwm"                 all the rest | 
|  |  | 
|  | In case parent's exceptions change and local exceptions are not allowed | 
|  | anymore, they'll be deleted. | 
|  |  | 
|  | Notice that new whitelist entries will not be propagated: | 
|  | A | 
|  | / \ | 
|  | B | 
|  |  | 
|  | group        whitelist entries                        denied devices | 
|  | A            "c 1:3 rwm", "c 1:5 r"                   all the rest | 
|  | B            "c 1:3 rwm", "c 1:5 r"                   all the rest | 
|  |  | 
|  | when adding "c *:3 rwm": | 
|  | # echo "c *:3 rwm" >A/devices.allow | 
|  |  | 
|  | the result: | 
|  | group        whitelist entries                        denied devices | 
|  | A            "c *:3 rwm", "c 1:5 r"                   all the rest | 
|  | B            "c 1:3 rwm", "c 1:5 r"                   all the rest | 
|  |  | 
|  | but now it'll be possible to add new entries to B: | 
|  | # echo "c 2:3 rwm" >B/devices.allow | 
|  | # echo "c 50:3 r" >B/devices.allow | 
|  | or even | 
|  | # echo "c *:3 rwm" >B/devices.allow | 
|  |  | 
|  | Allowing or denying all by writing 'a' to devices.allow or devices.deny will | 
|  | not be possible once the device cgroups has children. | 
|  |  | 
|  | 4.1 Hierarchy (internal implementation) | 
|  |  | 
|  | device cgroups is implemented internally using a behavior (ALLOW, DENY) and a | 
|  | list of exceptions.  The internal state is controlled using the same user | 
|  | interface to preserve compatibility with the previous whitelist-only | 
|  | implementation.  Removal or addition of exceptions that will reduce the access | 
|  | to devices will be propagated down the hierarchy. | 
|  | For every propagated exception, the effective rules will be re-evaluated based | 
|  | on current parent's access rules. |