|  | CPU cooling APIs How To | 
|  | =================================== | 
|  |  | 
|  | Written by Amit Daniel Kachhap <[email protected]> | 
|  |  | 
|  | Updated: 6 Jan 2015 | 
|  |  | 
|  | Copyright (c)  2012 Samsung Electronics Co., Ltd(http://www.samsung.com) | 
|  |  | 
|  | 0. Introduction | 
|  |  | 
|  | The generic cpu cooling(freq clipping) provides registration/unregistration APIs | 
|  | to the caller. The binding of the cooling devices to the trip point is left for | 
|  | the user. The registration APIs returns the cooling device pointer. | 
|  |  | 
|  | 1. cpu cooling APIs | 
|  |  | 
|  | 1.1 cpufreq registration/unregistration APIs | 
|  | 1.1.1 struct thermal_cooling_device *cpufreq_cooling_register( | 
|  | struct cpumask *clip_cpus) | 
|  |  | 
|  | This interface function registers the cpufreq cooling device with the name | 
|  | "thermal-cpufreq-%x". This api can support multiple instances of cpufreq | 
|  | cooling devices. | 
|  |  | 
|  | clip_cpus: cpumask of cpus where the frequency constraints will happen. | 
|  |  | 
|  | 1.1.2 struct thermal_cooling_device *of_cpufreq_cooling_register( | 
|  | struct device_node *np, const struct cpumask *clip_cpus) | 
|  |  | 
|  | This interface function registers the cpufreq cooling device with | 
|  | the name "thermal-cpufreq-%x" linking it with a device tree node, in | 
|  | order to bind it via the thermal DT code. This api can support multiple | 
|  | instances of cpufreq cooling devices. | 
|  |  | 
|  | np: pointer to the cooling device device tree node | 
|  | clip_cpus: cpumask of cpus where the frequency constraints will happen. | 
|  |  | 
|  | 1.1.3 struct thermal_cooling_device *cpufreq_power_cooling_register( | 
|  | const struct cpumask *clip_cpus, u32 capacitance, | 
|  | get_static_t plat_static_func) | 
|  |  | 
|  | Similar to cpufreq_cooling_register, this function registers a cpufreq | 
|  | cooling device.  Using this function, the cooling device will | 
|  | implement the power extensions by using a simple cpu power model.  The | 
|  | cpus must have registered their OPPs using the OPP library. | 
|  |  | 
|  | The additional parameters are needed for the power model (See 2. Power | 
|  | models).  "capacitance" is the dynamic power coefficient (See 2.1 | 
|  | Dynamic power).  "plat_static_func" is a function to calculate the | 
|  | static power consumed by these cpus (See 2.2 Static power). | 
|  |  | 
|  | 1.1.4 struct thermal_cooling_device *of_cpufreq_power_cooling_register( | 
|  | struct device_node *np, const struct cpumask *clip_cpus, u32 capacitance, | 
|  | get_static_t plat_static_func) | 
|  |  | 
|  | Similar to cpufreq_power_cooling_register, this function register a | 
|  | cpufreq cooling device with power extensions using the device tree | 
|  | information supplied by the np parameter. | 
|  |  | 
|  | 1.1.5 void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev) | 
|  |  | 
|  | This interface function unregisters the "thermal-cpufreq-%x" cooling device. | 
|  |  | 
|  | cdev: Cooling device pointer which has to be unregistered. | 
|  |  | 
|  | 2. Power models | 
|  |  | 
|  | The power API registration functions provide a simple power model for | 
|  | CPUs.  The current power is calculated as dynamic + (optionally) | 
|  | static power.  This power model requires that the operating-points of | 
|  | the CPUs are registered using the kernel's opp library and the | 
|  | `cpufreq_frequency_table` is assigned to the `struct device` of the | 
|  | cpu.  If you are using CONFIG_CPUFREQ_DT then the | 
|  | `cpufreq_frequency_table` should already be assigned to the cpu | 
|  | device. | 
|  |  | 
|  | The `plat_static_func` parameter of `cpufreq_power_cooling_register()` | 
|  | and `of_cpufreq_power_cooling_register()` is optional.  If you don't | 
|  | provide it, only dynamic power will be considered. | 
|  |  | 
|  | 2.1 Dynamic power | 
|  |  | 
|  | The dynamic power consumption of a processor depends on many factors. | 
|  | For a given processor implementation the primary factors are: | 
|  |  | 
|  | - The time the processor spends running, consuming dynamic power, as | 
|  | compared to the time in idle states where dynamic consumption is | 
|  | negligible.  Herein we refer to this as 'utilisation'. | 
|  | - The voltage and frequency levels as a result of DVFS.  The DVFS | 
|  | level is a dominant factor governing power consumption. | 
|  | - In running time the 'execution' behaviour (instruction types, memory | 
|  | access patterns and so forth) causes, in most cases, a second order | 
|  | variation.  In pathological cases this variation can be significant, | 
|  | but typically it is of a much lesser impact than the factors above. | 
|  |  | 
|  | A high level dynamic power consumption model may then be represented as: | 
|  |  | 
|  | Pdyn = f(run) * Voltage^2 * Frequency * Utilisation | 
|  |  | 
|  | f(run) here represents the described execution behaviour and its | 
|  | result has a units of Watts/Hz/Volt^2 (this often expressed in | 
|  | mW/MHz/uVolt^2) | 
|  |  | 
|  | The detailed behaviour for f(run) could be modelled on-line.  However, | 
|  | in practice, such an on-line model has dependencies on a number of | 
|  | implementation specific processor support and characterisation | 
|  | factors.  Therefore, in initial implementation that contribution is | 
|  | represented as a constant coefficient.  This is a simplification | 
|  | consistent with the relative contribution to overall power variation. | 
|  |  | 
|  | In this simplified representation our model becomes: | 
|  |  | 
|  | Pdyn = Capacitance * Voltage^2 * Frequency * Utilisation | 
|  |  | 
|  | Where `capacitance` is a constant that represents an indicative | 
|  | running time dynamic power coefficient in fundamental units of | 
|  | mW/MHz/uVolt^2.  Typical values for mobile CPUs might lie in range | 
|  | from 100 to 500.  For reference, the approximate values for the SoC in | 
|  | ARM's Juno Development Platform are 530 for the Cortex-A57 cluster and | 
|  | 140 for the Cortex-A53 cluster. | 
|  |  | 
|  |  | 
|  | 2.2 Static power | 
|  |  | 
|  | Static leakage power consumption depends on a number of factors.  For a | 
|  | given circuit implementation the primary factors are: | 
|  |  | 
|  | - Time the circuit spends in each 'power state' | 
|  | - Temperature | 
|  | - Operating voltage | 
|  | - Process grade | 
|  |  | 
|  | The time the circuit spends in each 'power state' for a given | 
|  | evaluation period at first order means OFF or ON.  However, | 
|  | 'retention' states can also be supported that reduce power during | 
|  | inactive periods without loss of context. | 
|  |  | 
|  | Note: The visibility of state entries to the OS can vary, according to | 
|  | platform specifics, and this can then impact the accuracy of a model | 
|  | based on OS state information alone.  It might be possible in some | 
|  | cases to extract more accurate information from system resources. | 
|  |  | 
|  | The temperature, operating voltage and process 'grade' (slow to fast) | 
|  | of the circuit are all significant factors in static leakage power | 
|  | consumption.  All of these have complex relationships to static power. | 
|  |  | 
|  | Circuit implementation specific factors include the chosen silicon | 
|  | process as well as the type, number and size of transistors in both | 
|  | the logic gates and any RAM elements included. | 
|  |  | 
|  | The static power consumption modelling must take into account the | 
|  | power managed regions that are implemented.  Taking the example of an | 
|  | ARM processor cluster, the modelling would take into account whether | 
|  | each CPU can be powered OFF separately or if only a single power | 
|  | region is implemented for the complete cluster. | 
|  |  | 
|  | In one view, there are others, a static power consumption model can | 
|  | then start from a set of reference values for each power managed | 
|  | region (e.g. CPU, Cluster/L2) in each state (e.g. ON, OFF) at an | 
|  | arbitrary process grade, voltage and temperature point.  These values | 
|  | are then scaled for all of the following: the time in each state, the | 
|  | process grade, the current temperature and the operating voltage. | 
|  | However, since both implementation specific and complex relationships | 
|  | dominate the estimate, the appropriate interface to the model from the | 
|  | cpu cooling device is to provide a function callback that calculates | 
|  | the static power in this platform.  When registering the cpu cooling | 
|  | device pass a function pointer that follows the `get_static_t` | 
|  | prototype: | 
|  |  | 
|  | int plat_get_static(cpumask_t *cpumask, int interval, | 
|  | unsigned long voltage, u32 &power); | 
|  |  | 
|  | `cpumask` is the cpumask of the cpus involved in the calculation. | 
|  | `voltage` is the voltage at which they are operating.  The function | 
|  | should calculate the average static power for the last `interval` | 
|  | milliseconds.  It returns 0 on success, -E* on error.  If it | 
|  | succeeds, it should store the static power in `power`.  Reading the | 
|  | temperature of the cpus described by `cpumask` is left for | 
|  | plat_get_static() to do as the platform knows best which thermal | 
|  | sensor is closest to the cpu. | 
|  |  | 
|  | If `plat_static_func` is NULL, static power is considered to be | 
|  | negligible for this platform and only dynamic power is considered. | 
|  |  | 
|  | The platform specific callback can then use any combination of tables | 
|  | and/or equations to permute the estimated value.  Process grade | 
|  | information is not passed to the model since access to such data, from | 
|  | on-chip measurement capability or manufacture time data, is platform | 
|  | specific. | 
|  |  | 
|  | Note: the significance of static power for CPUs in comparison to | 
|  | dynamic power is highly dependent on implementation.  Given the | 
|  | potential complexity in implementation, the importance and accuracy of | 
|  | its inclusion when using cpu cooling devices should be assessed on a | 
|  | case by case basis. | 
|  |  |