Optimization for isZero to reduce processor instructions
Results showed an improvement of ~0.58% of app cpu-cycles that happen
mostly within RenderNodeDrawable::onDraw hits this path every frame during view
drawing traversal. At the assembly code level, the fabsf fuction is
instrinsic and replaced by a single instruction which is the reason this
code is more optimal.
Did further benchmarking with a binary that contained a for loop
iterating on calling this function and the cpu-cycle results obtained are:
Overhead Shared Object Symbol
16.87% /data/local/tmp/edgartest isZeroOld(float)
9.66% /data/local/tmp/edgartest isZeroNew(float)
where isZeroNew is the proposed function, and we can see it is ~40% faster
than the old method.
Test: Ran hwuimacro benchmarks and also did some benchmarking with my
own binary
Change-Id: I68b7db1bf501a3faa669ad5b7d3807ad9cb8798e
diff --git a/libs/hwui/utils/MathUtils.h b/libs/hwui/utils/MathUtils.h
index cc8d83f..62bf39c 100644
--- a/libs/hwui/utils/MathUtils.h
+++ b/libs/hwui/utils/MathUtils.h
@@ -31,7 +31,9 @@
* Check for floats that are close enough to zero.
*/
inline static bool isZero(float value) {
- return (value >= -NON_ZERO_EPSILON) && (value <= NON_ZERO_EPSILON);
+ // Using fabsf is more performant as ARM computes
+ // fabsf in a single instruction.
+ return fabsf(value) <= NON_ZERO_EPSILON;
}
inline static bool isOne(float value) {