Update libjpeg_turbo to use clz for bitcounting on ARM

Cherry-picked r1220 from upstream:
Use clz/bsr instructions on ARM for bit counting rather than the lookup table (reduces memory footprint and can improve performance in some cases.)

Upstream review:
http://sourceforge.net/p/libjpeg-turbo/patches/57/

Original review:
https://codereview.appspot.com/77480045/

Removing the lookup table saves 64k data for each process that uses jpeg encoding. Benchmarks on a few ARM devices shows encoding performance changes, from a slowdown of 3-4% on some devices, to a speedup of 10-20% on other devices. In average performance improves.

x86 will still use the lookup table because the bsr instruction showed to be slower on some chips.

BUG=
[email protected]

Review URL: https://codereview.appspot.com/97690043

git-svn-id: http://src.chromium.org/svn/trunk/deps/third_party/libjpeg_turbo@272637 4ff67af0-8c30-449e-8e8b-ad334ec8d88c
3 files changed