Allow BATCH_MATMUL with different input and output scales.

The CPU reference implementation already supported this, so the only
change needed is to relax the validation and modify the test.

Also updated sample shim driver prebuilts.

Cherrypicked from AOSP I4513c2c73d6d920378e32ee8491bb642796a386d

Fixes: 204802210
Test: NNT_static
Test: VtsHalNeuralnetworksTargetTest
Change-Id: I4513c2c73d6d920378e32ee8491bb642796a386d
Merged-In: I4513c2c73d6d920378e32ee8491bb642796a386d
7 files changed