model output of unspecified shape as partition input should not force CPU fallback
When execution of one partition writes a model output, and that output's
shape is not fully specified, and the execution succeeds, and that model
output is the input of another partition, execution of that second
partition fails. This results in CPU fallback, except that an OEM or
extension operation results in execution failure being reported at the
NDK level.
The reason execution of the second partition fails is that although
shape information is computed by execution of the first partition, that
shape information is not propagated to the execution of the second
partition: When we try to execute the second partition, both the model
input operand and the request input argument have unspecified shape,
causing a validation failure.
The fix is that in ExecutionStep::mapInputsAndOutputs() when we
initialize the second partition's input from the first partition's
output, we copy the dimensions from the first partition's output to the
second partition's input.
Test: NeuralNetworksTest_static (PartitioningTest, DynamicTemporariesTest, RandomPartitioningTest)
Bug: 168657259
Merged-In: I388f14feb2e55d1632958a07e2f2b5cecb9328d8
Change-Id: I388f14feb2e55d1632958a07e2f2b5cecb9328d8
(cherry picked from commit bf459f7b742bfbb3bc06948014d73d6fd41591ab)
diff --git a/runtime/ExecutionBuilder.cpp b/runtime/ExecutionBuilder.cpp
index 120ff99..8b6b817 100644
--- a/runtime/ExecutionBuilder.cpp
+++ b/runtime/ExecutionBuilder.cpp
@@ -544,7 +544,7 @@
// Get fallback executor.
std::shared_ptr<StepExecutor> executor;
- int n1 = plan.fallback(controller, &executor);
+ int n1 = plan.fallback(controller, &executor, nullptr, nullptr);
if (n1 != ANEURALNETWORKS_NO_ERROR) {
return {n1, {}, kNoTiming, nullptr};
}
@@ -579,8 +579,9 @@
// Get the current step of the execution.
std::shared_ptr<StepExecutor> executor;
std::shared_ptr<ExecutionBurstController> burstController;
- int n = doInsufficientSizeFallback ? plan.fallback(controller, &executor, &burstController)
- : plan.next(controller, &executor, &burstController);
+ int n = doInsufficientSizeFallback
+ ? plan.fallback(controller, &executor, &burstController, &outputShapes)
+ : plan.next(controller, &executor, &burstController, &outputShapes);
doInsufficientSizeFallback = false;
if (n != ANEURALNETWORKS_NO_ERROR) {
// During the interpreted execution of control flow, a loop timeout
@@ -779,7 +780,7 @@
// Get the current step of the execution.
std::shared_ptr<StepExecutor> executor;
- int n = plan.next(controller, &executor, nullptr, syncFence);
+ int n = plan.next(controller, &executor, nullptr, nullptr, syncFence);
if (n != ANEURALNETWORKS_NO_ERROR) {
// During the interpreted execution of control flow, a loop timeout
// might occur in ExecutionPlan::next().
@@ -1259,17 +1260,28 @@
}
void StepExecutor::mapInputOrOutput(const ModelArgumentInfo& builderInputOrOutput,
- ModelArgumentInfo* executorInputOrOutput) {
+ ModelArgumentInfo* executorInputOrOutput,
+ const hidl_vec<uint32_t>* builderDimensions) {
+ auto updateDimensions = [executorInputOrOutput, builderDimensions] {
+ if (!builderDimensions) {
+ return;
+ }
+ executorInputOrOutput->dimensions() = *builderDimensions;
+ };
+
*executorInputOrOutput = builderInputOrOutput;
switch (executorInputOrOutput->state()) {
default:
CHECK(false) << "unexpected ModelArgumentInfo::state";
break;
case ModelArgumentInfo::HAS_NO_VALUE:
- case ModelArgumentInfo::POINTER:
case ModelArgumentInfo::UNSPECIFIED:
break;
+ case ModelArgumentInfo::POINTER:
+ updateDimensions();
+ break;
case ModelArgumentInfo::MEMORY: {
+ updateDimensions();
const uint32_t builderPoolIndex = builderInputOrOutput.locationAndLength().poolIndex;
const Memory* memory = mExecutionBuilder->mMemories[builderPoolIndex];
const uint32_t executorPoolIndex = mMemories.add(memory);