Partial fix to allow partitions to have boundary temporaries of unknown size.

The old behavior was that we'd fall back to full model CPU execution at
compilation time; the new behavior is that we'll get ordinary
partitioned compilation and execution.

Limitations:
- Needs more testing and more tests written.
- The initial guess for the size of a boundary temporary is a single
  element.  Perhaps it would be useful to remember actual size from
  a previous execution.
- Fenced execution punts to unfenced execution (at the NDK API level)
  when plan contains subgraph outputs of unknown size.
- Operands of unknown size at control flow construct boundaries still
  falls back to full model CPU execution.

Also adds some diagnostic logging.

Test: NeuralNetworksTest_static

Bug: 132458982

Merged-In: I52e7179ff9783d184fd6bfc1c9fefc55972e942a
Change-Id: I52e7179ff9783d184fd6bfc1c9fefc55972e942a
(cherry picked from commit d6183c8db7feb5e2bdf0d2907af01418e7da809e)
diff --git a/runtime/NeuralNetworks.cpp b/runtime/NeuralNetworks.cpp
index 5d3dae4..f5206c8 100644
--- a/runtime/NeuralNetworks.cpp
+++ b/runtime/NeuralNetworks.cpp
@@ -1543,6 +1543,26 @@
             waitForList.push_back(syncFenceFd);
         }
     }
+
+    if (r->getCompilation()->hasDynamicTemporaries()) {
+        // The current implementation of fenced execution does not support
+        // dynamic temporaries.  Fall back to non fenced execution.
+        LOG(INFO) << "ANeuralNetworksExecution_startComputeWithDependencies falling back"
+                  << " to ANeuralNetworksExecution_startCompute"
+                  << " because of boundary operands of unknown size";
+        for (int syncFenceFd : waitForList) {
+            if (syncFenceFd > 0) {
+                auto w = syncWait(syncFenceFd, -1);
+                if (w != FenceState::SIGNALED) {
+                    VLOG(EXECUTION) << "syncWait failed, fd: " << syncFenceFd;
+                    *event = nullptr;
+                    return ANEURALNETWORKS_OP_FAILED;
+                }
+            }
+        }
+        return ANeuralNetworksExecution_startCompute(execution, event);
+    }
+
     int syncFenceToSignal = -1;
     int n = r->computeFenced(waitForList, duration, &syncFenceToSignal);
     std::unique_ptr<SyncFenceEvent> e =