The Benchmark App is a tool designed to help developers measure the performance of PyTorch models on Apple devices using the ExecuTorch runtime. It provides a flexible framework for dynamically generating and running performance tests on your models, allowing you to assess metrics such as load times, inference speeds, memory usage, and more.
xcode-select --install)..dmg installer and move the CMake app to /Applications folder.sudo /Applications/CMake.app/Contents/bin/cmake-gui --installincreased-memory-limit entitlement if targeting iOS devices.To get started, clone the ExecuTorch repository and cd into the source code directory:
git clone https://github.com/pytorch/executorch.git --depth 1 --recurse-submodules --shallow-submodules cd executorch
This command performs a shallow clone to speed up the process.
The Benchmark App relies on prebuilt ExecuTorch frameworks. You have two options:
Run the provided script to download the prebuilt frameworks:
./extension/benchmark/apple/Benchmark/Frameworks/download_frameworks.sh
Alternatively, you can build the frameworks yourself by following the guide.
Once the frameworks are downloaded or built, verify that the Frameworks directory contains the necessary .xcframework files:
ls extension/benchmark/apple/Benchmark/Frameworks
You should see:
backend_coreml.xcframework backend_mps.xcframework backend_xnnpack.xcframework executorch.xcframework kernels_custom.xcframework kernels_optimized.xcframework kernels_portable.xcframework kernels_quantized.xcframework
Place your exported model files (.pte) and any other resources (e.g., tokenizer.bin) into the extension/benchmark/apple/Benchmark/Resources directory:
cp <path/to/my_model.pte> <path/to/llama3.pte> <path/to/tokenizer.bin> extension/benchmark/apple/Benchmark/Resources
Optionally, check that the files are there:
ls extension/benchmark/apple/Benchmark/Resources
For this example you should see:
llama3.pte my_model.pte tokenizer.bin
The app automatically bundles these resources and makes them available to the test suite.
Open the Benchmark Xcode project:
open extension/benchmark/apple/Benchmark/Benchmark.xcodeproj
Select the destination device or simulator and press Command+U, or click Product > Test in the menu to run the test suite.
If you plan to run the app on a physical device, you may need to set up code signing:
Command+1 and click on the Benchmark root of the file tree.App and Tests targets.After running the tests, you can view the results in Xcode:
Command+9.Note: The tests use XCTMeasureOptions to run each test multiple times (usually five) to obtain average performance metrics.
The Benchmark App uses a dynamic test generation framework to create tests based on the resources you provide.
The key components are:
DynamicTestCase: A subclass of XCTestCase that allows for the dynamic creation of test methods.ResourceTestCase: Builds upon DynamicTestCase to generate tests based on resources that match specified criteria.Define Directories and Predicates: Override the directories and predicates methods to specify where to look for resources and how to match them.
Generate Resource Combinations: The framework searches the specified directories for files matching the predicates, generating all possible combinations.
Create Dynamic Tests: For each combination of resources, it calls dynamicTestsForResources, where you define the tests to run.
Test Naming: Test names are dynamically formed using the format:
test_<TestName>_<Resource1>_<Resource2>_..._<OS>_<Version>_<DeviceModel>
This ensures that each test is uniquely identifiable based on the resources and device.
Here's how you might create a test to measure model load and inference times:
@interface GenericTests : ResourceTestCase @end @implementation GenericTests + (NSArray<NSString *> *)directories { return @[@"Resources"]; } + (NSDictionary<NSString *, BOOL (^)(NSString *)> *)predicates { return @{ @"model" : ^BOOL(NSString *filename) { return [filename hasSuffix:@".pte"]; }, }; } + (NSDictionary<NSString *, void (^)(XCTestCase *)> *)dynamicTestsForResources:(NSDictionary<NSString *, NSString *> *)resources { NSString *modelPath = resources[@"model"]; return @{ @"load" : ^(XCTestCase *testCase) { [testCase measureWithMetrics:@[[XCTClockMetric new], [XCTMemoryMetric new]] block:^{ XCTAssertEqual(Module(modelPath.UTF8String).load_forward(), Error::Ok); }]; }, @"forward" : ^(XCTestCase *testCase) { // Set up and measure the forward pass... }, }; } @end
In this example:
.pte files in the Resources directory.load and forward.You can create custom tests by subclassing ResourceTestCase and overriding the necessary methods.
Subclass ResourceTestCase:
@interface MyCustomTests : ResourceTestCase @end
Override directories and predicates:
Specify where to look for resources and how to match them.
+ (NSArray<NSString *> *)directories { return @[@"Resources"]; } + (NSDictionary<NSString *, BOOL (^)(NSString *)> *)predicates { return @{ @"model" : ^BOOL(NSString *filename) { return [filename hasSuffix:@".pte"]; }, @"config" : ^BOOL(NSString *filename) { return [filename isEqualToString:@"config.json"]; }, }; }
Implement dynamicTestsForResources:
Define the tests to run for each combination of resources.
+ (NSDictionary<NSString *, void (^)(XCTestCase *)> *)dynamicTestsForResources:(NSDictionary<NSString *, NSString *> *)resources { NSString *modelPath = resources[@"model"]; NSString *configPath = resources[@"config"]; return @{ @"customTest" : ^(XCTestCase *testCase) { // Implement your test logic here. }, }; }
Add the Test Class to the Test Target:
Ensure your new test class is included in the test target in Xcode.
An example of a more advanced test is measuring the tokens per second during text generation with the LLaMA model.
@interface LLaMATests : ResourceTestCase @end @implementation LLaMATests + (NSArray<NSString *> *)directories { return @[@"Resources"]; } + (NSDictionary<NSString *, BOOL (^)(NSString *)> *)predicates { return @{ @"model" : ^BOOL(NSString *filename) { return [filename hasSuffix:@".pte"] && [filename containsString:@"llama"]; }, @"tokenizer" : ^BOOL(NSString *filename) { return [filename isEqualToString:@"tokenizer.bin"]; }, }; } + (NSDictionary<NSString *, void (^)(XCTestCase *)> *)dynamicTestsForResources:(NSDictionary<NSString *, NSString *> *)resources { NSString *modelPath = resources[@"model"]; NSString *tokenizerPath = resources[@"tokenizer"]; return @{ @"generate" : ^(XCTestCase *testCase) { // Implement the token generation test... }, }; } @end
In this test:
tokenizer.bin.The Benchmark App leverages Apple's performance testing APIs to measure metrics such as execution time and memory usage.
XCTMetric protocol.XCTClockMetric: Measures wall-clock time.XCTMemoryMetric: Measures memory usage.TokensPerSecondMetric.You can also run the tests using xcodebuild:
# Run on an iOS Simulator xcodebuild test -project extension/benchmark/apple/Benchmark/Benchmark.xcodeproj \ -scheme Benchmark \ -destination 'platform=iOS Simulator,name=<SimulatorName>' \ -testPlan Tests # Run on a physical iOS device xcodebuild test -project extension/benchmark/apple/Benchmark/Benchmark.xcodeproj \ -scheme Benchmark \ -destination 'platform=iOS,name=<DeviceName>' \ -testPlan Tests \ -allowProvisioningUpdates DEVELOPMENT_TEAM=<YourTeamID>
Replace <SimulatorName>, <DeviceName>, and <YourTeamID> with your simulator/device name and Apple development team ID.
The app can be built and run on macOS, just add it as the destination platform.
Also, set up app signing to run locally.
The ExecuTorch Benchmark App provides a flexible and powerful framework for testing and measuring the performance of PyTorch models on Apple devices. By leveraging dynamic test generation, you can easily add your models and resources to assess their performance metrics. Whether you're optimizing existing models or developing new ones, this tool can help you gain valuable insights into their runtime behavior.