osm-edge aims to provide service esh functionality along with high performance resources, so that resource-constrained edge environments can also use the service mesh functionality used in the cloud.
In this test, benchmarks were conducted for osm-edge (v1.1.0) and osm (v1.1.0). The main focus is on service TPS, latency distribution when using two different meshes, and monitoring the resource overhead of the data plane.
osm-edge uses Pipy as the data plane; osm uses Envoy as the data plane.
Testing Environment
The benchmark was tested in a Kubernetes cluster running on Tencent Cloud CVM. There are 2 standard S5 nodes in the cluster. osm-edge and osm are both on loose traffic mode and mTLS, and the other settings are default.
- Kubernetes: k3s v1.20.14+k3s2
- OS: Ubuntu 20.04
- Nodes: 16c32g * 2
- Load Generator: 8c16g
The test application uses the common SpringCloud microservices architecture. The application is taken from flomesh-bookinfo-demo, which is a SpringCloud implementation of bookinfo application using SpringCloud. In the tests, not all services are used, but rather the API gateway and the Bookinfo Ratings service are selected.
External access to the service is provided through Ingress; the load generator uses the common Apache Jmeter 5.5.
For the tests, 2 links were chosen: one is the direct access to the Bookinfo Ratings service via Ingress (hereafter ratings), and the other passes through the API Gateway (hereafter gateway-ratings) in the middle. The reason for choosing these two links is to cover both single sidecar and multiple sidearm scenarios.
Procedure
We start the test with a non-mesh (i.e., no sidecar injection, hereafter referred to as non-mesh), and then test with osm-edge and osm mesh, respectively.
In order to simulate the resource-constrained scenario of the edge, the CPU used by the sidecar is limited when using the grid, and the 1 core and 2 core scenarios are tested respectively.
Thus, there are 5 rounds of testing, with 2 different links in each round.
Performance
Jmeter uses 200 threads during the test, and runs the test for 5 minutes. Each round of testing is preceded by a 2-minute warm-up.
For space reasons, only the results of the gateway-ratings link are shown below with different sidecar resource constraints.
Note: The sidecar in the table refers to the API Gateway sidecar.
grid | sidecar max CPU | TPS | 90th | 95th | 99th | sidecar CPU usage | sidecar memory usage |
---|---|---|---|---|---|---|---|
non-mesh | NA | 3283 | 87 | 89 | 97 | NA | NA |
osm-edge | 2 | 3395 | 77 | 79 | 84 | 130% | 52 |
osm | 2 | 2189 | 102 | 104 | 109 | 200% | 108 |
osm-edge | 1 | 2839 | 76 | 77 | 79 | 100% | 34 |
The osm-edge | 1 | 1097 | 201 | 203 | 285 | 100% | 105 |
sidecar 2 core
In the sidecar 2-core scenario, using the osm-edge grid gives a small TPS improvement over not using the grid, and also improves latency. The sidecar for both API Gateway and Bookinfo Ratings is still not running out of 2 cores (only 65%), when the Bookinfo Ratings service itself is at its performance limit.
The TPS of the osm grid was down nearly 30%, and the API Gateway sidecar CPU was running full, which was the bottleneck.
In terms of memory, the memory usage of osm-edge and osm sidecar is 50 MiB and 105 MiB respectively.
TPS
Latency distribution
API gateway sidecar CPU usage
API gateway sidecar memory footprint
Bookinfo Ratings sidecar CPU usage
Bookinfo Ratings sidecar memory usage
Sidecar 1 core
The difference is particularly noticeable in tests that limit sidecar to 1 core CPU. At this point, the API Gateway sidecar becomes the performance bottleneck, with both the osm-edge and osm sidecar running out of CPU.
In terms of TPS, osm-edge drops 12% and osm’s TPS drops a staggering 65%.
TPS!
Latency distribution
API gateway sidecar CPU usage
API gateway sidecar memory footprint
Bookinfo Ratings sidecar CPU usage
Bookinfo Ratings sidecar memory usage
Summary
This time, we benchmarked osm-edge and osm data planes with limited sidecar resources. From the results, osm-edge can still maintain high performance with low resource usage and more efficient use of resources. For resource-constrained edge scenarios, service grid features that are only available in the cloud can be enjoyed at a lower resource overhead. These are made possible by Pipy’s low-resource, high-performance features.
Of course, osm-edge is suitable for edge computing scenarios, but it can be applied to the cloud as well. In particular, cloud environments with large-scale services meet the requirements for cost control.
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.