osm-edge aims to provide service esh functionality along with high performance resources, so that resource-constrained edge environments can also use the service mesh functionality used in the cloud.

In this test, benchmarks were conducted for osm-edge (v1.1.0) and osm (v1.1.0). The main focus is on service TPS, latency distribution when using two different meshes, and monitoring the resource overhead of the data plane.

osm-edge uses Pipy as the data plane; osm uses Envoy as the data plane.

Testing Environment

The benchmark was tested in a Kubernetes cluster running on Tencent Cloud CVM. There are 2 standard S5 nodes in the cluster. osm-edge and osm are both on loose traffic mode and mTLS, and the other settings are default.

  • Kubernetes: k3s v1.20.14+k3s2
  • OS: Ubuntu 20.04
  • Nodes: 16c32g * 2
  • Load Generator: 8c16g

The test application uses the common SpringCloud microservices architecture. The application is taken from flomesh-bookinfo-demo, which is a SpringCloud implementation of bookinfo application using SpringCloud. In the tests, not all services are used, but rather the API gateway and the Bookinfo Ratings service are selected.

External access to the service is provided through Ingress; the load generator uses the common Apache Jmeter 5.5.

For the tests, 2 links were chosen: one is the direct access to the Bookinfo Ratings service via Ingress (hereafter ratings), and the other passes through the API Gateway (hereafter gateway-ratings) in the middle. The reason for choosing these two links is to cover both single sidecar and multiple sidearm scenarios.

Procedure

We start the test with a non-mesh (i.e., no sidecar injection, hereafter referred to as non-mesh), and then test with osm-edge and osm mesh, respectively.

In order to simulate the resource-constrained scenario of the edge, the CPU used by the sidecar is limited when using the grid, and the 1 core and 2 core scenarios are tested respectively.

Thus, there are 5 rounds of testing, with 2 different links in each round.

Performance

Jmeter uses 200 threads during the test, and runs the test for 5 minutes. Each round of testing is preceded by a 2-minute warm-up.

For space reasons, only the results of the gateway-ratings link are shown below with different sidecar resource constraints.

Note: The sidecar in the table refers to the API Gateway sidecar.

grid sidecar max CPU TPS 90th 95th 99th sidecar CPU usage sidecar memory usage
non-mesh NA 3283 87 89 97 NA NA
osm-edge 2 3395 77 79 84 130% 52
osm 2 2189 102 104 109 200% 108
osm-edge 1 2839 76 77 79 100% 34
The osm-edge 1 1097 201 203 285 100% 105

sidecar 2 core

In the sidecar 2-core scenario, using the osm-edge grid gives a small TPS improvement over not using the grid, and also improves latency. The sidecar for both API Gateway and Bookinfo Ratings is still not running out of 2 cores (only 65%), when the Bookinfo Ratings service itself is at its performance limit.

The TPS of the osm grid was down nearly 30%, and the API Gateway sidecar CPU was running full, which was the bottleneck.

In terms of memory, the memory usage of osm-edge and osm sidecar is 50 MiB and 105 MiB respectively.

TPS

Latency distribution

API gateway sidecar CPU usage

API gateway sidecar memory footprint

Bookinfo Ratings sidecar CPU usage

Bookinfo Ratings sidecar memory usage

Sidecar 1 core

The difference is particularly noticeable in tests that limit sidecar to 1 core CPU. At this point, the API Gateway sidecar becomes the performance bottleneck, with both the osm-edge and osm sidecar running out of CPU.

In terms of TPS, osm-edge drops 12% and osm’s TPS drops a staggering 65%.

TPS!

Latency distribution

API gateway sidecar CPU usage

API gateway sidecar memory footprint

Bookinfo Ratings sidecar CPU usage

Bookinfo Ratings sidecar memory usage

Summary

This time, we benchmarked osm-edge and osm data planes with limited sidecar resources. From the results, osm-edge can still maintain high performance with low resource usage and more efficient use of resources. For resource-constrained edge scenarios, service grid features that are only available in the cloud can be enjoyed at a lower resource overhead. These are made possible by Pipy’s low-resource, high-performance features.

Of course, osm-edge is suitable for edge computing scenarios, but it can be applied to the cloud as well. In particular, cloud environments with large-scale services meet the requirements for cost control.