The increased popularity of containerized application deployments means more applications consist of hundreds of containers that can become difficult to manage. As a result, service mesh has become a popular tool for communicating between services.
In this article you’ll learn how service mesh removes operational toil from your microservices, as well as when it’s worth using one—and when it’s not. We’ll also give an overview of the main service mesh implementations available today.
A service mesh is a tool at the infrastructure layer that transparently adds observability, reliability, load balancing, service discovery, authentication, and authorization support to the applications.
The increasing sprawl of microservices makes it challenging to enforce things like service discovery, mutual TLS, circuit breaking, and observability. Service mesh implements all these cross-cutting concerns transparently for applications via a set of network proxies deployed alongside applications known as sidecar.
Service mesh helps drive business value by removing operational toil from microservices deployment. A service mesh allows DevOps to implement cross-cutting concerns for all the services employed by an application. These features can be configured from a single and consistent interface within the service mesh called the control plane.
A service mesh offers many features that solve common problems among applications. Here’s a list of some of the most prominent features of service mesh:
Different implementations of service mesh offer different features, but the features discussed above are the most common mesh capabilities.
In absence of a service mesh, each service developer has to implement their feature at the application layer, leading to duplication of effort. That’s why implementing a service mesh translates into cost savings. We’ll look at examples of this in a moment.
Let’s take a bird’s eye view of service mesh architecture. A service mesh is primarily made of two components: a Control Plane and a Data Plane (see diagram below).
As noted above, a service mesh implements all of its functionality using sidecar proxies. These proxies constitute the service mesh’s Data Plane and are responsible for collecting data for tracing and observability. They are also responsible for intercepting requests for other features, like retries, encryption, circuit breaker, and more.
The Control Plane, on the other hand, is a single and consistent interface used to configure the proxies. You can enable and disable any feature discussed so far using a provided mechanism such as Command Line Interface (CLI), an SDK, or an API.
Having a single place to configure the entire service mesh is a powerful construct that brings down the operational overhead significantly.

There are many service mesh products on the market. Below is a short list of some of the most popular products available today.
In the chart below, we’ll compare four of these products and their features to see how they stack up. The four products we’ve chosen—Istio, Linkerd, Consul, and Open Service Mesh—have the most features available and are among the most popular choices in the industry.

All of the meshes noted here provide a comparable feature set with slight differences in where and how they can be deployed. The best choice for your company will depend on the ecosystem you have and the level of support you need. If you have any specific feature need that is exclusive to a select few, that will naturally narrow down the list.
As we saw in the previous sections, the market is flooded with service mesh options. That’s why the Cloud Native Computing Foundation (CNCF) decided to implement Service Mesh Interface (SMI), a standard specification that covers the most common service mesh capabilities.
The SMI is applicable only to meshes running in Kubernetes and it’s not a comprehensive specification of all the service mesh features. However, it does implement the following capabilities:
The goal of SMI was to build a specification against which application developers can build their applications, without locking into any specific implementation.
So far, we’ve discussed the many benefits of using a service mesh. However, a service mesh does come with downsides and complexities. These are some of the most common issues:
With these limitations in mind, you should wait to implement a service mesh in your architecture until you have a considerably large scale of microservices or have the requirements that can be met by a service mesh.
You can avoid implementing a service mesh when you are just starting out with Kubernetes by implementing some mesh functionalities in your application layer. A few open-source libraries used to implement a subset of features in your Node.js applications include: opossum, for circuit breaker; node-rate-limiter for rate limiting; and Jaeger, for distributed tracing.
In this article, we covered the basics of what a service mesh is, including the features it provides. At a high level, there are two components that comprise a service mesh’s architecture. These include a Data Plane made up of sidecar proxies that are responsible for intercepting requests and providing the mesh features, and a Control Plane that allows the configuration of the sidecars from a central point.
We also looked at how service meshes can help eliminate toil from your microservices by elevating the overall resilience, observability, and security of any architecture. A service mesh can solve a wide array of problems that teams face when they start to scale out their microservices architecture, and they can ease the operational burden from infrastructure teams.
That said, teams that are just starting out with Kubernetes or microservices architecture, or who have only few services, should avoid the complexity of a service mesh. Ultimately, teams should do their due diligence to make sure a service mesh is right for them before investing time and money in implementing one.