Resilient Microservice Design With Spring Boot — Rate Limiter Pattern

Vinoth Selvaraj
5 min readOct 25, 2020


In this article, Lets talk about Rate Limiter Pattern — for designing resilient microservice. This is the fifth article in the Resilient design patterns series. If you have not read the previous articles, I would suggest you to take a look at them first.

Need For Resiliency:

MicroServices are distributed in nature. It has more components and moving parts. In the distributed architecture, dealing with any unexpected failure is one of the biggest challenges to solve. It could be a hardware failure, network failure etc. Ability of the system to recover from the failure and remain functional makes the system more resilient. It also avoids any cascading failures.

Why Rate Limiter?

Rate limiter pattern helps us to make our services highly available just by limiting the number of calls we could make in a specific window. In other words, It helps us to control the throughput. To understand this behavior, Lets assume that there is a service which is responsible for finding specific 100 stock prices. It needs to talk to a 3rd party service/some time consuming operation to do that. It takes certain time to find the price and returns the list of price for those 100 stock. Stock prices are not going to be updated every millisecond and lets assume that it is updated once in every 5 seconds. In that case, when we get many concurrent requests to get the stock prices, should we really make those many concurrent calls to the 3rd party service? Why not just make 1 call and cache the response to make our service highly available? That is exactly what we are going to do as part of this Rate limiter pattern in this article.

Sample Application:

Lets consider the above scenario and design our application accordingly. Lets assume that we have a stock-service which is responsible for fetching stock prices from some other 3rd party service which is time consuming. It could also be a CPU intensive task / IO intensive call. It does not really matter!

Our aim is to limit the calls from our service to the 3rd party service as it is very time and resource consuming. So, we are going to limit the calls as shown here. That is, we can get N number of requests. But we would make only one call every 5 second. Remaining calls would be rejected. It is completely up to us how to handle the rejected calls.

Stock Service W/O Rate Limiter:

  • First I create a simple spring boot application with web dependency.
  • StockController
public class StockController {

private StockService stockService;

public List<StockPrice> getStockPrices(){
return this.stockService.getStocks();

  • StockPrice — DTO
public class StockPrice {

private String stock;
private int price;

//getters, setters, constructor

  • StockService → getStocks method is very time consuming call. We simulate that using hard coded sleep.

Running this application, I run a simple performance test with 10 concurrent users who makes continuous calls every 300 ms for 30 seconds. Our performance test result is as shown below.

  • We were able to make 90 requests in total
  • Throughput is 2.9 request / sec

Stock Service With Rate Limiter:

  • I make below changes to implement the rate limiter pattern using Resilience library.

Application.yaml changes:

  • We provide the rate limit specific configuration
  • limitRereshPeriod is the interval — in our case it is 5 seconds
  • limitForPeriod is number of requests to make every 5 seconds. In our case, it should be just 1. We want to make only one call every 5 second. All other requests should not be allowed within the 5 second window
  • timeoutDuration — additional requests within the 5 seconds should be timedout immediately.
limitForPeriod: 1
limitRefreshPeriod: 5s
timeoutDuration: 0s

StockService changes:

  • getStocks method is time consuming. So we would like to limit the calls. We add @RateLimiter with the instance name we had defined in the application.yaml
  • When any additional requests arrive within the 5 second window will fail and call the fall back method.
  • Here we cache the response and return as part of fall back. Every 5 second we update our cache.

If I run the same performance test after implementing Rate Limiter pattern, the results look great!

  • Throughput has increased from 2.9 requests/sec to 30.5 requests/sec
  • Average response time has dropped to 26 ms from 3 seconds.


Rate limiter pattern is very useful in controlling throughput of a method call. So that server resources can be utilized properly and cascading failures by over utilization of resources can also be avoided. Rate Limiter will also be useful for popular applications like google, twitter, facebook etc to prevent any malicious attacks.

There are other design patterns which could handle this better along with rate limiter pattern. Please take a look at these articles.

Happy learning 🙂



Vinoth Selvaraj

Principal Software Engineer — passionate about software architectural design, microservices.