Microsoft Resiliency Extensions and Polly Part 4 - Proactive Strategies

February 17, 2026#Software Development

In our previous post on reactive strategies, we explored how Polly helps applications respond gracefully to failures after they occur. But what if you could prevent many of those failures from happening in the first place? That’s where proactive resilience strategies come in.

Proactive strategies don’t wait for things to go wrong. Instead, they establish boundaries and constraints that prevent problems before they can cascade into failures. They’re the guardrails that keep your application running smoothly even under unexpected load or when dealing with misbehaving dependencies.

In this post, we’ll explore three essential proactive strategies: Timeout, Rate Limiter, and Concurrency Limiter. Each addresses different aspects of resource management and helps you build systems that are both resilient and performant.

Understanding Proactive vs Reactive Strategies

The distinction between proactive and reactive strategies is fundamental to building robust resilience pipelines. Reactive strategies—like Retry, Circuit Breaker, Fallback, and Hedging—detect and respond to failures after they’ve occurred. They’re your safety net when things go wrong.

Proactive strategies, on the other hand, prevent failures by constraining how operations execute. They limit execution time (Timeout), control request rates (Rate Limiter), and manage concurrent operations (Concurrency Limiter). Think of them as preventive medicine for your application: they keep you healthy rather than treating illness after it occurs.

The Timeout Strategy

Imagine calling an external API that usually responds in 200ms. But occasionally, due to a bug or resource exhaustion on the server, a request hangs indefinitely. Without a timeout, that single slow request can tie up a thread in your application, and as more requests pile up, you quickly exhaust your thread pool. What started as a problem in a dependency has now become a problem in your own service.

The Timeout strategy is the simplest yet most critical proactive pattern. It ensures that operations don’t run forever, preventing resource exhaustion and keeping your application responsive. Every network call, database query, or external service invocation should have a reasonable timeout configured.

When to Use Timeout

Timeout is essential for:

Any network request (HTTP calls, gRPC, database queries)
Operations with external dependencies that could hang
Protecting thread pools from exhaustion
Ensuring predictable response times

Without timeouts, a single hanging operation blocks a thread indefinitely. As blocked threads accumulate, your entire application becomes unresponsive—a problem that’s especially critical in high-throughput services where every thread counts.

Basic Timeout Configuration

builder.Services.AddResiliencePipeline("timeout-pipeline", pipelineBuilder =>
{
    pipelineBuilder.AddTimeout(new TimeoutStrategyOptions
    {
        Timeout = TimeSpan.FromSeconds(5)
    });
});

This simple configuration ensures that any operation wrapped by this pipeline will be canceled if it doesn’t complete within 5 seconds. The operation receives a TimeoutRejectedException, allowing your code to handle the timeout appropriately.

Dynamic Timeouts

Polly allows you to configure dynamic timeouts based on context. This can be useful when different operations or priorities require different timeout values:

pipelineBuilder.AddTimeout(new TimeoutStrategyOptions
{
    TimeoutGenerator = args =>
    {
        // You can use dynamic timeouts based on context
        var priority = args.Context.Properties.GetValue(
            new ResiliencePropertyKey<string>("Priority"), "Normal");
        
        return new ValueTask<TimeSpan>(
            priority == "High" 
                ? TimeSpan.FromSeconds(10) 
                : TimeSpan.FromSeconds(5)
        );
    }
});

Handling Timeout Events

You can monitor timeout occurrences to understand when and why operations are timing out:

pipelineBuilder.AddTimeout(new TimeoutStrategyOptions
{
    Timeout = TimeSpan.FromSeconds(5),
    OnTimeout = args =>
    {
        logger.LogWarning(
            "Operation timed out after {Timeout}. Operation: {OperationKey}",
            args.Timeout,
            args.Context.OperationKey);
        return ValueTask.CompletedTask;
    }
});

Setting Appropriate Timeout Values

Choosing the right timeout value is more art than science. Set it too low and you’ll reject operations that would have succeeded given a bit more time. Set it too high and you won’t protect your application from truly hung operations.

Start by understanding your operation’s normal behavior. If an API call typically responds in 200ms, a timeout of 5 seconds gives plenty of headroom for slow responses while still protecting against hangs. For database queries, consider the 95th or 99th percentile response time and add a reasonable buffer.

Remember that timeouts should account for retry logic if you’re combining strategies. If you have a retry policy that attempts an operation 3 times with a 2-second delay between attempts, your outer timeout needs to accommodate that entire retry sequence.

The Rate Limiter Strategy

Picture a public API that receives a sudden surge of traffic—perhaps your service was mentioned on social media or is experiencing a DDoS attack. Without rate limiting, this flood of requests can overwhelm your infrastructure, causing high latency for all users or even complete service outages. The Rate Limiter strategy prevents this by controlling how many operations can execute.

What makes Polly’s Rate Limiter strategy particularly powerful is its dual nature. It can operate in two distinct modes: as a traditional rate limiter controlling requests over time, or as a concurrency limiter controlling simultaneous operations. Both modes protect resources, but in different ways.

Rate limiting is fundamentally about protecting resources. By constraining either the rate at which operations execute or the number that can run concurrently, you ensure that your service remains responsive and available even under heavy load. This protects not just your own service, but also downstream dependencies that you’re calling.

When to Use Rate Limiter

Rate Limiter is crucial for:

Protecting APIs from being overwhelmed by traffic spikes
Complying with third-party API rate limits
Implementing fair usage policies across tenants or users
Preventing resource exhaustion (database connections, thread pools)
Managing costs for pay-per-request services
Controlling the number of concurrent operations

Rate Limiting Modes

Polly V8’s Rate Limiter strategy is built on top of the System.Threading.RateLimiting package and supports multiple algorithms. The strategy can be used in two primary modes:

Time-Based Rate Limiting controls how many operations can execute within a time window. This is useful for complying with API rate limits (e.g., “100 requests per minute”) or protecting services from traffic spikes.

Concurrency Limiting controls how many operations can execute simultaneously. This is useful for protecting limited resources like database connection pools or preventing thread pool exhaustion.

The underlying System.Threading.RateLimiting package provides several rate limiting algorithms:

Fixed Window - Divides time into fixed windows and allows a specific number of requests per window. Simple but can allow bursts at window boundaries.
Sliding Window - Tracks requests in a rolling time window, providing smoother rate limiting without boundary bursts.
Token Bucket - Allows burst traffic up to a token limit while enforcing an average rate over time.
Concurrency - Limits the number of concurrent operations with optional queueing.

Basic Rate Limiter Configuration

The Rate Limiter strategy provides two convenient extension methods depending on your use case:

Time-based rate limiting using AddRateLimiter:

builder.Services.AddResiliencePipeline("pipeline", pipelineBuilder =>
{
    pipelineBuilder.AddRateLimiter(new SlidingWindowRateLimiter(
        new SlidingWindowRateLimiterOptions
        {
            PermitLimit = 100,
            Window = TimeSpan.FromMinutes(1),
            SegmentsPerWindow = 6
        }));
});

This configuration allows 100 requests per minute using a sliding window divided into 6 segments (10 seconds each). This provides smooth rate limiting without the burst problem of fixed windows.

Concurrency limiting using AddConcurrencyLimiter:

builder.Services.AddResiliencePipeline("pipeline", pipelineBuilder =>
{
    pipelineBuilder.AddConcurrencyLimiter(100, 50);
});

This simpler syntax creates a concurrency limiter allowing a maximum of 100 concurrent executions with a queue of 50 waiting requests.

Partitioned Rate Limiting

One of the most powerful features of rate limiting is partitioning—applying different rate limits to different groups of requests. This enables scenarios like per-user rate limiting or per-tenant quotas in multi-tenant applications.

pipelineBuilder.AddRateLimiter(new RateLimiterStrategyOptions
{
    RateLimiter = args =>
    {
        // Get user ID from context
        var userId = args.Context.Properties.GetValue(
            new ResiliencePropertyKey<string>("UserId"), "anonymous");
        
        // Create a partitioned rate limiter
        return new PartitionedRateLimiter<string>(
            userId,
            partition =>
            {
                // Premium users get higher limits
                var isPremium = premiumUsers.Contains(partition);
                return new SlidingWindowRateLimiter(
                    new SlidingWindowRateLimiterOptions
                    {
                        PermitLimit = isPremium ? 1000 : 100,
                        Window = TimeSpan.FromMinutes(1),
                        SegmentsPerWindow = 6
                    });
            });
    }
});

Handling Rate Limit Rejections

When a request is rate limited, Polly throws a RateLimiterRejectedException. You can handle this exception and provide meaningful feedback to callers:

pipelineBuilder.AddRateLimiter(new RateLimiterStrategyOptions
{
    DefaultRateLimiterOptions = new SlidingWindowRateLimiter(
        new SlidingWindowRateLimiterOptions
        {
            PermitLimit = 100,
            Window = TimeSpan.FromMinutes(1)
        }),
    OnRejected = args =>
    {
        logger.LogWarning(
            "Request rate limited. Retry after: {RetryAfter}",
            args.RetryAfter);
        
        return ValueTask.CompletedTask;
    }
});

The OnRejected callback is particularly useful when you have a rate limiter wrapped by other strategies like retry. If the retry strategy handles RateLimiterRejectedException, the exception might not propagate to your calling code, but OnRejected will still be called, allowing you to log or track rate limit events.

Rate Limiting External API Calls

A common use case is respecting rate limits imposed by third-party APIs. Many services limit you to a certain number of requests per hour or day, and exceeding those limits can result in your API key being throttled or banned.

// GitHub API allows 5,000 requests per hour for authenticated requests
builder.Services.AddResiliencePipeline("github-api", pipelineBuilder =>
{
    pipelineBuilder.AddRateLimiter(new SlidingWindowRateLimiter(
        new SlidingWindowRateLimiterOptions
        {
            PermitLimit = 4500, // Leave some buffer
            Window = TimeSpan.FromHours(1),
            SegmentsPerWindow = 12 // 5-minute segments
        }));
});

Concurrency Limiting for Resource Protection

While rate limiting controls operations over time, concurrency limiting controls simultaneous operations. This is crucial for managing resources like database connections, thread pools, or memory usage.

Consider a scenario where you have a pool of 100 database connections. If your application receives a burst of 10,000 concurrent requests, all trying to query the database, you’ll quickly exhaust your connection pool. Subsequent requests will fail or hang waiting for connections. Concurrency limiting prevents this by queuing requests and ensuring only a safe number execute concurrently.

// If you have 100 connections in your pool, limit concurrent DB operations
builder.Services.AddResiliencePipeline("database-operations", pipelineBuilder =>
{
    pipelineBuilder
        .AddConcurrencyLimiter(80, 200) // 80 concurrent, queue up to 200
        .AddTimeout(TimeSpan.FromSeconds(30));
});

By limiting concurrency to 80 operations, you ensure connections remain available for critical operations and prevent connection pool exhaustion. The queue limit of 200 prevents unbounded memory growth - once the queue is full, additional requests are rejected immediately. The timeout ensures that even if the queue builds up, no operation waits indefinitely.

The AddConcurrencyLimiter extension method is shorthand for configuring the rate limiter with a concurrency limiter:

// These are equivalent:
pipelineBuilder.AddConcurrencyLimiter(100, 50);

pipelineBuilder.AddRateLimiter(new RateLimiterStrategyOptions
{
    DefaultRateLimiterOptions = new ConcurrencyLimiterOptions
    {
        PermitLimit = 100,
        QueueLimit = 50
    }
});

Combining Rate and Concurrency Limiting

Rate limiting and concurrency limiting serve different purposes and are often used together. You might want to ensure you never exceed 1,000 requests per minute while also never having more than 50 operations executing simultaneously:

builder.Services.AddResiliencePipeline("comprehensive-limits", pipelineBuilder =>
{
    pipelineBuilder
        // Limit requests per minute (rate limiting)
        .AddRateLimiter(new SlidingWindowRateLimiter(
            new SlidingWindowRateLimiterOptions
            {
                PermitLimit = 1000,
                Window = TimeSpan.FromMinutes(1),
                SegmentsPerWindow = 6
            }))
        // Limit concurrent operations (concurrency limiting)
        .AddConcurrencyLimiter(50, 100);
});

This pipeline ensures you never exceed 1,000 requests per minute (protecting the overall rate) while also ensuring no more than 50 operations execute simultaneously (protecting resources).

Best Practices

1. Always Set Timeouts

Every operation that touches external resources should have a timeout. There are no exceptions to this rule. Hanging operations are one of the most common causes of application failures in distributed systems.

2. Match Rate Limits to Reality

When calling third-party APIs, respect their documented rate limits. For your own APIs, set rate limits based on actual capacity testing, not guesses. Monitor your rate limiters to ensure they’re not too restrictive or too permissive.

3. Leave Headroom in Concurrency Limits

If you have 100 database connections, don’t limit concurrency to 100. Leave 20-30% headroom for other operations, health checks, and background jobs.

4. Monitor Limit Rejections

Track how often your rate limiter rejects requests (whether due to rate or concurrency limits). High rejection rates might indicate you need more capacity or that clients need to implement backoff strategies.

5. Provide Meaningful Feedback

When rejecting requests due to limits, use the OnRejected callback to log useful information. The RateLimiterRejectedException may include retry-after information that you can surface to clients.

6. Test Under Load

Your resilience strategies mean nothing if they haven’t been tested under realistic load. Use load testing tools to verify that your timeouts, rate limits, and concurrency limits behave as expected.

7. Use Dynamic Limits When Appropriate

Consider adjusting limits based on system health. If your database is struggling, temporarily reduce concurrency limits. If error rates are high, be more aggressive with rate limiting.

8. Choose the Right Rate Limiting Mode

Use time-based rate limiting (sliding window, fixed window, token bucket) when you need to control request rates over time. Use concurrency limiting when you need to protect finite resources like connection pools. Often, you’ll use both together for comprehensive protection.

Conclusion

Proactive resilience strategies prevent problems before they cascade into failures. The two proactive strategies in Polly V8 are:

Timeout - ensures operations don’t hang indefinitely and exhaust resources
Rate Limiter - controls both request rates over time and concurrent operations, preventing overload and protecting finite resources

The key is knowing when and how to configure them. Timeout values should account for normal operation plus a buffer. Rate limits should match capacity or third-party requirements. Concurrency limits should leave headroom for critical operations.

In the next post, we’ll explore how to combine proactive and reactive strategies to create comprehensive resilience pipelines that both prevent problems and handle them gracefully when prevention isn’t enough.

Microsoft Resiliency Extensions and Polly Part 4 - Proactive Strategies

Understanding Proactive vs Reactive Strategies

The Timeout Strategy

When to Use Timeout

Basic Timeout Configuration

Dynamic Timeouts

Handling Timeout Events

Setting Appropriate Timeout Values

The Rate Limiter Strategy

When to Use Rate Limiter

Rate Limiting Modes

Basic Rate Limiter Configuration

Partitioned Rate Limiting

Handling Rate Limit Rejections

Rate Limiting External API Calls

Concurrency Limiting for Resource Protection

Combining Rate and Concurrency Limiting

Best Practices

1. Always Set Timeouts

2. Match Rate Limits to Reality

3. Leave Headroom in Concurrency Limits

4. Monitor Limit Rejections

5. Provide Meaningful Feedback

6. Test Under Load

7. Use Dynamic Limits When Appropriate

8. Choose the Right Rate Limiting Mode

Conclusion

Authors

Categories

Series

Understanding Proactive vs Reactive Strategies

The Timeout Strategy

When to Use Timeout

Basic Timeout Configuration

Dynamic Timeouts

Handling Timeout Events

Setting Appropriate Timeout Values

The Rate Limiter Strategy

When to Use Rate Limiter

Rate Limiting Modes

Basic Rate Limiter Configuration

Partitioned Rate Limiting

Handling Rate Limit Rejections

Rate Limiting External API Calls

Concurrency Limiting for Resource Protection

Combining Rate and Concurrency Limiting

Best Practices

1. Always Set Timeouts

2. Match Rate Limits to Reality

3. Leave Headroom in Concurrency Limits

4. Monitor Limit Rejections

5. Provide Meaningful Feedback

6. Test Under Load

7. Use Dynamic Limits When Appropriate

8. Choose the Right Rate Limiting Mode

Conclusion

Get these blog posts delivered to your email!

Authors

Categories

Series