Why My Synchronous API Scaled Better Than My Asynchronous API

Why My Synchronous API Scaled Better Than My Asynchronous API

Hi Everyone, Jeremy Kruer here. Today I am doing an update on my last post about Asynchronous Programming in C#. In my last post I created an API with two different end points. One endpoint downloaded several websites with traditional synchronous code and the other end point downloaded the same websites with asynchronous code. The purpose of the post was the show how to convert synchronous code into asynchronous code.

The secondary purpose was to show how much better the asynchronous code scaled when compared to the synchronous code. Unfortunately, at the end of the post when I generated load against both end points, the synchronous endpoint was able to handle more load than the asynchronous endpoint. I wasn't sure why this was happening and I asked you for help. I got lots of comments, suggestions, replies, and even pull requests and I really appreciate all of the time that everyone spent. I finally figured out what the issue was and I want to share that with you today.

Common Feed Back I Got

The most common piece of feedback I got was that I was downloading the multiple websites one at a time and that I should convert the asynchronous code to download the websites in parallel. You could accomplish this by modifying the CrawlBingSearchPageAsync method within the AsyncService. There are 3 things that you would need to do:

  1. Create a List to store the tasks: var tasks = new List<Task<HtmlDocument>>();
  2. Within the foreach look, instead of awaiting each call to LoadDocumentAsync you want to add that task to the list: tasks.Add(LoadDocumentAsync(linkUrl));
  3. Wait for all of the tasks to complete: await Task.WhenAll(tasks);

The final code for the CrawlBingSearchPageAsync method would look like this:

        private async Task<List<string>> CrawlBingSearchPageAsync(string url)
            var result = new List<string>

            var doc = await LoadDocumentAsync(url);
            var searchResultsGrid = GetSearchResultsGrid(doc);
            if (searchResultsGrid == null) return result;

            var searchResultsLinks = GetSearchResultsLinks(searchResultsGrid);

            var tasks = new List<Task<HtmlDocument>>();
            foreach (var linkUrl in searchResultsLinks)

            await Task.WhenAll(tasks);

            return result;

While creating the original post, I thought about this and I almost did it. I completely agree that in a production application this is the correct approach to take. However, the performance boost gained by this, has to do with making the code parallel not from making the code asynchronous. I felt that this would give the asynchronous code an unfair advantage and be misleading.

In my demo application I downloaded a total of 4 websites to make it take longer. The reason that I feel this approach would be misleading is because if my demo only downloaded a single website, I would get no benefit to trying to make it parallel (since there is only one) and I would still have the same issue that I originally had.

The First Problem

The first problem I was having was due to a way that the .Net Framework handles HTTP calls. This was pointed out to me by Christian Melendez on StackOverflow. By default you are limited to only 2 concurrent calls. I corrected this issue by setting the System.Net.ServicePointManager.DefaultConnectionLimit to 1000 within my Startup class in my Web API.

        public Startup(IHostingEnvironment env)
            System.Net.ServicePointManager.DefaultConnectionLimit = 1000;

            var builder = new ConfigurationBuilder()
                .AddJsonFile("appsettings.json", optional: true, reloadOnChange: true)
                .AddJsonFile($"appsettings.{env.EnvironmentName}.json", optional: true)
            Configuration = builder.Build();

This improved the performance of my API in general but it did not allow my asynchronous controller to handle more load than my synchronous controller.

Other Changes I Made

Uping the DefaultConnectionLimit made such a big difference on the API that I needed to increase the load within my LoadGenerator program.

Within the Program.cs I increased the number on concurrent connections from 50 to 200 on Line 21 by changing the call to Enumberable.Range.

        public static void Main(string[] args)
            Console.WriteLine("Waiting for API to Startup");
            bool quit;
                Console.Write("Run Asyncronously (Y/N)?");
                var runAsyncronously = Console.ReadLine().ToLower() == "y";
                var tasks = Enumerable.Range(0, 200).Select(num => CallApi(num, runAsyncronously)).ToArray();
                Console.Write("Finished. Quit (Y/N)?");
                quit = Console.ReadLine().ToLower() == "y";
            } while (!quit);

Within the CallApi method, I wanted to lower the timeout from the default of 120 seconds to only 60 seconds. In order to do this I had to change from a WebClient to an HttpClient so I had more control.

        public static async Task CallApi(int number, bool runAsyncronously)
            var stopWatch = new Stopwatch();
            string result;
                //var url = runAsyncronously ? "http://localhost:60383/api/SearchBingAsync" : "http://localhost:60383/api/SearchBingSync";
                var url = runAsyncronously ? "http://localhost:5000/api/SearchBingAsync?random=" : "http://localhost:5000/api/SearchBingSync?random=";

                using (var client = new HttpClient
                    Timeout = TimeSpan.FromSeconds(60)
                using (var response = await client.GetAsync(url + Guid.NewGuid()))
                using (var content = response.Content)
                    var downloadedContent = await content.ReadAsStringAsync();
                    result = "Success!";
            catch (Exception)
                result = "Failure!";
            Console.WriteLine($"{number} - {result} - {stopWatch.ElapsedMilliseconds}ms");

The Root Cause of the Problem

I'm a bit embarrassed to admit this, but the root cause of my problem was that I was running the code from within Visual Studio. Even though I was running it in "Release" mode, Visual Studio still must have been doing something to alter the way the code was actually executing. Once I published my code and ran it outside of Visual Studio everything performed as expected.

Publishing My Code

From within Visual Studio, go to Build > Publish SyncVsAsync.

Chose "Custom" for the publish target and name it "Local".

Choose "File System" for the Publish Method and choose a target folder.

Click "Publish" and it will publish a copy of your API to that folder.

At this point you can go ahead and close Visual Studio.

Running the Code

Navigate to the Folder that you published your API to and double click the SyncVsAsync.exe application. Your web api is now running within Kestrel and listening on port 5000.

Now navigate to your /bin/Release folder for the LoadGenerator application. Double click the LoadGenerator.exe appliation.


First, we want to run the program Synchronously, so when the LoadGenerator program asks "Run Asynchronously?" choose N for No. It will immediately send 200 requests to the synchronous endpoint on the API.

Now pay attention to the output from the API. Even though all 200 requests were sent immediately to the API, the API does not have enough threads available to process all of the incoming requests. Instead, the requests get queued up and processed as threads become available. You can tell this because for almost the entire time, you will see text getting displayed that says "Request Starting HTTP/1.1 Get....". These continue to happen even long after the requests were initially sent by the LoadGenerator.

This is the nature of synchronous code. While the thread is waiting for the websites to download, instead of doing something useful (like processing the queued up requests) it sits and does nothing at all. The queued up requests cannot be processed until a thread has finished the request that it was currently working on.

When you look at the output of the LoadGenerator application, you will see that it successfully processed the first several requests but it failed to process the majority of them because they timed out.


Now Lets Look at what happens when we run the Asynchronous Code. When the LoadGenerator program asks "Run Asynchronously?" choose Y for Yes. It will immediately send 200 requests to the asynchronous endpoint on the API.

Now pay attention to the output from the API. All 200 requests get received almost immediately. Within the first few seconds of the requests being sent you will see all of the requests coming in and getting processed before any responses get sent out.

With the synchronous code we saw the incoming requests get queued up and then slowly processed as the threads became available. However, with the asynchronous code all 200 requests started getting processed almost immediately. This is the nature of asynchronous code. Here is how a thread would work:

  1. Receive an incoming request
  2. Start to Download the first website for the first request
  3. While that first website is downloading start processing another incoming request
  4. Start to Download the first website for the second request
  5. If the first download for the first request is finished start downloading the second website for the first request
  6. If the first download for the first request is not finished start processing another incoming request

This continues until all requests have been processed. The point is, the thread doesn't sit around twiddling its thumbs and doing nothing while it is waiting for downloads to complete. It stays busy processing other requests! This is why asynchronous APIs scale so much better than synchronous APIs!

It stays busy processing other requests! This is why asynchronous APIs scale so much better than synchronous APIs!

When we look at the output of the LoadGenerator application, you can see that the majority of the requests were able to be successfully processed while only a few of the requests timed out and failed.

What I Learned

  1. Asynchronous Code DOES Scale Better Than Synchronous Code
  2. When doing load/scale/performance tests, make sure you are running deployed application not debugging from within Visual Studio
  3. When creating an asynchronous demo, it would probably be better to use either a)Disk I/O as an example or b)A server endpoint that you have control over (instead of creating a small scale DOS attack on a public web server you don't control (sorry Bing and MSDN)).

Requests I Have For You

  1. Subscribe so you can be notified of all my new content.
  2. What questions do you have about async/await?
  3. Have you converted any production synchronous web apps to asynchronous code? If so what was your experience like?
  4. What topics would you like to see me discuss next?

Get Notified of New Posts

* indicates required

Related Article