The title may sound a bit hyperbolic, but I promise that 18,000% was the rough page load speed gain realized on a recent project that I was working. Before we dive into the meat of the problem and its resolution, let’s set the stage. I was given a new project to pull down and become familiar with. After getting over the initial hiccups of loading a new project, I clicked around the application to get a feel for how it functions from a top-level view. In the process of clicking, I landed on one page where there was a progress bar indicator that was running on load and a search form waiting to be filled. After waiting a couple of seconds with no data showing, I figured that it just showed the animation until a search was performed. With that thought in mind, I went to perform a search, but before I could start, the data populated. It had taken what felt like an eternity to load that information, but as anyone knows when investigating a performance problem feelings don’t matter; numbers do.
Where to Start
Figure 1: Network trip of SearchableCares
We have a server issue. If it were a client issue, then the action wouldn’t have taken that long to complete. Now, there’s an important piece of technology that’s being used to access the data in this application, and that’s Entity Framework (EF). When I realized that was the case, I was fairly certain at this point the problem was the so-called N+1 problem
. In short, EF lazy loads navigation properties which means if I load a collection of entities that have navigation properties, then iterate over that collection and access those navigation properties, I end up hitting the database on each for-loop for each navigation property. This practice kills
Figure 2: Materializing a list of Cares from the database
Figure 3: Care class has a number of Virtual properties which have NOT been materialized in Figure 2.
Looking at the PartialView, this application returned and the way the model was being loaded it was definitely triggering the N+1 problem. But I still wanted to get a peek at the actual queries themselves. Recently I’d learned about a tool called Glimpse that would let me inspect the state of the SearchableCares action and, more importantly, the SQL queries that were being issued against the database. I needed to install Glimpse
, Glimpse.EF6, and Glimpse.Mvc5 to make sure everything worked, and when I did I was able to get insight into what was happening against the database. And it was eye-opening! There were 393 queries being executed against the database to fetch the 147 records that were being displayed on the page!
Figure 4: Glimpse output showing the database connections
With the culprit identified, how do we go about solving the problem? Well, EF has a method of telling it that you also need to include these tables when returning my initial request. We can use this eager-loading strategy to prevent the N+1 problem from ever occurring. By forcing EF to materialize all of the data at once, we avoid extra database hits when we access a virtual property. A quick re-write of the code followed with a re-build and re-hitting the end point; we see a much better scenario. As Fig 6 shows, we’ve decreased our load time from 19.777 seconds to 1.851 seconds an improvement of roughly 10.7x. However, waiting almost 2 seconds to fetch 147 records is not a good place to be. Thankfully, there’s still room for improvement.
Figure 5: Telling EF to eagerly load the tables I need.
Figure 6: Much improved load time.
Figure 7: Glimpse output. Only one query.
Now, an important part of our database in this project is that the development database was spun up in Azure. So while the query itself is executed in only 42ms, we still have to transfer all of the data over, and then EF must hydrate our entities with that information. While all of this is happening, the connection to Azure is kept open. This situation is not great for performance, and as seen above in Fig 3, our entity is not exactly simple. It takes time to turn the flat SQL records into the complicated object hierarchy that we’re trying to represent. So now it’s time to tune the query itself.
The query in Fig 5 returns ~49 columns of information. Our view is only using ten columns of information. Ideally, we want to be able to tell EF to include only the information we need. Furthermore, we want to flatten our complex class into a simple ViewModel which will make the work EF has to accomplish much simpler. Using LINQ and a ViewModel, we can force EF to return only the information we need, and in a flat object that is much easier to materialize than our complex class Care.
Figure 8: ViewModel to simplify EF data returned.
So how does this change our query? Since we’re using LINQ, we can use the Select operator to project (or map) the complex object down to our simple object, and EF will use that Select to determine what columns to return.
Figure 9: Simplified query
You may be wondering, where did the Include() calls go? The answer is they’re no longer needed. By using Select to project what we want into an object, EF can determine exactly which columns we need. So we don’t need to tell EF to include certain tables anymore because they’re already going to be included by us accessing them in the Select() method. Now, let’s update the View and take the app for a test spin!
Figure 10: Final improved speed
As we can see from the network snapshot, we have finally achieved a fast load time! Now, most of that connection time is spent transferring the data from Azure still, but it’s kept open for a much shorter period due to the simplicity of the ViewModel that we’re using. And we finally got the performance gain that we wanted to see from the title. We’re now 184.8x faster than when we started, or when displayed as a percent that’s roughly 18,000% faster than when we began this blog post.
What We Learned – Tech Speak
So what can we learn from this? First, if you’re using EF, you absolutely must understand that it lazy loads virtual properties by default and this will come back to haunt you as soon as you iterate a collection of Entities that have them. Second, only return the data that you absolutely need for the view to trim-down load time. Obviously, there’s no need to transfer a bunch of data from the DB if we’re never even going to use it. Finally, it’s important to remember that when doing any performance tuning, you must measure before and after the “improvements” are made. While this was an easy example of low-hanging fruit, it’s not always obvious that the “enhancements” made will improve performance.
What We Learned – Business Speak
Oh and probably one of the most important reasons to optimize the speed of your app, website, or portal is the end-user! You don’t want them to bounce from your site when your page load speed is super slow and frustrating. And if your app load speed is slow, they may never use the app, thus your message, product, or service will never get to the end-user, buyer, or consumer.