- Patryk Gańczarek-Rał
- Read in 3 min.
Entity Framework
Entity Framework (EF) is one of the most popular ORM frameworks for .NET applications. It is supported by Microsoft and also open source. EF Core allows developers to work with data on a higher level of abstraction by using objects of domain-specific classes. According to Microsoft, “It eliminates the need for most of the data-access code that developers need to write”. Although Entity Framework Core is a great tool, it has its limits and drawbacks. When we are not careful enough, we can experience one of them – generating excessive base queries. It is one of the most common N+1 issues. It can have a massive impact on the performance of our system, and it shouldn’t be overlooked.
N+1 problem in the example
This example demonstrates the lazy loading approach. This means that Entity Framework will load our object only when needed. To display the EF database commands, we need to add the following code to our appsettings.json:
"Logging": { "LogLevel": { ... "Microsoft.EntityFrameworkCore.Database.Command": "Information" }
Let’s look at our data models:
public class Blog { [Key] [Required] public int BlogId { get; set; } [StringLength(50)] public string? BlogName { get; set; } public virtual ICollection<Post>? Posts { get; set; } } public class Post { [Key] [Required] public int PostId { get; set; } [StringLength(50)] public string? PostTitle { get; set; } public virtual Blog? Blog { get; set; } }
Now, let’s say we want to display all the titles of the posts in every blog which is in our database:
var blogs = _testDbContext.Blogs.ToList(); foreach (var blog in blogs) { foreach (var post in blog.Posts) { Console.WriteLine(post.PostTitle); } }
When we look at generated SQL commands by EF, we can see that actually we ran multiple selects:
... info: Microsoft.EntityFrameworkCore.Database.Command[20101] Executed DbCommand (25ms) [Parameters=[], CommandType='Text', CommandTimeout='30'] SELECT [b].[BlogId], [b].[BlogName], [b].[EmployeeId] FROM [Blogs] AS [b] info: Microsoft.EntityFrameworkCore.Database.Command[20101] Executed DbCommand (22ms) [Parameters=[@__p_0='?' (DbType = Int32)], CommandType='Text', CommandTimeout='30'] SELECT [p].[PostId], [p].[BlogId], [p].[PostTitle] FROM [Posts] AS [p] WHERE [p].[BlogId] = @__p_0 info: Microsoft.EntityFrameworkCore.Database.Command[20101] Executed DbCommand (0ms) [Parameters=[@__p_0='?' (DbType = Int32)], CommandType='Text', CommandTimeout='30'] SELECT [p].[PostId], [p].[BlogId], [p].[PostTitle] FROM [Posts] AS [p] WHERE [p].[BlogId] = @__p_0 ...
It isn’t a significant problem now because we don’t have many records in our Data Base, but if there were thousands of records, it will dramatically decrease the performance of our system.
How to solve our n+1 problem?
By changing our approach to eager loading, we can avoid excessive queries. By including “posts” in our blogs object, we can load all needed data at once:
var blogs = _testDbContext.Blogs.Include(b=>b.Posts).ToList(); foreach (var blog in blogs) { foreach (var post in blog.Posts) { Console.WriteLine(post.PostTitle); } }
Now when we look at generated queries we can see a massive improvement:
... info: Microsoft.EntityFrameworkCore.Database.Command[20101] Executed DbCommand (25ms) [Parameters=[], CommandType='Text', CommandTimeout='30'] SELECT [b].[BlogId], [b].[BlogName], [b].[EmployeeId], [p].[PostId], [p].[BlogId], [p].[PostTitle] FROM [Blogs] AS [b] LEFT JOIN [Posts] AS [p] ON [b].[BlogId] = [p].[BlogId] ORDER BY [b].[BlogId] ...
Summary
Entity Framework Core is a great tool, and it can help our development a lot. Unfortunately, that doesn’t mean that we can trust it 100%. As we can see in our examples, it is always worth knowing how frameworks work underneath and what code they are generating. By controlling generated SQL queries, we can make sure that our application is as fast as it can be.
Check out other articles in the technology bites category
Discover tips on how to make better use of technology in projects