Introduction to RCSI and Improved Entity Framework Locking Hint Interceptor
A few days ago, I inadvertently discovered that Google's NotebookLM had actually supported Traditional Chinese in its podcasts as early as April 30, 2025 (ironically, until January 2026, Gemini kept insisting to me that it wasn't supported). On a whim, I fed some of my technical notes into it, thinking I could convert them into a podcast to listen to during my commute.
Since Antigravity suggested that my notes were full of decision-making processes, I set the style to "Debate." After all, I don't consider my notes to be absolutely correct, so I also added a prompt asking the AI to check for any errors or outdated content.
The resulting content left me a bit stunned; it felt like many of the comments were criticizing just for the sake of criticizing:
- Debating for the sake of debating: In my notes, I mentioned that I am accustomed to using
DateTimeOffset, noting a pitfall a colleague encountered when usingDateTimewith UTC in Entity Framework (I identified the cause and provided a solution). The comment's conclusion was: I should have just told my colleague to switch toDateTimeOffsetinstead of wasting time struggling withDateTime? - Missing the point: The two AI hosts spent a large portion of the time attacking my "Introduction" and "Background" sections, while ignoring the technical details of the main content.
- Calling a deer a horse: I was clearly just demonstrating syntax or discussing certain mechanisms, and the text explicitly stated that I didn't recommend using them or highlighted potential issues. Yet, it claimed I was "promoting" these practices!
Well, I admit I'm a bit sensitive; I ended up deleting all those podcasts.
However, I can't say it was a total loss. I heard one key term: RCSI (Read Committed Snapshot Isolation).
I remember back around 2018, when interviewing for companies with high-traffic or high-concurrency systems, knowing how to use WITH (NOLOCK) in SQL Server was almost a mandatory question. It showed you knew how to prevent queries from being blocked by writes. But after checking recent documentation, I found that this practice is now considered an Anti-Pattern in certain scenarios.
Since I've been busy and a bit lazy lately, I'll just jot down the key points:
TIP
The complete executable sample for this article: CloudyWing/EfCoreSqlHintInterceptorSample.
What is RCSI?
The full name is Read Committed Snapshot Isolation. Simply put, it solves locking issues through the concept of "version control." When someone is writing data, read operations are not blocked, nor do they read "dirty" data that is mid-write. Instead, they read the "last version before the write (Snapshot)."
Is RCSI a global switch?
Yes, RCSI (READ_COMMITTED_SNAPSHOT) is a Database Level setting.
Once enabled (ALTER DATABASE ... SET READ_COMMITTED_SNAPSHOT ON), it changes the behavior of all queries using the Read Committed isolation level (the default) across the entire database.
- It cannot be enabled for "specific tables."
- It is globally effective; all queries that do not specify an isolation level will automatically become "Read Snapshot" mode.
TIP
If you want "selective" use of snapshot isolation, you should use Snapshot Isolation and explicitly specify IsolationLevel.Snapshot when creating a transaction in EF. However, such fine-grained control requires a high level of development discipline from the team. For projects with frequent transactions, honestly, even I am not confident that I could take the time to clarify and make the correct isolation level decisions for every single case.
WARNING
Please note that RCSI primarily solves locking issues related to queries (avoiding reads being blocked by writes). If you need to handle concurrency conflicts during data updates (e.g., two people modifying the same record simultaneously), you must still use the RowVersion (Optimistic Locking) mechanism to ensure data consistency. RCSI cannot replace RowVersion.
RCSI vs. NOLOCK vs. READPAST
WITH (NOLOCK): To avoid being locked, I am willing to read "Dirty Data (Dirty Read)."
WITH (READPAST): If a row is locked, just "skip" it. Suitable for queue processing, but not for general reports (because data will be missing).
RCSI: Read consistent "Snapshot Data." You can achieve non-blocking reads without adding any hints.
TIP
Whether it's a dirty read or a snapshot read, in high-concurrency scenarios, there is no guarantee of getting the "latest state" of the data, as the data is constantly changing. The difference is: a snapshot read only reads committed historical versions, whereas a dirty read might read an incomplete intermediate state, which destroys atomicity and thus carries a significantly higher level of risk.
Why wasn't it used before, and why is it mainstream now?
The cost of RCSI is that it uses a large amount of TempDB to store data versions. In the era of traditional hard disk drives (HDD), this would lead to severe I/O bottlenecks. But the mainstream environment has changed:
- SSD Adoption: I/O speeds have increased dramatically, so the burden on TempDB is no longer a fatal flaw.
- Cloud Defaults: For example, Azure SQL Database has RCSI enabled by default.
Therefore, if the hardware can handle it (space and I/O), enabling RCSI directly is a cleaner solution than adding NOLOCK everywhere (especially since using NOLOCK in Entity Framework usually requires an interceptor, which is cumbersome).
TagWith + Interceptor Improved Implementation
Of course, RCSI is the best solution for now, but it's not suitable for every project, and we still need to control locking levels.
I previously wrote a note on How to add WITH (NOLOCK) and handle Parameter Sniffing in Entity Framework. The solution at the time used DbCommandInterceptor combined with Regex to force NOLOCK on all SELECT statements.
To be honest, I was never satisfied with that solution. When running list reports, data accuracy might not need to be that high; but when you are retrieving data to perform updates, reading dirty data is very dangerous. At the time, I could only use "workarounds" like excluding TOP 2, but if it were batch data processing (which requires fetching list data), it would still cause issues.
The Discovery of TagWith
Back in 2024, a colleague found the usage of TagWith while dealing with Parameter Sniffing issues. I mistakenly thought it was just adding a static comment and then using an interceptor to search and replace strings. I only recently realized that TagWith is a built-in feature of EF Core that allows comments to be fully carried into the generated SQL, which gives us a perfect "switch."
The Improved Solution
This example chooses not to add WITH (NOLOCK) and OPTION by default, but rather uses TagWith for explicit declaration. The interceptor only intervenes when I explicitly state that this query requires NOLOCK.
1. Define Extension Methods
This makes the code look like a Fluent API, with clearer semantics.
public static class EfHintExtensions {
public const string TagNoLock = "SQL_HINT: NOLOCK";
public const string TagReadPast = "SQL_HINT: READPAST";
public const string TagOptimizeForUnknown = "SQL_OPTION: OPTIMIZE FOR UNKNOWN";
public const string TagRecompile = "SQL_OPTION: RECOMPILE";
public static IQueryable<T> WithNoLock<T>(this IQueryable<T> query) => query.TagWith(TagNoLock);
public static IQueryable<T> WithReadPast<T>(this IQueryable<T> query) => query.TagWith(TagReadPast);
public static IQueryable<T> WithOptimizeForUnknown<T>(this IQueryable<T> query) => query.TagWith(TagOptimizeForUnknown);
public static IQueryable<T> WithRecompile<T>(this IQueryable<T> query) => query.TagWith(TagRecompile);
}2. Implement the Interceptor
This version of the interceptor has two improvements:
- Schema Support: Improved the Regex to support formats with schemas like
[dbo].[Table](the old Regex would fail). - Option Merging: Automatically handles the merging of multiple Options to avoid SQL syntax errors.
public class SqlTaggingInterceptor : DbCommandInterceptor {
private static readonly RegexOptions regexOptions = RegexOptions.Multiline | RegexOptions.IgnoreCase;
// Improved Regex: Supports Schema (e.g., [dbo].[Table])
// Theoretically, table names shouldn't have spaces, but it's technically allowed, so we handle it here
private static readonly Regex tableAliasRegex = new(
@"(?<tableAlias>(?:FROM|JOIN)\s+(?:\[[^\]]+\]\.)?\[[^\]]+\]\s+AS\s+\[[^\]]+\])(?!\s+WITH\s*\()",
regexOptions
);
// Intercept Reader and Scalar (Count/Any must also be handled!)
public override InterceptionResult<DbDataReader> ReaderExecuting(
DbCommand command, CommandEventData eventData, InterceptionResult<DbDataReader> result
) {
FixCommand(command);
return base.ReaderExecuting(command, eventData, result);
}
// ... (Async and Scalar methods omitted, remember to call FixCommand in all of them)
private static void FixCommand(DbCommand command) {
if (string.IsNullOrWhiteSpace(command.CommandText)) {
return;
}
string text = command.CommandText;
bool isChanged = false;
// 1. Handle Table Hints (NOLOCK / READPAST)
string hintToApply = null;
if (text.Contains(EfHintExtensions.TagNoLock)) {
hintToApply = "WITH (NOLOCK)";
}
else if (text.Contains(EfHintExtensions.TagReadPast)) {
hintToApply = "WITH (READPAST)";
}
if (hintToApply != null) {
text = tableAliasRegex.Replace(text, $"${{tableAlias}} {hintToApply}");
isChanged = true;
}
// 2. Handle Query Options
List<string> options = new ();
if (text.Contains(EfHintExtensions.TagOptimizeForUnknown)) {
options.Add("OPTIMIZE FOR UNKNOWN");
}
if (text.Contains(EfHintExtensions.TagRecompile)) {
options.Add("RECOMPILE");
}
if (options.Count > 0) {
text = text.TrimEnd().TrimEnd(';');
text += $" OPTION ({string.Join(", ", options)});";
isChanged = true;
}
if (isChanged) {
command.CommandText = text;
}
}
}Actual Usage
Now, you can control the behavior of every query without worrying about breaking other SQL statements:
// Report: Allow dirty reads + solve parameter sniffing
List<Order> orders = context.Orders
.WithNoLock()
.WithOptimizeForUnknown()
.ToList();
// Queue: Skip locked rows
Job job = context.Jobs
.WithReadPast()
.FirstOrDefault();Conclusion
Whether it's WITH (NOLOCK) or Option for parameter sniffing, there are now more suitable ways to handle these from the database side, such as enabling RCSI or using Query Store. If hardware and operations can keep up (I/O performance and TempDB space monitoring), especially if you have a DBA, I recommend using them. In most read-heavy, write-light, query-oriented systems, RCSI is a reasonable default. It fundamentally solves read-write conflicts without needing to add hints everywhere in the code. If the goal is to solve Parameter Sniffing, Query Store, introduced since SQL Server 2016, can also provide better execution plan management and forcing mechanisms.
Of course, all of this assumes that hardware resources (TempDB space and I/O) are sufficient, and it is best to have a DBA assist with evaluation and monitoring. But in modern cloud environments or with SSD equipment, these costs are usually worth the investment.
Extended Discussion: Reflections on Global Interception and Invisible Mechanisms
In the process of discussing RCSI and TagWith, one controversy is inevitable: Should you use a framework to perform global processing?
Regarding the AI host's opinion that "you shouldn't start from a global level" or "you shouldn't use invisible black magic," I hold a different view.
Whether it's Entity Framework or ASP.NET Core, the framework itself does many similar things. When someone criticizes that "modifying SQL from an interceptor makes the generated SQL different from what the developer expects," do developers really think they can fully master every single SQL statement generated by EF?
In fact, more people might only understand a portion of the mechanism (myself included), and some might not even know what the framework is handling behind the scenes (e.g., the impact of Parameter Sniffing).
For example:
ASP.NET Core Case Conversion: In the early days of ASP.NET Framework, JSON property names matched C# DTOs by default (PascalCase), which felt awkward for front-end developers accustomed to camelCase. Therefore, ASP.NET Core changed the default to force conversion to camelCase. This is a classic example of framework-level global processing.
Model Validation: In the past, we had to manually check
ModelState.IsValidin Actions, but later, ASP.NET Core Web API even started intercepting validation-failed requests and returning 400 by default, without the developer writing a single line of code.
So, in many cases, performing global processing from the framework is actually the correct choice. The issue is not "whether it's global," but which approach better reduces the Cognitive Load of the development team.
The Trade-off Between Cognitive Load and Default Behavior
AsNoTracking() is also a great example. Data retrieved this way has better performance because it lacks some tracking information. Theoretically, we would advise teams to always add AsNoTracking() to "queries that don't require updates."
However, in practice, we easily see two chaotic situations:
- Should have added it but didn't: Leading to failure when trying to Update later (because it was tracked).
- Shouldn't have added it but did: Leading to an inability to update, resulting in people who don't understand the situation manually changing
EntityStateto force an update, creating even more chaotic code.
Instead of relying on the developer's "self-discipline" or "memory," it is better to make a split in the architecture: separate Query Service and Repository. Any data coming from the Query Service should have AsNoTracking() by default. This way, when the team uses the Query Service, they don't have to think about "whether to add NoTracking," because the architecture has already made the safest default decision for you.
This is the point I want to emphasize: Look at which scenario is more common, which is easier to distinguish, and which carries the lowest risk if a judgment error occurs.
We can evaluate "whether to do global processing" from three dimensions:
Which scenario is more common? This depends on the project characteristics. If it's a reporting system, 90% of queries might need
NOLOCK; if it's a transaction system, 90% might not.Which is easier to distinguish? Sometimes, having a global default enabled makes it easier to distinguish.
- Default off Developers have to decide for every query, "Can this query be dirty-read?" This means they must deeply understand the business logic, which has a higher judgment cost.
- Default on Developers only need to decide, "Does this query absolutely forbid dirty reads (e.g., deducting inventory)?" Relatively speaking, these high-risk operations are easier to identify, and the threshold for misjudgment is lower.
Which carries the lowest risk if a judgment error occurs?
- Default off: Usually the lowest risk (everyone agrees on this).
- Default on + Misjudgment: Data inconsistency, transaction errors (serious).
- Default off + Misjudgment: Queries become slow, potential deadlocks (usually tolerable).
Although from point 3, "default off" is the lowest risk and the most conservative choice in most cases, if you combine point 1 (business characteristics) and point 2 (difficulty of distinction), some projects might find that "default global processing" actually significantly reduces the team's development burden and mental cost.
Regarding whether to use "explicit declaration" or "implicit black magic," the point is not the technical means itself, but whether it can effectively reduce the team's cognitive load and align with the project's risk tolerance. Discussing Anti-Patterns outside of specific contexts is often just another form of dogmatism.
Change Log
- 2026-05-29
- Renamed the extension method
WithOptionUnknowntoWithOptimizeForUnknownto match the sample project. - Added the corresponding GitHub sample project link.
- Renamed the extension method
- 2026-02-05 Initial document creation.