A Brief Discussion on C#'s GetHashCode()
Last week, while a colleague was sharing information about Bloom Filters, it reminded me of key comparisons in Dictionary, as both rely on hash values to quickly determine if an element exists. Of course, while both use an array structure and hash values as indices, their specific implementations differ.
GetHashCode()
GetHashCode() is a method used to generate an integer hash value, which is primarily used to compare whether object values are equal. Implementing GetHashCode() must adhere to the following principles:
- If two objects are considered equal (
Equals()returnstrue),GetHashCode()must return the same value. - If two objects are not equal,
GetHashCode()does not necessarily need to return different values. GetHashCode()should not throw exceptions.
The GetHashCode() method is commonly used in the following scenarios:
IDictionary<TKey, TValue>andDictionaryimplementation classes: Such asDictionary<TKey, TValue>andHashTable, these classes useGetHashCode()for fast key comparison.ISet<T>implementation classes: Such asHashSet<T>, which usesGetHashCode()to determine element uniqueness.- LINQ's
Distinct()method: UsesGetHashCode()to remove duplicate elements.
If you override Equals() but do not override GetHashCode(), these classes may still consider objects unequal even if they are equal according to Equals(), leading to incorrect behavior. In fact, when you only override Equals(), Visual Studio will display the following warning.

Test using the following code:
Test test1 = new() {
Name = "王大明",
Age = 10
};
Test test2 = new() {
Name = "王大明",
Age = 10
};
Dictionary<Test, string> dic = new() {
[test1] = "測試"
};
Console.WriteLine(test1.Equals(test2));
Console.WriteLine(dic.ContainsKey(test2));
public class Test {
public string Name { get; set; }
public int Age { get; set; }
public override bool Equals(object obj) => obj is Test test && Name == test.Name && Age == test.Age;
}The results are as follows:
True
FalseImplementation of GetHashCode()
The simplest way to implement it is to use Visual Studio's refactoring feature. Take the following code as an example:
public class Test {
public string Name { get; set; }
public int Age { get; set; }
}Click on the Test class, press ALT + Enter to open the refactoring options, and select "Generate Equals and GetHashCode...".

Select the members to be used for Equals() evaluation. You can also choose to implement IEquatable<T> and operators (== and !=) simultaneously; we will leave these extra items unchecked for now:

The generated code is as follows:
public class Test {
public string Name { get; set; }
public int Age { get; set; }
public override bool Equals(object obj) => obj is Test test && Name == test.Name && Age == test.Age;
public override int GetHashCode() => HashCode.Combine(Name, Age);
}However, since HashCode.Combine() was added in .NET Core 2.1, it is not supported in .NET Framework. For .NET Framework versions, the generated code might look like this:
public class Test {
public string Name { get; set; }
public int Age { get; set; }
public override bool Equals(object obj) => obj is Test test && Name == test.Name && Age == test.Age;
public override int GetHashCode() {
int hashCode = -1360180430;
hashCode = hashCode * -1521134295 + EqualityComparer<string>.Default.GetHashCode(Name);
hashCode = hashCode * -1521134295 + Age.GetHashCode();
return hashCode;
}
}Code explanation:
-1360180430and-1521134295are constants used to initialize and combine hash codes to produce a more uniformly distributed hash value. These constants are usually large prime numbers, as prime numbers help reduce the probability of collisions in hash functions.- -1360180430: This is the initial hash code value. By using a non-zero starting value (usually a prime number), it ensures that the object's hash code differs from the default value and that it does not return zero as a hash code even if the object has no properties.
- -1521134295: This is the multiplier used to combine hash values. Using a prime multiplier to mix the hash values of individual fields increases the randomness of the result, thereby reducing the probability of hash collisions.
- The reason for using
EqualityComparer<T>.Defaultto calculate hash values for reference objects:- To avoid
nullvalues; when the value isnull, it returns0. - To ensure that the same hash calculation rules are used across objects of different classes.
- To avoid
Of course, this .NET Framework version is not easy to memorize. For .NET Framework versions that do not support HashCode.Combine(), some developers might use anonymous objects to obtain GetHashCode(). Naturally, this approach might slightly impact performance due to the creation of temporary anonymous objects.
public override int GetHashCode() => new { Name, Age }.GetHashCode();Implementation of ContainsKey(TKey key)
I will briefly explain this here. If you want to know the complete implementation, the .NET 8 source code is excerpted below for your own study.
The ContainsKey() method internally uses the FindValue() method to search for the existence of the specified TKey. If the corresponding value is found, it means the TKey exists in the Dictionary<TKey, TValue>. Dictionary has two main array fields:
_entries: TypeEntry[], whereEntrycontainsTKey,TValue, and the index of the next entry with the same bucket value._buckets: Typeint[], which stores the mapping table between each bucket value (the value calculated from theTKey's HashCode after modulo operations) and the corresponding entry index.
The FindValue() implementation process is as follows:
- Calculate the HashCode of the passed
keyparameter. This HashCode is used to quickly locate the corresponding entry index in the_bucketsarray. Through this index, the corresponding entry can be retrieved directly from the_entriesarray, avoiding sequential searching and improving search efficiency. - First, compare the
TKey's HashCode with the HashCode stored in the entry. This allows for quick filtering of non-matching items, reducing the number of calls to theEquals()method. In most cases, the computational cost ofEquals()is higher than that of Hash comparison, so this further improves performance. - Since different objects may have the same HashCode, and the index in
_bucketsis the result of a calculation, theentryindex obtained from_bucketsmight not necessarily point to the desired item. If a mismatch is found,entry.nextis used to point to the index of the next item with the same bucket, continuing the comparison until a match is found or all possible items have been checked.
The following is the excerpted implementation of FindValue() in .NET 8:
internal ref TValue FindValue(TKey key) {
if (key == null) {
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key);
}
ref Entry entry = ref Unsafe.NullRef<Entry>();
if (_buckets != null) {
Debug.Assert(_entries != null, "expected entries to be != null");
IEqualityComparer<TKey>? comparer = _comparer;
if (typeof(TKey).IsValueType && // comparer can only be null for value types; enable JIT to eliminate entire if block for ref types
comparer == null) {
uint hashCode = (uint)key.GetHashCode();
int i = GetBucket(hashCode);
Entry[]? entries = _entries;
uint collisionCount = 0;
// ValueType: Devirtualize with EqualityComparer<TKey>.Default intrinsic
i--; // Value in _buckets is 1-based; subtract 1 from i. We do it here so it fuses with the following conditional.
do {
// Should be a while loop https://github.com/dotnet/runtime/issues/9422
// Test in if to drop range check for following array access
if ((uint)i >= (uint)entries.Length) {
goto ReturnNotFound;
}
entry = ref entries[i];
if (entry.hashCode == hashCode && EqualityComparer<TKey>.Default.Equals(entry.key, key)) {
goto ReturnFound;
}
i = entry.next;
collisionCount++;
} while (collisionCount <= (uint)entries.Length);
// The chain of entries forms a loop; which means a concurrent update has happened.
// Break out of the loop and throw, rather than looping forever.
goto ConcurrentOperation;
} else {
Debug.Assert(comparer is not null);
uint hashCode = (uint)comparer.GetHashCode(key);
int i = GetBucket(hashCode);
Entry[]? entries = _entries;
uint collisionCount = 0;
i--; // Value in _buckets is 1-based; subtract 1 from i. We do it here so it fuses with the following conditional.
do {
// Should be a while loop https://github.com/dotnet/runtime/issues/9422
// Test in if to drop range check for following array access
if ((uint)i >= (uint)entries.Length) {
goto ReturnNotFound;
}
entry = ref entries[i];
if (entry.hashCode == hashCode && comparer.Equals(entry.key, key)) {
goto ReturnFound;
}
i = entry.next;
collisionCount++;
} while (collisionCount <= (uint)entries.Length);
// The chain of entries forms a loop; which means a concurrent update has happened.
// Break out of the loop and throw, rather than looping forever.
goto ConcurrentOperation;
}
}
goto ReturnNotFound;
ConcurrentOperation:
ThrowHelper.ThrowInvalidOperationException_ConcurrentOperationsNotSupported();
ReturnFound:
ref TValue value = ref entry.value;
Return:
return ref value;
ReturnNotFound:
value = ref Unsafe.NullRef<TValue>();
goto Return;
}Change Log
- 2024-09-01 Initial document creation.
