On this page

Skip to content

A Brief Introduction to C#'s GetHashCode()

TLDR

  • When overriding Equals(), you must also override GetHashCode(); otherwise, hash-based collections like Dictionary or HashSet will fail to correctly identify equal objects.
  • Core principle of GetHashCode(): Equal objects must return the same hash code; unequal objects are not strictly required to return different values; the method itself should not throw exceptions.
  • In .NET Core 2.1+ environments, it is recommended to use HashCode.Combine() to implement GetHashCode().
  • During Dictionary lookups, the HashCode is compared first, followed by Equals(). This mechanism effectively improves performance and reduces unnecessary object comparisons.

Implementation Principles and Importance of GetHashCode()

When do you encounter this issue: When a custom class is used as a Dictionary key or stored in a HashSet, overriding Equals() without also handling GetHashCode() will cause the collection to fail to correctly determine if objects are duplicates.

GetHashCode() is used to generate an integer hash value and is the foundation for methods like Dictionary<TKey, TValue>, HashSet<T>, and LINQ's Distinct(). When implementing it, you must adhere to the following principles:

  • If two objects are determined to be equal by Equals(), GetHashCode() must return the same value.
  • If two objects are not equal, GetHashCode() is not strictly required to return different values (hash collisions are allowed).
  • GetHashCode() should not throw exceptions.

If these principles are not followed, even logically equal objects will be treated as distinct entities within a collection. Below is an incorrect example:

csharp
public class Test {
    public string Name { get; set; }
    public int Age { get; set; }

    public override bool Equals(object obj) => obj is Test test && Name == test.Name && Age == test.Age;
}

// Test results
Test test1 = new() { Name = "王大明", Age = 10 };
Test test2 = new() { Name = "王大明", Age = 10 };
Dictionary<Test, string> dic = new() { [test1] = "測試" };

Console.WriteLine(test1.Equals(test2)); // True
Console.WriteLine(dic.ContainsKey(test2)); // False

Implementation Recommendations

When do you encounter this issue: During development, when you need to manually implement GetHashCode() to ensure the correctness of object comparisons.

Depending on the .NET version, the following approaches are recommended:

  • .NET Core 2.1 and later: Use HashCode.Combine(), which is the most concise and performant way.

    csharp
    public override int GetHashCode() => HashCode.Combine(Name, Age);
  • .NET Framework versions: Since HashCode.Combine() is not supported, it is recommended to manually combine field hash values and use prime numbers as multipliers to reduce collisions.

    csharp
    public override int GetHashCode() {
        int hashCode = -1360180430;
        hashCode = hashCode * -1521134295 + EqualityComparer<string>.Default.GetHashCode(Name);
        hashCode = hashCode * -1521134295 + Age.GetHashCode();
        return hashCode;
    }

TIP

Using EqualityComparer<T>.Default to calculate hash values properly handles null values and ensures consistent calculation rules across different classes.

Dictionary Search Mechanism and Performance Optimization

When do you encounter this issue: When you need to understand why Dictionary maintains high performance during comparisons even when processing large amounts of data.

Dictionary internally uses two array structures, _buckets and _entries, to store data. The workflow of its FindValue() search method is as follows:

  1. Calculate Hash Value: First, calculate the HashCode of the Key to locate the index in _buckets, allowing direct access to the corresponding _entries and avoiding a global search.
  2. Preliminary Filtering: Compare whether the HashCodes are identical. If the HashCodes differ, they are immediately determined to be unequal, avoiding the need to call the more costly Equals() method.
  3. Handle Collisions: If the HashCodes are the same but the object contents differ (hash collision), compare the objects in the chain one by one using the entry.next pointer until a match is found.

This mechanism ensures that in most cases, collection operations can be completed with a time complexity close to O(1).

Change Log

  • 2024-09-01 Initial documentation created.