On this page

Skip to content

A Brief Discussion on Handling Database String Types

TLDR

  • Common strategies for handling database strings include: distinguishing between NULL and empty strings, storing NULL exclusively, or setting columns to NOT NULL and storing empty strings.
  • Distinguishing between NULL and empty strings offers better semantic expression but lacks enforcement, requiring careful handling by developers.
  • Oracle databases automatically convert empty strings to NULL, ensuring mandatory consistency.
  • Using NOT NULL with empty strings avoids IS NULL syntax issues and reduces unexpected results caused by NULL in conditional operations.
  • NULL can lead to index performance issues or abnormal NOT IN query results; using empty strings mitigates these risks.
  • It is recommended to set string columns to NOT NULL and store empty strings by default, unless otherwise specified, while keeping in mind that Foreign Key columns must allow NULL.

Distinguishing Between NULL and Empty Strings

When is this encountered: When business logic requires a clear distinction between "data never entered" and "an empty value was entered."

This approach defines NULL as data that has never been entered, while an empty string represents a value that has been entered but is empty. The advantage is clear semantics, but the disadvantage is the lack of database-level enforcement; developers must strictly control the usage scenarios for both in their code.

Storing NULL Exclusively

When is this encountered: When using an Oracle database, or when you want to enforce a unified format at the database level.

Oracle databases automatically convert empty strings to NULL upon storage, ensuring data consistency. If this strategy is adopted in other database systems, it usually requires a CHECK constraint for enforcement.

Setting NOT NULL and Storing Empty Strings

When is this encountered: To simplify SQL query logic, avoid NULL-related calculation pitfalls, or optimize index performance.

The advantages of this strategy are as follows:

  • Enforcement: Setting a column to NOT NULL forces users to store only empty strings.
  • Index Performance: In some databases, index processing for empty strings is more efficient than for NULL. Note that NULL sometimes cannot effectively utilize indexes; for example, in SQL Server, filtered indexes may not function when using the IS NULL predicate.
  • Simplified Queries: Avoids using IS NULL checks in SQL, reducing the complexity of auto-generated SQL.
  • Avoiding Calculation Anomalies: Prevents NULL from producing unexpected results in NOT IN or AND/OR conditional operations.

The disadvantage is that when a Foreign Key is of a string type, it must store NULL if there is no value, leading to inconsistent handling logic compared to other string columns.

Recommended Practice

In the absence of specific framework or team guidelines, it is recommended to set string columns to NOT NULL and store empty strings to avoid the logical risks associated with NULL. If you encounter a string-type Foreign Key, you must compromise and use NULL, while trying to avoid using strings as Foreign Keys whenever possible.

Change Log

  • 2024-07-12 Initial version created.