A Brief Discussion on Handling Database String Types
TLDR
- Common strategies for handling database strings include: distinguishing between
NULLand empty strings, storingNULLexclusively, or setting columns toNOT NULLand storing empty strings. - Distinguishing between
NULLand empty strings offers better semantic expression but lacks enforcement, requiring careful handling by developers. - Oracle databases automatically convert empty strings to
NULL, ensuring mandatory consistency. - Using
NOT NULLwith empty strings avoidsIS NULLsyntax issues and reduces unexpected results caused byNULLin conditional operations. NULLcan lead to index performance issues or abnormalNOT INquery results; using empty strings mitigates these risks.- It is recommended to set string columns to
NOT NULLand store empty strings by default, unless otherwise specified, while keeping in mind that Foreign Key columns must allowNULL.
Distinguishing Between NULL and Empty Strings
When is this encountered: When business logic requires a clear distinction between "data never entered" and "an empty value was entered."
This approach defines NULL as data that has never been entered, while an empty string represents a value that has been entered but is empty. The advantage is clear semantics, but the disadvantage is the lack of database-level enforcement; developers must strictly control the usage scenarios for both in their code.
Storing NULL Exclusively
When is this encountered: When using an Oracle database, or when you want to enforce a unified format at the database level.
Oracle databases automatically convert empty strings to NULL upon storage, ensuring data consistency. If this strategy is adopted in other database systems, it usually requires a CHECK constraint for enforcement.
Setting NOT NULL and Storing Empty Strings
When is this encountered: To simplify SQL query logic, avoid NULL-related calculation pitfalls, or optimize index performance.
The advantages of this strategy are as follows:
- Enforcement: Setting a column to
NOT NULLforces users to store only empty strings. - Index Performance: In some databases, index processing for empty strings is more efficient than for
NULL. Note thatNULLsometimes cannot effectively utilize indexes; for example, in SQL Server, filtered indexes may not function when using theIS NULLpredicate. - Simplified Queries: Avoids using
IS NULLchecks in SQL, reducing the complexity of auto-generated SQL. - Avoiding Calculation Anomalies: Prevents
NULLfrom producing unexpected results inNOT INorAND/ORconditional operations.
The disadvantage is that when a Foreign Key is of a string type, it must store NULL if there is no value, leading to inconsistent handling logic compared to other string columns.
Recommended Practice
In the absence of specific framework or team guidelines, it is recommended to set string columns to NOT NULL and store empty strings to avoid the logical risks associated with NULL. If you encounter a string-type Foreign Key, you must compromise and use NULL, while trying to avoid using strings as Foreign Keys whenever possible.
Change Log
- 2024-07-12 Initial version created.
