On this page

Skip to content

Considerations for Elasticsearch Dynamic Field Mapping

I expect to be grappling with Elasticsearch until the end of this year, unless I get tired of it and find something simpler to write about. However, that probably won't take much time, as my weight loss progress in September wasn't as good as the previous two months, so I might increase my exercise time.

Misconceptions About Dynamic Field Mapping

I have always had a stereotype that relational databases require a pre-defined schema, while NoSQL databases do not and can store data dynamically. As it turns out, I stumbled when I first encountered Elasticsearch.

First of all, although Elasticsearch supports Dynamic Mapping, it is actually not recommended to use it in a production environment. The reasons are as follows:

1. String Types Can Cause Storage Bloat

When using dynamic mapping, string types are stored by default as both a text type and a keyword type sub-field.

The text type performs tokenization and builds an inverted index, while the keyword type stores the full string for exact matching, sorting, and aggregation. This dual indexing significantly increases storage space usage.

Of course, if storage space is sufficient, using multi-fields to index the same field with multiple types to meet different query requirements is a common technique. This is a design choice that trades storage space for functional flexibility; for details, refer to Multi-fields.

2. Not All Features Support Dynamic Mapping

Not all types can be automatically handled via dynamic mapping. For example:

  • Geospatial fields: If you want to use geo_point or geo_shape related geospatial query APIs, you must define them in the mapping beforehand. Even if you store JSON data that matches a geographic structure (e.g., {"lat": 25.03, "lon": 121.56}), if it is not pre-defined as a geo_point type, Elasticsearch will treat it as a standard object, making it impossible to use geospatial query functions like geo_distance.

  • Nested objects: If you need to perform independent queries on objects within an array, you must use the nested type. Dynamic mapping will only create them as an object type, causing the object fields in the array to be flattened, making correct querying impossible.

  • Custom Analyzers: If you need specific text analysis methods (such as Chinese tokenization, synonym processing, etc.), you must explicitly specify the analyzer in the mapping; dynamic mapping will only use the default standard analyzer.

For related information, please refer to Field data types.

3. The Risk of Mapping Explosion

The official Elasticsearch documentation specifically warns about the Mapping explosion issue. If you use dynamic mapping and the data source contains a large number of different field names (e.g., user-defined fields, dynamically generated keys), it may lead to:

  • An explosive growth in the number of fields in the index.
  • A significant increase in memory usage.

By default, an index can have a maximum of 1000 fields; exceeding this will result in the rejection of new documents.

4. Official Recommendation: Use Explicit Mapping

The official Elasticsearch documentation recommends using Explicit mapping to specify the data type for each field. This is the recommended practice for production environments because you can fully control how data is indexed to suit specific use cases. For related instructions, refer to Mapping.

Dynamic Mapping Type Conversion Rules

The following table shows the type mapping rules for Elasticsearch under different dynamic settings. For detailed explanations, please refer to Dynamic field mapping:

JSON Data TypeElasticsearch Type ("dynamic":"true")Elasticsearch Type ("dynamic":"runtime")
nullNo field addedNo field added
true or falsebooleanboolean
doublefloatdouble
longlonglong
objectobjectNo field added
arrayDepends on the first non-null value in the arrayDepends on the first non-null value in the array
string passing date detectiondatedate
string passing numeric detectionfloat or longdouble or long
string not passing date/numeric detectiontext with a .keyword sub-fieldkeyword

Dynamic Parameter Settings

The dynamic parameter controls whether new fields are added dynamically and accepts the following options:

true (Default)

New fields are automatically added to the mapping. Suitable for rapid testing during the development phase, but not recommended for production environments.

runtime

New fields are added to the mapping as runtime fields. These fields are not indexed but are loaded from _source and calculated on the fly during queries. The advantage is that they do not consume index space; the disadvantage is that query performance is lower, making them suitable for fields that are not queried often but are occasionally needed.

false

New fields are ignored. These fields are not indexed or searchable, but they will still appear in the _source field of the returned results. These fields are not added to the mapping and must be manually and explicitly added. This setting can prevent mapping explosion while maintaining the integrity of the original data.

strict

If a new field is detected, an exception is thrown and the document is rejected. New fields must be explicitly added to the mapping before they can be used. This is the strictest setting, suitable for scenarios that require strict control over data structure.

For more detailed explanations, please refer to Dynamic mapping.

Conclusion

Although Elasticsearch's dynamic mapping feature seems convenient, it is recommended to plan your schema in advance and explicitly define the types for each field in production environments. This allows you to make the best trade-offs between storage space, query performance, and functional requirements, avoiding the trouble of re-indexing later.

Change Log

  • 2025-10-04 Initial document creation.