Solr MaxGram Changes Not Taking Effect (Managed-Schema Mode)

Overview

You may observe Solr search behavior where results ignore characters beyond the first ~18 characters, even after increasing maxGramSize in schema.xml and rebuilding the index. A common symptom is: "it only sees the first 18 characters. It doesn't care about anything else after that."

This occurs when Solr is running in managed-schema mode: direct edits to schema.xml are not applied because Solr loads schema definitions from a dynamically managed schema (for example, managed-schema.xml). The mitigation is to update the affected field type via the Solr Schema API, verify the updated maxGramSize, and then rebuild/reindex so the new analyzer settings are applied.

Solution

Issue / Symptom

Search results appear limited to matching only the first ~18 characters of a filename (or similar text), even after setting maxGramSize to a higher value (for example, 50) in schema.xml and rebuilding the Solr index.

Observed Symptom Text (Verbatim)

"it only sees the first 18 characters. It doesn't care about anything else after that."
"It's always 18 characters."

Root Cause

Solr is running in managed-schema mode by default. In this mode, edits to schema.xml are not applied. Solr reads field type and analyzer definitions from a dynamically generated managed schema (for example, managed-schema.xml). As a result, the active NGramTokenizerFactory configuration (including maxGramSize) remains unchanged when only schema.xml is edited.

Resolution (Update MaxGram via Solr Schema API)

Use the Schema API to update the field type that controls the analyzer (example shown for field type text_en_ext), then reindex.

Step 1: Verify Current Field Type Settings

Replace placeholders for your Solr host/port and core/collection as needed.

curl "http://127.0.0.1:<solr_port>/solr/<core_or_collection>/schema/fieldtypes/text_en_ext"

Step 2: Backup Existing Documents (as used during troubleshooting)

curl "http://127.0.0.1:<solr_port>/solr/<core_or_collection>/select?q=*:*"

Step 3: Update `maxGramSize` Using the Schema API

Example that replaces the field type definition and sets maxGramSize to 50:

curl -X POST -H 'Content-type:application/json' \
"http://127.0.0.1:<solr_port>/solr/<core_or_collection>/schema" \
--data-binary '{
  "replace-field-type": {
    "name": "text_en_ext",
    "class": "solr.TextField",
    "positionIncrementGap": "100",
    "indexAnalyzer": {
      "charFilters": [{"class": "solr.ICUNormalizer2CharFilterFactory"}],
      "tokenizer": {"class": "solr.NGramTokenizerFactory", "minGramSize": "3", "maxGramSize": "50"},
      "filters": [
        {"class": "solr.ASCIIFoldingFilterFactory"},
        {"class": "solr.LowerCaseFilterFactory"}
      ]
    },
    "queryAnalyzer": {
      "tokenizer": {"class": "solr.KeywordTokenizerFactory"},
      "filters": [{"class": "solr.LowerCaseFilterFactory"}]
    }
  }
}'

Step 4: Verify the Change

curl "http://127.0.0.1:<solr_port>/solr/<core_or_collection>/schema/fieldtypes/text_en_ext"

Confirm the response shows "maxGramSize": "50" (or your intended value).

Step 5: Rebuild / Reindex Search Data

After changing the schema/analyzer, rebuild/reindex the existing data so the new analyzer settings are applied to indexed content. (Without reindexing, existing indexed terms may still reflect the prior analyzer behavior.)

Validation

Repeat the same search scenario that previously failed (searching by a string longer than 18 characters).
Confirm that adding characters beyond the first 18 now changes the results appropriately (that is, it no longer returns matches based only on the initial 18-character prefix).
Re-confirm the field type configuration via the schema endpoint to ensure maxGramSize remains at the configured value.

Engineering / Development Notes

The behavior was reproduced in a test environment and escalated for development review. Engineering analysis confirmed the root cause was managed-schema behavior (direct schema.xml edits not being loaded), and the associated defect record was later cancelled after the engineering analysis/work completed.

Frequently Asked Questions

1. How can this issue be recognized quickly?: Searches behave as if only the first ~18 characters matter; adding additional characters to a query does not narrow results. A typical symptom is: "it only sees the first 18 characters".
2. Why didn’t editing schema.xml work even after rebuilding the index?: In managed-schema mode, Solr does not apply changes from schema.xml; it uses a dynamically managed schema (for example, managed-schema.xml). The analyzer configuration must be updated via the Solr Schema API.
3. What must be done after changing maxGramSize via the API?: Rebuild/reindex existing content so the new analyzer settings are applied to indexed terms; then rerun the search scenario and verify the field type via /schema/fieldtypes/<fieldtype_name>.
4. Is this a product defect?: The behavior was escalated for development review during investigation, but the root cause was identified as managed-schema behavior (schema.xml edits not being loaded). The associated defect record was later cancelled after engineering analysis/work completed.

Choose files or drag and drop files

ai_initiated

atlas_studio

kb_automation

Tags:

Was this article helpful?

Yes

Priyanka Bhotika
Posted

Comments

Please sign in to comment