Overview
You may observe Solr search behavior where results ignore characters beyond the first ~18 characters, even after increasing maxGramSize in schema.xml and rebuilding the index. A common symptom is: "it only sees the first 18 characters. It doesn't care about anything else after that."
This occurs when Solr is running in managed-schema mode: direct edits to schema.xml are not applied because Solr loads schema definitions from a dynamically managed schema (for example, managed-schema.xml). The mitigation is to update the affected field type via the Solr Schema API, verify the updated maxGramSize, and then rebuild/reindex so the new analyzer settings are applied.
Solution
Issue / Symptom
Search results appear limited to matching only the first ~18 characters of a filename (or similar text), even after setting maxGramSize to a higher value (for example, 50) in schema.xml and rebuilding the Solr index.
Observed Symptom Text (Verbatim)
"it only sees the first 18 characters. It doesn't care about anything else after that."
"It's always 18 characters."
Root Cause
Solr is running in managed-schema mode by default. In this mode, edits to schema.xml are not applied. Solr reads field type and analyzer definitions from a dynamically generated managed schema (for example, managed-schema.xml). As a result, the active NGramTokenizerFactory configuration (including maxGramSize) remains unchanged when only schema.xml is edited.
Resolution (Update MaxGram via Solr Schema API)
Use the Schema API to update the field type that controls the analyzer (example shown for field type text_en_ext), then reindex.
Step 1: Verify Current Field Type Settings
Replace placeholders for your Solr host/port and core/collection as needed.
curl "http://127.0.0.1:<solr_port>/solr/<core_or_collection>/schema/fieldtypes/text_en_ext"
Step 2: Backup Existing Documents (as used during troubleshooting)
curl "http://127.0.0.1:<solr_port>/solr/<core_or_collection>/select?q=*:*"
Step 3: Update maxGramSize Using the Schema API
Example that replaces the field type definition and sets maxGramSize to 50:
curl -X POST -H 'Content-type:application/json' \
"http://127.0.0.1:<solr_port>/solr/<core_or_collection>/schema" \
--data-binary '{
"replace-field-type": {
"name": "text_en_ext",
"class": "solr.TextField",
"positionIncrementGap": "100",
"indexAnalyzer": {
"charFilters": [{"class": "solr.ICUNormalizer2CharFilterFactory"}],
"tokenizer": {"class": "solr.NGramTokenizerFactory", "minGramSize": "3", "maxGramSize": "50"},
"filters": [
{"class": "solr.ASCIIFoldingFilterFactory"},
{"class": "solr.LowerCaseFilterFactory"}
]
},
"queryAnalyzer": {
"tokenizer": {"class": "solr.KeywordTokenizerFactory"},
"filters": [{"class": "solr.LowerCaseFilterFactory"}]
}
}
}'
Step 4: Verify the Change
curl "http://127.0.0.1:<solr_port>/solr/<core_or_collection>/schema/fieldtypes/text_en_ext"
Confirm the response shows "maxGramSize": "50" (or your intended value).
Step 5: Rebuild / Reindex Search Data
After changing the schema/analyzer, rebuild/reindex the existing data so the new analyzer settings are applied to indexed content. (Without reindexing, existing indexed terms may still reflect the prior analyzer behavior.)
Validation
- Repeat the same search scenario that previously failed (searching by a string longer than 18 characters).
- Confirm that adding characters beyond the first 18 now changes the results appropriately (that is, it no longer returns matches based only on the initial 18-character prefix).
- Re-confirm the field type configuration via the schema endpoint to ensure
maxGramSizeremains at the configured value.
Engineering / Development Notes
The behavior was reproduced in a test environment and escalated for development review. Engineering analysis confirmed the root cause was managed-schema behavior (direct schema.xml edits not being loaded), and the associated defect record was later cancelled after the engineering analysis/work completed.
Frequently Asked Questions
- 1. How can this issue be recognized quickly?
- Searches behave as if only the first ~18 characters matter; adding additional characters to a query does not narrow results. A typical symptom is:
"it only sees the first 18 characters". - 2. Why didn’t editing
schema.xmlwork even after rebuilding the index? - In managed-schema mode, Solr does not apply changes from
schema.xml; it uses a dynamically managed schema (for example,managed-schema.xml). The analyzer configuration must be updated via the Solr Schema API. - 3. What must be done after changing
maxGramSizevia the API? - Rebuild/reindex existing content so the new analyzer settings are applied to indexed terms; then rerun the search scenario and verify the field type via
/schema/fieldtypes/<fieldtype_name>. - 4. Is this a product defect?
- The behavior was escalated for development review during investigation, but the root cause was identified as managed-schema behavior (schema.xml edits not being loaded). The associated defect record was later cancelled after engineering analysis/work completed.
Priyanka Bhotika
Comments