-
Notifications
You must be signed in to change notification settings - Fork 8
Closed
Labels
i18n-trackerGroup bringing to attention of Internationalization, or tracked by i18n but not needing response.Group bringing to attention of Internationalization, or tracked by i18n but not needing response.
Description
The Web Sustainability Guidelines (WSG) implicitly rely on robust text matching for core functionality, including user search (Guideline 2.8) and machine processing of metadata/code (Guideline 3.5, 3.11). Failure to define global text matching rules can lead to unpredictable search results, validation errors, and poor user experience for non-English content.
- 2.8 Ensure that navigation and wayfinding are well-structured
- 3.5 Avoid redundancy and duplication in code
- 3.11 Structure metadata for machine readability
- 3.16 Use dependencies appropriately and ensure maintenance
Strengthening relevant sections (especially in Section 3: Web Development) to direct implementers to consider the following implications of global text matching would be useful:
1. Unicode Normalization for Identifiers and Syntax
- Problem: Two strings or identifiers (e.g., database keys, CSS class names) can appear identical but be composed of different Unicode character sequences (e.g., precomposed 'é' vs. 'e' + combining accent). Without defined Normalization rules (NFC, NFD, or none), systems will treat these as different strings, breaking logic and efficiency.
- Recommendation: Implementers should be aware that matching syntactic content and identifiers must account for Unicode Normalization if canonical equivalence is desired.
- Reference: W3C I18N Best Practices, Section 6.3: Working with Unicode Normalization
2. Case Folding for Search and Input
- Problem: User searches must often be case-insensitive. Simple ASCII case folding is insufficient for global scripts (e.g., the Turkish dotted and dotless 'I').
- Recommendation: User-facing text matching (like search and sorting) should use Unicode Full Case Folding for case-insensitive matching, while syntactic content (like code identifiers) should generally be case-sensitive by default.
- Reference: W3C I18N Best Practices, Section 6.4: Case folding
Addressing these points is crucial for building performant, sustainable web products that function correctly in all languages.
Metadata
Metadata
Assignees
Labels
i18n-trackerGroup bringing to attention of Internationalization, or tracked by i18n but not needing response.Group bringing to attention of Internationalization, or tracked by i18n but not needing response.