Unicode normalization
E564765
Unicode normalization is a set of standardized processes that convert equivalent Unicode text sequences into a consistent canonical form to ensure reliable comparison, searching, and processing of text across systems.
Observed surface forms (2)
| Surface form | Occurrences |
|---|---|
| Unicode Normalization Forms | 3 |
| NFKD | 1 |
Statements (51)
| Predicate | Object |
|---|---|
| instanceOf |
Unicode standard feature
ⓘ
text processing standard ⓘ |
| addresses |
multiple representations of the same abstract text
ⓘ
precomposed and decomposed character sequences ⓘ |
| alsoKnownAs | UAX #15 ⓘ |
| appliesTo | Unicode text ⓘ |
| definedBy | Unicode Consortium NERFINISHED ⓘ |
| definesForm |
NFC
ⓘ
NFD ⓘ NFKC ⓘ NFKD ⓘ |
| ensures | canonically equivalent strings have identical binary representation ⓘ |
| example | é can be U+00E9 or 'e' + U+0301 ⓘ |
| hasFullName |
Normalization Form C
ⓘ
Normalization Form D NERFINISHED ⓘ Normalization Form KC NERFINISHED ⓘ Normalization Form KD NERFINISHED ⓘ |
| hasProperty |
closed under normalization (idempotent)
ⓘ
stable across Unicode versions for assigned characters ⓘ |
| hasPurpose |
to convert equivalent Unicode text sequences into a consistent form
ⓘ
to enable consistent text processing across systems ⓘ to enable reliable text comparison ⓘ to enable reliable text searching ⓘ |
| hasSpecification | Unicode Standard Annex #15 NERFINISHED ⓘ |
| idempotent | applying the same normalization form twice yields the same result ⓘ |
| isImportantFor |
collation
ⓘ
search indexing ⓘ security-sensitive string comparison ⓘ string equality ⓘ text rendering consistency ⓘ |
| NFC | canonical composition form ⓘ |
| NFCPreferredFor | data interchange ⓘ |
| NFD | canonical decomposition form ⓘ |
| NFDPreferredFor | internal text processing in some systems ⓘ |
| NFKC | compatibility composition form ⓘ |
| NFKCPreferredFor | identifier comparison in some contexts ⓘ |
| NFKD | compatibility decomposition form ⓘ |
| NFKDPreferredFor | text analysis and searching in some contexts ⓘ |
| partOf | Unicode Standard NERFINISHED ⓘ |
| reliesOn |
Unicode character decomposition mappings
ⓘ
canonical combining class ⓘ composition exclusions ⓘ |
| usedBy |
databases
ⓘ
programming languages ⓘ search engines ⓘ text editors ⓘ |
| usesConcept |
canonical equivalence
ⓘ
compatibility equivalence ⓘ |
| usesOperation |
canonical composition
ⓘ
canonical decomposition ⓘ compatibility decomposition ⓘ |
Referenced by (7)
Full triples — surface form annotated when it differs from this entity's canonical label.
this entity surface form:
Unicode Normalization Forms
this entity surface form:
Unicode Normalization Forms
this entity surface form:
Unicode Normalization Forms
this entity surface form:
NFKD