Unicode normalization

E564765

Unicode normalization is a set of standardized processes that convert equivalent Unicode text sequences into a consistent canonical form to ensure reliable comparison, searching, and processing of text across systems.

Try in SPARQL Jump to: Surface forms Statements Referenced by

Observed surface forms (2)

Surface form Occurrences
Unicode Normalization Forms 3
NFKD 1

Statements (51)

Predicate Object
instanceOf Unicode standard feature
text processing standard
addresses multiple representations of the same abstract text
precomposed and decomposed character sequences
alsoKnownAs UAX #15
appliesTo Unicode text
definedBy Unicode Consortium NERFINISHED
definesForm NFC
NFD
NFKC
NFKD
ensures canonically equivalent strings have identical binary representation
example é can be U+00E9 or 'e' + U+0301
hasFullName Normalization Form C
Normalization Form D NERFINISHED
Normalization Form KC NERFINISHED
Normalization Form KD NERFINISHED
hasProperty closed under normalization (idempotent)
stable across Unicode versions for assigned characters
hasPurpose to convert equivalent Unicode text sequences into a consistent form
to enable consistent text processing across systems
to enable reliable text comparison
to enable reliable text searching
hasSpecification Unicode Standard Annex #15 NERFINISHED
idempotent applying the same normalization form twice yields the same result
isImportantFor collation
search indexing
security-sensitive string comparison
string equality
text rendering consistency
NFC canonical composition form
NFCPreferredFor data interchange
NFD canonical decomposition form
NFDPreferredFor internal text processing in some systems
NFKC compatibility composition form
NFKCPreferredFor identifier comparison in some contexts
NFKD compatibility decomposition form
NFKDPreferredFor text analysis and searching in some contexts
partOf Unicode Standard NERFINISHED
reliesOn Unicode character decomposition mappings
canonical combining class
composition exclusions
usedBy databases
programming languages
search engines
text editors
usesConcept canonical equivalence
compatibility equivalence
usesOperation canonical composition
canonical decomposition
compatibility decomposition

Referenced by (7)

Full triples — surface form annotated when it differs from this entity's canonical label.

Mark Davis contributedTo Unicode normalization
Unicode Standard Annexes includesTopic Unicode normalization
Unicode Technical Committee standardDeveloped Unicode normalization
this entity surface form: Unicode Normalization Forms
East_Asian_Width relatedTo Unicode normalization
Normalization_Quick_Check relatedTo Unicode normalization
this entity surface form: Unicode Normalization Forms
Unicode 4.1 refinesNormalization Unicode normalization
this entity surface form: Unicode Normalization Forms
Vunisea Airport ICAOcode Unicode normalization
this entity surface form: NFKD