UTF-32

E23921

Unicode transformation format character encoding fixed-length encoding

UTF-32 is a fixed-length Unicode character encoding that represents each code point using 32 bits, providing simple indexing at the cost of higher memory usage.

Try in SPARQL Jump to: Surface forms Statements Referenced by

All labels observed (2)

Label	Occurrences
UTF-32 canonical	4
UTF-32 (conceptually, as 32-bit form)	1

Statements (47)

Predicate	Object
instanceOf	Unicode transformation format ⓘ character encoding ⓘ fixed-length encoding ⓘ
BOMCodeUnit	0x0000FEFF ⓘ
codeUnitSize	4 bytes ⓘ
doesNotEncode	noncharacters outside Unicode range ⓘ
encodes	Unicode code points ⓘ
hasDisadvantage	increased bandwidth usage ⓘ larger cache footprint ⓘ
hasEndianness	big-endian ⓘ little-endian ⓘ
hasProperty	direct mapping between code point and code unit ⓘ fixed-length code units ⓘ high memory usage per character ⓘ no surrogate pairs ⓘ simple indexing by code point ⓘ
hasVariant	UTF-32BE ⓘ UTF-32LE ⓘ
introducedTo	provide simple mapping from code point index to memory offset ⓘ
isAlternativeTo	UTF-1 ⓘ UTF-16 ⓘ UTF-7 ⓘ UTF-8 ⓘ
isCommonlyUsedIn	some programming language runtimes ⓘ some text processing libraries ⓘ
isCompatibleWith	Unicode scalar values ⓘ
isDefinedBy	Unicode ⓘ surface form: Unicode Standard
isLessEfficientThan	UTF-16 for storage ⓘ UTF-8 for storage ⓘ
isPartOf	Unicode Standard Annexes ⓘ surface form: Unicode Standard encodings
isRarelyUsedFor	file storage ⓘ web content ⓘ
isRelatedTo	UCS-4 ⓘ
isStandardizedBy	ISO/IEC 10646 ⓘ Unicode Consortium ⓘ
isUsedFor	APIs requiring constant-time indexing ⓘ internal string representation ⓘ
mayUse	byte order mark ⓘ
supportsCodeSpace	U+000000 to U+10FFFF ⓘ
supportsPlane	Basic Multilingual Plane ⓘ Supplementary Ideographic Plane ⓘ Supplementary Multilingual Plane ⓘ Supplementary Private Use Area-A ⓘ Supplementary Private Use Area-B ⓘ Supplementary Special-purpose Plane ⓘ
usesBitWidth	32 bits per code point ⓘ
wasPreviouslyCalled	UCS-4 ⓘ

Referenced by (5)

Full triples — surface form annotated when it differs from this entity's canonical label.

Unicode → hasEncodingForm → UTF-32 ⓘ

Unicode Scalar Values → usedBy → UTF-32 ⓘ

ISO/IEC 10646 → usesEncodingForm → UTF-32 ⓘ

UTF-7 → replacedBy → UTF-32 ⓘ

Unicode 3.0 → definesEncodingForm → UTF-32 ⓘ

this entity surface form: UTF-32 (conceptually, as 32-bit form)