tokenizerType

P21075
predicate

Indicates the specific tokenization method or algorithm used to split text into tokens.

All labels observed (2)

Label Occurrences
tokenizerType canonical 7
tokenizationMethod 1

Sample triples (8)

Subject Object
GPT-2 Byte Pair Encoding
GPT-Neo byte pair encoding
RoBERTa byte-level BPE
DistilBERT WordPiece
Bloom SentencePiece NERFINISHED
Bloom subword tokenizer
XLM-R SentencePiece via predicate surface "tokenizationMethod" NERFINISHED
GPT-1 Byte Pair Encoding NERFINISHED