ollama

mirror of https://github.com/ollama/ollama.git synced 2026-05-13 14:27:00 +00:00

History

Daniel Hiltgen ec9b4e9e47 tokenizer: fix multi-regex BPE offset handling (#15844 ) Use the current fragment offset when emitting unmatched spans during multi-regex BPE splitting. This avoids duplicating earlier prompt text and inflating token counts for multi-stage BPE tokenizers.		2026-04-27 14:14:27 -07:00
..
testdata
bytepairencoding.go	tokenizer: fix multi-regex BPE offset handling (#15844 )	2026-04-27 14:14:27 -07:00
bytepairencoding_test.go	tokenizer: fix multi-regex BPE offset handling (#15844 )	2026-04-27 14:14:27 -07:00
sentencepiece.go
sentencepiece_test.go
tokenizer.go
vocabulary.go
vocabulary_test.go
wordpiece.go
wordpiece_test.go