unname coder's blog

Posts

Showing posts with the label unicode

Slice a string containing Unicode chars

- September 20, 2018

I have a piece of text with characters of different bytelength. let text = "Hello привет"; I need to take a slice of the string given start (included) and end (excluded) character indices. I tried this let slice = &text[start..end]; and got the following error thread 'main' panicked at 'byte index 7 is not a char boundary; it is inside 'п' (bytes 6..8) of `Hello привет`' I suppose it happens since Cyrillic letters are multi-byte and the [..] notation takes chars using byte indices. What can I use if I want to slice using character indices, like I do in Python: slice = text[start:end] ? I know I can use the chars() iterator and manually walk through the desired substring, but is there a more concise way? Possible solutions to codepoint slicing I know I can use the chars() iterator and manually walk through the desired substring, but is there a more concise way? If you know the exact byte indices, you can slice a string: let text = "Hello...