String.next_codepoint

You're seeing just the function next_codepoint, go back to String module for more information.

Specs

next_codepoint(t()) :: {codepoint(), t()} | nil

Returns the next code point in a string.

The result is a tuple with the code point and the remainder of the string or nil in case the string reached its end.

As with other functions in the String module, next_codepoint/1 works with binaries that are invalid UTF-8. If the string starts with a sequence of bytes that is not valid in UTF-8 encoding, the first element of the returned tuple is a binary with the first byte.

Examples

iex> String.next_codepoint("olá")
{"o", "lá"}

iex> invalid = "\x80\x80OK" # first two bytes are invalid in UTF-8
iex> {_, rest} = String.next_codepoint(invalid)
{<<128>>, <<128, 79, 75>>}
iex> String.next_codepoint(rest)
{<<128>>, "OK"}

Comparison with binary pattern matching

Binary pattern matching provides a similar way to decompose a string:

iex> <<codepoint::utf8, rest::binary>> = "Elixir"
"Elixir"
iex> codepoint
69
iex> rest
"lixir"

though not entirely equivalent because codepoint comes as an integer, and the pattern won't match invalid UTF-8.

Binary pattern matching, however, is simpler and more efficient, so pick the option that better suits your use case.