Handle string in Golang

December 04, 2024

Background

Studying Golang, I face some things using strings. It is different from JavaScript. for ... range supported in Golang works out of my expectation. So I will introduce some examples in this post.

string in Go

In Go, string is a type consisting of char type. It just a array of char. And Go encode string into UTF-8 internally.

On other hand, there is a type rune. rune has a character in UTF-8. So it can be 1 byte or 4 bytes at most. My first language, Korean is encoded into 3 bytes.

For ... Range

You usually use for ... range syntax to iterate all elements of one array. Of coursley, It can be applied in string case.

k_str := "이 세상에 온 것을 환영해"

for _, c := range k_str {
    fmt.Printf("%c(%d) ", c, c)
}
// result: 이(51060)  (32) 세(49464) 상(49345) 에(50640)  (32) 온(50728)  (32) 것(44163) 을(51012)  (32) 환(54872) 영(50689) 해(54644)

And you should know c in iteration is converted into rune internally.

I will show you index instead of binary data.

k_str := "이 세상에 온 것을 환영해"

for i, c := range k_str {
    fmt.Printf("%c(%d) ", c, i)
}
// result: 이(0)  (3) 세(4) 상(7) 에(10)  (13) 온(14)  (17) 것(18) 을(21)  (24) 환(25) 영(28) 해(31)

You can find it has different index from string where the character is real located.

It is related to UTF-8 encoding.

Korean occupies 3 bytes in UTF-8, so indices point the character's start within the bytes sizes.

Slice of string

fmt.Printf("%c", k_str[0])
// result: ì

So if you just try to take slice of string like above, you can get wrong character.

Solution

k_str_in_rune := []rune(k_str)
for i, c := range k_str_in_rune {
    fmt.Printf("%c(%d) ", c, i)
}
// result: 이(0)  (3) 세(4) 상(7) 에(10)  (13) 온(14)  (17) 것(18) 을(21)  (24) 환(25) 영(28) 해(31) 이(0)  (1) 세(2) 상(3) 에(4)  (5) 온(6)  (7) 것(8) 을(9)  (10) 환(11) 영(12) 해(13)

You can format string into array of runes.

And this case show that indices match each location in string.

fmt.Printf("%c", k_str_in_rune[0])
// result: 이

It seems to be not wrong in slice of runes.

It can be more easy to compare character in each two strings actually.

Handle string in Golang - ALROCK Blog