Go Arrays and Slices

I found myself recently trying to recall specific details of how slices work when needing to do something that meant I wanted to not mutate the underlying array data structure of the slice I was working with.

Now the reason for why I wanted to do that isn’t important. What’s motivating this write-up is my want for a good reference document (not saying the official go blog isn’t a good reference, but I have my own things I like to focus in on in these situations).

Composite Types

Let’s recap on what we’re talking about when we say “array” and “slice”:

  • Array: fixed-length data structure, zero-indexed, contains same type
  • Slice: variable-length sequence, contains same type

Arrays

An array uses subscript notation to access elements, and is zero-indexed. Here is an example of both those concepts:

a := [3]string{"a", "b", "c"}

fmt.Println(a[1]) // "b"

If you define an array without using the ‘literal’ syntax (as demonstrated in the above example program), then the values will be initialized with the zero value of the given type, for example:

var a [3]string

fmt.Printf("%#v", a) // [3]string{"", "", ""}

The size of the array is actually part of its ‘type’ definition, for example the following two arrays have unique types:

a := [3]string{"a", "b", "c"}
b := [2]string{"a", "b"}

fmt.Printf("a: %T, b: %T", a, b) // a: [3]string, b: [2]string

If you don’t want to have to count the number of elements you’re defining, then you can use an ellipsis ... instead:

a := [...]int{1, 2, 3}

fmt.Printf("%T %#v", a, a) // [3]int [3]int{1, 2, 3}

Notice how the output of the above program inserts the calculated length of the array.

As arrays are fixed-length, it means they cannot be resized once full.

Arrays are also ‘copied’ when being passed into a function. This means if a function modifies a given array, it’s actually only modifying a copy of the array and not the original (i.e. go doesnt use ‘pass-by-reference’ semantics like some other languages).

If dealing we an array data structure, remember you’ll need to pass a pointer to it if you require a function to be able to modify the original array in memory.

Slices

A slice is a lightweight data structure that provides access to a subsequence (or ‘window’ view) of an underlying array.

The slice data structure consists of the following fields:

  • ptr: pointer to array.
  • len: length of slice (number of elements it contains).
  • cap: capacity of slice (number of elements in array, starting from the first element in the slice).

Note: a slice cannot grow larger than its capacity, nor can you reslice a slice to attempt to access earlier elements in the array.

Here is an example program that creates an array and then makes a slice of a subsequence of the original array:

a := [3]int{1, 2, 3}
fmt.Printf("array:  %T\n\t%#v\n\n", a, a)

s := a[1:]
fmt.Printf("slice:  %T\n\t%#v\n", s, s)

The output from the above code would be as follows (notice how the slice provides a narrower ‘view’ of the original array):

array: [3]int
       [3]int{1, 2, 3}

slice: []int
       []int{2, 3}

Notice that a slice ‘type’ looks the same as an array’s, but just omits a length (e.g. slice: []int, array: [3]int).

Slices, much like arrays, cannot dynamically grow larger at runtime. When a slice is full we must create a new slice, which requires the use of go’s builtin functions.

When modifying a slice you are infact modifying the underlying array, as demonstrated below:

a := [3]int{1, 2, 3}
s := a[1:]

s[0] = 4

fmt.Printf("array:  %T\n\t%#v\n\n", a, a)
fmt.Printf("slice:  %T\n\t%#v\n", s, s)

The above code results in the following output:

array: [3]int
       [3]int{1, 4, 3}

slice: []int
       []int{4, 3}

Notice both the slice and the underlying array have been updated.

Although a lot of people refer to this as “updating a slice” when talking about their code, I personally think it’s best not to think of these as two distinct pieces of data. The array holds the data and the slice is just a language abstraction that enables us to control how we view that data.

Additionally, it’s important to realize that because a slice contains a pointer to an underlying array, it means multiple slices can point to the same array in memory (as demonstrated below).

a := [3]int{1, 2, 3}
s1 := a[:2]
s2 := a[1:]

fmt.Printf("array:  %T\n\t%#v\n\n", a, a)
fmt.Printf("slice1: %T\n\t%#v\n\n", s1, s1)
fmt.Printf("slice2: %T\n\t%#v\n\n", s2, s2)

// make modification via the first slice
s1[1] = 4

fmt.Print("---\n\n")
fmt.Printf("array:  %T\n\t%#v\n\n", a, a)
fmt.Printf("slice1: %T\n\t%#v\n\n", s1, s1)
fmt.Printf("slice2: %T\n\t%#v\n\n", s2, s2)

The output of the above program is shown below. Notice how the two slices, s1 and s2 both point at the same underlying array and so although we make a modification via the first slice we can see that both slices will highlight the changed value:

array:  [3]int
	[3]int{1, 2, 3}

slice1: []int
	[]int{1, 2}

slice2: []int
	[]int{2, 3}

---

array:  [3]int
	[3]int{1, 4, 3}

slice1: []int
	[]int{1, 4}

slice2: []int
	[]int{4, 3}

Append

You’ll find in a lot of situations code that needs to append data to a slice. This results in code that uses the builtin append function, but interestingly will nearly always reassign the returned value back to the slice variable itself:

a := [...]int{1, 2, 3, 4, 5}
s := a[1:]

fmt.Printf("array:  %T\n\t%#v\n\tlen: %d\n\tcap: %d\n\n", a, a, len(a), cap(a))
fmt.Printf("slice:  %T\n\t%#v\n\tlen: %d\n\tcap: %d\n\n", s, s, len(s), cap(s))

s = append(s, 6)

fmt.Printf("array:  %T\n\t%#v\n\tlen: %d\n\tcap: %d\n\n", a, a, len(a), cap(a))
fmt.Printf("slice:  %T\n\t%#v\n\tlen: %d\n\tcap: %d\n\n", s, s, len(s), cap(s))

In the above program we have an array that contains five elements. We take a slice of it which is the last four elements. We then attempt to append a new value (6) to the slice (which should mean appending it to the underlying array). The output of that program is as follows:

array:  [5]int
	[5]int{1, 2, 3, 4, 5}
	len: 5
	cap: 5

slice:  []int
	[]int{2, 3, 4, 5}
	len: 4
	cap: 4

array:  [5]int
	[5]int{1, 2, 3, 4, 5}
	len: 5
	cap: 5

slice:  []int
	[]int{2, 3, 4, 5, 6}
	len: 5
	cap: 8

Note: the append function always returns a new slice.

Notice how s is showing an updated ‘view’ (e.g. 2, 3, 4, 5, 6). What’s interesting here is the new slice must have also resulted in a new array being allocated (and subsequently the new slice is pointing to it) because the original array (a) isn’t showing the appended value (6).

Now we can check this with some overly complicated code that ‘reflects’ into the internal go code (this will enable us to locate the slice’s pointer and to dereference that pointer to access the underlying array):

s := []int{1, 2, 3, 4}

hdr := (*reflect.SliceHeader)(unsafe.Pointer(&s))
data := *(*[4]int)(unsafe.Pointer(hdr.Data))

fmt.Printf("slice:  %T\n\t%#v\n\tlen: %d\n\tcap: %d\n", s, s, len(s), cap(s))
fmt.Printf("hdr: %#v\n", hdr)   // &reflect.SliceHeader{Data:0x40e020, Len:4, Cap:4}
fmt.Printf("data: %#v\n", data) // [4]int{1, 2, 3, 4}

s = append(s, 5)

hdr2 := (*reflect.SliceHeader)(unsafe.Pointer(&s))
data2 := *(*[8]int)(unsafe.Pointer(hdr.Data))

fmt.Printf("slice:  %T\n\t%#v\n\tlen: %d\n\tcap: %d\n", s, s, len(s), cap(s))
fmt.Printf("hdr2: %#v\n", hdr2)   // &reflect.SliceHeader{Data:0x45e020, Len:5, Cap:8}
fmt.Printf("data2: %#v\n", data2) // [8]int{1, 2, 3, 4, 5, 0, 0, 0}

Note: s := []int{1, 2, 3, 4} causes an underlying array to be created and then referened by the slice we defined.

The output of the above program is as follows:

slice:  []int
	[]int{1, 2, 3, 4}
	len: 4
	cap: 4

hdr: &reflect.SliceHeader{Data:0x40e020, Len:4, Cap:4}
data: [4]int{1, 2, 3, 4}

slice:  []int
	[]int{1, 2, 3, 4, 5}
	len: 5
	cap: 8

hdr: &reflect.SliceHeader{Data:0x45e020, Len:5, Cap:8}
data: [8]int{1, 2, 3, 4, 5, 0, 0, 0}

We can see the Data field on the SliceHeader is pointing to a different memory address!

We can also see that the capacity of the slice (cap) has increased to double! This demonstrates what’s happening ‘behind-the-scenes’, and is similar to how resizing an array is done in other languages (i.e. you create a new array at double the size of the old array, then you append the new values until that new array is full and the process repeats).

So what has happened is that append has returned a new slice (which is expected), but that the slice is now pointing to a new underlying array. This is the primary reason why when appending a value to a slice you’ll see the slice variable is updated to the return value of append: because we don’t know (unless we do some inspection of the returned slice) whether the underlying array has been copied to a new array in memory.

Caveat of Appending

If the underlying array had enough capacity for the appended value(s), then append would have still returned a new slice (because remember a slice cannot grow beyond its defined capacity) but the underlying array would still be the same array in memory. Let’s see an example of that below:

a := [6]int{1, 2, 3, 4, 5, 6}
s := a[1:4]
x := append(s, 0)

fmt.Printf("array:    %T\n\t  %#v\n\t  len: %d\n\t  cap: %d\n\n", a, a, len(a), cap(a))
fmt.Printf("slice1 s: %T\n\t  %#v\n\t  len: %d\n\t  cap: %d\n\n", s, s, len(s), cap(s))
fmt.Printf("slice2 x: %T\n\t  %#v\n\t  len: %d\n\t  cap: %d\n\n", x, x, len(x), cap(x))

The output of that program is as follows:

array:    [6]int
	  [6]int{1, 2, 3, 4, 0, 6}
	  len: 6
	  cap: 6

slice1 s: []int
	  []int{2, 3, 4}
	  len: 3
	  cap: 5

slice2 x: []int
	  []int{2, 3, 4, 0}
	  len: 4
	  cap: 5

Notice how the first slice s has a length of 3 and a capacity of 5. That is because when we created the slice we specified explicitly we wanted only three elements (1:4) and that the slice’s capacity is derived from the number of elements in the underlying array starting from the first element referenced within the slice).

Meaning the first element in the slice is 2 and counting from there within the underlying array would result in a capacity of 5 (e.g. 2, 3, 4, 5, 6 is five elements).

Now when we call append on s we know that append will always return a new slice and so we assign that new slice to a different variable x. We can see for x the length is one element longer because of the append (e.g. we appended the value 0) but the capacity is still 5 meaning there is more room in the underlying array and so we don’t create a new array nor update the slice pointer to that new array.

What’s really interesting here is that the appending of the value 0 to the first slice has caused the underlying array to be modified in a potentially unexpected way!

By that I mean, we can see the underlying array has now replaced the element value 5 with 0 instead of having 0 appended to the end of it. You might have expected a new array to be created (because really the underlying array has a length of 6 and so an array with elements 1, 2, 3, 4, 5, 6, 0 would be a length of seven, thus not enough capacity in the array).

But because of how slices work, and its capacity is determined by the number of elements in the underlying array starting from the first element referenced by the slice, it means when we append to the underlying array, we are doing so from the perspective of the underlying array ending at the element value 4.

Yes, this is confusing.

See the go tour for more examples.

Note: there’s also a gotcha which is worth being aware of, and is related to the fact that slices point to the same underlying array, which occurs when the slice modifications don’t change the array’s capacity. See https://yourbasic.org/golang/gotcha-append/ for details.