At some point, many Go programmers have had the need to unmarshal a data structure like this:

[
  { "type": "dog", "bark_loudness": 7, "name": "Fido" },
  { "type": "train", "cars": [ "passenger", "passenger", "caboose" ] },
  { "type": "computer", "ram": 16, "architecture": "x85" }
]

An array of objects, where each one’s type has no relation to the other.

While this is (arguably) a bad data structure, sometimes you have to handle the cards you have been dealt.

The “easy” way is to just unmarshal items into a map[string]any:

package main

import (
	"encoding/json"
	"fmt"
)

type item map[string]any

func main() {
	j := []byte(
		`[
  { "type": "dog", "bark_loudness": 7, "name": "Fido" },
  { "type": "train", "fast": true, "cars": [ "passenger", "passenger", "caboose" ] },
  { "type": "computer", "ram": 16, "architecture": "x85" }
]`)

	d := []item{}
	err := json.Unmarshal(j, &d)
	if err != nil {
		panic(err)
	}

	for i := range d {
		fmt.Printf("%#v\n", d[i])
	}
}

This works, but it is not very good. The only compile-time type we can be sure of are the object keys (string). Accessing the ‘cars’ of the ’train’ item requires shenanigans like:

if item["type"] == "train" {
  cars := item["cars"].([]any)
  for _, car := range cars {
    fmt.Println(car)
  }
}

As well as being awful to maintain, it has no run-time safety (which we could add, but would be more awkward checking of the type assertions) and is relying on fmt.Println knowing how to reflect to find out how to print the underlying string.

We’d have to write a bunch more code to turn this into a []string slice instead of an []any slice, so we could (for example) use strings.Join to output them

So, let’s do this properly. First, define our types:

type items []any

type dog struct {
  BarkLoudness int    `json:"bark_loudness"`
  Name         string `json:"name"`
}

type train struct {
	Fast bool     `json:"fast"`
	Cars []string `json:"cars"`
}

type computer struct {
  Ram          int    `json:"ram"`
  Architecture string `json:"architecture"`
}

There’s no getting away from using the any type for the actual top level array - we are going to be storing different types here.

The three other types here don’t contain any surprises. We are ignoring the "type" key in the JSON, since we can infer it by the struct name itself. You might want to store it.

If your structs do have some other commonality (like a timestamp or similar) this is a good chance to break that out into a separate data type and compose it into all of the specific types.

We specify the JSON struct tags as usual, so that they can be unmarshaled correctly.

Now let’s look at how we unmarshal such a structure into the top level items type. We need to create a custom unmarshal function:

func (i *items) UnmarshalJSON(data []byte) error {
	var rawSlice []map[string]any
	if err := json.Unmarshal(data, &rawSlice); err != nil {
		return err
	}

	for _, raw := range rawSlice {
		switch raw["type"] {

		case "dog":
			instance := dog{}
			rawToTyped(raw, &instance)
			*i = append(*i, instance)
		case "train":
			instance := train{}
			rawToTyped(raw, &instance)
			*i = append(*i, instance)
		case "computer":
			instance := computer{}
			rawToTyped(raw, &instance)
			*i = append(*i, instance)

		default:
			panic(fmt.Sprintf("not sure how to deal with type '%s'", raw["type"]))
		}
	}

	return nil
}

func rawToTyped(raw interface{}, out interface{}) {
	outType := reflect.TypeOf(out)
	if outType.Kind() != reflect.Ptr {
		panic("out must be a pointer to a type")
	}

	back, err := json.Marshal(raw)
	if err != nil {
		panic("error marshaling raw data")
	}

	if err := json.Unmarshal(back, out); err != nil {
		panic(fmt.Sprintf("error unmarshaling JSON into out for a %#v\nerror: %s", out, err))
	}
}

What’s going on here?

First we unmarshal the entire incoming array into an []map[string]any struct. We allow the JSON library to parse the values into an underlying concrete type according to its rules.

Then we look at the value for the key “type” - since that is our way of knowing what type we have to put the data into. Depending on your data structure, you might need to use a different technique to differentiate between types.

Depending on which type it is, we create an instance of that struct and call the function rawToTyped.

This takes the ‘raw’ data (the map[string]any that the JSON package gave us above), turns it back into a JSON byteslice, and then re-unmarshal’s it into the correctly typed value.

This is a little “double handling”, but the rules that the JSON marshal/unmarshal applies to interpret values as types (for instance true or false indicates a bool type) should be 100% reversible, getting us back our original JSON string which can be unmarshaled into the correct type.

Putting this all together, we can now unmarshal into a slice with the proper concrete types for each value:

d := items{}
json.Unmarshal(j, &d)
fmt.Printf("%#v", d)

(error handling omitted for brevity)

The json.Unmarshal function will automatically call our custom function, resulting in:

main.items{
  main.dog{BarkLoudness:7, Name:"Fido"},
  main.train{Fast:true, Cars:[]string{"passenger", "passenger", "caboose"}},
  main.computer{Ram:16, Architecture:""}
}

Now we have our individual items with proper typing!

A typical way to process these:

for _, item := range d {
  switch anItem := item.(type) {
  case dog:
    fmt.Printf("%s woofs at level %d\n", anItem.Name, anItem.BarkLoudness)

  case computer:
    fmt.Printf("The %s computer has %dGb of RAM\n", anItem.Architecture, anItem.Ram)

  case train:
    fmt.Printf("Train (fast: %t) has cars: %s\n", anItem.Fast, strings.Join(anItem.Cars, ", "))
  }
}

Result:

Fido woofs at level 7
Train (fast: true) has cars: passenger, passenger, caboose
The x85 computer has 16Gb of RAM

The entire self-contained example is below, or you can play with it on the playground:

package main

import (
	"encoding/json"
	"fmt"
	"reflect"
	"strings"
)

type items []any

type dog struct {
	BarkLoudness int    `json:"bark_loudness"`
	Name         string `json:"name"`
}

type train struct {
	Fast bool     `json:"fast"`
	Cars []string `json:"cars"`
}

type computer struct {
	Ram          int    `json:"ram"`
	Architecture string `json:"architecture"`
}

func (i *items) UnmarshalJSON(data []byte) error {
	var rawSlice []map[string]any
	if err := json.Unmarshal(data, &rawSlice); err != nil {
		return err
	}

	for _, raw := range rawSlice {
		switch raw["type"] {

		case "dog":
			instance := dog{}
			rawToTyped(raw, &instance)
			*i = append(*i, instance)
		case "train":
			instance := train{}
			rawToTyped(raw, &instance)
			*i = append(*i, instance)
		case "computer":
			instance := computer{}
			rawToTyped(raw, &instance)
			*i = append(*i, instance)

		default:
			panic(fmt.Sprintf("not sure how to deal with %s", raw["type"]))
		}
	}

	return nil
}

func rawToTyped(raw interface{}, out interface{}) {
	outType := reflect.TypeOf(out)
	if outType.Kind() != reflect.Ptr {
		panic("out must be a pointer to a type")
	}

	back, err := json.Marshal(raw)
	if err != nil {
		panic("error marshaling raw data")
	}

	if err := json.Unmarshal(back, out); err != nil {
		panic(fmt.Sprintf("error unmarshaling JSON into out for a %#v\nerror: %s", out, err))
	}
}

func main() {
	j := []byte(
		`[
  { "type": "dog", "bark_loudness": 7, "name": "Fido" },
  { "type": "train", "fast": true, "cars": [ "passenger", "passenger", "caboose" ] },
  { "type": "computer", "ram": 16, "architecture": "x85" }
]`)

	d := items{}
	err := json.Unmarshal(j, &d)
	if err != nil {
		panic(err)
	}

	for _, item := range d {
		switch anItem := item.(type) {
		case dog:
			fmt.Printf("%s woofs at level %d\n", anItem.Name, anItem.BarkLoudness)

		case computer:
			fmt.Printf("The %s computer has %dGb of RAM\n", anItem.Architecture, anItem.Ram)

		case train:
			fmt.Printf("Train (fast: %t) has cars: %s\n", anItem.Fast, strings.Join(anItem.Cars, ", "))
		}
	}
}

Tags: golang  json