Blog

Creating an Immutable MultiMap in Scala

02 Dec, 2015

This post shows a couple of neat Scala tricks by implementing an immutable MultiMap.
A MultiMap is a special kind of Map: one that allows a collection of elements to be associated with each key, not just a single value. For this example, we will start by using Lists to contain the values, hence creating a ListMultiMap

Adding elements to a Map[A, List[B]]

Before writing any code we should ask ourselves: do we really need a ListMultiMap at all? Why wouldn’t a normal Map[A, List[B]] be sufficient?
Of course a Map[A, List[B]] can hold our data just fine, but adding a value can be cumbersome: when the key does not exist yet, a new List must be created – otherwise, our value needs to be added to the existing one. Let’s start by putting this code in a helper function:
[code language=”scala”]
def addBinding[A,B](map: Map[A, List[B]], key: A, value: B): Map[A, List[B]] =
map + (key -> { value :: map.getOrElse(key, Nil) })
[/code]

Type aliases

If you start using the pattern Map[A, List[B]] a lot, things might become a little more easy to ready by defining a type alias for it:
[code language=”scala”]
type ListMultiMap[A,B] = Map[A,List[B]]
def addBinding[A,B](map: ListMultiMap[A, B], key: A, value: B): ListMultiMap[A, B] =
map + (key -> { value :: map.getOrElse(key, Nil) })
[/code]
A type and its alias can be used pretty much interchangeably, there is no ‘hierarchy’: you can use a ListMultiMap[A,B] where a Map[A, List[B]] is expected and vice-versa.

Pimp My Library

While this helper function makes adding a single value to a MultiMap less painful, when doing multiple operations it does not look very pretty:
[code language=”scala”]
val original = Map(1 -> List(2, 3))
val intermediate = addBinding(original, 2, 5)
val result = addBinding(step1, 2, 6)
[/code]
To make this a bit easier to use, we can use the pimp my library pattern:
[code language=”scala”]
implicit class ListMultiMap[A,B](map: Map[A, List[B]]) {
def addBinding(key: A, value: B): ListMultiMap[A, B] =
map + (key -> { value :: map.getOrElse(key, Nil) })
}
val result = Map(1 -> List(2, 3))
[/code]
Because here the ListMultiMap[A,B] is defined as an implicit class, whenever you have a Map[A,List[B]] in your code and try to invoke addBinding, the Scala compiler will know how to perform the conversion from Map[A, List[B]] to ListMultiMap[A,B].

Optimization

Unfortunately, this means there is an overhead associated with using addBinding: every time it is used on a Map[A,List[B]] a new ListMultiMap[A,B] wrapper object is constructed. Luckily, we can remove this overhead by extending AnyVal, turning it into a value class:
[code language=”scala”]
implicit class ListMultiMap[A,B](val map: Map[A, List[B]]) extends AnyVal {
def addBinding(key: A, value: B): Map[A, List[B]] =
map + (key -> { value :: map.getOrElse(key, Nil) })
}
val result = Map(1 -> List(2, 3))
[/code]
This tells the compiler that the ListMultiMap class does not have any state of its own, and it can optimize out the overhead of creating a wrapper object.

Removing elements

Of course a map is not complete without a way to remove bindings, so:
[code language=”scala”]
implicit class ListMultiMap[A,B](val map: Map[A, List[B]]) extends AnyVal {
def addBinding(key: A, value: B): Map[A, List[B]] =
map + (key -> { value :: map.getOrElse(key, Nil) })
def removeBinding(key: A, value: B): ListMultiMap[A, B] = map.get(key) match {
case None => map
case Some(List(`value`)) => map – key
case Some(list) => map + (key -> list.diff(List(value)))
}
}
[/code]

Control when needed

Another advantage of the ‘pimp my library’ pattern over simply inheriting from Map is that the user can have control over the performance characteristics of the result by choosing a particular Map implementation.
This control can be extended further by allowing the user to also choose his List implementation. In fact, we can easily make our MultiMap generic over any immutable sequence:
[code language=”scala”]
implicit class MultiMap[A,B](val map: Map[A, Seq[B]]) extends AnyVal {
def addBinding(key: A, value: B, emptySeq: Seq[B] = Nil): Map[A, Seq[B]] =
map + (key -> { value +: map.getOrElse(key, emptySeq) })
}
[/code]

Being more generic

There’s still a lot of opportunity for making this implementation even more generic, for example by using the collection.MapLike trait and more precisely specifying the variance of the generic type parameters. Those, however, are topics that deserves a blogpost of their own.

In short

Using Scala features like type aliases and the ‘pimp my library’ pattern allows you to write supporting code that can remove distracting boilerplate from your actual program logic.

Inline Feedbacks
Andrew Phillips
6 years ago

Thanks for the example, Arnout! Just to add: the “extends AnyVal” bit create a ‘value class’, defined in SIP-15:
http://docs.scala-lang.org/overviews/core/value-classes.html
> Of course a Map[A, List[B]] can hold our data just fine, but adding a value can be cumbersome: when the key does not exist yet, a new List must be created – otherwise, our value needs to be added to the existing one.
On this point: why not simply use withDefault or withDefaultValue?

```type MultiMap[A, B] = Map[A, List[B]]
var lmm: MultiMap[Int, String] = Map().withDefaultValue(Nil)
lmm += 1 -> ("one" :: lmm(1)) // add 1 -> "one"
lmm += 1 -> ("een" :: lmm(1)) // add 1 -> "een"
scala> println(lmm(1))
List(een, one)
```

We’d probably still want the add a method to avoid the :: lmm(key) bit, but the getOrElse call should not be necessary.

If you want to check for equality to `List(value)`, instead of matching any single-element value, use back-ticks around `value` (hope formatting works out below…):
`case Some(List(`value`)) => map - key`