Creating an Immutable MultiMap in Scala
This post shows a couple of neat Scala tricks by implementing an immutable MultiMap.
A MultiMap is a special kind of Map: one that allows a collection of elements to be associated with each key, not just a single value. For this example, we will start by using Lists to contain the values, hence creating a ListMultiMap
Adding elements to a Map[A, List[B]]
Before writing any code we should ask ourselves: do we really need a ListMultiMap at all? Why wouldn’t a normal Map[A, List[B]] be sufficient?
Of course a Map[A, List[B]] can hold our data just fine, but adding a value can be cumbersome: when the key does not exist yet, a new List must be created – otherwise, our value needs to be added to the existing one. Let’s start by putting this code in a helper function:
[code language=”scala”]
def addBinding[A,B](map: Map[A, List[B]], key: A, value: B): Map[A, List[B]] =
map + (key -> { value :: map.getOrElse(key, Nil) })
[/code]
Type aliases
If you start using the pattern Map[A, List[B]] a lot, things might become a little more easy to ready by defining a type alias for it:
[code language=”scala”]
type ListMultiMap[A,B] = Map[A,List[B]]
def addBinding[A,B](map: ListMultiMap[A, B], key: A, value: B): ListMultiMap[A, B] =
map + (key -> { value :: map.getOrElse(key, Nil) })
[/code]
A type and its alias can be used pretty much interchangeably, there is no ‘hierarchy’: you can use a ListMultiMap[A,B] where a Map[A, List[B]] is expected and vice-versa.
Pimp My Library
While this helper function makes adding a single value to a MultiMap less painful, when doing multiple operations it does not look very pretty:
[code language=”scala”]
val original = Map(1 -> List(2, 3))
val intermediate = addBinding(original, 2, 5)
val result = addBinding(step1, 2, 6)
[/code]
To make this a bit easier to use, we can use the pimp my library pattern:
[code language=”scala”]
implicit class ListMultiMap[A,B](map: Map[A, List[B]]) {
def addBinding(key: A, value: B): ListMultiMap[A, B] =
map + (key -> { value :: map.getOrElse(key, Nil) })
}
val result = Map(1 -> List(2, 3))
.addBinding(2, 5)
.addBinding(2, 6)
[/code]
Because here the ListMultiMap[A,B] is defined as an implicit class, whenever you have a Map[A,List[B]] in your code and try to invoke addBinding, the Scala compiler will know how to perform the conversion from Map[A, List[B]] to ListMultiMap[A,B].
Optimization
Unfortunately, this means there is an overhead associated with using addBinding: every time it is used on a Map[A,List[B]] a new ListMultiMap[A,B] wrapper object is constructed. Luckily, we can remove this overhead by extending AnyVal, turning it into a value class:
[code language=”scala”]
implicit class ListMultiMap[A,B](val map: Map[A, List[B]]) extends AnyVal {
def addBinding(key: A, value: B): Map[A, List[B]] =
map + (key -> { value :: map.getOrElse(key, Nil) })
}
val result = Map(1 -> List(2, 3))
.addBinding(2, 5)
.addBinding(2, 6)
[/code]
This tells the compiler that the ListMultiMap class does not have any state of its own, and it can optimize out the overhead of creating a wrapper object.
Removing elements
Of course a map is not complete without a way to remove bindings, so:
[code language=”scala”]
implicit class ListMultiMap[A,B](val map: Map[A, List[B]]) extends AnyVal {
def addBinding(key: A, value: B): Map[A, List[B]] =
map + (key -> { value :: map.getOrElse(key, Nil) })
def removeBinding(key: A, value: B): ListMultiMap[A, B] = map.get(key) match {
case None => map
case Some(List(`value`)) => map – key
case Some(list) => map + (key -> list.diff(List(value)))
}
}
[/code]
Control when needed
Another advantage of the ‘pimp my library’ pattern over simply inheriting from Map is that the user can have control over the performance characteristics of the result by choosing a particular Map implementation.
This control can be extended further by allowing the user to also choose his List implementation. In fact, we can easily make our MultiMap generic over any immutable sequence:
[code language=”scala”]
implicit class MultiMap[A,B](val map: Map[A, Seq[B]]) extends AnyVal {
def addBinding(key: A, value: B, emptySeq: Seq[B] = Nil): Map[A, Seq[B]] =
map + (key -> { value +: map.getOrElse(key, emptySeq) })
}
[/code]
Being more generic
There’s still a lot of opportunity for making this implementation even more generic, for example by using the collection.MapLike trait and more precisely specifying the variance of the generic type parameters. Those, however, are topics that deserves a blogpost of their own.
In short
Using Scala features like type aliases and the ‘pimp my library’ pattern allows you to write supporting code that can remove distracting boilerplate from your actual program logic.
Look at our consultancy services, training offers and careers below or contact us at info@xebia.com
Comments
Thanks for the example, Arnout! Just to add: the “extends AnyVal” bit create a ‘value class’, defined in SIP-15:
http://docs.scala-lang.org/overviews/core/value-classes.html
> Of course a Map[A, List[B]] can hold our data just fine, but adding a value can be cumbersome: when the key does not exist yet, a new List must be created – otherwise, our value needs to be added to the existing one.
On this point: why not simply use withDefault or withDefaultValue?
We’d probably still want the add a method to avoid the :: lmm(key) bit, but the getOrElse call should not be necessary.
Hi Andrew,
Thanks for the pointer to the Value Classes overview, added it to the article!
Using a map with default values indeed makes adding value to the MultiMap much nicer. Tricky thing is, we then must make sure on only use that function with MultiMaps’ that do have a default value. We don’t always have control over the construction of the MultiMap, and can’t really have the type system guarantee this without constraining other valid uses. That’s why I chose to explicitly handle this at bind time in this case. As always, what you want depends on context :).
Thanks for the addition!
If you want to check for equality to
List(value)
, instead of matching any single-element value, use back-ticks aroundvalue
(hope formatting works out below…):case Some(List(`value`)) => map - key
(thanks IntelliJ for the warning)
Hi Stefan, thanks for your comment! That’s indeed a nice improvement, I’m updating the article.