Blog

Real world functional programming in Scala – comparing Java, Clojure and Scala

04 Jul, 2009

I recently started reading Stuart Halloway’s book ‘Programming in Clojure’. I don’t think I will be writing much enterprise applications in that language in the near future, but it never hurts to broaden the mind, and it’s a very good read. In his book, he demonstrates some of the advantages of functional programming by taking an example from the Apache commons library: StringUtils.indexOfAny. He has also written a blog about it.
In this blog post, we’ll compare the original function in Java, the Clojure version and a Scala implementation.

To start: here’s the original implementation in Java:
[java]
// From Apache Commons Lang, http://commons.apache.org/lang/
public static int indexOfAny(String str, char[] searchChars) {
if (isEmpty(str) || ArrayUtils.isEmpty(searchChars)) {
return -1;
}
for (int i = 0; i < str.length(); i++) {
char ch = str.charAt(i);
for (int j = 0; j < searchChars.length; j++) {
if (searchChars[j] == ch) {
return i;
}
}
}
return -1;
}
[/java]
Sample results of this function are:
[java]
StringUtils.indexOfAny(null, *) = -1
StringUtils.indexOfAny("" , *) = -1
StringUtils.indexOfAny(*, null) = -1
StringUtils.indexOfAny(*, []) = -1
StringUtils.indexOfAny("zzabyycdxx" ,[‘z’ ,’a’ ]) = 0
StringUtils.indexOfAny("zzabyycdxx" ,[‘b’ ,’y’ ]) = 3
StringUtils.indexOfAny("aba" , [‘z’ ]) = -1
[/java]
And now the Clojure code. I’ve taken the version from the book, (free to download here). The book version defines three functions, instead of the version shown in the blog post.


(defn indexed [s] (map vector (iterate inc 0) s))
(defn index-filter [pred coll]
  (when pred
    (for [[idx elt] (indexed coll) :when (pred elt)] idx)))
(defn index-of-any [pred coll]
  (first (index-filter pred coll)))

The three functions defined are:

  1. indexed returns a sequence of pairs, the first element being the index, the sequence the element.
  2. index-filter returns a sequence of the indexes of matching predicates in the collection. For example, the function call (index-filter #{\a \b} “abcdbbb”) yields (0 1 4 5 6)
  3. index-of-any is the StringUtils function. This is now a simple function, just returning the first result of the index-filter function.

As Stuart has mentioned in his book, the Clojure version is shorter, less complex, and also more general than the Java version. No special case handling, such as null or empty search character checking. This all just works naturally in the functional version. It can also be easily extended for other functions than just indexOfAny
Since I’ve been implementing all kinds of nice, but useless little functions in Scala recently, I thought it might be a good idea to try out a more useful example. So let’s see how well we can do with Scala in implementing this in a similar way as the Clojure version.
Almost immediately, we hit on the dreaded null problem. Scala is not a pure functional language, and you have to beware that parameters that are passed on can be null. You could specify the input arguments as Options, of course, but than the caller of the function still needs to construct either the Some or None value, and to do that the caller would have to check for it being null or not.
So, after spending some time grumbling about this issue, I decided to admit defeat and do the null check. 1-0 for Clojure.
Without further ado, here are the three functions implemented in Scala (N.B. I’ve used the trunk here, in case you might want to try this out yourself, I haven’t tried whether this also works version 2.7.5):
[scala]
def indexed(coll:Seq[Any]):Seq[Pair[Int,Any]] = {
if (coll==null) List()
(0 until coll.length).zip(coll)
//below an alternative, but here we have to traverse the list another time to reverse the elements…
//coll.zipWithIndex.map(p => p.swap)
}
def indexedFilter(pred:Seq[Any], coll:Seq[Any]):Seq[Int] = {
if (pred==null)List()
val idxList = indexed(coll)
for (p <- pred; idxPair <- idxList; if (p == idxPair._2)) yield (idxPair._1)
}
def indexOfAny(pred:Seq[Any], coll:Seq[Any]):Option[Int] = {
val res=indexedFilter(pred, coll)
if (res.isEmpty) None
else Some(res.first)
}
[/scala]
Following the definition of the indexed function is a no-brainer: Scala’s zip function on the List class does the job here. There’s also zipWithIndex which does almost what we want, this returns a List of tuples, but in reverse order as the Clojure sample: element first, index later.
The indexedFilter uses a for-comprehension just as the Clojure example (Clojure’s for is not a traditional loop, but a sequence comprehension). For each element in our predicate sequence it checks whether it is present in the indexed tuples that is the result of the indexed function. If it is, it is added to the resulting list using yield. indexOfAny then becomes a trivial function that just checks whether the resulting sequence is empty or not. Instead of returning -1 if nothing is found, I’ve used Scala’s Option class in this case to make the result more expressive.
The function at least works as expected:

scala> indexOfAny("ab", "cdef")
res3: Option[Int] = None
scala> indexOfAny("ab", "bcdefbbb")
res3: Option[Int] = None
scala> indexOfAny("fb", "bcdefbbb")
res4: Option[Int] = Some(4)

Conclusion
All in all, the scala version is not that bad. Its main drawback over the Clojure version is the two null checks that we need to do, and also we need to check the resulting list being empty or not in the indexOfAny function. Clojure wins in this respect. However, the Scala version still is pretty neat and generic and improves (in my opinion at least) the original Java version.
Update: handling infinite collections
Stuart asked the question in his comment how the Scala version would handle infinite collections. The version presented above does not handle this very well. For instance, you’ll get a OEM if you try to pass on Stream.from(0) (which constructs a lazy infinite list of all natural numbers) to the indexed function. Also the indexedFilter function evaluates the entire collection, which is strictly not necessary for the indexOfAny function. To handle an infinite collection, we need to modify and optimize the functions somewhat, to resemble a bit more the functions that Stuart has used in his blog post, instead of the book versions. The Clojure functions used in the blog post look like this:


(defn indexed [s] (map vector (iterate inc 0) s))
(defn index-of-any [s chars]
  (some (fn [[idx char]] (if (get chars char) idx))
          (indexed s)))

The Scala version that I came up with is as follows:
[scala]
def lazyIndexed(coll:Seq[Any]):Seq[Pair[Int,Any]] = {
if (coll==null) Stream.empty
Stream.from(0).zip(coll)
}
def lazyIndexOfAny(pred:Seq[Any], coll:Seq[Any]):Option[Int] = {
if (pred==null) None
lazyIndexed(coll).dropWhile(ip => !pred.exists(p => (p == ip._2))).headOption.map(ip => ip._1)
}
[/scala]
As said, Scala’s Stream implements a lazy list, in which the elements are only evaluated when needed. In this case, we keep on drop elements from it list that do not appear in the predicate sequence. The head of the remaining list is the first matching result.
Obvious drawback here is that this indeed does works on infinite collections, but only if a match can be found, else we’re still in trouble.

guest
17 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Mohammed Berdai
12 years ago

Yep, I came to the same conclusion 🙂

Stuart Halloway
12 years ago

Glad you enjoyed the book. Is there any chance you can reformat the Clojure example to indent correctly? Also, the book includes JavaScript you can use to syntax-color Clojure code if you want.

Alexander Azarov
12 years ago

It is possible to simplify Scala version a little bit using one helper function:
def opt[T](x: Seq[T]): Seq[T] = if (x == null) List() else x
def indexed[T](coll: Seq[T]): Seq[(Int, T)] = {
val c = opt(coll)
List.range(0,c.length).zip(c.toList)
}
def indexedFilter[T](pred:Seq[T], coll:Seq[T]): Seq[Int] = for (p <- opt(pred); idxPair <- indexed(coll); if p == idxPair._2) yield idxPair._1
def indexOfAny[T](pred: Seq[T], coll:Seq[T]): Option[Int] = indexedFilter(pred, coll).firstOption
This compiles and runs under Scala 2.7.5. Scala 2.8 will hopefully provide object method Option.apply(x) which will test x (it’s marked as @experimental right now)

Asd
Asd
12 years ago

Clojure surely can get nulls from Java methods. Are they represented as nil? Is nil handled automatically when working with collections in clojure?
Personally I would omit the null checks in Scala. If somebody is passing nulls around it is their own fault if the get a NPE. In fact, they should get a NPE from any function that is not documented to handle nulls.

Asd
Asd
12 years ago

The Scala could definitely be shorter. I am omitting the null checks (you didn’t put any type checks in the clojure did you?):
def indexedFilter(pred:Seq[Any], coll:Seq[Any]) =
coll.toList.zipWithIndex.filter(p => pred.contains(p._1)).map(p => p._2)
def indexOfAny(pred:Seq[Any], coll:Seq[Any]) = indexedFilter(pred,coll) match {
case x::xs => Some(x)
case _ => None
}

Asd
Asd
12 years ago

D’oh, my Scala could be even shorter:
def indexOfAny(pred:Seq[Any], coll:Seq[Any]) = indexedFilter(pred,coll).firstOption

Cay Horstmann
12 years ago

This is an interesting exercise, but it might scare away Java refugees who think that everything in Scala must be rethought with a mess of predicates and zipped lists. This particular algorithm translates easily into something that a Java refugee can grasp pretty quickly:
def indexOfAny(str : String, searchChars : Array[Char]) =
if (str == null || searchChars == null) List() else
for { i <- 0 until str.length if (searchChars.contains(str(i))) } yield i
It collects all matching indices, not just the first one.

Eastsun
12 years ago

my scala solution(just a direct translation from the java version)
scala> def indexOfAny(str :String, seq :Array[Char]):Option[Int] = {
| if(str == null || seq == null) None
| else (0 until str.length) find {i => seq.contains(str(i)) }
| }
indexOfAny: (String,Array[Char])Option[Int]

Stuart Halloway
12 years ago

Cay, Eastsun,
The direct translations are concise, but they lack the generality of the Clojure version. How would you code a Scala version to handle an infinite collection?

Alexander Azarov
12 years ago

I guess you have a mistake in the update. When you don’t write “else”, “if” returns Unit.

Anders
Anders
12 years ago

I have read tutorial how to write code more functional, i like both Scala and Clojure, but to get developer hocked-on functional development it could be easy way to show how to do it, with the language they all ready use.
IBM have a good introduction.
Functional programming in the Java language
http://www.ibm.com/developerworks/java/library/j-fp.html
mvh Anders

Alexey Tarasevich
12 years ago

I think, while you are not using domain-specific macros your Clojure will not look remarkably better then Scala 🙂

Eddie Stanley
10 years ago

There is visibly a lot to identify about this. I feel you made some good points in features also.

trackback

[…] who follows programming languages will probably have noticed Scala cropping up a lot […]

Explore related posts