Blog

Google Code Jam 2009 – Qualification

04 Sep, 2009
Xebia Background Header Wave

Aliens sending messages, Water flowing over a map and finding the hidden Welcome message in a String… Yes, Google Code Jam has returned for the 2009 edition! I participated in the Qualification Round and managed to solve all but 1 input set….

Sounds like fun? It sure was! As I have been busy learning some Scala, I thought I’d try that and see how it worked out during the contest. I think it saved me at least some lines of code.
For the first problem, Google has intercepted some alien communications but the signal has weakened so much that some of the words are unclear. They now need our help to devise a clever algorithm that can tell them how many different words the signal can represent. For this they constructed the full dictionary of the alien language. As a sample they provided us with the following alien language which consists of 5 3-letter words, and 4 received signals.
3 5 4
abc
bca
dac
dbc
cba
(ab)(bc)(ca)
abc
(abc)(abc)(abc)
(zyx)bc

For my solution I decided to read the dictionary of all the words, and subsequently convert each of the signals to a regular expression by doing
[scala]
val pattern = "(ab)(bc)(ca)".replaceAll("\(", "[").replaceAll("\)", "]").r
[/scala]
This lead to the following code:
[scala]
object AlienLanguage {
def main(args: Array[String]) = {
if (args.length == 0) {
throw new IllegalArgumentException("Could not find input file in arguments")
}
val input = args(0)
val outputFile = new File((input substring(0, input lastIndexOf("."))) + ".out")
val reader = Source fromFile(input) getLines
val ar = reader nextIntArray
val (tokens, nrWords, nrProblems) = (ar(0), ar(1), ar(2))
val results = new Array [String] ( nrProblems )
val words = (1 to nrWords).map(x => reader.trimmedLine).toList
for (val i <- 1 to nrProblems) {
results(i – 1) = "Case #" + i + ": " + solveProblem(reader, words)
}
outputFile write results
}
def solveProblem(reader : Iterator[String], words : List[String]) : String = {
val pattern = reader.trimmedLine
val regex = pattern.replaceAll("\(", "[").replaceAll("\)", "]").r
matchWords(regex, words.head, words.tail).toString
}
def matchWords(regex : scala.util.matching.Regex, word: String, words : List[String]) : Int = {
words.isEmpty match {
case true => matchWord(regex, word)
case false => matchWord(regex, word) + matchWords(regex, words.head, words.tail)
}
}
def matchWord(regex : scala.util.matching.Regex, word: String) : Int = {
regex.findFirstIn(word) match {
case Some(x) => 1
case None => 0
}
}
}
[/scala]
The code uses two helper classes which use implicits to add methods to a File and to an Iterator[String]. These can be found in my GitHub repository (see bottom). This code was able to solve the large input set (max. 5000 15-letter words and 500 signals) within 2,5 seconds.
For the second problem we were given an elevation map for which we had to calculate drainage basins, based on a a set of rules:

  • From each cell water flows to at most one neighbour cell
  • Water always flows to the lowest neighbour, in case of a tie, water prefers North, West, East, South in this order
  • If no neighbour has a lower altitude, water stays in this cell, we call this a sink
  • drainage systems are identified by a single unique lowercase letter, such that when all rows are concatenated, the string is lexographically smallest.
    A sample input with 1 5×5 map
    1
    5 5
    1 2 3 4 5
    2 9 3 9 6
    3 3 0 8 7
    4 9 8 9 8
    5 6 7 8 9

    My solution consists of first building the map and then converting that to a "flow map", in which every cell points to the index of the cell to which the cell drains. This flow map can then easily be transformed to a "basin map" by starting in the top-left corner and going from left-to-right and top-to-bottom through the map, assigning a unique letter to the drainage system. The code is:
    [scala]
    object Watersheds extends CodeJam {
    def solveProblem(reader : Iterator[String]) : String = {
    val ar = reader.nextIntArray
    val (h, w) = (ar(0), ar(1))
    val theMap = (1 to h).map(x => reader.nextIntArray).toArray
    val shed = new Shed(h, w, theMap)
    val basins : Array[String] = shed.solve
    printRecursive(basins, w)
    }
    def printRecursive(basins : Array[String], length : Int) : String = {
    basins.size match {
    case 0 => ""
    case => "\n" + basins.take(length).reduceLeft(+" "+) + printRecursive(basins.drop(length), length)
    }
    }
    }
    class Shed(val h : Int, val w : Int, val theMap : Array[Array[Int]]) {
    val flowMap = (for (hPos <- 0 until h; wPos <- 0 until w) yield determineFlowTo(hPos, wPos)).toArray
    val basins : Array[Int] = Array.make(h * w, -1)
    def determineFlowTo(hPos : Int, wPos : Int) : Int = {
    val north = determineAltitude(hPos – 1, wPos)
    val west = determineAltitude(hPos, wPos – 1)
    val east = determineAltitude(hPos, wPos + 1)
    val south = determineAltitude(hPos + 1, wPos)
    val me = determineAltitude(hPos, wPos)
    val minAltitude = List(me, north, west, east, south).sort(
    <_).head
    if (me > north && north == minAltitude) {
    convertToIndex(hPos – 1, wPos)
    } else if (me > west && west == minAltitude) {
    convertToIndex(hPos, wPos – 1)
    } else if (me > east && east == minAltitude) {
    convertToIndex(hPos, wPos + 1)
    } else if (me > south && south == minAltitude) {
    convertToIndex(hPos + 1, wPos)
    } else {
    convertToIndex(hPos, wPos)
    }
    }
    def determineAltitude(hPos : Int, wPos : Int) : Int = {
    if (hPos == h || hPos < 0) {
    10001 // more than max altitude in large map because out of range
    } else if (wPos == w || wPos < 0) {
    10001 // more than max altitude in large map because out of range
    } else {
    theMap(hPos)(wPos)
    }
    }
    def convertToIndex(hPos : Int, wPos : Int) = {
    hPos w + wPos
    }
    /
  • Solve methods
    /
    def solve() : Array[String] = {
    val basinString = "abcdefghijklmnopqrstuvwxyz".split("")
    var currentHighestBasin = -1
    (0 until h
    w).foreach(i => if (basins(i) == -1) currentHighestBasin = fillFlow(i, currentHighestBasin))
    basins.map(i => basinString(i + 1)).toArray
    }
    def fillFlow(index : Int, currentHighestBasin : Int) : Int = {
    val flowTo = flowMap(index)
    if (basins(flowTo) != -1) {
    basins(index) = basins(flowTo)
    currentHighestBasin
    } else {
    val basinValue = findBasinValue(index)
    if (basinValue != -1) {
    fillTheFlow(index, basinValue)
    currentHighestBasin
    } else {
    fillTheFlow(index, currentHighestBasin + 1)
    currentHighestBasin + 1
    }
    }
    }
    def findBasinValue(index : Int) : Int = {
    flowMap(index) match {
    case MARKDOWN_HASH6a992d5529f459a44fee58c733255e86MARKDOWN<em>HASH => basins(index)
    case => findBasinValue(flowMap(index))
    }
    }
    def fillTheFlow(index : Int, basin : Int) {
    basins(index) = basin
    val flowTo = flowMap(index)
    if (flowTo != index && basins(flowTo) == -1) fillTheFlow(flowTo, basin)
    }
    }
    [/scala]
    Note that the CodeJam trait which is extended provides a main method, much alike the one from the first problem. The output that is generated by the code looks for the sample input given above as
    Case #1:
    a a a a a
    a a b b a
    a b b b a
    a b b b a
    a a a a a

    The program was able to solve both the large and small datasets in under 1 second.
    The last problem was (at least for me) a hard nut to crack, though the small input was easy to solve, my program didn’t finish the large input within the scheduled 8 minutes. The problem statement was as follows:
    Given a text string, you are to determine how many times the string "welcome to code jam" appears as a sub-sequence of that string. In other words, find a sequence s of increasing indices into the input string such that the concatenation of input[s[0]], input[s[1]], …, input[s[18]] is the string "welcome to code jam".
    My approach consisted of the following steps:

      • first filter out all characters from the string that do not belong to the “welcome to code jam” message
      • do a simple kind of run-length-encoding trick to decrease the length of the string even further (while maintaining the information)
      • loop over the string and calculate the number of occurrences
          [scala]

     

        • object WelcomeToCodeJam extends CodeJam {

     

        • def solveProblem(reader : Iterator[String]) : String = {

     

        • val expr = “welcome to code jam”

     

        • val line = reader.trimmedLine

     

        • // first filter all characters not occurring in pattern

     

        • val filteredLine = makeHisto(line.toList.filter(x => expr.contains(x)))

     

        • val exprList = expr.toList

     

        • (solve(filteredLine, exprList) % 10000) formatted (“%04d”)

     

        • }

     

        • def makeHisto(line : List[Char]) : List[(Char, Long)] = {

     

        • if (line.isEmpty) {

     

        • List[(Char, Long)]()

     

        • } else {

     

        • val h = histoLetter(line.tail, line.head)

     

        • val newLine = line.drop(h._2.toInt)

     

        • newLine.isEmpty match {

     

        • case true => List(h)

     

        • case false => List(h) ++ makeHisto(newLine)

     

        • }

     

        • }

     

        • }

     

        • def histoLetter(line : List[Char], c : Char) : (Char, Long) = {

     

        • (c, (line takeWhile (_ == c)).size + 1)

     

        • }

     

        • def solve(line : List[(Char, Long)], expr : List[Char]) : Long = {

     

        • var count : Long = 0L

     

        • if (!expr.isEmpty) {

     

        • var restLine = findNext(line, expr)

     

        • while (!restLine.isEmpty) {

     

        • count += (restLine.head._2 * solve(restLine.tail, expr.tail))

     

        • restLine = findNext(restLine.tail, expr)

     

        • }

     

        • } else {

     

        • count = 1

     

        • }

     

        • count

     

        • }

     

        • def findNext(line : List[(Char, Long)], expr : List[Char]) = {

     

        • line.dropWhile(_._1 != expr.head)

     

        • }

     

        • }

     

        • [/scala]

     

        • Now, given the fact that some people actually managed to do this in a very small amount of time, my guess is that I completely missed something which is why it couldn’t finish the large input set (lines of 500 characters). The small input set was correctly generated however. I’ve been busy this last day refactoring the third program to improve its speed, the latest version is the one shown above, who can tell me what I missed, or how it can be done faster?

     

        • For the code of the 2009 edition, and some of the problems of the 2008 edition, you can also visit my GitHub repository at:

    https://github.com/hierynomus

      • . Who else competed, or who has a better solution in Scala?
Questions?

Get in touch with us to learn more about the subject and related solutions

Explore related posts