Blog

Better Shell Scripting with Scala-CLI

12 Feb, 2024
Xebia Background Header Wave

The need for "glue" code is a fact of life for programmers. Sooner or later, that app you’ve crafted, that shining monument to software engineering, will need to be sullied by actually running it on a real system.

Perhaps you need a boot script, or you need to build a container. Maybe it’s a tool that needs to be invoked as part of some larger orchestration process.

Whatever the reason, you’ll need some glue code to make the magic happen. That probably means a shell script, and for most people, that probably means using Bash (or something like it).

Bash, coupled with the usual suite of Linux command line tools, is fantastically powerful and adaptable. It is also rather fiddly and esoteric.

In this post, well look at an alternative approach using Scala and Scala-CLI.

A Note on Style

Although I enjoy functional programming as much as the next Scala dev, I’m not worrying about it too much for the purposes of this post. FP scripting is perfectly possible, but it’s not the objective of this write-up.

When trying to suggest that Scala is a viable alternative to Bash, we have to ask why Bash is still so popular.

We can’t do much to compete with Bash’s ubiquity and general availability because we will need to install Scala / Scala-CLI wherever we need it.

I believe Bash’s main appeal is perceived development speed. It feels like it’s quick to – forgive me – bash out a Bash script, and get it up and running. Nonetheless, here I think we can compete.

True, Scala is always going to have a compile-time and small start-up cost (unless you export to a native GraalVM image!), but with the right choice of libraries, I posit that you can write a correct script more quickly in Scala with Scala-CLI than you can in Bash; particularly as the size and complexity of the script increase.

For this reason, I’ve opted to use ‘The Singaporean Stack’ / the Haoyi Li ecosystem, which includes my all-time favorite Scala library: OS-lib. These tools give us an interface comparable in their direct simplicity to command line tools but with significantly more power.

Follow along setup

You will need to have Scala-CLI installed to follow along with the Scala sections of this post. In the Bash section, we’re using curl and jq.

Motivating Scenario

Let’s take a look at a simple fictional use case.

We need a script that will grab some data from the Star Wars API and save it.

Our input will be a file called planets.txt containing one URL per line pointing to data on planets in the Star Wars universe. Here is our sample data:

https://swapi.dev/api/planets/1/
https://swapi.dev/api/planets/2/
https://swapi.dev/api/planets/3/

Our script will:

  1. Ensure a suitable output directory is available to use
  2. Read the input file
  3. For each URL in the file:
    1. Download the JSON data
    2. Extract the name of the planet
    3. Save a JSON file using the planet’s name for the filename and the raw JSON data as the contents.

Bashing Something Together

Below is a first attempt, coded up in 28 lines of pretty concise yet readable-looking Bash, liberally commented and spaced out.

#!/bin/bash

set -e

# Create, or if it exists already, clear and recreate the output dir
DESTINATION="./output"

if [ -d "$DESTINATION" ]; then
  rm -rf $DESTINATION
fi

mkdir $DESTINATION

# Read the file
while read -r line
do
  # Request the data
  PLANET=$(curl -s $line) # -s tells curl to work silently

  # Extract the name
  NAME=$(echo $PLANET | jq -r .name) # -r discards the quotes, so you get Yavin, not "Yavin".

  # Output the data to a json file named after the planet in the output directory
  echo $PLANET > "$DESTINATION/$NAME.json"

  # Print the name for some user feedback
  echo $NAME
done < "planets.txt"

To run this script, save it to a file called demo.sh in the same directory as your planets.txt file, and run it from your command line with: bash demo.sh

At a glance, this looks pretty good. Ok, I had to google Bash’s if statement syntax for the millionth time in my career, but the end result is actually quite readable.

There are a couple of problems, though:

  1. There’s a scary rm -rf in there that I always have to think about very carefully, to avoid accidentally wiping all my data.
  2. I’m not going to have any confidence that this works at all until I run it.
  3. There is no error handling other than the set -e flag, and adding error mechanisms will increase the script’s complexity quite a bit.

Scala for Scripting

When we think of Scala, we tend to think of Big Data, backend services, and slow JVM start-up times. In fact, Scala can do basically anything, from native applications to web apps to -yes- CLI tools and scripts.

Let’s try porting our Bash script to Scala and running it via Scala-CLI.

//> using scala 3.3.1

//> using dep com.lihaoyi::os-lib:0.9.3
//> using dep com.lihaoyi::requests:0.8.0
//> using dep com.lihaoyi::upickle:3.1.4

// Create, or if it exists already, clear and recreate the output dir
val dest = os.pwd / "output"

if os.exists(dest) then os.remove.all(dest)

os.makeDir(dest)

// Read the file
os.read
  .lines(os.pwd / "planets.txt")
  .foreach: url =>
    // Request the data
    val planet = requests.get(url).text()

    // Extract the name
    val name = ujson.read(planet)("name").str

    // Output the data to a json file named after the planet in the output directory
    os.write(dest / s"$name.json", planet)

    // Print the name for some user feedback
    println(name)

To run this script, save it to a file called demo.sc in the same directory as your planets.txt file, and run it from your command line with: scala-cli demo.sc.

This is a simple Scala-CLI compatible script. Since the file is an .sc file, all the statements are allowed to sit at the top level. If I’d used a .scala file instead, I’d have needed to provide a main method or had an object extending App or similar.

At the top of the script, you can see some statements beginning with //>. These are called ‘directives‘ and are used to give information about your build to Scala-CLI. Here, they specify the Scala version we’d like, along with our library dependencies. If you have a multi-file project, keeping all your directives in a single file is considered good practice.

Comparing the Solutions

The Scala script is more or less a direct translation of the Bash version. It’s at least equally readable, the comments and spacing are the same, and it even has the same number of lines of code as the original!

Did we reap any benefits from this? Well, I’m happy to report three low-hanging improvements already:

  1. The big scary rm is gone!
  2. The program has type-checked, so I can’t have made many obvious blunders.
  3. My editor was able to help me with API information and docs during development.

Replacing Bash

We’re already doing better than Bash in some regards, but it isn’t a decisive victory for Scala just yet.

What can we do to convince you to make the change?

There are a host of benefits to switching from Bash to Scala and Scala-CLI, not least of which are solid solutions for traditionally tricky issues in Bash, like handling special/escaped characters and excellent support for unit testing.

That said, the two points that I consider to have the most immediate impact are:

  1. Error handling
  2. Expressiveness and scalability

Error handling

Let’s look at another version of that script:

//> using scala 3.3.1

//> using dep com.lihaoyi::os-lib:0.9.3
//> using dep com.lihaoyi::requests:0.8.0
//> using dep com.lihaoyi::upickle:3.1.4

// Create, or if it exists already, clear and recreate the output dir
val dest = os.pwd / "output"

if os.exists(dest) then os.remove.all(dest)

os.makeDir(dest)

val filename = "planets.txt"

if os.exists(os.pwd / filename) then
  // Read the file
  os.read.lines
    .stream(os.pwd / filename) // Stream in case it's a lot of data.
    .foreach: url =>
      // Request the data
      val response = requests.get(url)
      val planet   = response.text()

      // Handle failed requests
      if response.statusCode != 200 then println(s"Error getting planet at url '$url', got '${response.statusCode}'.")
      else
        // Extract the name and handle the missing field
        ujson.read(planet)("name").strOpt match
          case None =>
            println(s"Unnamed planet at url '$url'!")

          case Some(name) =>
            // Output the data to a json file named after the planet in the output directory
            os.write.over(
              dest / s"$name.json",
              planet
            ) // Write over existing files, instead of erroring.

            // Print the name for some user feedback
            println(name)
else println(s"Could not find the expected file '$filename', in the working directory.")

This version is a little bit longer, but notice that we have a number of simple error handling improvements that would have been fiddly to deal with in Bash.

Streaming data

By switching from:

os.read.lines(os.pwd / filename)

To:

os.read.lines.stream(os.pwd / filename)

We are now streaming the data one line at a time, so we would not accumulate huge amounts of memory if the data set was very large.

Write over or error

By switching from:

os.write(dest / s"$name.json", planet)

To:

os.write.over(dest / s"$name.json", planet)

We are now explicitly saying that it’s ok to overwrite existing files. In the former version, we’d get an exception if a file of the same name was encountered.

This means we are replicating Bash’s behavior, but we’ve had to be specific in making that design decision.

Handle response status codes

Our code checks that we get a 200 back from the server before continuing with the processing and reports if there is a problem.

Missing field handling

My personal favorite: What happens in the Bash version if the name field is missing? In fact, you get a null as a String, and the script carries on regardless!

In the Scala version, we can ensure that doesn’t happen with everyone’s favorite null aware type, Option:

ujson.read(planet)("name").strOpt match
  case None       => ???
  case Some(name) => ???

Expressiveness and Scalability

There is a long history of people using other programming languages to replace shell scripts; Perl is a great example. The main motivation is to improve the programmer’s ability to express what they want to achieve and to use familiar tools to do so.

Scala is the "scale-able" language, and this is rarely more evident than in the context of scripting.

If your requirements are simple and direct, you can quickly write a few lines of Bash-like procedural, side-effecting, mutable, low-abstraction code to get the job done.

As your script’s needs grow, Scala grows with you as you introduce simple abstractions, functions, and unit tests, giving Python-like results built on Scala’s rich, robust, and relatively straightforward library/package ecosystem.

For more grand, complicated, and nuanced challenges, Scala’s immutability by default, effect systems, and powerful functional constructs will ensure that you are always working at the most productive level of abstraction for the problem at hand, all without ever leaving Scala-CLI.

Bash can’t do that. Not many languages can, really.

Conclusion

In this post, I’ve used Scala-CLI to allow me to utilize a number of libraries from the Haoyi Li ecosystem, which, with their no-nonsense APIs and low dependencies, I believe are a great fit for the scripting use case.

While my library choices are a matter of personal taste, I hope I’ve managed to convince you that Scala and Scala-CLI are more than worthy replacements for your usual go-to scripting language.

We haven’t come close to scratching the surface of what Scala-CLI can do in this write up, but I hope you’re feeling inspired to give it a go!

Happy Scala Scripting!

Dave Smith
Dave accidentally became a backend Scala developer in 2012 and has been trying to return to the frontend ever since. By insisting on dragging Scala and Scala.js with him, success levels in this endeavour have been dubious at best. He's best known as the maintainer of Indigo, a Scala.js game engine, and Tyrian, an Elm-inspired Scala.js web framework.
Questions?

Get in touch with us to learn more about the subject and related solutions

Explore related posts