The need for "glue" code is a fact of life for programmers. Sooner or later, that app you’ve crafted, that shining monument to software engineering, will need to be sullied by actually running it on a real system.
Perhaps you need a boot script, or you need to build a container. Maybe it’s a tool that needs to be invoked as part of some larger orchestration process.
Whatever the reason, you’ll need some glue code to make the magic happen. That probably means a shell script, and for most people, that probably means using Bash (or something like it).
Bash, coupled with the usual suite of Linux command line tools, is fantastically powerful and adaptable. It is also rather fiddly and esoteric.
In this post, well look at an alternative approach using Scala and Scala-CLI.
A Note on Style
Although I enjoy functional programming as much as the next Scala dev, I’m not worrying about it too much for the purposes of this post. FP scripting is perfectly possible, but it’s not the objective of this write-up.
When trying to suggest that Scala is a viable alternative to Bash, we have to ask why Bash is still so popular.
We can’t do much to compete with Bash’s ubiquity and general availability because we will need to install Scala / Scala-CLI wherever we need it.
I believe Bash’s main appeal is perceived development speed. It feels like it’s quick to – forgive me – bash out a Bash script, and get it up and running. Nonetheless, here I think we can compete.
True, Scala is always going to have a compile-time and small start-up cost (unless you export to a native GraalVM image!), but with the right choice of libraries, I posit that you can write a correct script more quickly in Scala with Scala-CLI than you can in Bash; particularly as the size and complexity of the script increase.
For this reason, I’ve opted to use ‘The Singaporean Stack’ / the Haoyi Li ecosystem, which includes my all-time favorite Scala library: OS-lib. These tools give us an interface comparable in their direct simplicity to command line tools but with significantly more power.
Follow along setup
You will need to have Scala-CLI installed to follow along with the Scala sections of this post. In the Bash section, we’re using curl
and jq
.
Motivating Scenario
Let’s take a look at a simple fictional use case.
We need a script that will grab some data from the Star Wars API and save it.
Our input will be a file called planets.txt
containing one URL per line pointing to data on planets in the Star Wars universe. Here is our sample data:
https://swapi.dev/api/planets/1/
https://swapi.dev/api/planets/2/
https://swapi.dev/api/planets/3/
Our script will:
- Ensure a suitable output directory is available to use
- Read the input file
- For each URL in the file:
- Download the JSON data
- Extract the name of the planet
- Save a JSON file using the planet’s name for the filename and the raw JSON data as the contents.
Bashing Something Together
Below is a first attempt, coded up in 28 lines of pretty concise yet readable-looking Bash, liberally commented and spaced out.
#!/bin/bash
set -e
# Create, or if it exists already, clear and recreate the output dir
DESTINATION="./output"
if [ -d "$DESTINATION" ]; then
rm -rf $DESTINATION
fi
mkdir $DESTINATION
# Read the file
while read -r line
do
# Request the data
PLANET=$(curl -s $line) # -s tells curl to work silently
# Extract the name
NAME=$(echo $PLANET | jq -r .name) # -r discards the quotes, so you get Yavin, not "Yavin".
# Output the data to a json file named after the planet in the output directory
echo $PLANET > "$DESTINATION/$NAME.json"
# Print the name for some user feedback
echo $NAME
done < "planets.txt"
To run this script, save it to a file called demo.sh
in the same directory as your planets.txt
file, and run it from your command line with: bash demo.sh
At a glance, this looks pretty good. Ok, I had to google Bash’s if statement syntax for the millionth time in my career, but the end result is actually quite readable.
There are a couple of problems, though:
- There’s a scary
rm -rf
in there that I always have to think about very carefully, to avoid accidentally wiping all my data. - I’m not going to have any confidence that this works at all until I run it.
- There is no error handling other than the
set -e
flag, and adding error mechanisms will increase the script’s complexity quite a bit.
Scala for Scripting
When we think of Scala, we tend to think of Big Data, backend services, and slow JVM start-up times. In fact, Scala can do basically anything, from native applications to web apps to -yes- CLI tools and scripts.
Let’s try porting our Bash script to Scala and running it via Scala-CLI.
//> using scala 3.3.1
//> using dep com.lihaoyi::os-lib:0.9.3
//> using dep com.lihaoyi::requests:0.8.0
//> using dep com.lihaoyi::upickle:3.1.4
// Create, or if it exists already, clear and recreate the output dir
val dest = os.pwd / "output"
if os.exists(dest) then os.remove.all(dest)
os.makeDir(dest)
// Read the file
os.read
.lines(os.pwd / "planets.txt")
.foreach: url =>
// Request the data
val planet = requests.get(url).text()
// Extract the name
val name = ujson.read(planet)("name").str
// Output the data to a json file named after the planet in the output directory
os.write(dest / s"$name.json", planet)
// Print the name for some user feedback
println(name)
To run this script, save it to a file called demo.sc
in the same directory as your planets.txt
file, and run it from your command line with: scala-cli demo.sc
.
This is a simple Scala-CLI compatible script. Since the file is an .sc
file, all the statements are allowed to sit at the top level. If I’d used a .scala
file instead, I’d have needed to provide a main
method or had an object extending App
or similar.
At the top of the script, you can see some statements beginning with //>
. These are called ‘directives‘ and are used to give information about your build to Scala-CLI. Here, they specify the Scala version we’d like, along with our library dependencies. If you have a multi-file project, keeping all your directives in a single file is considered good practice.
Comparing the Solutions
The Scala script is more or less a direct translation of the Bash version. It’s at least equally readable, the comments and spacing are the same, and it even has the same number of lines of code as the original!
Did we reap any benefits from this? Well, I’m happy to report three low-hanging improvements already:
- The big scary
rm
is gone! - The program has type-checked, so I can’t have made many obvious blunders.
- My editor was able to help me with API information and docs during development.
Replacing Bash
We’re already doing better than Bash in some regards, but it isn’t a decisive victory for Scala just yet.
What can we do to convince you to make the change?
There are a host of benefits to switching from Bash to Scala and Scala-CLI, not least of which are solid solutions for traditionally tricky issues in Bash, like handling special/escaped characters and excellent support for unit testing.
That said, the two points that I consider to have the most immediate impact are:
- Error handling
- Expressiveness and scalability
Error handling
Let’s look at another version of that script:
//> using scala 3.3.1
//> using dep com.lihaoyi::os-lib:0.9.3
//> using dep com.lihaoyi::requests:0.8.0
//> using dep com.lihaoyi::upickle:3.1.4
// Create, or if it exists already, clear and recreate the output dir
val dest = os.pwd / "output"
if os.exists(dest) then os.remove.all(dest)
os.makeDir(dest)
val filename = "planets.txt"
if os.exists(os.pwd / filename) then
// Read the file
os.read.lines
.stream(os.pwd / filename) // Stream in case it's a lot of data.
.foreach: url =>
// Request the data
val response = requests.get(url)
val planet = response.text()
// Handle failed requests
if response.statusCode != 200 then println(s"Error getting planet at url '$url', got '${response.statusCode}'.")
else
// Extract the name and handle the missing field
ujson.read(planet)("name").strOpt match
case None =>
println(s"Unnamed planet at url '$url'!")
case Some(name) =>
// Output the data to a json file named after the planet in the output directory
os.write.over(
dest / s"$name.json",
planet
) // Write over existing files, instead of erroring.
// Print the name for some user feedback
println(name)
else println(s"Could not find the expected file '$filename', in the working directory.")
This version is a little bit longer, but notice that we have a number of simple error handling improvements that would have been fiddly to deal with in Bash.
Streaming data
By switching from:
os.read.lines(os.pwd / filename)
To:
os.read.lines.stream(os.pwd / filename)
We are now streaming the data one line at a time, so we would not accumulate huge amounts of memory if the data set was very large.
Write over or error
By switching from:
os.write(dest / s"$name.json", planet)
To:
os.write.over(dest / s"$name.json", planet)
We are now explicitly saying that it’s ok to overwrite existing files. In the former version, we’d get an exception if a file of the same name was encountered.
This means we are replicating Bash’s behavior, but we’ve had to be specific in making that design decision.
Handle response status codes
Our code checks that we get a 200
back from the server before continuing with the processing and reports if there is a problem.
Missing field handling
My personal favorite: What happens in the Bash version if the name
field is missing? In fact, you get a null
as a String
, and the script carries on regardless!
In the Scala version, we can ensure that doesn’t happen with everyone’s favorite null
aware type, Option
:
ujson.read(planet)("name").strOpt match
case None => ???
case Some(name) => ???
Expressiveness and Scalability
There is a long history of people using other programming languages to replace shell scripts; Perl is a great example. The main motivation is to improve the programmer’s ability to express what they want to achieve and to use familiar tools to do so.
Scala is the "scale-able" language, and this is rarely more evident than in the context of scripting.
If your requirements are simple and direct, you can quickly write a few lines of Bash-like procedural, side-effecting, mutable, low-abstraction code to get the job done.
As your script’s needs grow, Scala grows with you as you introduce simple abstractions, functions, and unit tests, giving Python-like results built on Scala’s rich, robust, and relatively straightforward library/package ecosystem.
For more grand, complicated, and nuanced challenges, Scala’s immutability by default, effect systems, and powerful functional constructs will ensure that you are always working at the most productive level of abstraction for the problem at hand, all without ever leaving Scala-CLI.
Bash can’t do that. Not many languages can, really.
Conclusion
In this post, I’ve used Scala-CLI to allow me to utilize a number of libraries from the Haoyi Li ecosystem, which, with their no-nonsense APIs and low dependencies, I believe are a great fit for the scripting use case.
While my library choices are a matter of personal taste, I hope I’ve managed to convince you that Scala and Scala-CLI are more than worthy replacements for your usual go-to scripting language.
We haven’t come close to scratching the surface of what Scala-CLI can do in this write up, but I hope you’re feeling inspired to give it a go!
Happy Scala Scripting!