Blog

On the mysteriously fast Spray-can web-server

02 Aug, 2013

I am addicted to a problem: handling unknown peak load on the net. Part of the solution I have in mind involves, of course, a fast web-server. One of the fastest around is Spray-can (see https://github.com/spray/spray-can) and I really like the thing for several reasons I won’t explain here. Anyway, I’m sure you can guess my very first question by now:
How fast is Spray-can really?

So I’m after assessing the speed of Spray-can. I can believe the guys from spray.io who tweeted (my co-worker Raymond Roestenburg pointed this out to me, thanks!):
Screen shot 2013-08-02 at 1.52.25 PM
Needless to say just believing these guys is absolutely no fun. Here is what I did to figure it out myself.
The Server.
I wrote this bit of scala to obtain a response that is easy for counting:

   case HttpRequest(GET, "/dispatcher", _, _, _) =>
      counter = counter + 1;
      sender ! HttpResponse(entity = counter.toString())

Let’s see if this works. It does! First request (https://xebia.com/blog:8080/dispatcher) gives ‘1’, next gives ‘2’, then ‘3’. Cool!
The Client
Over the last months I have used several techniques for the Client. I started with JMeter and I blew up JMeter, not Spray-can. Then I wrote a really mean low level client in java, used thousands of threads and got results that I still do not understand. I might get into that in a later blog. Last tuesday I told my co-worker Joris de Winne and he asked why don’t you just use the ‘wget’ Unix command. So we hacked up this experiment the same day.
First experiment: one Mac
On my Mac there is no ‘wget’, so we used ‘curl’ but that’s a detail. We used two little shell scripts the first one (“testit.sh”) does curl just calls the server. It looks like this:

#!/bin/sh
curl https://xebia.com/blog:8080/dispatcher 2>&1 > /dev/null

And a second one for making our live easy. I put the first script in an endless loop and start the loop 30 times in the background. Like this:

#!/bin/sh
while [ "" = "" ] ; do ./testit.sh 2>&1 > /dev/null; done &
while [ "" = "" ] ; do ./testit.sh 2>&1 > /dev/null; done &
while [ "" = "" ] ; do ./testit.sh 2>&1 > /dev/null; done &
<snip>
while [ "" = "" ] ; do ./testit.sh 2>&1 > /dev/null; done &
while [ "" = "" ] ; do ./testit.sh 2>&1 > /dev/null; done &

The mysterious result
Note I’m running Spray-can as well as my test scripts on the same machine (2Ghz Intel core i7)
This is what a see. A repeating pattern that shows a CPU bound process that drops frequently to almost no CPU usage at all…. Green is CPU power used by user processes, Red is CPU power used by the system.
Screen shot 2013-08-02 at 10.57.45 AM
But why this pattern???? I hooked up JConsole to see if this was garbage collection firing. Nope! I could really use your help here. Are my Akka Actors collapsing and being put put back in the air by supervising Actors???? And it gets more mysterious in a bit when we run our second experiment. Hang in.
Throughput
I let my experiment run for 5 minutes and then used a browser to see how many requests were handled. There were 161447 requests. So that is a throughput of about 538 req/sec.
Second experiment: two Macs
Today I used my wife’s Mac to run the clients on. It is a somewhat older machine and not as powerful (2.53 Ghz Intel Core 2 Duo) compared to my own. As we now have two machines there is obviously a network in between and I got myself some UTP cables to make the communication as fast as possible. That appeared unnecessary! Just using the wifi I saw this:
Mac running the clients:
Schermafbeelding 2013-08-02 om 13.04.24
Mac running Spray-can:
Screen shot 2013-08-02 at 1.04.14 PM
The machine running the clients is clearly CPU bound like we saw before. And the machine running Spray-can behaves as expected. But where is my repeating pattern now? I haven’t the foggiest….
Throughput
I saw 44697 requests in 5 min which is about 150 req/sec.
So now what?
I plan to organize a “Please Break My System” session with all my co-workers. I’ll allow any technique used for the clients except that they have to use their laptops as using servers in the cloud is no fun. Watch this space.

guest
11 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Jeroen Leenarts
Jeroen Leenarts
8 years ago

If I were you I’d suspect you’re I/O constraints. Especially the way you are running the shell scripts, I think there’s a ton of internal housekeeping going on. To really dig into this, I’d look into using DTrace and/or Instruments on a Mac. That will give you deeper insight.
My guess with the bumping CPU graph you’re seeing is that the shell scripts are amassing a lot of requests, which are at some point handled your server. Apparently the Spray-Can does this with way lower CPU overhead compared to Curl.
Remember that looping that Curl call is a child process on every iteration. So the CPU is more busy with creating and cleaning processes than anything else. Maybe some Java.nio would do the trick.

Jeroen Leenarts
Jeroen Leenarts
8 years ago

Also look into Apache Benchmark, installed by default on a Mac.
Command-line:
ab -c 10 -n 1000 http://localhost:8080/dispatcher
-c is the number of concurrent requests.
-n The total number of requests to fire.
Much lower overhead compared to that curl thing you did. 😉

Age Mooij
Editor
8 years ago

Jeroen, AB would normally be a very good suggestion but unfortunately AB is severely broken on OSX and this will lead to very strange behaviour. Even if you install a newer version via homebrew it will not work correctly if you don’t use persistent connections (`-k` switch).
Try one of the other well known load generators, like ‘weighttp’, ‘httperf’ or ‘wrk’ (which are all available via homebrew).

Jeroen Leenarts
Jeroen Leenarts
8 years ago
Reply to  Age Mooij

Works on my machine… (which is pre-release 10.9)
Should I look for specific behavior to detect this broken-ness?

Age Mooij
Editor
8 years ago

Wilco, you are linking to an ancient and deprecated version of spray-can. The latest, much faster version is part of the main spray project at http://github.com/spray/spray
Did you also run your tests against the old version of spray-can?

Age Mooij
Editor
8 years ago

Have a look at the standard server-benchmark example that comes with spray-can: https://github.com/spray/spray/tree/master/examples/spray-can/server-benchmark
Try the following weighttp command:
weighttp -n 100000 -c 100 -t 4 -k “http://localhost:8080/”
When I run this against the server-benchmark project on my Retina MBP without any special configuration, I get around 45k requests/second. If I run it against “http://localhost:8080/json” I get around 71k requests/second.

Tomasz N.
Tomasz N.
8 years ago

Spawning curl/wget or any blocking tool like JMeter might consume most of computer resources like CPU, I/O, RAM etc. Consider http://gatling-tool.org/ for HTTP benchmarking. Interestingly, just as Spray, it’s Akka based and non-blocking.

wilco koorn
wilco koorn
8 years ago

@all: thanks for the hits to other benchmarking tools. I had a quick look at ‘ab’ yesterday and I can confirm Age’s remarks. I get very spurious results and the thing seems buggy to me too. I’ll certainly have a look at ‘weighttp’ and ‘gatling’. Also ‘The Grinder’ is on my list (http://grinder.sourceforge.net/).
@Age: I ran ‘spray-can/1.1-M7’ during my experiments.
But for me the most important question still stands: what is going on in my first experiment that repeatedly shows poor performance?

Age Mooij
Editor
8 years ago
Reply to  wilco koorn

The M8 release of Spray is significantly faster since it is based on the new actor-based Akka IO core.
That being said, M7 was already pretty damn fast and should also easily do many tens of thousands of requests per second so IMHO your extremely low results are still mostly caused by inefficient load generation.
I think your focus on explaining a weird CPU usage pattern that you observed just once on a heavily overloaded system is… interesting 🙂

wilco koorn
wilco koorn
8 years ago
Reply to  Age Mooij

@Age: I promise to re-run tests soonish. And yes, interesting isn’t 😉
And I’ve seen a throughput of 12679 req/sec using JMeter (while JMeter was the bottleneck) so yes, I already knew its fast 😉

Marcin
Marcin
1 year ago

Use wiremock instead. Bash script it’s a little lame

Explore related posts