On October 31 Kirk Pepperdine – authority in the field of java performance tuning – will join us to hold a 4 days during workshop. Five Xebia internal attendants registered and the remaining seven seats are open for external attendants. For course details go here.
I interviewed Kirk Pepperdine. He is CTO of javaperformancetuning.com and editor of TheServerSide. Kirk talks about the course he provides with us, his work as a consultant and how he tackles performance problems on-site. So if you think about following the course or if you just look for Java performance tips and experiences, this interview is for you.
Jeroen: Can you tell us a bit about yourself and what you do?
Kirk: First and foremost, I am a developer and I enjoy developing applications and tinkering with different things. My foray in performance tuning started while I was working in a group that used Cray supercomputers. It was a bit strange in that I could be coding in Cray Assembler in the morning and then Smalltalk in the afternoon with a bit of C sprinkled in for lunch. What I do now is help teams solve their performance problems. I also edit, write and teach but most of all I enjoy writing code.
I first started performance tuning Java in late 97. Jack Shirazi and I were asked to performance tune a fairly large Java application. I believe it was the largest Java project in Europe at that time. As you could imagine the problems were not so simple and the tooling and documentation was almost non-existent. I started writing my own memory profiler as there were no commercial options available when we started. This work was the inspiration for Jack to write the performance tuning book. He then started a companion website and then we started collaborating on generating materials for the site. Now all of that work has resulted in both of us being noticed by Sun and we are now both members of the Java Champions program. It is quite an overwhelming experience to be in discussions about the future of Java with all of the truly talented people that are in that group.
Jeroen: You provide your Speeding up Java applications a.k.a. JavaPerformance Tuning training with us on October 31, November 1, 2 and 3. What is this training about and who should attend?
The computing industry is quite a paradoxical one. On the one hand computers equal performance and of course every computer programmer knows how to write the fastest code around. However I keep running into projects in trouble that are suffering from different forms of the identical problem. A few years back when Jack and I first started offering this course, we gave the attendees a fairly routine exercise. What we discovered is that not one single person could complete the task of tuning without intervention. It wasn’t just this one group, it was every single group that we ran into. Finally one person did manage to solve the problem. However, to our great surprise, he wasn’t a developer! The guy who identified the real problem was a tester without much coding experience! We also have another exercise where not a single person has been able to identify the real underlying performance bottleneck.
It isn’t that any of these problems are very difficult; it just seems that many developers are ill equipped to deal with performance problems. I can’t say that I know why but I do know that once a developer has been through the course they won’t look at a performance problem with the same eyes ever again. In fact in our last offering I had one comment that loosely quoted was; “you’ve given me a whole new way of looking at this problem”. It is very satisfying to hear these types of comments because that is our goal.
Jeroen: I’m going to follow the course myself. I’m curious how it will be. Do the students need to have a Java developer background or is it suitable for testers with some basic Java knowledge as well?
When we first wrote the course we were originally were targeting developers and since developers like developing code, many of the exercises were designed to make it interesting for developer. However we found that many testers also wanted to join in on the fun. Some of them were quite capable of coping with the coding tasks and we gave a little boost to the ones that were having difficulties so they could still benefit from the intended lesson. We’ve also adapted exercises on the fly to meet the special needs of a group. You never know what you are going to face when you walk into one of these sessions. This is why we’ve insisted that anyone else who is presenting the material have hands-on performance tuning experience. Performance is a dynamic topic. The solutions to the exercises are not set in stone. We’ve had students crack problems in surprising ways. This is one aspect of the course that we believe sets it apart from courses that have been “canned” so they can be delivered by a trainer.
Jeroen: I find javaperformancetuning.com to be an amazing performance resource. Can you tell us about the site and your company?
As I said earlier, javaperformancetuning.com was to be a companion site to Jack’s book. I believe that Tim O’Rielly encouraged him to start it. I came on board shortly afterwards to help add some content but the most useful thing in my opinion are the tips section and tool reports. We’ve been so busy that we’ve let some of the tool reports slip but I think you’ll see that changing in the next few months. I’ve got some interesting tool reports in the pipes.
What we offer is pretty much anything to do with performance. That includes white papers, benchmarking, performance reviews and tuning, architectural advice.
Jeroen: You are recently active as editor of TheServerSide. What does this involve?
My primary task is to acquire content and make sure it is suitable for publication on the site. Secondary to that is my involvement in planning for the TSS symposiums. I only do this on a part time basis and most of the work falls on some very dedicated and talented TechTarget staff. It isn’t easy trying to understanding what people want, get the speakers, organizing abstracts and all of the other stuff. It is a lot of work but the rewards are more than worth it.
Jeroen: Can you sketch a typical striking customer situation for us, when you come in as a performance reviewer or troubleshooter?
Gosh, a typical customer… I’m not sure that there is a typical customer except they all are stressed out and possibly at wits end. Stress is the killer problem and that needs to be addressed before anything else. You need to have a calm environment or it is almost impossible to get anything done. The first thing I look for is a pressure relief valve. It is going to look different in every situation but you have to find it even if it means taking some very unconventional steps. I like to call these strategic hacks, something that is sometimes so hideous that it is difficult to look at. I once wrote a cron job that queried the database every few minutes to find the oldest session and then it just killed it. You’d have to agree that this is a pretty hideous hack but the effect was dramatic. The phones just suddenly stopped ringing and you could feel a wave of relief sweep over the room.
I should also mention the skeptical client, the client that already has had 1 or 2 or even maybe 3 consultants in to take a look at the problem. In fact I was just helping a partner write a proposal that he’d already presented to a client with a significant performance problem. Although each of the previous consultants that this client had already engaged had each declared success, the problem still persisted. I’m not sure how one can declare victory unless one is able to characterize the problem, apply some solution, and then re-measure for effect.
You can imagine that the client is very skeptical of us since we neither work for nor seek endorsement from any product company. Yet what most clients may or may not realize is that big product company will send in the next available consultant with minimal regard to that person’s ability to resolve the situation. So it is not surprising that we run into the skeptical customer nor is it surprising that we’re able to solve problems that big product companies’ own consultants sometimes miss.
Jeroen: What are the most typical and most occurring performance problems you encounter in the field?
Database interactions and memory management.
Jeroen: What is your solution to these problems? How do you achieve the solution?
Good question and I wish I knew a real answer to it. The first thing to do is understand the source of the problem. Once you see the source the solutions are typically self explanatory. For example I was asked to investigate an application that had a query that was talking about 40 minutes to complete. I had a quick chat with the DBA and she produced a query plan that showed that a full table scan was being triggered. The DBA immediately realized that she should create an index which of course resulted in sub-second response times. And no, I’m not making that one up. You may ask how someone could miss something so simple. My answer is that it is pretty common that people miss these types of opportunities because of stress and the complexity of the environments that they are working in. If you have a number of problems then little things like a 40 minute query get overlooked in all the noise. It happens, we are all just human and can only handle so much complexity, and even less of it when we are under stress.
Jeroen: Which change have you seen applied in a project that gained the largest performance improvement?
Well, aside from adding an all important index, the largest improvements have all come from fundamental changes in architecture or design. Yeah, I know, that typically means a complete rewrite. However, if you use the first version as a requirements document for the second, you can often realize some amazing reductions. I mean, if you look deeply at the first implementation you can often see what the real requirements were and use those observations as a basis for requirements for the second. I like to say that less code runs faster and the one thing that I’ve noticed is that rewrites almost always ends up leaving you with much less code.
Jeroen: You defined some performance anti-patterns. What are they?
There are so many to choose from. I’ll just mention the biggest and baddest one of them all. Not having definitive evidence that pin-points the performance problems before going off and doing something. And by evidence I mean something that you could take home to mom and have her say: “Oh yeah, I see!”. We call this Guess, don’t Measure, then call us. Shouldn’t be any surprise that the refactored solution is called Measure, Don’t Guess. Needless to say my wife prefers that you all follow the anti-pattern.
Jeroen: Often database access and O/R mapping are responsible for bad performance. What is your take on this? How do we prevent this?
I like to quote Doug Clarke who is the product manager for TopLink. He is often asked how fast TopLink is. The answer he gives is: “how fast is your bicycle?”. This answer may be annoying but it is pretty much dead on. O/R mapping tools introduce some drag but in the grand scheme of things it is nothing compared to a good old hit on the network. The problem is; it is both an important and silly question at the same time. It is just about impossible to get a meaningful number out of a benchmark. Response time is going to be a weighted average of the response time of all the component parts. This includes schema and many dynamic aspects of a system. This means that all of the weights will be different for every single application. What you are asking Doug and Gavin is; what is my average response time and by the way you have to guess at all of the weights in the equation. There are just way too many variables to control.
Gavin King has some interesting benchmarking numbers that he refuses to publish because in his words they didn’t make sense and people would believe that he made them up. There is just so much stuff going to get a meaningful result, one that is going to help you to understand what you should be expecting. The most recent Microsoft benchmark is yet another example of yet another useless benchmark. They claimed that their implementation of the WebSphere reference application runs faster in .NET than in J2EE. I asked them why and they didn’t have an answer. The guess was, remember our mantra measure don’t guess, the guess was tight coupling between .NET and Access was responsible for the marginal performance benefit. So if I used that database access method then I’m golden; otherwise the J2EE stack would appear to be a better choice. But I’m guessing here as the benchmark failed to answer the fundamental questions architects need to have answered.
Benchmarking is a much more complex task than it sounds. Anyone that thinks they are going to get numbers in a day or so hasn’t tried this before, or will be reporting bogus numbers, or will need lots of luck.
How do we prevent bad database performance? Ask your local DBA. A good DBA will do wonders to ensure that your database is up to snuff and you are executing sound SQL. The next step is to get the O/R mapping tool to generate the desired SQL. That can be tricky.
Of course the best database call is the one you didn’t make and you can avoid making them if you use caching. Caching is there to protect your application from over utilizing a slower underlying technology. Data close is better than data far away and a cache will keep data close.
Jeroen: What is your experience with new technologies like AJAX, SOA, ESB?
AJAX is new and so the reports from the field are mixed. The problem with AJAX is that it pretty much breaks the idea that someone is going to quickly use a connection not so often. So this can’t help but put more stress on the server. However my gauge of performance is the user experience. I don’t care how hard the hardware is working, that is what it is there for. Hardware utilization isn’t as important as making sure that the user is being serviced in a reasonable amount of time. Reasonable is a fuzzy word and I use it intentionally. If AJAX can keep users busy so that they don’t notice “un-reasonable” response times then it’s done its job. You can get more hardware but getting users is typically a more challenging effort. I should add that there are reports that moving to AJAX has improved server performance so as I said, results are mixed.
I’m surprised at how SOA is being hyped as a technology when is really is a new label on some pretty old stuff. I’m happy for SOA. SOA works to organize your code in a very useful way. Organized code is always better for performance tuning than unorganized code. More over you can focus on what matters, response time, with a simple measurement at the gateway.
ESB is totally reliant on your network which is arguably the slowest piece of your computing infrastructure. Most ESB’s use XML which is about the fattest protocol that one can use. So what we have is an application that is using the fattest protocol over the slowest piece of infrastructure and you want to talk about performance. Obviously a tightly coupled client/server pair using a light weight protocol is going to run circles around an ESB architected solution. I wouldn’t suggest that people use ESB if performance is going to be an issue. However, ESB offers many interesting architectural and integration possibilities which may override concerns for lighting fast response times.
This drives into a number of different questions that we address during the course. We are not about performance at any cost but instead we like to talk about how to achieve performance that is “good enough” and making sound architectural choices that meet both your performance requirements and business needs. It is clear that system integration can offer some critical business advantages, yet we may not have the luxury of being able to rewrite applications so they can work together. This is one place where ESB offers a tremendous advantage. Of course if it isn’t fast enough then we as an industry will figure out how to make it go faster. After all, this is what this industry is all about. If the XML parsing is too heavy then one can now get an XML parsing SPD, problem solved!
Jeroen: What is “an XML parsing SPD” and how does it speed up XML parsing?
A XML parsing Special Purpose Device is just a piece of hardware dedicated to parsing XML. All it does is offload the task of parsing XML from your CPU to another device that in theory should be able to handle the task more efficiently. Similar thing is done with SSL. You can get a piece of hardware that will handle all the encryption.
This brings up another interesting point. There are now a few companies such as Azul Systems and the Sun MicroSystems Niagara project that are selling hardware specifically targeted at making Java run faster.
Jeroen: What do you mean exactly with “couple the client to the server”, like tightly coupled instead of loosely coupled? or like co-located?
I mean coupled at the code or data structure level where changes in one demands changes in the other. ESB may be coupled at the service level in that your application may need to interact with a credit card service. However the dependency on the code level would be limited to some XML file.
Jeroen: How much is the performance of a J2EE application typically impacted by using MVC frameworks like Struts, Tapestry, MyFaces instead of plain servlets?
If you are asking will plain Servlets always outperform applications that use more abstract frameworks then there is really no clear answer. We use frameworks because they provide us with functionality that we need. Each of these frameworks brings value to the table. However, using them doesn’t come for free. If you decide to roll your own then your developers may or may not give you some thing that performs better. That aspect of the equation is a gamble. What isn’t a gamble is that that you’ll need more time to develop the functionality and you will not find people who are familiar with your homegrown approach.
Generally the economic benefits of using these frameworks outweigh concerns for performance and in my humble opinion this is the way it should be. Used properly in the proper scenarios each of these frameworks will offer good enough performance. Used poorly and in improper scenarios and all bets are off. Here are some more arguments for using off the self packages: Gavin King removed weak references from Hibernate and it resulted in a nice improvement in performance. If you are using Hibernate then by upgrading to the new version you should see a nice improvement in performance. As an aside, this is but one of the reasons that we emphasise components and good design over concerns for performance. Performance at all costs should not be an option.
Jeroen: Performance is often ignored in projects, until there are costly performance problems in production. Do you have an explanation for this?
Yes, belief that our code will run as fast as the wind and if it doesn’t we can just buy more hardware to paper over the problems. Seriously though, the biggest problem is time or perceived lack of it. Even if the schedule includes time for performance testing it is the task at the end of the line and is the one that will get shorted. After all, there is no point in tuning what isn’t working.
Jeroen: Should there be explicit performance activities in a development process? How would you fit these in a process?
We talk about this topic in the course. In every stage of development there are useful things that one can do to ensure you will meet your performance targets. For example, in the requirements phase you should be collecting and setting performance targets.
Jeroen: Is just adding new hardware a solution for performance and scalability problems?
It really depends upon the nature of your performance problem and how your application has been designed, architected, and implemented, and I do stress implemented. I’ve given the recommendation that the most cost-effective solution is to purchase more hardware. I’ve also refused to provide that recommendation when I knew that it would make no difference whatsoever. The first thing you need to know is what resource constraint is responsible for the performance problem. From there you should be able to do some analysis to determine the best solution. The mantra is; measure, don’t guess.
Jeroen: What performance tools are in your toolbox and which ones are your favorites? What makes them your favorites?
I tend to rely on what some would regard to be very primitive tools. If I could equate favorite to most useful then I would have to pick vmstat. Now you may think that it is funny that a tool that spits out kstat values would be so useful in performance tuning Java, however you can use it to eliminate entire classes of problems in just a few minutes. If you are looking for a needle in a haystack the best thing to do is to get rid of as much hay as you can up front.
Jeroen: You mean you can quickly figure out things like too much memory usage, swapping; excessive disk I/O etc.?
Exactly. If you look at all the classes of performance problems the ones you need to eliminate first are those based in hardware. If you’re hardware is coping yet users are complaining about response times then you are looking at a totally different class of performance problem. If I need to get to that needle in a hurry then I want something that is going to clear hay very quickly and vmstat or perfmon will do that for me.
Jeroen: I think your mantra: Measure, Don’t Guess, is indeed crucial and in my experience often not well applied in practice. How is this covered in your course?
Measure don’t guess is like any other catch phrase, it tries to capture the spirit of what in this case needs to be done. However it only says what you should do without really saying how. What the course teaches is the what and hows of measuring. I offer this friendly warning. The course mirrors real life. Should you decide to ignore the mantra in the class you’ll just feel silly whereas in life the consequence could be quite serious.
Jeroen: This has been a great interview with you Kirk! Thank you very much for sharing this knowledge and these experiences with us. I’m looking forward to attend your course.