Blog

Sorting and pagination with Hibernate Criteria – How it can go wrong with joins

11 Dec, 2008

Lately I ran into an annoying problem with Hibernate. I tried to do pagination on a query result which was doing an SQL-JOIN under the hood. The query before paging returned about 100 results. When I turned on paging (with 20 results per page) all the pages had less than 20 results!
The reason for this is that with a JOIN there can be duplicate results and those results are filtered out after pagination is done. In this blog I will explain how to solve those problems and it also a cleaner way to build your Criteria queries.

Let me explain this with some examples.
Our data model for today consists of doors and colors. A door can have multiple colors.
The data set to start with:
Door A:
-id: 1
-size: large
-colors: red and green
Door B:
-id: 2
-size: small
-color: blue and green
Now lets select all red, green or blue doors, that will be two doors right?

Criteria crit = session.createCriteria(Door.class);
crit.createAlias("colors", "c");
crit.add(Restrictions.in("c.name", new Object[]{ "red" , "green" , "blue" }));

Because of the colors alias an INNER JOIN is used and the result might not be what you expect.
The size of the list is 4, 4 Door objects are returned! That is very strange. When you have a look at the SQL you will see the following query:
select this_.id as id2_1_, this_.size as size2_1_, colors3_.DOOR_ID as DOOR1_, c1_.id as COLOR2_, c1_.id as id3_0_, c1_.name as name3_0_
from doors this_ inner join DOORS_COLORS colors3_ on this_.id=colors3_.DOOR_ID inner join colors c1_ on colors3_.COLOR_ID=c1_.id
where c1_.name in (?, ?, ?)

Translated to readable SQL:
SELECT * FROM doors d
INNER JOIN doors_colors dc on d.id=dc.door_id
INNER JOIN colors c ON c.id=dc.color_id
WHERE c.name in (?, ?, ?)

This query will indeed return 4 results, but we’re selecting Door objects, not SQL-rows. Looking at the java code I expected Door object, not query result rows wrapped in a Door object
To solve this problem you can add a ResultTransformer to the crit-object:

crit.setResultTransformer(CriteriaSpecification.DISTINCT_ROOT_ENTITY);

Now only two doors are selected, problem solved.
Well, not so fast, let’s see what happens with many doors. Our database now has 16 doors. When we select the red, green and blue doors again, 11 results appears. Suppose we want to add paging to the results, because it’s just too many doors to handle at a time. 5 doors per page is enough.
We can achieve it by adding the following two lines:

crit.setFirstResult(0);
crit.setMaxResults(5);

11 doors found, 5 results per page and display the results starting from the first result (which is 0-indexed).
How many results do you expect on the first page?
Not the 3 results returned probably. The ResultTransformer is still active by the way. What happens is that the paging is applied before the duplicate entities are filtered out.
When you start thinking in SQL you might come up with the idea to use a subquery. This is exactly what we’re are going to do in Hibernate.
A subquery in pseudocode:
select * from doors d where d.id in (select id from doors, colors where colors.name in (‘red’,’green’,’blue’))
As you can see the subquery only selects the id column and in my previous example a complete Door object was selected.
To select other columns than the columns in the Entity object you can use Projections. With multiple columns use the ProjectionList object. For now we need an id-projection, which has is own method:

crit.setProjection(Projections.id());

Now a list with int’s is selected. This result is the input for the next query. You can create Criteria objects without a Hibernate Session for later use. These objects are called DetachedCriteria.
When we rename the crit to dc and create it as a DetachedCriteria you will have the following lines of code:

DetachedCriteria dc = DetachedCriteria.forClass(Door.class);
dc.createAlias( "colors" , "c");
dc.add(Restrictions.in( "c.name" , new Object[]{ "red" , "green" , "blue" }));
dc.setResultTransformer(CriteriaSpecification.DISTINCT_ROOT_ENTITY);
dc.setProjection(Projections.id());

The next step is creating an outer query around the previous query.

Criteria outer = session.createCriteria(Door.class);
outer.add(Subqueries.propertyIn("id", dc));
outer.setFirstResult(0);
outer.setMaxResults(5);

With Subqueries.propertyIn you can add the previous query. When you add paging and sorting to the new outer query object all your problems are solved. You can even get rid of the ResultTransformer and get the same result. When you’re tuning performance it might be a nice test to see whether it makes any difference to do the DISTINCT before you pass the results to the outer query.
How you can solve DISTINCT problemns is explained reasonably in the Hibernate FAQ:
Hibernate does not return distinct results for a query with outer join fetching enabled for a collection (even if I use the distinct keyword)?
The lesson I learned with Hibernate is that you still have to know SQL and don’t trust the results blindly. If you don’t know why you have to use DISTINCT and how a JOIN presents its results you’ll get lost very soon.
With the DetachedCriteria object you can split your queries in a very readable way and when using projections, the SQL under the hood only selects the needed columns, which improves performance.

sources

http://floledermann.blogspot.com/2007/10/solving-hibernate-criterias-distinct.html
http://www.hibernate.org/117.html#A12

guest
20 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Alex
Alex
13 years ago

I think it’s kind of sad that you need to compromise performance (the joins will be much faster on for instance mysql) because Hibernate won’t translate the SQL resultset to objects correctly when collections are involved.
Hibernate should really have a configuration parameter which automatically does this for you.

Maarten Winkels
13 years ago

@Alex: It is a bit strange to blame Hibernate for the fact that MySQL cannot optimize the subselect in the same way as it does the inner join: The DB logic is in both cases exactly the same. Furthermore, it has nothing to do with how Hibernate ‘translates the SQL resultset to objects’ or how this is incorrect. The fact is that you cannot write an SQL statement which involves an (inner) join on a table that might have multiple entries for the base table with a LIMIT (for MySQL) to return a predictable number of unique rows from the base table. You cannot request Hibernate to solve this for you.
What you could do is look into the query.scroll() method and do pagination yourself. This will be a bit trickier, because you will have to do the UNIQUE and the pagination yourself, but you’ll be able to improve performance.

Muzaffer
Muzaffer
13 years ago

Hi Jeroen van Wilgenburg,
Thanks for your explanation. I am using Criteria obj as follows
1. Set the order by [column1]
2. Get the resultset
3. Here I want to change order by to [column2]. If I simply addOrder [column2] to the Criteria obj it will give me result in the order [column1],[column2]. Here I want result to be sorted only based on [column2]. Thanks.

Bhim Bomma
Bhim Bomma
13 years ago

Thanks Jeroen. This helped me to solve the pagination problem in my application.

AGO
AGO
13 years ago

Thank you very much for such nice article!
It is short, with good example and without unnecessary fireworks – simply great.
Regards, Albert

trackback

[…] did not realize that multiple joins are not possible using Hibernate’s criteria api.  This article is not related but solves a pagination problem I’ve had in the […]

murali
murali
12 years ago

Good Article , Just one note : this approach would not work if the entities has EAGER Join on Collections

ptb
ptb
12 years ago

I tried this solution but I couldn’t get it to work in cases where you need to add multiple properties to the projection list. Apparently Sql doesn’t allow multiple select items in a subselect.

Filme Noi Cinema
12 years ago

Actually SQL allows multiple select items in subselect (or at least Oracle SQL does) like this:
select * from person where (name, age) in (select name, age from ……. )
But I guess that Hibernate hasn’t implemented this yet.

Markos Fragkakis
Markos Fragkakis
11 years ago

Thanks for this, very well explained.

Tim Dennison
Tim Dennison
10 years ago

Great explanation of this problem!
I keep thinking about an alternative solution and wanted feedback. I created a custom ResultTransformer that delegates to the DistinctRootEntityResultTransformer for transformTuple() and transformList() operations. However, the transformList() operation also uses a couple of instance variables (number of objects desired, page index) to produce a “paged” List of objects to return. The effect is that paging is accomplished via “entities” instead of rows.

trackback

[…] This seems to be a well known problem for years as can be read here: http://blog.xebia.com/2008/12/11/sorting-and-pagination-with-hibernate-criteria-how-it-can-go-wrong-… […]

Cyril Mathew
Cyril Mathew
9 years ago

This is a great article which solved my pagination problem

Breton
Breton
9 years ago

That may save some time. I ll follow your explanations. Thks

umarani
umarani
9 years ago

I want to display 10 records

umarani
umarani
9 years ago

I want to display ten records per page using critria query how to set pagenum,pagesize using that….thanks in advance

fdm
fdm
8 years ago

nice,
it does not solve the problem for sorting.
how would it be done if yoiu want to sort the doors by their color?
Subqueries do not support sorting.

fdm
fdm
8 years ago

Sorry… the ‘nice’ do not sounds quite correct on my previus message.
It is in complete … it was meant to say: ‘Nice article, but it does not solve the problem for sorting’
thanks for the article.

Ruba
Ruba
8 years ago

Thanks a lot! you rescued me! thanks for the great tips.

Aaron
Aaron
8 years ago

Thank you! Ran into this exact issue and your blog perfectly described the problem and workaround, hope others find this as helpful as I did.
-Aaron

Explore related posts