As Josh Bloch writes, for business objects, "overriding the equals method [is] necessary to satisfy programmer expectations." (Effective Java, item 7). Apart from benefits he mentions – conformance to expectations, correct use in maps and sets etc. – I’ve found that implementing equals (and hashCode) really make you consider what the classes represent. Certainly for business objects, i.e. objects in your domain model, trying to define a business-level identity for your classes is a good way of validating that you’ve correctly captured a business-relevant concept.
An equals definition can also serve as a useful piece of documentation describing your domain. Here, we’ll consider an approach that tries to do this as cleanly and conveniently as possible…declaratively!
The problem with equals
The equals method is essentially self-documenting. Well, kinda. But how easy is it actually to determine from the following (Eclipse autogenerated) implementation that a PhoneNumber is determined by the country code, area code and the number?
public class PhoneNumber { private int countryCode; private int areaCode; private int number; @Override public boolean equals(Object obj) { if (this == obj) return true; if (obj == null) return false; if (getClass() != obj.getClass()) return false; PhoneNumber other = (PhoneNumber) obj; if (areaCode != other.areaCode) return false; if (countryCode != other.countryCode) return false; if (number != other.number) return false; return true; } // hashCode() ... }
The following "standard" implementation may be a little more readable
public boolean equals(Object obj) { if (this == obj) { return true; } if (!(obj instanceof PhoneNumber)) { return false; } PhoneNumber other = (PhoneNumber) obj; return ((countryCode == other.countryCode) && (areaCode == other.areaCode) && (number == other.number)); }
or, using Apache Commons’ EqualsBuilder,
public boolean equals(Object obj) { if (this == obj) { return true; } if (!(obj instanceof PhoneNumber)) { return false; } PhoneNumber other = (PhoneNumber) obj; return new EqualsBuilder().append(countryCode, other.countryCode) .append(areaCode, other.areaCode).append(number, other.number).isEquals(); }
but that’s still a lot of boilerplate code, when all we really want to say is "use these three properties!"
A different approach
How about something like this, then?
@BusinessObject public class PhoneNumber { @BusinessField private int countryCode; @BusinessField private int areaCode; @BusinessField private int number; @Override public boolean equals(Object obj) { return BusinessObjectUtils.equals(this, obj); } @Override public int hashCode() { return BusinessObjectUtils.hashCode(this); } // getters and setters etc... }
Once we’re here, why not also add
@BusinessObject public class PhoneNumber { // ... @Override public String toString() { return BusinessObjectUtils.toString(this); } }
which can be very useful for debugging purposes?
Um…why?
What’s so nice about this? Well, the business key for the object is visibly defined, on the fields that it comprises, as opposed to being referred to indirectly in some method. With runtime retention of the annotation, the definition is also accessible to any other code that cares to look. And boilerplate code is almost down to zero, even more so after a bit of refactoring:
@BusinessObject public abstract AbstractBusinessObject { @Override public boolean equals(Object obj) { return BusinessObjectUtils.equals(this, obj); } @Override public int hashCode() { return BusinessObjectUtils.hashCode(this); } @Override public String toString() { return BusinessObjectUtils.toString(this); } } public class PhoneNumber extends AbstractBusinessObject { @BusinessField private int countryCode; @BusinessField private int areaCode; @BusinessField private int number; // getters and setters etc... }
Of course, there’s nothing new under the sun, and I’m sure something similar is out there in the codeosphere (there was this Hibernate forum post, but it seems to have disappeared. If you know of another example, please add a comment).
I came across tt>@BusinessObject</tt in a project developed by Vincenzo Vitale, a former colleague. I just reimplemented some of the internals to improve performance and slightly modified and extended the semantics.
Whoa…explain, please!
So how does this all work? According to BusinessObjectUtils.equals, two objects are equal only under the following circumstances:
- if one of the objects is null, both must be null
- otherwise, both objects must be business objects (i.e. annotated with @BusinessObject, or extending such a class) and
- they must have the same business fields, i.e. the set of names of fields annotated with @BusinessField (whether defined in the class or inherited) must be the same for both classes and
- the values of business fields with the same name must all be equal
Note that these conditions are reflexive, symmetric and transitive, i.e. a valid implementation of equals: a clearly has the same, equal set of business fields as a, and if a and b have the same, equal business fields, and b and c do too, then clearly so do a and c.
So
BusinessObjectUtils.equals(null, null) == true BusinessObjectUtils.equals(new PhoneNumber(), null) == false BusinessObjectUtils.equals(new PhoneNumber(), "31 35 538 1921") == false BusinessObjectUtils.equals("31 35 538 1921", "31 35 538 1921") == false (!) BusinessObjectUtils.equals(new PhoneNumber(), new PhoneNumber()) == true
Note that, so far, there is no mention of the type of the business object! Indeed, the following is possible:
class CountryAreaCodeNumberFragment extends AbstractBusinessObject { @BusinessField protected int countryCode; @BusinessField protected int areaCode; } @BusinessObject public class LocalNumber extends CountryAreaCodeNumberFragment { @BusinessField private int number; } PhoneNumber xebiaClosedDialingPlan = new PhoneNumber(); xebiaSales.countryCode = 31; xebiaSales.areaCode = 35; xebiaSales.number = 5381921; LocalNumber xebiaOpenDialingPlan = new LocalNumber(); xebiaLocal.countryCode = 31; xebiaLocal.areaCode = 35; xebiaLocal.number = 5381921; BusinessObjectUtils.equals(xebiaClosedDialingPlan, xebiaOpenDialingPlan) == true
"Closed dialing plan"? "Open dialing plan"? What is he on about?!? See Dr. Wiki, but I digress…
If this last example feels "wrong" to you, welcome to the club. But it is consistent with the idea that, if you’re basing your definition of equality on field values, that’s all you should consider unless otherwise specified.
Of course, there are plenty of situations in which this is not at all appropriate.
@BusinessObject public class Ferrari { @BusinessField private String licence; } @BusinessObject public class Windows95 { @BusinessField private String licence; } Ferrari fxx = new Ferrari(); fxx.licence = "HL-34-F3"; Windows95 preinstalledOs = new Windows95(); preinstalledOs.licence = "HL-34-F3"; BusinessObjectUtils.equals(fxx, preinstalledOs) == true (?!?)
OK, the example is incredibly artificial, but you get the point: often, it is necessary to limit the types your business object may be equal to.
Who art thou, Object stranger?
An equality definition based on field values only would indeed appear to be a rather strange case. After all, all the "canonical" equals implementations include some kind of instanceof check, restricting equality to (sub)classes of the class being compared to1.
As such, a mustBeInstanceOf attribute2 on the tt>@BusinessObject</tt annotation would seem like an obvious choice. On closer examination, however, it raises some interesting questions.
mustBeInstanceOf, yes. But instance of what? The class being compared to? Or the class on which the tt>@BusinessObject</tt annotation is defined (tt>@BusinessObject</tt, although not an tt>@Inherited</tt annotation by design, has transitive semantics3)?
Using the (runtime) class of the object being compared is a definite no-no, because it very quickly leads to symmetricity violations. If PhoneNumber is annotated with tt>@BusinessObject</tt and MobilePhoneNumber extends PhoneNumber, then
- new PhoneNumber().equals(new MobilePhoneNumber()) == true but
- new MobilePhoneNumber().equals(new PhoneNumber()) == false
because
- MobilePhoneNumber instanceof PhoneNumber == true but
- PhoneNumber instanceof MobilePhoneNumber == false
which is a well-known problem.
So we’re left with the class annotated with tt>@BusinessObject</tt. This solves the above problem (PhoneNumber and MobilePhoneNumber are obviously both PhoneNumber instances), but requires some thought in designing your domain model – if everything just extends AbstractBusinessObject then they’re all comparable to each other, so you’re back where you were before we started the whole mustBeInstanceOf discussion.
Picking your own equals
Whichever way you turn it, a mustBeInstanceOf-type attribute limits you to the class hierarchy of your domain objects. This may well be regarded as a good thing (let’s defer the domain modelling discussion for the moment), but it’s also a bit of wasted opportunity.
After all, if you’re going to the trouble of introducing an extra "marker" for your domain objects, shouldn’t you be able to declare comparable sets of domain classes from among all classes with this marker?
To cut a long story short, tt>@BusinessObject</tt does indeed offer this, in the form of a equivalentClasses attribute. If a class is annotated as4
@BusinessObject(equivalentClasses = { A.class, B.class, C.class }) public class A { ... }
then instances of A will be comparable to instances of B (b instanceof B) and C. B and C do need to be business objects, though.
As Josh outlines in some detail, correctly implementing equals in a subclass hierarchy can be very tricky. In this case, in order to avoid transitivity violations you need to ensure the sets of equivalentClasses are “transitively closed”, i.e. if B, C and (automatically) A are in A's equivalentClasses, then B's equivalentClasses must be A, C and (automatically) B, and C's likewise A, B and (automatically) C.
Why? Assume that, for classes A, B and C (none of which which are subclasses of each other), A‘s equivalentClasses are only A and B whereas C‘s equivalentClasses are A, B and C. Then, if a, b and c are instances of A, B and C respectively with equal business fields, we have
- a.equals(b) == b.equals(a) == true (B is an equivalent class for A and vice versa) and
- b.equals(c) == c.equals(b) == true (C is an equivalent class for B and vice versa).
But a.equals(c) == c.equals(a) == false, because C is not an equivalent class for A!
Note there there is no violation of symmetricity here, only of transitivity.
Before you decide to stay well away from this particular can of worms, though, it’s worth bearing in mind that most of the most common use cases are perfectly safe:
“Shared parent class”
// no equivalent classes would mean "comparable to all classes" @BusinessObject(equivalentClasses = { Parent.class }) class Parent { // business fields } // @BusinessObject annotation carried over from Parent class Child1 extends Parent { } // @BusinessObject annotation carried over from Parent class Child2 extends Parent { }
In this scenario, instances of Parent, Child1 and Child2 are safely comparable to each other, i.e. will be equal if the values of the business fields are equal. This holds whether or not Parent is an abstract class or not.
“Comparable class pool”
@BusinessObject(equivalentClasses = { Foo.class, Bar.class, Baz.class }) class Foo { // business fields } @BusinessObject(equivalentClasses = { Foo.class, Bar.class, Baz.class }) class Bar { // business fields } @BusinessObject(equivalentClasses = { Foo.class, Bar.class, Baz.class }) class Baz { // business fields } // @BusinessObject annotation carried over from Baz class BazChild extends Baz { }
Here, instances of Foo, Bar and Baz and BazChild are also safely comparable to each other. Of course, maintaining the equivalentClasses attributes of the “class pool” can be a bit of a pain.
As a general rule of thumb, if you define a set a mutually comparable classes and ensure that the equivalentClasses of the members of this set are the same, you should be OK. If some of these classes inherit from a common parent, you only need to correctly annotate that parent, in fact.
Moral: With great power comes great responsibility a plethora of ways to screw things up if you’re not careful!
The price of freedom lunch
So I declare my business fields, and their values are retrieved when the objects are compared…hm, how do they get at the values, I wonder?…let’s have a look… (at this point you fire up your favourite IDE) oh no, they’re using reflec…well, I’d be curious to see how that can possibly perform!.
I’ll save you the trouble of finding out: performance is bad. Orders of magniture bad5.
As might be expected, it’s the reflection (which happens in PropertyUtils.getSimpleProperty) that’s hurting.
So you might want to think twice before using this in performance-critical context.
Sources
The (zipped) Maven project can be obtained from here.
Vincenzo’s original code, since refined and renamed "simplestuff", is hosted at Google Code.
- Forcing classes to be equal (as opposed to assignableFrom) is a dead duck right from the start, because it could not sensibly deal with the proxy subclasses created by so many of the popular frameworks.
- Why not even with default = true? Well, using the annotations is effectively an opt-in approach; the idea being that if you are not annotated as a @BusinessField you are not relevant.
In this scenario, it would seem rather inconsistent to “magically” include the class attribute by default. - The reason for not passing the @BusinessObject annotation on to children via @Inherited is that the set of business fields relevant for a domain object is defined to be precisely those @BusinessFields defined in the object’s class and all parent classes up to and including the first one containing the @BusinessObject annotation.
This means that domain objects that should inherit business fields from their parents must not contain the annotation! - In fact, A.class is not required – unless equivalentClasses is left unspecified, the class annotated with @BusinessObject is automatically added to the compatible classes (otherwise, subclasses of the class would not be equal to themselves, violating reflexivity!). Still, for the sake of clarity it’s a good idea to explicitly include the class.
- equals timings were obtained by averaging the comparison of 50000 pairs of objects which were equal (not identical) with the given likelihood. hashCode timings were measured by calculating the hash code for the pairs.