Java Cookbook:
Porting C++ to Java


Introduction, Basics, Next Steps, Well-Mannered Objects, Esoterica, Background, Index


Well-Mannered Objects

The following sections describe particular issues that are
common to almost all classes, but are often tricky to get right.


*Bullet-proofing

The most important language feature missing from Java is const. The absence of this feature significantly compromises the robustness of your code.

In Java, you can't determine on an object-by-object basis whether someone can change an object; you can only do it on the class level. This significantly complicates your life, if you want to provide the same level of robustness against mistaken modifications as you have in C++.

With the Java paradigm, the only way to have constant objects is to write an Immutable class. There are certainly advantages to this approach:

  • Immutables do not need to be cloned.
  • Multiple variables and other objects can safely refer to Immutables without fear that some object will modify them behind their backs.
  • Immutables are also thread-safe, without taking special provisions.

The downside is that in order to make an object Immutable, you often have to write two classes: one Mutable and one Immutable, with a fast conversion between them. We see that with String and StringBuffer. StringBuffer provides a mechanism to modify strings, while String provides the Immutable counterpart. Behind the scenes, they are designed to share a single character buffer--where possible--so that conversions are not too onerous.

Short of taking this approach, it is difficult to maintain the advantages of const when porting your code. Suppose that you are returning a const pointer from a getter. Without const the integrity of your object can be compromised if someone mistakenly alters the object returned from the getter. Suppose that you are passing your object in as a parameter. Without const you have no indication when your object is just an input parameter, and when it could be modified (perhaps mistakenly) behind your back.

Our recommended approach with the current Java language definition is to write an Immutable interface, one that provides API for just the "const" methods, such as getters. If you then return (or pass) objects of type Immutable, you get the same degree of safety as in C++. (Note that, just as in C++, the "constness" can be cast away, so it doesn't prevent malicious coders!)

Replacing const

C++

Java

// definition
class Foo {
 public:
  int getSize() const;
  int setSize();
 private:
  int size;
}

// in another class's definition 
const Foo* method1()  {...

void method2(
  const Foo& input,
  Foo& output) {...

// usage
const Foo* y = x.method1();
z = y.getSize();
y.setSize(3); // compilation error
(*(Foo*)&y).setSize(3); // cast
// definition
class Foo implements ConstFoo {

 public int getSize();
 public int setSize();

 private int size;
}

// in another class's definition 
ConstFoo method1() {...

void method2(
  ConstFoo input,
  Foo output) {...

// usage
ConstFoo y = x.method1();
z = y.getSize();
y.setSize(3); // compilation error
((Foo)y).setSize(3); // cast

// additional interface
interface ConstFoo {
 int getSize();
}

The other safe alternatives are:

  • Clone the pointer before returning. This has the advantage of requiring a small amount of programming effort, but may be a performance hit. See Doppelgänger and Don't try this at home.
  • Write an Immutable cover class, one that delegates all of its calls to the original class. This is a real class, not just an interface, and cannot be cast away. This approach provides complete safety, but at the cost of considerably more work and some performance (an extra method call for every delegated method, plus the cost of constructing another object).

Really Safe

C++

Java

// definition
class Foo {
 public:
  int getSize() const;
  int setSize();
 private:
  int size;
}

// in another class's definition 
const Foo* method1() {...

void method2(
  const Foo& input,
  Foo& output) {...

// usage
const Foo* y = x.method1();
z = y.getSize();
y.setSize(3); // compilation error
(*(Foo*)&y).setSize(3); // cast
// definition
class Foo {

 public int getSize();
 public int setSize();

 private int size;
}

// in another class's definition 
SafeFoo method1() {...

void method2(
  SafeFoo input,
  Foo output) {...

// usage
SafeFoo y = x.method1();
z = y.getSize();
y.setSize(3); // compilation error
((Foo)y).setSize(3); // comp. error

// additional class
final class SafeFoo {
 public SafeFoo(Foo value) {
  foo = value;
 }
 public int getSize(); {
  return foo.getSize();
 }
 private Foo foo;
}

* Be careful of static final data fields; unless they are Immutable, they are not safe. You have to use the same techniques as shown above to make them so.

Unexpected Damage

// declaration
import java.awt.Point;
class Foo {
 public static final Point ORIGIN = new Point(0,0);
}

// usage
Point y = Foo.ORIGIN;
y.translate(3,5);

// Danger, Will Robinson!
// ORIGIN has been changed to be <3,5> at this point!


*On pins and needles

Thread-safety is a new concept for many C++ programmers. The C++ language provides no standard assistance for multithreaded programs, so all of the C++ synchronization (if any) is dependent on external libraries. Since it appears explicitly, you should be able to translate it according to the semantics of that library into explicit synchronization calls. However, you will need to understand both how the particular C++ synchronization and how Java's synchronization work.

Java offers powerful, built-in support for threads, but you will need to design your classes for thread-safety to ensure that they work properly. In general, your classes will fall under three cases.

No thread-safety
If your class will only ever be used in a single thread, you don't need to do anything.
 
Minimal thread-safety
Minimal thread-safety allows you to use different instances in different threads, but not references from two threads to the same object. To make your class minimally thread-safe, determine which fields have class data (a.k.a. static data) that can be altered. Synchronize all methods that access or change that static data. (This actually overstates it a bit; you need only synchronize the actual code that accesses that data, not the entire routine. However, it may be simpler in porting to just add the synchronized keyword to these methods in your first pass.)
 
If you don't make your classes minimally thread-safe, you can get into trouble. Imagine what happens if in thread A, object1 is trying to access static data, while in thread B, a completely different object1 (but of the same class, or a subclass) is modifying the same static data!
 
Full thread-safety
With fully thread-safe objects, you don't have to worry how you use them at all. Full thread-safety allows two different threads to have variables referring to the same object, with either one able to make changes to that object without causing problems. As well as making the changes for minimal thread-safety, you have to synchronize all methods that either change instance data (a.k.a. object data), or access instance data that could be changed after the construction of the object.
 
There is a price for full-thread safety: access to your object is always slower, even if the object is not being used in a multithreaded environment. Full thread-safety is not generally necessary for all objects.
 
Immutables don't need to be synchronized in order to be fully thread-safe, except for those methods that change hidden caches. For example, Locale is Immutable, but there is a hashCode method that changes a hidden data field. That method then has to be synchronized.

Even if you follow the above guidelines, you need to make sure that the objects are left in a consistent state whenever any method returns. Unless additional synchronization mechanisms are set up, client code of your class can't do any transaction-like operations that span multiple calls. For example, if two threads are both iterating through a Vector and reversing the order of the elements at the same time, even if all of the methods are synchronized the results can be undefined. Complete guidelines to thread-safety are beyond the scope of this article.

If an object has only minimal thread-safety, callers have to do their own synchronization for that object if it can be referenced by multiple threads; e.g., by protecting all the code that accesses that object.


*Liberté, Égalité, Fraternité

The way Java is set up, classes should implement hashCode and equals[1]. However, it is easy to get these wrong, and the failures may be difficult to debug. Although Java memory management saves some complications, there are other problems similar to those of C++. Unless you are aware of these problems, you will get non-robust (fragile) code. So here is a fairly complete example of how to write equals.

As discussed under Basics, there is quite a difference between == and equals(). The operator == represents pointer identity, while equals represents value or semantic equality. To correctly define equals, you must make sure that the following principles are observed.

Semantic Equality
If you use the same steps to create x as you do to create y, then x.equals(y).
 
Symmetry
If x.equals(y), then y.equals(x).
 
Transitivity
If x.equals(y), and y.equals(z), then x.equals(z)

If you don't maintain these invariants, then users of your code (a.k.a. clients) will become rather annoyed when your class doesn't work as expected, or--worst yet--data structures can become corrupt (see Making a hash of it).

Note that if you depend on inheriting the default implementation of equals from Object, you will get the wrong answer! The default implementation, as we see below with StringBuffer, does not preserve semantic identity.

Bad equals in Action

// use same steps to create x and y
StringBuffer x = new StringBuffer("abc");
StringBuffer y = new StringBuffer("abc");

// failing code
if (x.equals(y)) {
 System.out.println("Correct!"); // never reached
}

// work-around, for this case
if (x.toString().equals(y.toString())) {
 System.out.println("Correct!");
}

// Second example
// Goal: try to avoid relayout when size remains the same
Dimension mySize = size();
if (!mySize.equals(oldSize)) { 
// ALWAYS TRUE since Dimension fails to override!
 oldSize = mySize;
 relayout();
}

// Work-around: do it yourself
Dimension mySize = size();
if (mySize.width != oldSize.width || mySize.height != oldSize.height) {
 oldSize = mySize;
 relayout();
}

Here is an example of how to correctly implement equals, with the different cases that you may be faced with annotated.

Implementing equals

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
public boolean equals(Object obj) {
  // if top of heirarchy, use code:
    if (this == obj) return true;
    if (obj == null || getClass() != obj.getClass()) return false;
  // if NOT top of heirarchy, use code:
    if (!super.equals(obj)) return false; // super checks class
  Sample other = (Sample)obj;
  if (myPrimitive != other.myPrimitive) return false;
  if (!myObject.equals(other.myObject)) return false;
  if ((myPossNull == null) {
    if (other.myPossNull != null) return false;
  } else if (!myPossNull.equals(other.myPossNull)) return false;
  // if (!myTransient.equals(other.myTransient)) return false;
  if (myBad.getSize() != other.myBad.getSize()
     || myBad.getColor().equals(other.myBad.getColor()) return false;
  return true;
}

Notes

Line

Comment

2-6.
  • Use lines (3,4) if this class is the top of your hierarchy.
  • Use line (6) if this class is not the top of your hierarchy.

*Never call super.equals at the top of your hierarchy;
Object.equals will give you the wrong result!

This way, each subclass depends on its superclasses to check their fields; the top class is the only one that needs to to check that the classes are the same.

Example:

class A {
 public boolean equals(Object obj) {
  if (this == obj) return true;
  if (getClass() != obj.getClass()) return false;
 ...
class B extends A {
 public boolean equals(Object obj) {
  if (!super.equals(obj)) return false; // super checks class
 ...
class C extends B {
 public boolean equals(Object obj) {
  if (!super.equals(obj)) return false; // super checks class
 ...

If you have a special hierarchy (such as Number) where you want equality checks to work across different classes, then you will need to use special code. You can do it, but be forewarned that such cases get very tricky unless you have a closed set of classes, with no outside subclassing!

4.

So why don't we write the following in each class?

if (!(obj instanceof Sample)) return false;

Here is why. Suppose A is a superclass of B, and we are comparing two objects of those classes, a and b.

  • In the code for a.equal(b), (b instanceof A) is true.
  • But in the code for b.equals(a), (a instanceof B) is false!

Using (getClass() != obj.getClass()) instead solves this problem; and can also be called just once in your hierarchy.

10. You need this more complicated code if a field could be null.
13. Transient fields, such as caches, are irrelevant to the equality of the object, and must be ignored.
14.

If one of your fields does not implement equals correctly, then you have to do your own comparison.

(I have seen some people use toString() to work around bad equals. Don't do it except with StringBuffer. The toString method is relatively expensive and not guaranteed to contain the complete state of the object. In practice, if objects can be reasonably converted to a string, toString is used for the name of that method. If objects cannot be, then toString spews whatever debugging information the class designer thought worthwhile.)


*Making a hash of it

The way Java is set up, most classes should implement hashCode and equals. However, it is easy to get these wrong, and the failures may be difficult to debug. So here is a fairly complete example of how to write hashCode.

Writing hashCode is much simpler than writing equals. The only strict principle that you absolutely must follow is:

Agreement with Equality
If x.equals(y), then x.hashCode() == y.hashCode().

If you don't maintain this invariant, then HashTable data structures get corrupt! Here is an example of how to correctly implement hashCode, with the different cases that you may be faced with. You will see that this corresponds closely with the code for equals.

Unlike equals, hashCode does not need to use all the nontransient fields of an object; just enough of them to get a reasonable distribution from 0 to Integer.MAX_VALUE.

Implementing HashCode

1
2
3
4
5
6
7
8
9
10
11
public int hashCode() {
  int result = 0;
  // result ^= super.hashCode();
  result = 37*result + myNumericalPrimitive;
  result = 37*result + (myBoolean ? 1 : 0);
  result = 37*result + myObject.hashCode();
  result = 37*result + 
   (myPossNull != null ? myPossNull.hashCode() : 0);
  // if (!myTransient.equals(other.myTransient)) return false;
  return result;
}

Notes

Line

Comment

... Why 37, you might ask? Actually, any reasonably sized prime number works pretty well.
 3. * Uncomment this line if-and-only-if the immediate superclass is not Object; otherwise you will get the wrong result!
  7. You need this slightly more complicated code if a field could be null.
 9. Transient fields, such as caches, are irrelevant to the equality of the object, and should be ignored. You must not include any fields in your hashCode that are not included in your equals code.

* If your keys in a Hashtable are not Immutable, be careful; if you change the value of the key you must first remove the key-value pair from the table, and then re-enter the pair after you change the value of the key. Otherwise your Hashtable becomes corrupt!


*Doppelgänger

Implementing clone allows other programmers to use your objects as fields and to safely implement getters, setters, and clone themselves. You should provide a clone operator for all of your classes.

However, suppose you are feeling lazy, and want to get away with the absolute minimum. You do not need to provide a clone method if your superclass does not implement a public clone method, and your object falls under one of the following cases:

  • It is Immutable, or
  • It would never be a field in another object that itself will need to implement clone, or
  • It is final, and can be duplicated with public getters and setters. (That is, your object can be duplicated by getting all of the state of your object with public getters, then creating a new object with the identical state.)

The only strict principles that you must follow for clone are:

Clone Equality
If y == clone(x), then x.equals(y).
 
Clone Independence
If y == clone(x), then no setter on y can cause the value of x to be modified.

This is what is known as a deep clone. There are cases where it may make sense to provide a shallow clone, especially with collection classes. Such a shallow clone only clones the top-level structure of the object, not the lower levels. A shallow clone is useful in many circumstances so long as programmers can somehow still implement a deep clone on top of those objects. Ideally, the class would implement both, with a separate method called shallowClone.

Here is an example of how to correctly implement clone, with the different cases that you may be faced with.

Implementing Clone

1
2
3
4
5
6
7
8
9
10
11
12
13
protected Object clone() throws CloneNotSupportedException {
 Sample result = (Sample) super.clone();
 result.myGood = (Good) myGood.clone();
 result.myTransient = null;
 result.myVector = (Vector) myVector.clone();
 for (int i = 0; i < myVector.size(); ++i) {
  result.myVector.setElementAt(
    ((Cloneable) myVector.elementAt()).clone(), i);
 }
 result.myBad = new Bad(myBad.getSize(), myBad.getColor());
 result.myBad.setActiveStatus(Bad.INACTIVE);
 return result;
}

Notes

Line

Comment

 2. This copies the superclasses fields, and makes bitwise copies of your fields. You do not have to copy any primitives or Immutables (such as String) in the rest of your code.
 3. You should set your transient fields to an invalid state, to signal that they need to be rebuilt. Do this if the field is Mutable and not shared between objects.
 6. If the members on the Vector are Immutable, then you don't have to clone them, as in lines 6-9. Use the same style for arrays: for example, you can just call
foo = (int[])other.foo.clone();
 8.
*

Unfortunately, this method of deep-cloning a Vector (or array, or Dictionary) actually will not work, because of an annoying flaw in the Cloneable interface; surprisingly, it does not have clone() as a method! (And Object's clone is protected, not public.) This is despite the statement in JPL (page 68) that "The clone method in the Cloneable interface is declared public..."

The result is, you cannot polymorphically implement clone in many cases; you have to have preknowledge of the precise type (or an overall superclass) of the objects in the collection, and cast them to that type to call their clone operator.

Keep your fingers crossed that flaw is fixed in JDK 1.1!

  10. If the author of the Bad class was a bit lazy, and did not supply you with a clone operator, you will do it yourself with a constructor and setters as necessary. If the object is of a subclass of Bad which you are not aware of, then despite your best efforts the object will be sliced, and data will be lost.

In implementing clone, getters, setters, and thread-safety, Immutable would actually be a very useful Java interface. You could then test objects at runtime with instanceof to see if you need to take special action with them. Although it is unfortunately not in the standard Java class libraries, you may find it useful to define it as an interface in your own code for just this purpose.

Even if your class is Immutable you may wish to provide a clone operator; that way it may be simpler for users of your class. If you do so, then your implementation is very simple. Don't even bother to call super.clone--just return this. For complex classes, this will be much faster. Since your class is Immutable, it satisfies both the clone principles of Independence and Equality.

*

Be careful though; if a class is not final, any subclass could fail to uphold its immutability (since there is no language support for it). So if you want to be safe, don't depend on immutablity for features such as this unless the class is final. For example:

class SupposedlyImmutable implements Cloneable {
 SupposedlyImmutable(int x) {
  value = x;
 }
 int getValue() {
  return value;
 }
 public Object clone() {
  return this; // shouldn't do since class isn't final
 }
 private int value;
}

class BreaksImmutable extends SupposedlyImmutable {
 BreaksImmutable(int x, int y) {
  super(x);
  value2 = y;
 }
 void setExtra(int y) {
  value2 = y;
 }
 int getExtra() {
  return value2;
 }
 public Object clone() {
  return super.clone(); // fails Independence
 }
 private int value2;
}


*Don't try this at home

Getters and setters seem trivial, but incorrect construction can leave your object open to pernicious bugs (some of which will also show up in other methods besides getters and setters). For example, look at the following:

Dangerous Getter/Setters

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// definition
public Foo[] getFooArray() {
 return fooArray;
}

public setFooArray(Foo[] newValue) {
 fooArray = newValue;
}

private Foo[] fooArray;
...
// usage
Foo[] y = x.getFooArray();
y[3].changeSomething();

x.setFooArray(z);
z[3].changeSomething();

With these setters and getters, lines 14 and 17 change the state of your object behind your back. If you had other state in your object that needed to be in sync with fooArray, you are now in an inconsistent state. Moreover, even if you didn't have such state, if any of your potential subclasses had such state, they would now be corrupted.You might just as well have made fooArray public!

If your field is Immutable or a primitive, then you can just use the simple code with perfect safety. If not, then you need to consider the use of your field. Your choices are:

  • For complete safety, clone the field in getters and setters of Mutables. The downside of this approach is that you take a certain performance hit, sometimes an unacceptable one.
  • For pretty good safety, use a read-only interface on your getter, as in Bullet-proofing. This prevents most accidents from happening. For full safety, you still would need to clone incoming Mutable parameters in your setter.
  • Bite the bullet, document what changes the caller may make to objects passed in or returned, and depend on your callers not to make a mistake!


*Allegro ma non troppo

There is a technique for speeding up equals and hashCode. It is worth implementing under the following circumstances:

  • Your objects don't change often.
  • You are doing a lot of equality comparisons or hashCode calls.

Note that if your objects are put into Hashtables or other collections, these comparisons or hashCode calls will be made behind your back.

While it speeds up hashing and comparison dramatically, if your objects are not compared or hashed very often, don't bother using this technique.

This technique provides some very fast checks for equality by adding a version count and a hash cache. To use it, add the following code marked in blue to your class definition. Then, in any of your methods where you alter any of the nontransient fields of the object, call changeValue.

Fast equals & hashCode

public int hashCode() {
 if (hashCache == -1)
  hashCache = <old hashCode computionation code here>
  if (hashCache == -1) {
   hashCache = 1;
  }
 }
 return hashCache;
}
    
public boolean equals(Object other) {
 if (other == this) return true;
 if (getClass() != other.getClass()) return false;
 MyType x = (MyType) other;
 if (versionCount == x.versionCount) return true;
 if (hashCache != x.hashCache) return false;
 <rest of old field comparison code here>
 if (versionCount < other.versionCount) {
  versionCount = other.versionCount;
 } else {
  other.versionCount = versionCount;
 }
 return true;
}

public MyType setFoo(ConstFoo newValue) {
 foo = newValue;
 changeValue();
}
 
// ============= privates =============
 
private static int masterVersionCount = 0;
private long versionCount = 0;
private int hashCache = -1;
 
private final void changeValue() {
    hashCache = -1;
    versionCount = ++masterVersionCount;
}

Theoretically, you could have a problem with the versionCount wrapping back to zero. However, even if you altered your objects once every nanosecond, it would still take over 100 years for a wrap to occur. However, if you really want to be safe, instead of incrementing versionCount you can use the clever trick of allocating a new Object each time. This will be airtight even in the days of Terahertz processors.


*Pitfalls

*
  •  Suppose that you want to remove characters from a StringBuffer. Unfortunately there is no method to do so; you have to resort to the following code to delete from start to end.
a = new StringBuffer(a.toString().substring(0,start)))
        .append(a.toString().substring(end,a.length()));
  • StringBuffer doesn't implement equals correctly, as discussed in Liberté, Égalité, Fraternité.
     
  • There is no constructor to make a String from a char, so use:
String foo = new String(ch + "");

Similarly, the following code doesn't do what you expect; since there is no explicit constructor for a char, StringBuffer casts up to an int and allocates a buffer of length 0x61!

StringBuffer result = new StringBuffer('a');

  • In String, the version of indexOf and lastIndexOf that searches for characters has the char typed as an int. This makes it easy to make a mistake, as illustrated in the following code, which searchs for (char)start in myString, starting at offset (int)myChar!
position = myString.indexOf(start, myChar);
  • Unfortunately, many objects (StringBuffer, Vector, Dictionary...) do not implement a clone, or implement only a shallow clone. This causes a number of problems: see Doppelgänger.
     
  • The methods DataInput.readline and PrintStream.println only handle '\n' delimited strings properly. If you are reading and writing platform-specific text files (which is the vast majority of the cases!), you will have to work around that. Luckily, you can get the line delimiter from
static String eol = System.getProperties()

 .getProperty("line.separator");

    So to handle output, just replace println(x) with print(x+Globals.eol). Input is a quite a bit more annoying; you will have to write your own input routine that recognizes eol instead of just \n.

*

Introduction, Basics, Next Steps, Well-Mannered Objects, Esoterica, Background, Index




JavaTM is a trademark of Sun Microsystems, Inc.

Other companies, products, and service names may be trademarks or service marks of others.

Copyright    Trademark



  Java Education Java Home  
IBM HomeOrderEmployment