| [back to bqbase index] | ||||||||||||||||||||||||||||||||||||||||||||
Comparing Impact of Type Erasure in Scala and Java |
||||||||||||||||||||||||||||||||||||||||||||
The implementation of Java Generics in JDK1.5 and following uses an erasure scheme in order to be able to compile generic types to non-generic types understood by the JVM. This is in contrast to C#2.0, which is compiled to a .NET CLR that understands generics. Scala, being designed to interact smoothly with Java, uses erasure, too. There is some discussion on the net concerning erasure in Java -- enough motivation to compare ideas and limitations of the erasure approach as implemented in Scala and Java.
Generics PapersThe path from the theoretical F-bounded polymorphism to its application in object-oriented programming [0] and finally JDK1.5 and C# 2.0 has seen some considerable research. I want to list a non-exhaustive list of papers here that touch on generics, erasure and variance. Java generics are based on the Pizza design by Odersky and Wadler [1], where erasure is called "homogeneous translation". A key property of this design was that it did not modify the JVM. It was possible to translate away generics source-to-source. Sun Microsystems was interested, Bracha, Odersky, Stoutamire and Wadler reworked the design for Java, yielding GJ [2]. People had something to discuss. The GJ compiler (with generics turned off) became JDK1.3. Igarashi, Pierce and Wadler published another formalization in their Featherweight Java calculus [3]. Thorup and Torgersen introduced use-site variance, applying virtual types [4]. Use-site variance was proven sound for GJ quite a bit later by Igarashi and Viroli [6]. While Sun hesitated on the generics issue, Kennedy and Syme provided a different design for .NET [5] -- one that does not use erasure. Finally, Torgersen et al. adapted use-site variance by introducing wildcards [8,9]. When JDK1.5 came out, all new facilities were introduced in a way that prioritized backwards-compatibility. The JVM was not affected by generics, and moreover erasure of the new collections libraries was equivalent with the old, non-generic ones. In contrast, C# 2.0 offered generics but currently does not tell a story on variance (a definition-site variance scheme for C# is proposed in [10]). The intermediary language (the virtual machine) changed to accomodate generics, so consequently the non-generic collection libraries were complemented by new, generic ones. Scala is compiled to the JVM language and aims for interoperability, so the straightforward choice was to use erasure for generics. We should mention that Scala supported generic types from the beginning, which predates the JDK1.5 release by a couple of years. Due to our compiler's architecture, our upcoming .NET backend will use erasure as well, even though interoperability on .NET would probably mandate preserving types. Erasure MantrapsJava Generics are often used with Java's use-site variance mechanism, wildcard types. Hence a lot of complaints regarding wildcards and erasure are really problems related to variance. We begin with two examples that show the value of variance in Scala. Then we turn to some anomalies regarding arrays. Variance ISafalra gives examples of the problems in his entry titled What's Wrong With Java: Type Erasure. We shall follow through his examples and see how Scala behaves on them.
In Java, an array of type
Since in Scala, Salafra is right when he points out that erasure makes it
impossible to add the needed runtime check (which is in case of
arrays). But it would be bad design to burden developers with
runtime checks. It was bad design to make arrays covariant in the
first place. In Scala (and a C# proposal [10]), class signature
with a variant parameter are restricted in a way that avoids
runtime checks. You simply cannot have methods behave in a way
that would require runtime checks. Concretely, you cannot add an
element to a Variance IIOn, Eric Gunnerson's blog Scott Wisniewski has problems in figuring out the right bounds for the generic types. The motivation seems to have a signature
The problem in Java is that although
There is pending proposal that allows to use wildcard types in the
Scala language. Their semantics is defined as an existential type
whose bounds are determined by the context. For covariant types, they
are simply replaced by the upper bound. The proposal would also allow
ArraysJava Generics do not allow the creation of arrays of a type parameter. Again, Scala does not know such a restriction, although it uses erasure.
In Java, the problem can be circumvented by using
Behind the scenes, Scala uses an array wrapper that is backed by an object array. It also contains the additional methods and a decent toString implementation that a Scala array offers with respect to a Java array. But since Scala aims for interoperability, arrays are turned into Java arrays when needed. Thus, the following code behaves the same.
We should mention that Java does not have a type that
corresponds to the CastingWhen casting, all bets are off. Generic types are no exception, and both Java and Scala behave the same.
Bridge methodsBridge methods are additional methods that get inserted in order to "connect" implementations of methods in instances of generic interfaces. For instance assume the following definitions
interface Comparable<T> { public int compareTo(T b) }
class Foo implements Comparable<Foo>{
public int compareTo(Foo b){
return 0;
}
}
Then consider a method that expects a The trouble is, what if there is a
Head Types, Raw Types and instanceOf
We include this subsection only for completeness, because Scala's
In Scala, a head type is not really considered a type -- it is some
artefact that occurs after typechecking. The Java designers however
chose to include things like Pattern MatchingFinally, we should mention facilities that exist in Scala, but not in Java. Scala hasmatch expressions, which however may not contain generic type tests.
x match {
case List[String](_*) => // syntax error
}
In fact, just like tests of the kind of isInstanceOf[List[String]], pattern matching will
erroneously match values of constructed types, because it compares only the head type List:
(List(1,2,3):List[Any]) match {
case _:List[String]=> Console.println("oops") // reached :-/
}
The above-mentioned proposal to introduce wildcard types also outlaws
type tests of constructed types in casts and Scala's Compound TypesSince Scala's type system is more expressive, type erasure affects Scala also in programs that cannot be expressed in Java. Compound types have been suggested for Java, too [11]. Compound types express combinations of several types (e.g. when defining a variable or in a pattern match). The type T1 with ... with Tn { R } talks about a type made of component types Ti and refinement R. That is, a value of a type conforms to each Ti. Combining several traits is possible using a form of multiple inheritance, and there are rules avoiding the known problems of diamond inheritance and accidental name clashes (but we cannot discuss everything here, eh? :-) ). The type erasure of such a compound type is simply T1 (the first), so the others are forgotten. Compound types are not allowed as type tests in pattern matching, for a reason. The following simple program looks meaningful but demonstrates the limits of type erasure when compiling Scala code.
object main { // don't do this at home
trait Impl
trait SizeImpl extends Impl { def size = 42 }
trait ColorImpl extends Impl { def color = "red" }
type Both = SizeImpl with ColorImpl
def info(x:Impl) = x match {
case x:Both => "size "+x.size+" color "+x.color // you wish
case x:SizeImpl => "size "+x.size
case x:ColorImpl => "color "+x.color
case _ => "n.a."
}
def main(args:Array[String]): Unit = {
// make up some class that has a size
class MyNode extends SizeImpl
Console.println("hello " + info(new MyNode))
}
}
Easy, right? If something has both, a size and a color, we want to give both. Update: Feature Request bug#789 was implemented and so this program works now. The subtlety here is in performing two tests
ConclusionErasure seems useful not just for backwards-compatibility, but because complete runtime-type information as promoted by dynamically typed languages comes at a cost. The design of the .NET CLR Generics tackles this cost by code specialization. The above cases should have made clear when it is erasure and when it is the language that is to be blamed for a particular shortcoming. I claim that a lot of erasure-bashing is really due to a lack of definitions-site variance in Java's type system. While it's true that the existing use-site variance is more flexible, it seems cumbersome to use. In case you are not a hardcore user already, I also hope that you got curious about Scala. Sorry for the vagueness regarding the various Scala improvement proposals floating around. These are not written down in a publishable form. When such a write-up becomes available I shall include proper links here. References
|
published:2006-10-16 last change:2006-11-03
changelog: