Notice: This material is excerpted from Special Edition Using Java, ISBN: 0-7897-0604-0. The electronic version of this material has not been through the final proof reading stage that the book goes through before being published in printed form. Some errors may exist here that are corrected before the book is published. This material is provided "as is" without any warranty of any kind.
by Alexander Newman
In this chapter, we'll talk a little bit about what Java is, in general terms. Java is in many ways similar to program languages that have gone before it, but there are a few key differences. Fortunately, the fact that there's a standard vocabulary used to describe computer languages makes our job a little easier.
Throughout the upcoming chapter, we'll see that Java's features are really all interrelated. You can't talk about Java security without discussing the fact that Java is interpreted by the browser. You can't discuss the fact that Java is interpreted as well as compiled without touching on architecture independence-and architecture independence is the cornerstone of Java's portability.
In this chapter you'll learn:
You can make up whatever coffee metaphors you like, but there's no getting around the fact that Java is the most exciting thing to hit the Internet since the World Wide Web. That's because it fills a need-several needs, in fact. With Java, programmers are able to deliver what everyone who uses the Web has been clamoring for-true interactivity.
First of all, Java is a programming language. Languages are used to compose a set of instructions which a computer follows. These groups of instructions are called programs, applications, executables, or , in the case of Java, applets. Java can also be used to build stand-alone programs, called applications, just like any other programming language. It's the applets that are the most innovative thing about Java.
Java applets, as discussed in the introduction, add life to the World Wide Web. By "life" I mean that with Java you can add animation, local data searches, and a wide variety of other functions and features that just weren't possible without the Java environment. It's not surprising that companies from Netscape to America Online have jumped on-board the Java bandwagon to design the next generation of browsers.
This brings us to browsers. In order to look at the Web, you need a browser. A browser is an application which runs over the Internet and interprets HTML code. There are graphical and non-graphical browsers, but we're only interested in the graphical ones (the ones that can display pictures on your screen). There are a lot of browsers available. Currently there are only two which support Java: Netscape 2.0 and HotJava.
Netscape 2.0 can be downloaded from
Currently HotJava is available for Windows 95, Windows NT, and Sun's Solaris 2.x platforms. Other systems, including Mac OS 7.5, are in the works and should be available soon. HotJava can be downloaded from
Browsers aren't the only way to view applets. Sun has a utility called Appletviewer. Unfortunately, Appletviewer, which is currently available for Sun's Solaris operating system, Windows, and the Macintosh, doesn't interpret HTMl code. It only shows applets.
For example, the Webpage called Example1.html, contains the following code:
<title>Jumping Box</title> <hr> <H2>The Amazing Jumping Box!</H2> <P> <applet code=MouseTrack.class width=300 height=300> </applet> <P> <H3>Stunning and amazing -- a game of skill for children of all ages!</H3> <hr> <a href="MouseTrack.java">The source.</a>
figure 1.1 shows that file, Example1.html using Sun's Applet Viewer.
fig. 1.1 Example1.html viewed with Applet Viewer
Looking at this webpage through the applet viewer, we lose all of the text. On the other hand, if we look at it through a Java powered browser (in this case, Netscape Navigator 2.0 for the Macintosh), we not only can see the applet, but the text ("Stunning and amazing-a game of skill for children of all ages!") as well.
fig. 1.2 Example1.html viewed with Netscape Navigator
Java is a complete programming environment and comes with it's own set of tools, including a compiler (javac) and a debugger (jdb). We'll talk about both the compiler and the debugger in depth later on. For now, it's important to get a basic grasp of how the Java compiler acts upon what you write to develop bytecode.
Byte code format is halfway between the code you write (source code) and the code the computer reads (which is a machine language specific to the architecture of each type of computer). In other words, the Java compiler takes your relatively short Java source code and breaks it down (lengthens it, really) into a longer set of instructions that are ready to be run on a computer-once the specifics of that computer are known.
Bytecode consists of a lot of the material produced by a normal compiler, but stops just before conforming the code to the particulars of a specific architecture or operating system. The bytecode is passed along to the runtime environment. The runtime environment interprets and checks the integrity and security of bytecode, then dynamically applies specifics based on the parameters found in the HTML code, system configurations, and environmental variables.
Now that we've established the difference between a browser (HotJava, Netscape) and a programming language (Java), it's time to get down to what Java is.
The folks who designed Java hoped to solve some of problems they saw in modern programming. As we said, Java's core principles developed out of a desire to build software for consumer electronics. Like those devices, the language needed to be compact, reliable, portable, distributed, real-time, and embedded.
Like most modern products, Java can be neatly defined in a set of buzzwords. Sun Microsystems' official definition Java is:
Java: A simple, object-oriented, distributed, interpreted, robust, secure, architecture neutral, portable, high-performance, multithreaded, and dynamic language.
Even though the Java developers decided that C++ was unsuitable for their purposes, they designed Java as closely to C++ as possible. this was done in order to make the system more familiar, more comprehensible, and to shorten the time necessary to learn the new language. One of Java's greatest appeals is that, if you're a programmer, you already know how to use it. Ninety percent of the programmers working these days use C, and almost all object-oriented programming is done in C++.
Another goal of the designers was to eliminate support for multiple class inheritance, operator overloading, and extensive automatic coercion of data types; several of the poorly understood, confusing and rarely used, features of C++. Also omitted from Java were header files, pre-processor, pointer arithmetic, structures, unions, or multi-dimensional arrays. They were selective, and retained features that would ease development, implementation, and maintenance of software, while omitting things that would slow a developer down. For example, even though operator overloading was eliminated (which lets programmers exists when operators have more than one semantic interpretation) they kept method overloading.
Here's the now-famous "Hello World" program, rendered as a Java applet:
import java.awt.*; public class applet extends java.applet.Applet { public void paint(Graphics g) { g.drawString("Hello world!", 25, 25); } }
figure 1.3, below shows the "Hello World" applet and HTML code for the page "example1.html" in Sun's applet viewer. Not very exciting, but we've got to start small.
Fig. 1.3 HelloWorld applet, as seen with Applet Viewer
Another problem faced by C and C++ programmers is storage management, which is the allocation and freeing of memory. Ordinarily, a C programmer needs to keep a careful eye on how much memory their program is using. When a chunk of memory is no longer being utilized, the programmer needs to make sure the program frees it up so it can be re-used. This is even harder than it sounds, especially in large programs, and is the main cause for memory leaks and bugs.
When programming in Java, you don't need to worry about those problems.
The Java system has an embedded auto garbage collection. The garbage collector simplifies Java programming, but at the expense of making the system more complicated. Because it has automatic garbage collection, Java not only makes programming easier, it also dramatically cuts down on the number of bugs and eliminates the hassle of memory management.
One of the features of Java which Sun neglected to mention in its definition was Java's size-or lack of it. As a side effect of being simple, Java is a very small. Remember that one of the original goals of Java was to facilitate the construction of software that ran stand-alone in small machines. The original *7 module that Java was developed for only had 3Mb of main memory. Java can happily run on personal computers with at least 4Mb of RAM or even VCRs, telephones, or doorknobs.
The size of the basic interpreter and class support is about 40K of RAM; adding the basic standard libraries and thread support (essentially a self-contained microkernel) adds an additional 175K. The combined total of approximately 215K is significantly smaller than comparable programming languages and environments.
Java is an object-oriented language. That means that it's part of a family of languages that focuses on defining data as objects and the methods that may be applied to those objects. As we've said, Java and C++ share many of the same underlying principles, they just differ in style and structure. Simply put, object-oriented (OO, for short) languages describe interactions among data objects. To make an analogy with medicine, an "object-oriented" doctor would be interested in holistic medicine - examining the body (or object) as a whole first, and the vaccines, diets, and medicine (the tools) used to make your health improve after that. A "non-object-oriented" doctor would think primarily of his tools.
Many OO languages support multiple inheritance, which can sometimes lead to confusion or unnecessary complications. Java doesn't; as part of its "less-is-more" philosophy, Java only supports single inheritance. That means that each class can only inherit from one other class at any given time. This avoids the problem of a class inheriting classes whose behaviors are contradictory or mutually exclusive.
Having said that, we should point out that, while Java does not support multiple inheritance per se, it does support abstract classes which can implement multiple inheritances. Abstract classes allow programmers to define methods for interfaces and worry about how the methods will be implemented later. This bypasses a lot of the problems inherent in actual multiple inheritance while still retaining most of the advantages.
Each class, whether abstract or not, defines the behavior of an object through a set of methods. All the code used for Java is divided into classes. Behaviors can be inherited from one class to the next, and at the head of the class hierarchy is the class called "Object". This is illustrated in figure 1.1, which shows a class called "meat" that inherits from class "Food", which, as all classes ultimately do, inherits from the class called OBJECT.
Objects can also implement any number of interfaces (or abstract classes, remember?). The Java interfaces are a lot like the Interface Definition Language interfaces. That similarity means that it's easy to build a compiler from IDL to Java.
That compiler could be used in the CORBA (Common Object Request Broker Architecture) system of objects to build distributed object systems. Is this good? Yes. Both IDL interfaces and the CORBA system are used in a wide variety of computer systems and this facilitates Java's platform independence, which we'll talk more about later.
As part of the effort to keep Java simple, not everything in this object-oriented language is an object. Booleans, numbers, and other simple types are not objects, but Java does have wrapper objects for all simple types. Wrapper objects allow all simple types to be implemented as though they were classes.
Object-oriented design is also the mechanism which allows modules to "plug and play." The object-oriented facilities of Java are essentially those of C++, with extensions from Objective C for more dynamic method resolution.
An essential characteristic of client/server applications like Java is the ability to share both information and the data processing workload. The term "distributed" describes the relationship between system objects, whether these objects are on remote or local systems. One of the great things about Java applets and applications is that they can open and access objects across the Web via URLs as easily as they can access a local file system.
Another bonus for Java programmers is the extensive library of routines built into the language. This allows Java applications and applets to cope easily with TCP/IP protocols like HTTP and FTP. Currently, some of the other protocols which are in common on the Web-protocols like gopher, mailto, or news-haven't been implemented in Java, but they will be in future releases.
The beauty of the distributed system is that multiple designers, at multiple remote locations, can collaborate on a single project. For example, by using Java, a distributed OO steam engine builder application that supports collaboration from other engine builders (at either remote or local sites) could be built. Using the OO steam engine builder, collaborators can work together to develop a better (faster, economical, etc.) machine.
Strictly speaking, that's true, although in reality Java is both interpreted and compiled. In fact, only about twenty percent of the Java code is interpreted by the browser - but this is a crucial twenty percent. Both Java's security and its ability to run on multiple platforms stem from the fact that the final steps of compilation are handled locally.
A programmer first compiles Java source into byte code, using the Java compiler. This byte code is binary and architecture-neutral (we'll also use the term "platform-independent"-they mean the same thing). This byte code isn't complete until it's interpreted by a Java run-time environment, usually a browser. Since each Java run-time environment is for a specific platform, the final product is going to work on that specific platform.
This is good news for developers. It means that Java code is Java code is Java code, no matter what platform you're developing for or on. That means you could write and compile a Java applet on your UNIX system and install the applet on your Web page. Three different people on three different machines-each with its own environment-can take a peek at your new applet. Provided each of those three people was running a Java-capable browser, it wouldn't matter whether they were on an IBM, an H-P, or a Macintosh. Using Java means that only one source of Java code needs to be maintained for the byte code to run on a variety of platforms. One pass through a compiler for multiple platforms is good news for programmers.
Be aware that, because Java byte code is interpreted, that Web pages with applets frequently take much longer to load than those without. That's due, in part, to the fact that the byte code that will become the applets you see contains more compile-time data than is typically used in non-interpreted situations. The byte code is downloaded to your system, much as the HTML code or images that make up a Web page are. Then a series of run-time procedures check its security or robustness.
Why is this combination of compilation and interpretation a positive feature? First, it facilitates security and stability. The Java environment contains an element called the linker which checks data coming into your machine to make sure that it contains neither deliberately harmful files (security) nor files that could disrupt the functioning of your computer (robustness).
Despite the assurances of Sun and the Java team, you're not completely safe when using Java. There are still too many variables. An example of this can be seen at the Missile Commando site
See chapter 7 for more information on Java security.
Second, and more importantly, this combination of compilation and interpretation alleviates concerns about version mismatches.
The fact that the final portion of compilation is being done by a platform-specific device which is maintained by the end-user, relieves the developer of the responsibility of maintaining multiple sources for multiple platforms. Interpretation also allows data to be incorporated at runtime, which is the foundation of Java's dynamic behavior.
"Robust" is simply computerspeak for how reliable a language is. The more robust a language is, the less likely that programs written in this language will crash and the more likely it is that they will be bug-free. Strongly typed languages (like Java or C++) allow extensive compile-time checking, meaning any bugs can be found early. "Strongly typed" means that most of the data type checking isn't performed at runtime, rather it's performed during compilation.
Simply put, Java can't cause programs to crash. That's obviously a feature because crashes are a bad thing. At a minimum, they're inconvenient. At most, they can cost you many hours of work, or worse. A Java program can't cause a crash because it doesn't have permission to access all of your computer's memory. Programs written in other languages can traditionally access, and therefore change, any part of your system's memory, but Java has a built-in limitation. Java programs can only access a restricted area of memory, so they can't change a value that is not meant to change.
Unlike C++, which has a lot of loopholes in compile-time checking (particularly method/procedure declarations), Java requires declarations and doesn't support C-style implicit declarations. Implicit declarations are bad; C++ supports them mostly to be compatible with the older C language. They state that if a method is implicitly declared, then type information isn't available, which defeats the type-checking process. In short, with implicit declarations a C++ programmer can (deliberately or through oversight) do an end-run around the safety built into strong type-checking. A Java programmer can't do that.
As a final safety check, Java has the linker. The linker is part of the run-time environment. It understands the type system and during runtime repeats many of the type checks done by the compiler to guard against version mismatch problems.
Robustness is in many ways Java's most important feature. Despite the release of the 'gold' version of Java, it is still very much an experimental language and is still in the development stages. Java's stability-it's robustness-encourages people to use it as a development platform. It allows programmers to focus on programming, rather than chasing bugs or memory leaks. And the more people work with Java, the closer it grows to becoming the standard the Sun hopes it can be.
Java's security is still something which remains to be proven. The concept of allowing another site to upload executables sight unseen to your machine is something which just doesn't sit right with many computer and Internet professionals. The simple truth is, viruses are out there, and no one wants to infect their machine by downloading a binary from the Net.
Some of the other features, such as robustness and the fact that Java is both interpreted and compiled, are aids to security. For example, the fact that Java programs can't access memory means they can be safely executed. Mostly, Java is secure because it was designed to be.
As we've said. Java programs are first compiled into byte-code instructions, proto-programs, which are verified. Byte-code instructions, unlike other instruction sets, are not platform specific and contain extra type information. This type information can be used to verify the program's legality and to check for potential security violations.
Java uses a new and unique approach to calling functions. Traditionally, PC programs call functions by a numeric address. Since this address is just a numeric sequence and that sequence can be constructed anyway the programmer likes, any number can be used to tell the program to execute a function. This provides a level of anonymity which makes it impossible to tell which functions will actually be used when the program is run.
Java, on the other hand, uses names. Methods and variables can only be accessed by name, which means that determining which methods and functions are actually used is easy. This verification process is used to ensure the byte code has not been contaminated or altered and that it conforms to Java language constraints.
Even if a "bad" applet somehow managed to slip through the verification process, the amount of damage it could do is extremely limited. That's because Java applets are executed in a restricted environment. Within this environment, the Java applet is not permitted to execute certain dangerous functions unless the end user allows it to.
The team that designed Java was well aware that if Java was to support applications on networks it would have to support a variety of systems with a variety of CPU and operating system architectures. A Java application can execute anywhere on the network, or on the Internet, because the compiler generates an architecture neutral object file format-the compiled code is executable on many processors, provided that processor is running a Java-savvy browser.
Once a Java program is written, it stays written. It doesn't need to be recompiled for each different platform. The Java language is the same on every computer. There's no "Java for Windows" or "Java for Macintosh". This isn't a new concept for the Web; HTML scripting works the same way. However, HTML code is only good for one thing, generating webpages. Java, on the other hand, is a full-fledged programming language. A platform-independent language is not only useful for networks but also for single system software distribution. Currently, application writers developing software for the present computer market, have to produce multiple versions of their application. One for each platform they want their software to run on. Current trends in the personal computer market, like Apple moving off the 68000 processor, make developing software that runs on all platforms almost impossible. With Java, the same version of the application runs on all platforms.
This is possible because, as we discussed earlier, the Java compiler generates bytecode instructions and not binaries. The bytecode instructions are easy to interpret on any machine, although they are not oriented towards a particular platform, and easily translated into native machine code at runtime.
Java's architecture neutrality makes the concept of porting from one platform to another a little redundant. But even beyond platform independence, Java's use of standards for data types eliminates the "implementation dependent" aspects of the specification that are found in C and C++.
For example, in Java the sizes of the primitive datatypes are specified,
as is the behavior of arithmetic on them. Two The examples
that Sun uses are that "int" always means a signed two's complement,
32 bit integer, and "float" always means a 32-bit IEEE 754 floating
point number. Computer technology has advanced to the point where making
a decision like this is possible. Almost all interesting CPU's share these
characteristics. Further, the Java libraries define portable interfaces
for the three most common platforms: Unix, Windows and the Macintosh.
But even beyond that, the Java system itself was built to be portable. Javac, the Java compiler, is written in Java and the runtime environment is written in ANSI C with a portability boundary which is essentially POSIX Interpreted The Java interpreter can execute Java bytecodes directly on any machine to which the interpreter has been ported. Using Java speeds the development process and that speed allows developers to explore directions they might not have had time for using a system which didn't support linking. Linking, since it's a lighter-weight and more incremental process, only enhances the speed of the development process.
The linker also facilitates the debugging process. Since more compile-time information is carried over (therefore and available) at runtime as a part of the bytecode stream, the linker is able to check types. Dynamic type-checking further enhances debugging.
Ease of portability will go a long way towards establishing Java as a commonly used development platform. Remember that most of the software business is just that, a business. Businesses are interested in making money and one of the best ways to make money is to capture marketshare. Before Java, if you wanted to develop a software product that run on the Macintosh and under Windows and on UNIX, there was a lot of redundancy on your development team _ in fact, there were usually multiple development teams, one for each platform. This drastically increases overhead. With Java, you're essentially developing for several platforms simultaneously. The necessity of porting is nearly eliminated.
Java's not high performance in the sense of "faster than a comparable C++ routine". In fact, it's almost the same speed. In benchmark tests run by Sun Microsystems on one of their SPARCStation 10 machines, the performance of byte codes converted to machine code is almost indistinguishable from native C or C++. Of course, that doesn't take into account the time of runtime compilation.
Interpreted byte code performance is fine for most usual tasks, there are certainly situations which call for higher performance. In those instances, the interpreted byte codes may not be the way to go, especially when you consider that the byte code format was designed with generating machine codes in mind. The process of generating machine code from byte code is pretty simple and produces reasonably good code.
The human brain is multithreaded. It can easily handle thousands of tasks simultaneously. You can be talking on the telephone while listening to the radio while pouring yourself a glass of orange juice while pinning the telephone to your ear with your shoulder while thinking about your plans for the weekend while noticing that the dishes need to be done while.
Computers aren't that good and probably won't ever be. However, a powerful computer application can deal with many simultaneous actions. Multithreading how they accomplish this. Unfortunately, C and C++ lend themselves towards single-threaded program which hampers your ability to develop multithreaded programs. Java couldn't be restricted to a single-threaded construction.
What multithreading means to Java users, is that they don't have to wait for the application to finish one task before beginning another. For example, if you were playing a dungeon adventure game, one thread could be handling the mathematics of the combat you were in, while another was taking care of the graphics. Multithreading eliminates a lot of lag time when compared to single-threading. On a personal computer, it means you see a lot less of the hourglass or wristwatch icon which we all know if the computer's way of saying "Don't rush me".
Java's sophisticated set of synchronization primitives are integrated into the language, making them more robust and much easier to use. That's not original to Java. A lot of the style of integration is based on from Xerox's Cedar/Mesa system. But if it's not broken, don't fix it. Java's primitives are based on C.Anthony. Hoare's twenty year-old monitor and condition variable paradigm.
What this built-in multithreading means to programmers is that constructing multithreaded programs is a lot easier in Java. The synchronization features eliminated much of the difficulties of dealing with a multithreaded environment's unpredictable nature. Things don't always happen in the same order.
Remember that the benefits of multithreading (better interactive responsiveness, real-time behavior) depend a lot on the underlying platform. Java applications are MP-Hot-they run better on a multi-processor machine-and stand-alone Java runtime environments have good real-time behavior. This will slow down is you're running a Java application on top of Windows, the Macintosh, Windows NT, or UNIX operating systems. Homogeneity is the best way to get performance out of any operating environment, and Java is no exception.
Java is a more dynamic language than C or C++. Unlike either of those languages, Java was designed to adapt to an evolving environment.
One of the major problems in development which uses C++, is that you can unwittingly or unwillingly become dependent on someone else. This is due to class libraries, a collection of plug and play components. Since C++ code has to be implemented in class libraries, if you license and use a second party's class library in your software, and that company subsequently alters or upgrades that library, you're more than likely going to have to respond. This response could be almost anything up to and including recompiling and redistributing your own software.
Now bearing that in mind, imagine the problems that can result if your customer is in an environment where he gets your (old) software and the second party's updated libraries independently. What happens if the library producer distributes an upgrade to their libraries? To your customers, it looks like all of the software you've built them (using those libraries, but the customer may not know that) breaks. It's possible - but extremely difficult-to write programs in C++ and still avoid this problem, but the resulting programs are hardly worth it. Essentially it means not using any of the language's OO features directly.
Java, on the other hand, makes the interconnections between modules later. This is both a more effective and easier to follow use of the object-oriented paradigm, and also neatly avoids the library problem. Because Java delays the binding of modules, libraries can be upgraded and changed at will. Methods and instance variables can be added or deleted without any negative impact on the library clients.
Having said that, you should be aware that it is possible to substitute different vesions of a class at run-time with disasterous consequences. If you remove a method from one class, without recompiling the entire program, eliminating all *.class files, the applet/application will throw a very very vague Exception.
Like Objective C, Java understands interfaces. Similar to classes, interfaces are simply any specification of a set of methods that an object responds to. Unlike classes, interfaces support multiple-inheritance, plus they can be used more flexibly than the usual class inheritance structure, which is quite rigid.
In a C or C++ program, if you have a pointer to an unknown type object, there is no way to find out. The compiler makes the assumption that you're not doing anything incorrectly. With Java, on the other hand, casts are checked at both compile-time and runtime, so determining the type of an object based on the runtime type information is easy. Because of this runtime checking, Java casts are more trustworthy than those in C++.
Further, it's possible to look up the definition of a class given a string containing its name. This is possible because class definitions are contained in the class named Class, and Class has a runtime representation. This allows you to compute a data type name and have easily have that name dynamically-linked into a running system.
Lisp, TCL and Smalltalk are often used for prototyping because they are extremely dynamic. Dynamic languages are good for prototyping because of their flexibility. You can put off a lot of decisions. Like Java, the above languages are known for their robustness and programmers don't have to worry about memory getting corrupted. This quality is one of Java's many good points.
One reason that dynamic languages are good for prototyping is that they don't require you to pin down decisions early on. Although it's dynamic, Java requires programmers to make choices explicitly, but there are safeguards built into the Java environment which keeps a poor choice from crashing your program. For example, if you write a method invocation and get it wrong, you are alerted to the error when you try to compile the source. Method invocation error is covered the same way.
Throughout this chapter we've discussed some of the features of the Java language in terms of their differences or similarities to the C++ programming language. There are a few issues that didn't fit well into Sun's feature list, but are worth pointing out anyway.
Certainly the biggest difference between Java and C/C++ is that Java has true arrays instead of linked lists of pointers. Java's pointer model eliminates the possibility of overwriting memory and corrupting data, and allows subscript checking to be performed. C++ on the other hand, has pointer arithmetic, instead of true arrays. In addition, it is not possible to turn an arbitrary integer into a pointer by casting.
There's also the issue of speed. With all the talking we've done about how efficient Java is, you'd expect it to be significantly faster than C++, right? Not so. C++ is almost twenty times as fast as Java is. This is still 'fast enough', and developers are working on code generators to enable Java programs to run nearly as quickly as those written in C++.
For technical support for our books and software contact support@mcp.com
Copyright ©1996, Que Corporation