src/site/xdoc/manual/jvm.xml - platform/external/apache-commons-bcel - Git at Google

 <?xml version="1.0"?>
 <!--
     * Licensed to the Apache Software Foundation (ASF) under one
     * or more contributor license agreements.  See the NOTICE file
     * distributed with this work for additional information
     * regarding copyright ownership.  The ASF licenses this file
     * to you under the Apache License, Version 2.0 (the
     * "License"); you may not use this file except in compliance
     * with the License.  You may obtain a copy of the License at
     *
     *   http://www.apache.org/licenses/LICENSE-2.0
     *
     * Unless required by applicable law or agreed to in writing,
     * software distributed under the License is distributed on an
     * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
     * KIND, either express or implied.  See the License for the
     * specific language governing permissions and limitations
     * under the License.
 -->
 <document>
   <properties>
     <title>The Java Virtual Machine</title>
   </properties>

   <body>
     <section name="The Java Virtual Machine">
       <p>
         Readers already familiar with the Java Virtual Machine and the
         Java class file format may want to skip this section and proceed
         with <a href="bcel-api.html">section 3</a>.
       </p>

       <p>
         Programs written in the Java language are compiled into a portable
         binary format called <em>byte code</em>. Every class is
         represented by a single class file containing class related data
         and byte code instructions. These files are loaded dynamically
         into an interpreter (<a
               href="http://docs.oracle.com/javase/specs/">Java
         Virtual Machine</a>, aka. JVM) and executed.
       </p>

       <p>
         <a href="#Figure 1">Figure 1</a> illustrates the procedure of
         compiling and executing a Java class: The source file
         (<tt>HelloWorld.java</tt>) is compiled into a Java class file
         (<tt>HelloWorld.class</tt>), loaded by the byte code interpreter
         and executed. In order to implement additional features,
         researchers may want to transform class files (drawn with bold
         lines) before they get actually executed. This application area
         is one of the main issues of this article.
       </p>

       <p align="center">
         <a name="Figure 1">
           <img src="../images/jvm.gif"/>
           <br/>
           Figure 1: Compilation and execution of Java classes</a>
       </p>

       <p>
         Note that the use of the general term "Java" implies in fact two
         meanings: on the one hand, Java as a programming language, on the
         other hand, the Java Virtual Machine, which is not necessarily
         targeted by the Java language exclusively, but may be used by <a
               href="http://www.robert-tolksdorf.de/vmlanguages.html">other
         languages</a> as well. We assume the reader to be familiar with
         the Java language and to have a general understanding of the
         Virtual Machine.
       </p>

     <subsection name="Java class file format">
       <p>
         Giving a full overview of the design issues of the Java class file
         format and the associated byte code instructions is beyond the
         scope of this paper. We will just give a brief introduction
         covering the details that are necessary for understanding the rest
         of this paper. The format of class files and the byte code
         instruction set are described in more detail in the <a
               href="http://docs.oracle.com/javase/specs/">Java
         Virtual Machine Specification</a>. Especially, we will not deal
         with the security constraints that the Java Virtual Machine has to
         check at run-time, i.e. the byte code verifier.
       </p>

       <p>
         <a href="#Figure 2">Figure 2</a> shows a simplified example of the
         contents of a Java class file: It starts with a header containing
         a "magic number" (<tt>0xCAFEBABE</tt>) and the version number,
         followed by the <em>constant pool</em>, which can be roughly
         thought of as the text segment of an executable, the <em>access
         rights</em> of the class encoded by a bit mask, a list of
         interfaces implemented by the class, lists containing the fields
         and methods of the class, and finally the <em>class
         attributes</em>, e.g.,  the <tt>SourceFile</tt> attribute telling
         the name of the source file. Attributes are a way of putting
         additional, user-defined information into class file data
         structures. For example, a custom class loader may evaluate such
         attribute data in order to perform its transformations. The JVM
         specification declares that unknown, i.e., user-defined attributes
         must be ignored by any Virtual Machine implementation.
       </p>

       <p align="center">
         <a name="Figure 2">
           <img src="../images/classfile.gif"/>
           <br/>
           Figure 2: Java class file format</a>
       </p>

       <p>
         Because all of the information needed to dynamically resolve the
         symbolic references to classes, fields and methods at run-time is
         coded with string constants, the constant pool contains in fact
         the largest portion of an average class file, approximately
         60%. In fact, this makes the constant pool an easy target for code
         manipulation issues. The byte code instructions themselves just
         make up 12%.
       </p>

       <p>
         The right upper box shows a "zoomed" excerpt of the constant pool,
         while the rounded box below depicts some instructions that are
         contained within a method of the example class. These
         instructions represent the straightforward translation of the
         well-known statement:
       </p>

       <p align="center">
         <source>System.out.println("Hello, world");</source>
       </p>

       <p>
         The first instruction loads the contents of the field <tt>out</tt>
         of class <tt>java.lang.System</tt> onto the operand stack. This is
         an instance of the class <tt>java.io.PrintStream</tt>. The
         <tt>ldc</tt> ("Load constant") pushes a reference to the string
         "Hello world" on the stack. The next instruction invokes the
         instance method <tt>println</tt> which takes both values as
         parameters (instance methods always implicitly take an instance
         reference as their first argument).
       </p>

       <p>
         Instructions, other data structures within the class file and
         constants themselves may refer to constants in the constant pool.
         Such references are implemented via fixed indexes encoded directly
         into the instructions. This is illustrated for some items of the
         figure emphasized with a surrounding box.
       </p>

       <p>
         For example, the <tt>invokevirtual</tt> instruction refers to a
         <tt>MethodRef</tt> constant that contains information about the
         name of the called method, the signature (i.e., the encoded
         argument and return types), and to which class the method belongs.
         In fact, as emphasized by the boxed value, the <tt>MethodRef</tt>
         constant itself just refers to other entries holding the real
         data, e.g., it refers to a <tt>ConstantClass</tt> entry containing
         a symbolic reference to the class <tt>java.io.PrintStream</tt>.
         To keep the class file compact, such constants are typically
         shared by different instructions and other constant pool
         entries. Similarly, a field is represented by a <tt>Fieldref</tt>
         constant that includes information about the name, the type and
         the containing class of the field.
       </p>

       <p>
         The constant pool basically holds the following types of
         constants: References to methods, fields and classes, strings,
         integers, floats, longs, and doubles.
       </p>

     </subsection>

     <subsection name="Byte code instruction set">
       <p>
         The JVM is a stack-oriented interpreter that creates a local stack
         frame of fixed size for every method invocation. The size of the
         local stack has to be computed by the compiler. Values may also be
         stored intermediately in a frame area containing <em>local
         variables</em> which can be used like a set of registers. These
         local variables are numbered from 0 to 65535, i.e., you have a
         maximum of 65536 of local variables per method. The stack frames
         of caller and callee method are overlapping, i.e., the caller
         pushes arguments onto the operand stack and the called method
         receives them in local variables.
       </p>

       <p>
         The byte code instruction set currently consists of 212
         instructions, 44 opcodes are marked as reserved and may be used
         for future extensions or intermediate optimizations within the
         Virtual Machine. The instruction set can be roughly grouped as
         follows:
       </p>

       <p>
         <b>Stack operations:</b> Constants can be pushed onto the stack
         either by loading them from the constant pool with the
         <tt>ldc</tt> instruction or with special "short-cut"
         instructions where the operand is encoded into the instructions,
         e.g.,  <tt>iconst_0</tt> or <tt>bipush</tt> (push byte value).
       </p>

       <p>
         <b>Arithmetic operations:</b> The instruction set of the Java
         Virtual Machine distinguishes its operand types using different
         instructions to operate on values of specific type. Arithmetic
         operations starting with <tt>i</tt>, for example, denote an
         integer operation. E.g., <tt>iadd</tt> that adds two integers
         and pushes the result back on the stack. The Java types
         <tt>boolean</tt>, <tt>byte</tt>, <tt>short</tt>, and
         <tt>char</tt> are handled as integers by the JVM.
       </p>

       <p>
         <b>Control flow:</b> There are branch instructions like
         <tt>goto</tt>, and <tt>if_icmpeq</tt>, which compares two integers
         for equality. There is also a <tt>jsr</tt> (jump to sub-routine)
         and <tt>ret</tt> pair of instructions that is used to implement
         the <tt>finally</tt> clause of <tt>try-catch</tt> blocks.
         Exceptions may be thrown with the <tt>athrow</tt> instruction.
         Branch targets are coded as offsets from the current byte code
         position, i.e., with an integer number.
       </p>

       <p>
         <b>Load and store operations</b> for local variables like
         <tt>iload</tt> and <tt>istore</tt>. There are also array
         operations like <tt>iastore</tt> which stores an integer value
         into an array.
       </p>

       <p>
         <b>Field access:</b> The value of an instance field may be
         retrieved with <tt>getfield</tt> and written with
         <tt>putfield</tt>. For static fields, there are
         <tt>getstatic</tt> and <tt>putstatic</tt> counterparts.
       </p>

       <p>
         <b>Method invocation:</b> Static Methods may either be called via
         <tt>invokestatic</tt> or be bound virtually with the
         <tt>invokevirtual</tt> instruction. Super class methods and
         private methods are invoked with <tt>invokespecial</tt>. A
         special case are interface methods which are invoked with
         <tt>invokeinterface</tt>.
       </p>

       <p>
         <b>Object allocation:</b> Class instances are allocated with the
         <tt>new</tt> instruction, arrays of basic type like
         <tt>int[]</tt> with <tt>newarray</tt>, arrays of references like
         <tt>String[][]</tt> with <tt>anewarray</tt> or
         <tt>multianewarray</tt>.
       </p>

       <p>
         <b>Conversion and type checking:</b> For stack operands of basic
         type there exist casting operations like <tt>f2i</tt> which
         converts a float value into an integer. The validity of a type
         cast may be checked with <tt>checkcast</tt> and the
         <tt>instanceof</tt> operator can be directly mapped to the
         equally named instruction.
       </p>

       <p>
         Most instructions have a fixed length, but there are also some
         variable-length instructions: In particular, the
         <tt>lookupswitch</tt> and <tt>tableswitch</tt> instructions, which
         are used to implement <tt>switch()</tt> statements.  Since the
         number of <tt>case</tt> clauses may vary, these instructions
         contain a variable number of statements.
       </p>

       <p>
         We will not list all byte code instructions here, since these are
         explained in detail in the <a
               href="http://docs.oracle.com/javase/specs/">JVM
         specification</a>. The opcode names are mostly self-explaining,
         so understanding the following code examples should be fairly
         intuitive.
       </p>

     </subsection>

     <subsection name="Method code">
       <p>
         Non-abstract (and non-native) methods contain an attribute
         "<tt>Code</tt>" that holds the following data: The maximum size of
         the method's stack frame, the number of local variables and an
         array of byte code instructions. Optionally, it may also contain
         information about the names of local variables and source file
         line numbers that can be used by a debugger.
       </p>

       <p>
         Whenever an exception is raised during execution, the JVM performs
         exception handling by looking into a table of exception
         handlers. The table marks handlers, i.e., code chunks, to be
         responsible for exceptions of certain types that are raised within
         a given area of the byte code. When there is no appropriate
         handler the exception is propagated back to the caller of the
         method. The handler information is itself stored in an attribute
         contained within the <tt>Code</tt> attribute.
       </p>

     </subsection>

     <subsection name="Byte code offsets">
       <p>
         Targets of branch instructions like <tt>goto</tt> are encoded as
         relative offsets in the array of byte codes. Exception handlers
         and local variables refer to absolute addresses within the byte
         code.  The former contains references to the start and the end of
         the <tt>try</tt> block, and to the instruction handler code. The
         latter marks the range in which a local variable is valid, i.e.,
         its scope. This makes it difficult to insert or delete code areas
         on this level of abstraction, since one has to recompute the
         offsets every time and update the referring objects. We will see
         in <a href="bcel-api.html#ClassGen">section 3.3</a> how <font
               face="helvetica,arial">BCEL</font> remedies this restriction.
       </p>

     </subsection>

     <subsection name="Type information">
       <p>
         Java is a type-safe language and the information about the types
         of fields, local variables, and methods is stored in so called
         <em>signatures</em>. These are strings stored in the constant pool
         and encoded in a special format. For example the argument and
         return types of the <tt>main</tt> method
       </p>

       <p align="center">
         <source>public static void main(String[] argv)</source>
       </p>

       <p>
         are represented by the signature
       </p>

       <p align="center">
         <source>([java/lang/String;)V</source>
       </p>

       <p>
         Classes are internally represented by strings like
         <tt>"java/lang/String"</tt>, basic types like <tt>float</tt> by an
         integer number. Within signatures they are represented by single
         characters, e.g., <tt>I</tt>, for integer. Arrays are denoted with
         a <tt>[</tt> at the start of the signature.
       </p>

     </subsection>

     <subsection name="Code example">
       <p>
         The following example program prompts for a number and prints the
         factorial of it. The <tt>readLine()</tt> method reading from the
         standard input may raise an <tt>IOException</tt> and if a
         misspelled number is passed to <tt>parseInt()</tt> it throws a
         <tt>NumberFormatException</tt>. Thus, the critical area of code
         must be encapsulated in a <tt>try-catch</tt> block.
       </p>

       <source>
 import java.io.*;

 public class Factorial {
     private static BufferedReader in = new BufferedReader(new InputStreamReader(System.in));

     public static int fac(int n) {
         return (n == 0) ? 1 : n * fac(n - 1);
     }

     public static int readInt() {
         int n = 4711;
         try {
             System.out.print("Please enter a number&gt; ");
             n = Integer.parseInt(in.readLine());
         } catch (IOException e1) {
             System.err.println(e1);
         } catch (NumberFormatException e2) {
             System.err.println(e2);
         }
         return n;
     }

     public static void main(String[] argv) {
         int n = readInt();
         System.out.println("Factorial of " + n + " is " + fac(n));
     }
 }
       </source>

       <p>
         This code example typically compiles to the following chunks of
         byte code:
       </p>

       <source>
         0:  iload_0
         1:  ifne            #8
         4:  iconst_1
         5:  goto            #16
         8:  iload_0
         9:  iload_0
         10: iconst_1
         11: isub
         12: invokestatic    Factorial.fac (I)I (12)
         15: imul
         16: ireturn

         LocalVariable(start_pc = 0, length = 16, index = 0:int n)
       </source>

       <p><b>fac():</b>
         The method <tt>fac</tt> has only one local variable, the argument
         <tt>n</tt>, stored at index 0. This variable's scope ranges from
         the start of the byte code sequence to the very end.  If the value
         of <tt>n</tt> (the value fetched with <tt>iload_0</tt>) is not
         equal to 0, the <tt>ifne</tt> instruction branches to the byte
         code at offset 8, otherwise a 1 is pushed onto the operand stack
         and the control flow branches to the final return.  For ease of
         reading, the offsets of the branch instructions, which are
         actually relative, are displayed as absolute addresses in these
         examples.
       </p>

       <p>
         If recursion has to continue, the arguments for the multiplication
         (<tt>n</tt> and <tt>fac(n - 1)</tt>) are evaluated and the results
         pushed onto the operand stack.  After the multiplication operation
         has been performed the function returns the computed value from
         the top of the stack.
       </p>

       <source>
         0:  sipush        4711
         3:  istore_0
         4:  getstatic     java.lang.System.out Ljava/io/PrintStream;
         7:  ldc           "Please enter a number&gt; "
         9:  invokevirtual java.io.PrintStream.print (Ljava/lang/String;)V
         12: getstatic     Factorial.in Ljava/io/BufferedReader;
         15: invokevirtual java.io.BufferedReader.readLine ()Ljava/lang/String;
         18: invokestatic  java.lang.Integer.parseInt (Ljava/lang/String;)I
         21: istore_0
         22: goto          #44
         25: astore_1
         26: getstatic     java.lang.System.err Ljava/io/PrintStream;
         29: aload_1
         30: invokevirtual java.io.PrintStream.println (Ljava/lang/Object;)V
         33: goto          #44
         36: astore_1
         37: getstatic     java.lang.System.err Ljava/io/PrintStream;
         40: aload_1
         41: invokevirtual java.io.PrintStream.println (Ljava/lang/Object;)V
         44: iload_0
         45: ireturn

         Exception handler(s) =
         From    To      Handler Type
         4       22      25      java.io.IOException(6)
         4       22      36      NumberFormatException(10)
       </source>

       <p><b>readInt():</b> First the local variable <tt>n</tt> (at index 0)
         is initialized to the value 4711.  The next instruction,
         <tt>getstatic</tt>, loads the references held by the static
         <tt>System.out</tt> field onto the stack. Then a string is loaded
         and printed, a number read from the standard input and assigned to
         <tt>n</tt>.
       </p>

       <p>
         If one of the called methods (<tt>readLine()</tt> and
         <tt>parseInt()</tt>) throws an exception, the Java Virtual Machine
         calls one of the declared exception handlers, depending on the
         type of the exception.  The <tt>try</tt>-clause itself does not
         produce any code, it merely defines the range in which the
         subsequent handlers are active. In the example, the specified
         source code area maps to a byte code area ranging from offset 4
         (inclusive) to 22 (exclusive).  If no exception has occurred
         ("normal" execution flow) the <tt>goto</tt> instructions branch
         behind the handler code. There the value of <tt>n</tt> is loaded
         and returned.
       </p>

       <p>
         The handler for <tt>java.io.IOException</tt> starts at
         offset 25. It simply prints the error and branches back to the
         normal execution flow, i.e., as if no exception had occurred.
       </p>

     </subsection>
     </section>
   </body>

 </document>
	<?xml version="1.0"?>
	<!--
	* Licensed to the Apache Software Foundation (ASF) under one
	* or more contributor license agreements. See the NOTICE file
	* distributed with this work for additional information
	* regarding copyright ownership. The ASF licenses this file
	* to you under the Apache License, Version 2.0 (the
	* "License"); you may not use this file except in compliance
	* with the License. You may obtain a copy of the License at
	*
	* http://www.apache.org/licenses/LICENSE-2.0
	*
	* Unless required by applicable law or agreed to in writing,
	* software distributed under the License is distributed on an
	* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
	* KIND, either express or implied. See the License for the
	* specific language governing permissions and limitations
	* under the License.
	-->
	<document>
	<properties>
	<title>The Java Virtual Machine</title>
	</properties>

	<body>
	<section name="The Java Virtual Machine">
	<p>
	Readers already familiar with the Java Virtual Machine and the
	Java class file format may want to skip this section and proceed
	with <a href="bcel-api.html">section 3</a>.
	</p>

	<p>
	Programs written in the Java language are compiled into a portable
	binary format called <em>byte code</em>. Every class is
	represented by a single class file containing class related data
	and byte code instructions. These files are loaded dynamically
	into an interpreter (<a
	href="http://docs.oracle.com/javase/specs/">Java
	Virtual Machine</a>, aka. JVM) and executed.
	</p>

	<p>
	<a href="#Figure 1">Figure 1</a> illustrates the procedure of
	compiling and executing a Java class: The source file
	(<tt>HelloWorld.java</tt>) is compiled into a Java class file
	(<tt>HelloWorld.class</tt>), loaded by the byte code interpreter
	and executed. In order to implement additional features,
	researchers may want to transform class files (drawn with bold
	lines) before they get actually executed. This application area
	is one of the main issues of this article.
	</p>

	<p align="center">
	<a name="Figure 1">
	<img src="../images/jvm.gif"/>
	<br/>
	Figure 1: Compilation and execution of Java classes</a>
	</p>

	<p>
	Note that the use of the general term "Java" implies in fact two
	meanings: on the one hand, Java as a programming language, on the
	other hand, the Java Virtual Machine, which is not necessarily
	targeted by the Java language exclusively, but may be used by <a
	href="http://www.robert-tolksdorf.de/vmlanguages.html">other
	languages</a> as well. We assume the reader to be familiar with
	the Java language and to have a general understanding of the
	Virtual Machine.
	</p>

	<subsection name="Java class file format">
	<p>
	Giving a full overview of the design issues of the Java class file
	format and the associated byte code instructions is beyond the
	scope of this paper. We will just give a brief introduction
	covering the details that are necessary for understanding the rest
	of this paper. The format of class files and the byte code
	instruction set are described in more detail in the <a
	href="http://docs.oracle.com/javase/specs/">Java
	Virtual Machine Specification</a>. Especially, we will not deal
	with the security constraints that the Java Virtual Machine has to
	check at run-time, i.e. the byte code verifier.
	</p>

	<p>
	<a href="#Figure 2">Figure 2</a> shows a simplified example of the
	contents of a Java class file: It starts with a header containing
	a "magic number" (<tt>0xCAFEBABE</tt>) and the version number,
	followed by the <em>constant pool</em>, which can be roughly
	thought of as the text segment of an executable, the <em>access
	rights</em> of the class encoded by a bit mask, a list of
	interfaces implemented by the class, lists containing the fields
	and methods of the class, and finally the <em>class
	attributes</em>, e.g., the <tt>SourceFile</tt> attribute telling
	the name of the source file. Attributes are a way of putting
	additional, user-defined information into class file data
	structures. For example, a custom class loader may evaluate such
	attribute data in order to perform its transformations. The JVM
	specification declares that unknown, i.e., user-defined attributes
	must be ignored by any Virtual Machine implementation.
	</p>

	<p align="center">
	<a name="Figure 2">
	<img src="../images/classfile.gif"/>
	<br/>
	Figure 2: Java class file format</a>
	</p>

	<p>
	Because all of the information needed to dynamically resolve the
	symbolic references to classes, fields and methods at run-time is
	coded with string constants, the constant pool contains in fact
	the largest portion of an average class file, approximately
	60%. In fact, this makes the constant pool an easy target for code
	manipulation issues. The byte code instructions themselves just
	make up 12%.
	</p>

	<p>
	The right upper box shows a "zoomed" excerpt of the constant pool,
	while the rounded box below depicts some instructions that are
	contained within a method of the example class. These
	instructions represent the straightforward translation of the
	well-known statement:
	</p>

	<p align="center">
	<source>System.out.println("Hello, world");</source>
	</p>

	<p>
	The first instruction loads the contents of the field <tt>out</tt>
	of class <tt>java.lang.System</tt> onto the operand stack. This is
	an instance of the class <tt>java.io.PrintStream</tt>. The
	<tt>ldc</tt> ("Load constant") pushes a reference to the string
	"Hello world" on the stack. The next instruction invokes the
	instance method <tt>println</tt> which takes both values as
	parameters (instance methods always implicitly take an instance
	reference as their first argument).
	</p>

	<p>
	Instructions, other data structures within the class file and
	constants themselves may refer to constants in the constant pool.
	Such references are implemented via fixed indexes encoded directly
	into the instructions. This is illustrated for some items of the
	figure emphasized with a surrounding box.
	</p>

	<p>
	For example, the <tt>invokevirtual</tt> instruction refers to a
	<tt>MethodRef</tt> constant that contains information about the
	name of the called method, the signature (i.e., the encoded
	argument and return types), and to which class the method belongs.
	In fact, as emphasized by the boxed value, the <tt>MethodRef</tt>
	constant itself just refers to other entries holding the real
	data, e.g., it refers to a <tt>ConstantClass</tt> entry containing
	a symbolic reference to the class <tt>java.io.PrintStream</tt>.
	To keep the class file compact, such constants are typically
	shared by different instructions and other constant pool
	entries. Similarly, a field is represented by a <tt>Fieldref</tt>
	constant that includes information about the name, the type and
	the containing class of the field.
	</p>

	<p>
	The constant pool basically holds the following types of
	constants: References to methods, fields and classes, strings,
	integers, floats, longs, and doubles.
	</p>

	</subsection>

	<subsection name="Byte code instruction set">
	<p>
	The JVM is a stack-oriented interpreter that creates a local stack
	frame of fixed size for every method invocation. The size of the
	local stack has to be computed by the compiler. Values may also be
	stored intermediately in a frame area containing <em>local
	variables</em> which can be used like a set of registers. These
	local variables are numbered from 0 to 65535, i.e., you have a
	maximum of 65536 of local variables per method. The stack frames
	of caller and callee method are overlapping, i.e., the caller
	pushes arguments onto the operand stack and the called method
	receives them in local variables.
	</p>

	<p>
	The byte code instruction set currently consists of 212
	instructions, 44 opcodes are marked as reserved and may be used
	for future extensions or intermediate optimizations within the
	Virtual Machine. The instruction set can be roughly grouped as
	follows:
	</p>

	<p>
	<b>Stack operations:</b> Constants can be pushed onto the stack
	either by loading them from the constant pool with the
	<tt>ldc</tt> instruction or with special "short-cut"
	instructions where the operand is encoded into the instructions,
	e.g., <tt>iconst_0</tt> or <tt>bipush</tt> (push byte value).
	</p>

	<p>
	<b>Arithmetic operations:</b> The instruction set of the Java
	Virtual Machine distinguishes its operand types using different
	instructions to operate on values of specific type. Arithmetic
	operations starting with <tt>i</tt>, for example, denote an
	integer operation. E.g., <tt>iadd</tt> that adds two integers
	and pushes the result back on the stack. The Java types
	<tt>boolean</tt>, <tt>byte</tt>, <tt>short</tt>, and
	<tt>char</tt> are handled as integers by the JVM.
	</p>

	<p>
	<b>Control flow:</b> There are branch instructions like
	<tt>goto</tt>, and <tt>if_icmpeq</tt>, which compares two integers
	for equality. There is also a <tt>jsr</tt> (jump to sub-routine)
	and <tt>ret</tt> pair of instructions that is used to implement
	the <tt>finally</tt> clause of <tt>try-catch</tt> blocks.
	Exceptions may be thrown with the <tt>athrow</tt> instruction.
	Branch targets are coded as offsets from the current byte code
	position, i.e., with an integer number.
	</p>

	<p>
	<b>Load and store operations</b> for local variables like
	<tt>iload</tt> and <tt>istore</tt>. There are also array
	operations like <tt>iastore</tt> which stores an integer value
	into an array.
	</p>

	<p>
	<b>Field access:</b> The value of an instance field may be
	retrieved with <tt>getfield</tt> and written with
	<tt>putfield</tt>. For static fields, there are
	<tt>getstatic</tt> and <tt>putstatic</tt> counterparts.
	</p>

	<p>
	<b>Method invocation:</b> Static Methods may either be called via
	<tt>invokestatic</tt> or be bound virtually with the
	<tt>invokevirtual</tt> instruction. Super class methods and
	private methods are invoked with <tt>invokespecial</tt>. A
	special case are interface methods which are invoked with
	<tt>invokeinterface</tt>.
	</p>

	<p>
	<b>Object allocation:</b> Class instances are allocated with the
	<tt>new</tt> instruction, arrays of basic type like
	<tt>int[]</tt> with <tt>newarray</tt>, arrays of references like
	<tt>String[][]</tt> with <tt>anewarray</tt> or
	<tt>multianewarray</tt>.
	</p>

	<p>
	<b>Conversion and type checking:</b> For stack operands of basic
	type there exist casting operations like <tt>f2i</tt> which
	converts a float value into an integer. The validity of a type
	cast may be checked with <tt>checkcast</tt> and the
	<tt>instanceof</tt> operator can be directly mapped to the
	equally named instruction.
	</p>

	<p>
	Most instructions have a fixed length, but there are also some
	variable-length instructions: In particular, the
	<tt>lookupswitch</tt> and <tt>tableswitch</tt> instructions, which
	are used to implement <tt>switch()</tt> statements. Since the
	number of <tt>case</tt> clauses may vary, these instructions
	contain a variable number of statements.
	</p>

	<p>
	We will not list all byte code instructions here, since these are
	explained in detail in the <a
	href="http://docs.oracle.com/javase/specs/">JVM
	specification</a>. The opcode names are mostly self-explaining,
	so understanding the following code examples should be fairly
	intuitive.
	</p>

	</subsection>

	<subsection name="Method code">
	<p>
	Non-abstract (and non-native) methods contain an attribute
	"<tt>Code</tt>" that holds the following data: The maximum size of
	the method's stack frame, the number of local variables and an
	array of byte code instructions. Optionally, it may also contain
	information about the names of local variables and source file
	line numbers that can be used by a debugger.
	</p>

	<p>
	Whenever an exception is raised during execution, the JVM performs
	exception handling by looking into a table of exception
	handlers. The table marks handlers, i.e., code chunks, to be
	responsible for exceptions of certain types that are raised within
	a given area of the byte code. When there is no appropriate
	handler the exception is propagated back to the caller of the
	method. The handler information is itself stored in an attribute
	contained within the <tt>Code</tt> attribute.
	</p>

	</subsection>

	<subsection name="Byte code offsets">
	<p>
	Targets of branch instructions like <tt>goto</tt> are encoded as
	relative offsets in the array of byte codes. Exception handlers
	and local variables refer to absolute addresses within the byte
	code. The former contains references to the start and the end of
	the <tt>try</tt> block, and to the instruction handler code. The
	latter marks the range in which a local variable is valid, i.e.,
	its scope. This makes it difficult to insert or delete code areas
	on this level of abstraction, since one has to recompute the
	offsets every time and update the referring objects. We will see
	in <a href="bcel-api.html#ClassGen">section 3.3</a> how <font
	face="helvetica,arial">BCEL</font> remedies this restriction.
	</p>

	</subsection>

	<subsection name="Type information">
	<p>
	Java is a type-safe language and the information about the types
	of fields, local variables, and methods is stored in so called
	<em>signatures</em>. These are strings stored in the constant pool
	and encoded in a special format. For example the argument and
	return types of the <tt>main</tt> method
	</p>

	<p align="center">
	<source>public static void main(String[] argv)</source>
	</p>

	<p>
	are represented by the signature
	</p>

	<p align="center">
	<source>([java/lang/String;)V</source>
	</p>

	<p>
	Classes are internally represented by strings like
	<tt>"java/lang/String"</tt>, basic types like <tt>float</tt> by an
	integer number. Within signatures they are represented by single
	characters, e.g., <tt>I</tt>, for integer. Arrays are denoted with
	a <tt>[</tt> at the start of the signature.
	</p>

	</subsection>

	<subsection name="Code example">
	<p>
	The following example program prompts for a number and prints the
	factorial of it. The <tt>readLine()</tt> method reading from the
	standard input may raise an <tt>IOException</tt> and if a
	misspelled number is passed to <tt>parseInt()</tt> it throws a
	<tt>NumberFormatException</tt>. Thus, the critical area of code
	must be encapsulated in a <tt>try-catch</tt> block.
	</p>

	<source>
	import java.io.*;

	public class Factorial {
	private static BufferedReader in = new BufferedReader(new InputStreamReader(System.in));

	public static int fac(int n) {
	return (n == 0) ? 1 : n * fac(n - 1);
	}

	public static int readInt() {
	int n = 4711;
	try {
	System.out.print("Please enter a number> ");
	n = Integer.parseInt(in.readLine());
	} catch (IOException e1) {
	System.err.println(e1);
	} catch (NumberFormatException e2) {
	System.err.println(e2);
	}
	return n;
	}

	public static void main(String[] argv) {
	int n = readInt();
	System.out.println("Factorial of " + n + " is " + fac(n));
	}
	}
	</source>

	<p>
	This code example typically compiles to the following chunks of
	byte code:
	</p>

	<source>
	0: iload_0
	1: ifne #8
	4: iconst_1
	5: goto #16
	8: iload_0
	9: iload_0
	10: iconst_1
	11: isub
	12: invokestatic Factorial.fac (I)I (12)
	15: imul
	16: ireturn

	LocalVariable(start_pc = 0, length = 16, index = 0:int n)
	</source>

	<p><b>fac():</b>
	The method <tt>fac</tt> has only one local variable, the argument
	<tt>n</tt>, stored at index 0. This variable's scope ranges from
	the start of the byte code sequence to the very end. If the value
	of <tt>n</tt> (the value fetched with <tt>iload_0</tt>) is not
	equal to 0, the <tt>ifne</tt> instruction branches to the byte
	code at offset 8, otherwise a 1 is pushed onto the operand stack
	and the control flow branches to the final return. For ease of
	reading, the offsets of the branch instructions, which are
	actually relative, are displayed as absolute addresses in these
	examples.
	</p>

	<p>
	If recursion has to continue, the arguments for the multiplication
	(<tt>n</tt> and <tt>fac(n - 1)</tt>) are evaluated and the results
	pushed onto the operand stack. After the multiplication operation
	has been performed the function returns the computed value from
	the top of the stack.
	</p>

	<source>
	0: sipush 4711
	3: istore_0
	4: getstatic java.lang.System.out Ljava/io/PrintStream;
	7: ldc "Please enter a number> "
	9: invokevirtual java.io.PrintStream.print (Ljava/lang/String;)V
	12: getstatic Factorial.in Ljava/io/BufferedReader;
	15: invokevirtual java.io.BufferedReader.readLine ()Ljava/lang/String;
	18: invokestatic java.lang.Integer.parseInt (Ljava/lang/String;)I
	21: istore_0
	22: goto #44
	25: astore_1
	26: getstatic java.lang.System.err Ljava/io/PrintStream;
	29: aload_1
	30: invokevirtual java.io.PrintStream.println (Ljava/lang/Object;)V
	33: goto #44
	36: astore_1
	37: getstatic java.lang.System.err Ljava/io/PrintStream;
	40: aload_1
	41: invokevirtual java.io.PrintStream.println (Ljava/lang/Object;)V
	44: iload_0
	45: ireturn

	Exception handler(s) =
	From To Handler Type
	4 22 25 java.io.IOException(6)
	4 22 36 NumberFormatException(10)
	</source>

	<p><b>readInt():</b> First the local variable <tt>n</tt> (at index 0)
	is initialized to the value 4711. The next instruction,
	<tt>getstatic</tt>, loads the references held by the static
	<tt>System.out</tt> field onto the stack. Then a string is loaded
	and printed, a number read from the standard input and assigned to
	<tt>n</tt>.
	</p>

	<p>
	If one of the called methods (<tt>readLine()</tt> and
	<tt>parseInt()</tt>) throws an exception, the Java Virtual Machine
	calls one of the declared exception handlers, depending on the
	type of the exception. The <tt>try</tt>-clause itself does not
	produce any code, it merely defines the range in which the
	subsequent handlers are active. In the example, the specified
	source code area maps to a byte code area ranging from offset 4
	(inclusive) to 22 (exclusive). If no exception has occurred
	("normal" execution flow) the <tt>goto</tt> instructions branch
	behind the handler code. There the value of <tt>n</tt> is loaded
	and returned.
	</p>

	<p>
	The handler for <tt>java.io.IOException</tt> starts at
	offset 25. It simply prints the error and branches back to the
	normal execution flow, i.e., as if no exception had occurred.
	</p>

	</subsection>
	</section>
	</body>

	</document>