Class files
Java classes are compiled into a binary representation stored in .class
files. The contents of .class
files can be viewed with the javap
tool shipped with the JDK.
Here’s the simple Java class from the previous chapter:
Hello.java
package test;
public class Hello {
public static void main(String[] args) {
System.out.println("Hello world!");
}
}
When compiled with javac
:
javac -d . Hello.java
it creates a test/Hello.class
file in the current directory. Opening this file with javap
shows a brief summary of the test.Hello
class:
Compiled from "Hello.java"
public class test.Hello {
public test.Hello();
public static void main(java.lang.String[]);
}
The .class
file contains all the information required to load the class and execute its methods. The full contents of the file can be displayed
using the -verbose
option:
javap -verbose test/Hello.class
This shows much more detail:
Classfile /home/lee/src/ruby/lkitching.github.io/_includes/code/basic_java/test/Hello.class
Last modified 28 Aug 2023; size 421 bytes
SHA-256 checksum 8cd926c0834329d073d06102f9f38f77e249b134d9b59e024c80453ca80fa28e
Compiled from "Hello.java"
public class test.Hello
minor version: 0
major version: 58
flags: (0x0021) ACC_PUBLIC, ACC_SUPER
this_class: #21 // test/Hello
super_class: #2 // java/lang/Object
interfaces: 0, fields: 0, methods: 2, attributes: 1
Constant pool:
#1 = Methodref #2.#3 // java/lang/Object."<init>":()V
#2 = Class #4 // java/lang/Object
#3 = NameAndType #5:#6 // "<init>":()V
#4 = Utf8 java/lang/Object
#5 = Utf8 <init>
#6 = Utf8 ()V
#7 = Fieldref #8.#9 // java/lang/System.out:Ljava/io/PrintStream;
#8 = Class #10 // java/lang/System
#9 = NameAndType #11:#12 // out:Ljava/io/PrintStream;
#10 = Utf8 java/lang/System
#11 = Utf8 out
#12 = Utf8 Ljava/io/PrintStream;
#13 = String #14 // Hello world!
#14 = Utf8 Hello world!
#15 = Methodref #16.#17 // java/io/PrintStream.println:(Ljava/lang/String;)V
#16 = Class #18 // java/io/PrintStream
#17 = NameAndType #19:#20 // println:(Ljava/lang/String;)V
#18 = Utf8 java/io/PrintStream
#19 = Utf8 println
#20 = Utf8 (Ljava/lang/String;)V
#21 = Class #22 // test/Hello
#22 = Utf8 test/Hello
#23 = Utf8 Code
#24 = Utf8 LineNumberTable
#25 = Utf8 main
#26 = Utf8 ([Ljava/lang/String;)V
#27 = Utf8 SourceFile
#28 = Utf8 Hello.java
{
public test.Hello();
descriptor: ()V
flags: (0x0001) ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 3: 0
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: (0x0009) ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=1, args_size=1
0: getstatic #7 // Field java/lang/System.out:Ljava/io/PrintStream;
3: ldc #13 // String Hello world!
5: invokevirtual #15 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
8: return
LineNumberTable:
line 5: 0
line 6: 8
}
SourceFile: "Hello.java"
The format of .class
files is described in full in the JVM specification.
They contain a sequence of sections
- Magic number
- Version number
- Constant pool
- Class properties
- Interfaces
- Fields
- Methods
- Attributes
Since the test.Hello
class contains no fields and implements no interfaces they are not described here.
Magic number
All . class
files begin with the 4-byte constant 0xCAFEBABE
. This is not displayed by javap
but can be seen by viewing the raw bytes.
Note .class
files are stored in big-endian order which may differ from the architecture of your system.
od --endian=big -x test/Hello.class
0000000 cafe babe 0000 003a 001d 0a00 0200 0307
Version number
The first information displayed by javap
is the major and minor version of the class file format this file uses. This class file uses version 58.0
which means it is only supported by versions 14 or higher of the JVM. Attempting to load this class on an older version of the JVM will result in an error e.g.
Error: A JNI error has occurred, please check your installation and try again
Exception in thread "main" java.lang.UnsupportedClassVersionError: test/Hello has been compiled by a more recent version of the Java Runtime (class file version 58.0), this version of the Java Runtime only recognizes class file versions up to 52.0
at java.lang.ClassLoader.defineClass1(Native Method)
Class properties
The flags
, this_class
and super_class
entries define the access properties and names of the defined class and its direct superclass. The Hello
does not explicitly
declare a superclass, so it is implicitly java.lang.Object
.
Constant pool
The test.Hello
class makes various symbolic references to other code elements, such as classes and their methods. Some of these references are not explicit in the source
code. These names and their types are recorded in the constant pool.
Binary names
The format of binary names within .class
files differs slightly from those in .java
files. The .
separator used by Java is replaced within .class
files.
For example the class java.lang.Object
will be refered to as java/lang/Object
within class files.
Descriptors
Within class files, descriptors are used to define the types of fields and methods.
Field descriptors
JVM types are either one of the primitive types, an object type or an array type. The grammar for fields descriptors denote one of these three possibilities.
The primitive types are byte
, char
, double
, float
, int
, long
, short
and boolean
and these are denoted as B
, C
, D
, F
, I
, J
, S
and Z
respectively within
the field descriptor grammar. For example, a descriptor of J
indicates a field of the primitive long
type.
Object types are defined with the literal L
followed by the binary name of the class type followed by the literal ‘;’. For example a field of type java.lang.Thead
has a descriptor of Ljava/lang/Thread;
.
Array types are defined with the literal [
followed by the element type. So an array of java.lang.Thread
references has a descriptor of [Ljava/lang/Thread;
and an array of primitive int
values
has a descriptor of [I
.
Method descriptors
Methods take a (possibly empty) collection of arguments and optionally return a value. Methods which do not return a value are given a return type of void
(denoted by V
) within method descriptors.
Methods which return a value use the grammar for field descriptors to denote the return type.
Method descriptors consist of a literal (
, the corresponding field descriptor for each parameter type in sequence, a literal )
followed by the return type descriptor. This can be either V
for void
methods,
or a field descriptor for the return type.
For example the method public String test(int i, boolean b, Object o)
has a method descriptor of (IZLjava/lang/Object;)java/lang/String;
.
Constant pool contents
Here are the references made by the test.Hello
class:
Index | Name | Type | Description |
---|---|---|---|
#2 | java/lang/Object |
Class | This is the (implicit) superclass of the test.Hello class. The super_class property (see above) contains the index of this class reference within the constant pool. |
#1 | java/lang/Object::<init> |
Method | Object constructors are given the special name <init> within .class files. This is invoked by the test.Hello constructor (see below). |
#8 | java/lang/System |
Class | Class which defines the static out field |
#7 | java/lang/System::out |
Field | Reference to the out field of the java.lang.System class |
#12 | Ljava/io/PrintStream |
Type reference | The declared type of the System.out field |
#13 | String | The string constant “Hello world!” | |
#16 | java/io/PrintStream |
Class | Type defining the println method |
#15 | java/io/PrintStream::println |
Method | The println method used to write to the console |
#21 | test/Hello |
Class | The class defined by this class file. The this_class property contains this index into the constant pool |
#25 | main |
utf8 | Name of the main method |
Methods
Entries in the method table define the name, attributes (access modifiers etc.) and JVM opcodes for the methods defined by the class.
Instruction format
JVM instructions are variable-length and consist of an opcode
which defines the operation, and a (possibly empty) sequence of operands. javap
displays
each instruction in the following format:
<index> <opcode> [ <operand1>, ..., <operandN> ] [ <comment> ]
index
is the index of the start of the instruction in the code array for the method.
opcode
is a mnemonic for the operation.
Method execution
Each JVM thread defines its own private stack consisting of frames for each method invocation. As a method is invoked, a new stack frame is allocated to store space for local variables and temporary results during execution. On method exit, the frame is popped from the stack and execution returns to the calling method.
Each method also makes use of an operand stack which is manipulated by individual JVM instructions. Some instructions push values onto the operand stack, some store values from the top of the stack into local variables, and some consume multiple items from the top of the stack and push a result.
Arguments
In order to invoke a method, its arguments are pushed in order onto the operand stack, and the method is invoked via a method reference in the class constant pool.
Instance methods are invoked via the invokevirtual
instruction, and have a first argument which is a reference to the object the method is being invoked on (the receiver).
Static methods are invoked with invokestatic
and only the explicit arguments are pushed prior to invocation.
Within a method, the arguments are loaded onto the operand stack with a family of opcode instructions (aload
, iload
etc.).
Hello class methods
The test.Hello
class defines two methods - an implicit constructor and the static main
method.
Constructor
public test.Hello();
descriptor: ()V
flags: (0x0001) ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 3: 0
The constructor declares no format parameters, and returns void
as shown by the descriptor. It does require a single parameter however - the
reference to the new object being initialised. This is loaded onto the operand stack with the aload_0
instruction. The java.lang.Object
constructor is then invoked in the same way, passing the reference just loaded as the first argument. The invokespecial
is similar to invokevirtual
and invokestatic
but is required to invoke constructors and superclass methods. The operand to the invokespecial
instruction is an index into the constant pool
of the test.Hello
class. The element at index #1
is a method reference to the java/lang/Object.<init>
method as indicated by the comment.
main
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: (0x0009) ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=1, args_size=1
0: getstatic #7 // Field java/lang/System.out:Ljava/io/PrintStream;
3: ldc #13 // String Hello world!
5: invokevirtual #15 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
8: return
LineNumberTable:
line 5: 0
line 6: 8
As indicated by the descriptor, The main
method defines a single String[]
parameter and has a return type of void
.
The flags also indicate the method is static
and public
.
The getstatic instruction, loads a field reference onto the operand stack. The operand of #7
is the index of the reference to the
System.out
field within the constant pool of the test.Hello
class.
The literal string “Hello world!” is pushed onto the operand stack with the ldc instruction. The operand of
#15
is the index of the string within the class constant pool.
At this point the operand stack contains the arguments to the java.io.PrintString.<println>
method - the receiver (the contents of the System.out
field), and the string to write. The
method is invoked with invokevirtual using the method reference in the class constant pool.