Internals

Characteristics

  • Object Oriented
  • Platform independent
  • General purpose
  • Compiled + interpreted

JRE

Environment required to run Java applications.

Even if you're not developing and just want to run a Java application, you'd still need JRE.

JRE = JVM + Library Classes

JVM

Java Virtual Machine

JVM (Code Translation#Interpreter) is responsible to convert program from bytecode to machine code and execute the program within JRE. It is the JVM which varies from system-to-system, while the bytecode remains independent, keeping Java platform independent.

There can also be different implementations of JVM for different purposes - all providing different optimizations.

Class Loader

  • Not a type of memory
  • Subsystem of #JVM
  • Used to load class files
  • Responsible for three activities
    • Loading
    • Linking
    • Initialization

Garbage Collector

#JVM component responsible for Memory Management#Garbage Collection of objects.

Java offers different garbage collectors for different latency, throughput and hardware requirements. But most of these garbage collectors are "Generational", in that they rely on the generational hypothesis and use it to separate objects by age and optimize garbage collection.

Generational Garbage Collection

Garbage collection is performed on objects on basis of their age in the program.

Heap Memory Division

Generational garbage collector divides the heap memory into different generations based on the lifespan of objects.

Heap is divided into:

  • Young Generation (Eden Space)
  • Old Generation
  • Survivor space
    • S0
    • S1
Garbage collection life cycle
  • Objects are stored in Eden space, when they are created at first.

  • 1st cycle - When Eden Space gets full - Minor GC runs

    • Objects moved to survivor space 1
  • Subsequent cycles - Eden space gets full again

    • Objects moved to survivor space 2 from both Eden space and survivor 1
  • Objects keep moving between both survivor spaces in cycles

  • Objects older than Max tenure threshold (a certain threshold for number of cycles) are moved to old generation space. Max tenure threshold can be configured.

  • When old generation space is about to be full

  • major GC runs → time consuming and might pause the application

Concurrent Mark and Sweep (CMS)

#deprecated

  • Operation: Runs a GC thread alongside the application.
  • Used when:
    • More memory available
    • High number of CPUs or cores present
    • App demands short pauses

(Used in most fintech apps)

Garbage First (G1)

Introduced in Version#9 to replace #Concurrent Mark and Sweep (CMS)

  • Operation
    • No concept of young and old generations
    • Divides Heap into several memory spaces
    • Collects from region which has most garbage → Garbage first (G1)
  • Advantages
    • GC pauses can be tuned
    • Small pauses
    • Parallelism and concurrency together
    • Better heap utilization

It does follow generational hypothesis, but it does it in a more flexible and dynamic way.

Serial Garbage Collector

  • Operation: Stops the application and runs a single thread for garbage collection.
  • Used when: Suitable for small applications.

Throughput Collector

Configurations

JVM offers several configuration options which can be tuned via command-line options, environment variables or Docker settings.

Setting Configurations

  • Setting configuration options in startup script When starting application via java -jar command:

    java -XX:+UseG1GC -XX:MaxTenuringThreshold=10 -Xms512m -Xmx2g -jar my-spring-app.jar
    

    Similar commands can be included in bash scripts.

  • For Dockerized apps

    • Docker#Dockerfile
      FROM openjdk:17-jdk-alpine
      ARG JAR_FILE=target/my-spring-app.jar
      COPY ${JAR_FILE} app.jar
      ENTRYPOINT ["java", "-XX:+UseG1GC", "-XX:MaxTenuringThreshold=10", "-Xms512m", "-Xmx2g", "-jar", "/app.jar"]		
      
    • Setting environment variable in Docker compose:
      	version: '3'
      	services:
      		my-spring-app:
      			image: my-spring-app
      		    environment:- JAVA_OPTS=-XX:+UseG1GC -XX:MaxTenuringThreshold=10 -Xms512m -Xmx2g
      		    command: ["java", "$JAVA_OPTS", "-jar", "/app.jar"]
      
    • Kubernetes
          env:
          - name: JAVA_OPTS
            value: "-Xms512m -Xmx2g -XX:+UseG1GC"
      
  • Using JAVA_TOOL_OPTIONS environment variable

    • Some hosting environments or CI/CD platforms support setting JVM options via the JAVA_TOOL_OPTIONS environment variable, which the JVM reads on startup.
    	export JAVA_TOOL_OPTIONS="-XX:+UseG1GC -XX:MaxTenuringThreshold=10 -Xms512m -Xmx2g"
    	java -jar my-spring-app.jar
    
  • You can also add the options in some variable in a .conf files and refer to it in your deployment environment

    Example: app.conf in some Spring application

    	JAVA_OPTS="-XX:+UseG1GC -Xmx2g -Xms512m"
    

Configuration Options

Memory Management

  • Heap Size
    • Initial heap size -Xms<size>
    • Max heap size -Xmx<size>
  • Stack Size: -Xss<size>

Garbage Collection

  • GC Algorithms
      -XX:+UseG1GC
      -XX:+UseParallelGC
      -XX:+UseConcMarkSweepGC  # Deprecated
      -XX:+UseZGC
      -XX:+UseShenandoahGC
    
  • Tenuring threshold
    • Min -XX:MinTenuringThreshold=<value>
    • Max -XX:MaxTenuringThreshold=<value>

System Environment

This environment is a system-dependent mapping from names to values which is passed from parent to child processes. Primarily the set of variables that define or control certain aspects of process execution.

System Properties

Java maintains a set of system properties for its operations. Each java system property is a key-value (String-String) pair.

System Properties vs Environment Variables

System properties and environment variables are both conceptually mappings between names and values. Both mechanisms can be used to pass user-defined information to a Java process. Environment variables have a more global effect, because they are visible to all descendants of the process which defines them, not just the immediate Java subprocess. They can have subtly different semantics, such as case insensitivity, on different operating systems. For these reasons, environment variables are more likely to have unintended side effects. It is best to use system properties where possible. Environment variables should be used when a global effect is desired, or when an external system interface requires an environment variable (such as PATH).

Entities in Memory

Array

  • Single-dimensional arrays A contiguous space is allocated in heap and a reference is returned (similar to new object).
  • Two-dimensional arrays
    • They don't really exist.
    • 2D arrays are just array of arrays.
    • Multi-dimensional arrays go by the same rule.

Class

  • The #Class Loader loads the class -> build process.
  • The constants, static components, method code etc. is loaded in the stack's class area.

Methods

  • Private JVM Stack in stack memory is created
  • New frame is created and stored in the stack
  • Memory Management#Stack Frame
  • Frame destroyed when method invocation completes.

A new one will be created again when the method is invoked.

How are arguments passed in Java?

  • Why is Java always 'pass by value'?

    Whenever a method is invoked in Java, it is allotted its own stack space. Regardless of the original variable type, each time a method is invoked, a copy for each argument is created in the stack memory and the copy version is passed to the method. Thus, always pass by value.

  • Passing Primitive Arguments

    Consider two variables, x and y , of primitive types and thus stored inside the stack memory. when calling a function, two copies are created inside the stack memory (let's say w and z) and are then passed to the method. hence, the original variables are not being sent to the method and any modification inside the method flow is affecting only the copies.

    Pasted image 20241006185328

  • Passing lang.java.lib.classes.wrappers/lang.java.lib.string Arguments

    Wrappers are stored inside the heap memory with a corresponding reference inside the stack memory. When calling a function, copy for each reference is created inside the stack memory, and the copies are passed to the method. Any change to the reference inside the method is actually changing the reference of the copies and not the original references.

    If you change the value of wrapper objects inside the method like this: x += 2, the change is not reflected outside the method, since wrapper objects are immutable. they create a new instance each time their state is modified. String objects work similarly to wrappers, so the above rules apply also on strings.

    Pasted image 20241006185359

  • Passing lang.java.lib.collection/lang.java.lib.classes.object Arguments

    When defining any collection or object in java, a reference is created inside the stack that points to multiple objects inside the heap memory. when calling a function, a copy of the reference is created and passed to the method. the actual object data is referenced by two references, and any change done by one reference is reflected in the other.

    Pasted image 20241006185414

Objects

  • The actual data/structure that is stored on the heap starts with what's commonly called object header.
  • Header contains — a (compressed) class pointer
  • Class pointer —> an internal data structure
  • Internal data structure — defines layout of the class
  • Layout of class — stored in a separate memory area called Metaspace (or Compressed Class space if Compressed OOPs are used).
  • The pointer can be 4 or 8 bytes, depending on the architecture - even on 64-bit systems, it's usually 4 bytes due to the Compressed OOPs optimization.

Root Objects

An object is a root object if it is referenced by:

Reachable Objects

An object is called reachable if it is reachable from a root object.

Objects which are directly or indirectly reachable from some other objects.

For example,

  • P -> O
  • P -> Q -> O

O is reachable from P in both cases.

Lifecycle

Box smallBox;

Unlike C++, the above statement won’t create an object. It will just create a reference variable. The reference variable has to be given an object.

Box smallBox= new Box();

Now smallBox contains address of an object of Box class. That is, it points to the object. However, the object itself has no name.

  • Memory is allocated on the heap and a reference for that object is returned which is stored in stack.

Static Members

Static variables are not garbage collected until class is loaded in the memory.

Static variables are referenced by Class objects (❗not class objects) which are referenced by ClassLoaders. So static variables can't be elected for garbage collection while the class is loaded. They can be collected when the respective class loader is itself collected.

String

String Constant Pool

Special memory area in which string literals are stored

String Initialization using literals

Creating a String literal → lang.java.jre.jvm checks the String Constant Pool.

  • Exists => Points to the same 'literal'
  • Doesn't Exist => New instance created

String Initialization using new

String str = new String(); //null
String str = new String("Kya challa?")

New object created irrespective of whether the literal already exists or not

Pointers

Why Java doesn't have pointers?

Program Components#Case against pointers

Considering the above points:

  • Java has a robust security model and disallows pointer arithmetic for this reason. It would be impossible for the JVM to ensure that code containing pointer arithmetic is safe without expensive runtime checks.

  • Java instead provides very good automatic garbage collection which takes care of memory management for you.

    For many people who had previously been forced to deal with Memory Management#Manual Memory Management in Pascal/C/C++ this was one of the biggest advantages of Java when it launched.

  • Complexity

    • Array access via pointer offsets Java does this via indexed array access so you don't need pointers. A big advantage of Java's indexed array access is that it detects and disallows out of bounds array access, which can be a major source of bugs. This is generally worth paying the price of a tiny bit of runtime overhead.

    • References to objects Java has this, it just doesn't call them pointers. Any normal object reference works as one of these. When you do String s="Hello"; you get what is effectively a pointer to a string object.

    • Passing argument by reference i.e. passing a reference which allows you to change the value of a variable in the caller's scope - Java doesn't have this, but it's a pretty rare use case and can easily be done in other ways. This is in general equivalent to changing a field in an object scope that both the caller and callee can see.

From the sun white paper The Java Language Environment:

Most studies agree that pointers are one of the primary features that enable programmers to inject bugs into their code. Given that structures are gone, and arrays and strings are objects, the need for pointers to these constructs goes away. Thus, Java has no pointer data types. Any task that would require arrays, structures, and pointers in C can be more easily and reliably performed by declaring objects and arrays of objects. Instead of complex pointer manipulation on array pointers, you access arrays by their arithmetic indices. The Java run-time system checks all array indexing to ensure indices are within the bounds of the array.You no longer have dangling pointers and trashing of memory because of incorrect pointers, because there are no pointers in Java.

References

© 2025 All rights reservedBuilt with Flowershow Cloud

Built with LogoFlowershow Cloud