The History of Garbage Collector in JVM: Efficient managing memory

Published in

DevOps.dev

6 min readJan 9, 2023

Software developers need to know how to have a deep understanding of how to manage memory.

Dynamic allocation is one of the notable processes that should be managed to avoid Memory Leaks. While static allocation requires immediate stack assignment and knowledge of the stack size at compile time, dynamic allocation can utilize heap memory in which size can be determined during running time as opposed to during compile time.

In C or C++, developers can use the dynamically allocated memory from the Heap, in which the size is decided while runtime. The developers must MANUALLY de-allocate the dynamically allocated memory when it is no longer in use . The manual de-allocation can lead to human errors including :

initiating memory leaks
trying to access already released memory
trying to de-allocate already released memory

The Garbage Collector (a.k.a GC) can be utilized to avoid these human errors by automatically eliminating unused memory. In Java or Javascript, there are garbage collectors and when we use these languages, the GC is internally executed. This means the developers do not need to manually de-allocate memory. Prior to taking advantage of GC, we need to understand how they function and what problems could potentially arise if we use them.

Mark And Sweep

The Mark And Sweep algorithm verifies whether or not the object has been mentioned the references make contact. If the object cannot be reach from the root space, the object is eliminated. The algorithm performs in two steps. First, reached objects are marked . Second, the unmarked objects are swept.

https://lambda.uta.edu/cse5317/notes/node47.html

JVM GC

In JVM(Java Virtual Machine), the heap can be divided into two components : The young generation and the old generation.

The GC executed in the young generation is called Minor GC, and in the old generation, it is called Major GC.

The young generation

Again, the young generation can be categorized into three spaces: Eden, Survival 0, and Survival 1.

First, the Eden space is allocated to newly generated objects. When the Eden space is full, the minor GC is executed.
Second, Survival spaces are allocated survive objects after the minor GC is conducted. The survival spaces have special rules, that is either Survival 0 or Survival 1 should be empty.

The Garbage Collecting on JVM

Before we introduce the GCs, you should know the concept, “Stop the World”. The STW “Stop the World” pauses means stopping an application that is being executed to perform a specific GC by JVM.

Let’s check out the process of GC.

First, new objects are allocated to the Eden space.

When the Eden space is full, the first minor GC is executed, then objects regarded as reachable are moved to Survival 0 which increases the age-bit.

Once more the Eden space is full and minor GC is executed, the objects regarded as reachable in Eden and Survivor 0 are moved to Survival 1 and the age bit also increases. The Survival spaces are alternately filled until the age-bit reaches the specific threshold.

When the age-bit of a specific object surpasses the specific threshold, JVM GC relabels the object as old. Then, JVM GC moves the object to the old generation area and this procedure is called the promotion. In Java 8’s Parallel GC, when the age-bit is over 15, the promotion progresses.

As time has passed, the old generation will be filled. At that time, the Major GC arises and eliminates unnecessary memory using the Mark and Sweep Algorithm. The Major GC takes more time than the Minor GC.

So, why are the two areas — the young generation and the old generation — divided? The reason is GC designers realized that a majority objects’ lifetimes are very short. Because most objects could face elimination, they make GC perform earlier and faster.

There are several GCs can be used by JVM:

1. Serial GC

Serial GC performs garbage collecting by utilizing one thread. On this one thread, Stop the World tends to spend an extended amount of time. The Serial GC appears when there is a single thread environment present and the heap size is fairly small.

When a single Serial GC thread is executed, there is a long latency which pauses the application. For this reason, we need to minimize the pause time in an effort to reduce the latency.

2. Parallel CG

Parallel CG performs garbage collecting through multiple threads, used as default since Java 8. Because of multiple threads, Stop The World time came to be shorter. Parallel CG can be used in a multi-core environment for improving the operation speed.

For multiple threads, it made STW pause time lower, but it needed to be more improving.

3. CMS GC

The Concurrent Mark Sweep (CMS) collector is designed for applications that prefer shorter garbage collection pauses and that can afford to share processor resources with the garbage collector while the application is running. Similar to the other collectors, both minor and major collections occur in the CMS collector.

During each major collection cycle, the CMS collector pauses all the application threads for a brief period at the beginning of the collection and again toward the middle of the collection. The second pause tends to be the longer of the two pauses. Multiple threads are used to do the collection work during both pauses.

Minor collections can interleave with an ongoing major cycle, and are done in a manner similar to the parallel collector; the application threads are stopped during minor collections.

4. G1 GC

The G1 garbage collector can perform on multiprocessor machines that contain large quantities of memory. This leads to a higher performance and overall reduces the pause time.

G1 GC has been the default since Java 9. It attempts to meet GC pause time boundaries with almost certainty while providing high throughput. For instance, whole-heap operations like global marketing can be performed concurrently with application threads. This prevents interruptions which are proportional to the heap or live-data size.

4. ZGC

ZGC intends to support large heap sizes with low application pause times. To reach this goal, it uses techniques such as colored 64-bit references, load barriers, relocation, and remapping.

colored 64-bit references: A reference represents the position of a byte in the virtual memory. Only several bits represent properties of the reference, not all.
load barriers: A load barrier is a piece of code that runs when a thread loads a reference from the heap
relocation: ZGC doesn’t want long pause times, it does most of the relocating in parallel with the application

In the posts that will be followed, let’s explore how to work the G1 GC and ZGC deeply.