▸ JDK 25 · Simple Language · Deep Concepts

JVM Definitions — Plain English

Every card: plain English first → then how it actually works under the hood → flip for the interview answer
01 System Startup & Bootstrapping 3 terms
JNI_CreateJavaVM()Startup
🟡 What it is — plain English Think of it as the power-on button for the JVM. When you type java MyApp, the OS doesn't know what Java is — it just runs a native program. That native program calls JNI_CreateJavaVM() which does everything needed to bring the JVM to life: checking your flags, picking a GC, setting up memory.
⚙️ Under the hood It's a C++ function inside libjvm.so. It runs before any Java code ever executes. Steps: validate JVM args → profile CPU/RAM → select GC algorithm → allocate Metaspace → initialize thread system → hand control to the Bootstrap ClassLoader to load java.lang.*.
🔗 Real analogy Like turning on a PC. The BIOS (JNI_CreateJavaVM) runs first, sets everything up — screen, disk, memory — then hands off to the OS (your Java program).
click to flip → say it in an interview
Say it in an interviewStartup
✅ Strong answer "JNI_CreateJavaVM is the native C++ entry point that bootstraps the JVM. When you run java MyApp, the OS calls this function, which validates JVM flags, profiles hardware to auto-select a GC, allocates Metaspace, and initialises the threading system — all before a single line of Java runs."
⚠️ Follow-up they might ask "What happens between typing java MyApp and main() being called?"
OS → JNI_CreateJavaVM → Bootstrap Loader loads java.lang.* → your main class is loaded (3-phase pipeline) → <clinit> runs → main() is called.
click to flip back
MetaspaceStartup
🟡 What it is — plain English Metaspace is where the JVM stores blueprints of your classes — not the objects themselves, just the class definitions (field names, method signatures, bytecode). It lives completely outside the Java heap, in native OS memory.
⚙️ Under the hood Split into two sub-regions: Compressed Class Space (contiguous, 32-bit shifted pointers, stores Klass structs only) and Non-Class Metaspace (fragmented okay, full 64-bit pointers, stores bytecode + Constant Pool + annotations). Replaced PermGen in JDK 8 — auto-grows instead of being fixed size. Cap with -XX:MaxMetaspaceSize.
🔗 Real analogy Java objects = houses. Metaspace = the city planning office that stores the architectural blueprints. You can build a million houses (objects on heap) but there's only one blueprint per house type (one Klass per class in Metaspace).
click to flip → say it in an interview
Say it in an interviewStartup
✅ Strong answer "Metaspace is native off-heap memory that stores class metadata — Klass structs, bytecode, Constant Pools, annotations. It replaced PermGen in JDK 8. The key difference: PermGen had a fixed max size, so you'd get OutOfMemoryError: PermGen space in Spring/Hibernate-heavy apps. Metaspace auto-expands, but you should still cap it with -XX:MaxMetaspaceSize to prevent unbounded native memory growth."
⚠️ Watch out Heavy use of dynamic proxies, reflection, or runtime code generation (Spring AOP, Hibernate, bytecode weavers) constantly loads new generated classes → Metaspace grows. This is still a real OOM risk even without PermGen.
click to flip back
GC Auto-SelectionStartup
🟡 What it is — plain English The JVM looks at your machine specs (how many CPU cores, how big is the heap) and automatically picks the best garbage collector for you. You don't have to choose — but you can override it.
⚙️ Under the hood Default is G1GC since JDK 9 for most machines. The JVM uses ergonomics — a set of heuristics — to pick: heap size hints from OS, available processors, and whether you're in a container (cgroups). ZGC is auto-selected if you explicitly want sub-millisecond pauses.
🔗 Override options -XX:+UseG1GC (default, balanced) · -XX:+UseZGC (sub-ms pauses, JDK 21+) · -XX:+UseShenandoahGC (RedHat GC, similar to ZGC) · -XX:+UseSerialGC (tiny heap, CLI tools)
click to flip → say it in an interview
Say it in an interviewStartup
✅ Strong answer "The JVM uses ergonomics to auto-select a GC at startup based on hardware. G1GC is the default since JDK 9 — it's a good balance of throughput and pause time for most server workloads. For latency-critical applications (trading, gaming), I'd override with ZGC which targets sub-millisecond pauses regardless of heap size."
⚠️ Gotcha in containers Without JVM container-awareness flags, the JVM reads the host's CPU/RAM, not the container's cgroup limits. JDK 10+ fixes this automatically. On older JDKs, the JVM might think it has 64 cores when the container only has 2 → over-threads GC → performance disaster.
click to flip back
02 Class Loading Pipeline 7 terms
Parent Delegation ModelClass Loading
🟡 What it is — plain English When the JVM needs to load a class, it always asks the parent loader first before trying itself. It's like a kid asking dad before doing it himself. This chain goes: your App Loader → Platform Loader → Bootstrap Loader. Only if everyone above fails does the child try.
⚙️ Under the hood This is a security mechanism. Without it, a malicious java/lang/String.class on your classpath could shadow the real one. Bootstrap Loader (native, no Java parent) always wins for core JDK classes — it's the only one that can load them. Two classes are equal only if their bytecode AND their ClassLoader instance are identical. Same class loaded by two different loaders = two completely different types.
🔗 Where it breaks down OSGi, hot-reload frameworks, and servlet containers deliberately break delegation (child-first loading) so each module/webapp gets its own isolated class version. This is also why you get ClassCastException with "same" class names across different ClassLoaders.
click to flip → say it in an interview
Say it in an interviewClass Loading
✅ Strong answer "The delegation model means every ClassLoader asks its parent first — App → Platform → Bootstrap. This prevents user code from hijacking core JDK classes. Crucially, class identity in the JVM = bytecode + ClassLoader, not just the class name. Same .class file loaded by two different loaders produces two incompatible types."
⚠️ Classic interview trap "What does String.class.getClassLoader() return?"
null — because String is loaded by the Bootstrap ClassLoader which is native (C++) and has no Java representation. null == Bootstrap Loader.
click to flip back
Klass (C++ Runtime Blueprint)Class Loading
🟡 What it is — plain English A Klass is the JVM's internal C++ representation of your Java class, stored in Metaspace. Every object on the heap has a tiny pointer back to its Klass so the JVM knows: what type is this? what methods does it have? what fields does it contain?
⚙️ Under the hood The Klass contains: vtable (array of method pointers for virtual dispatch), itable (interface method pointers), field layout offsets, references to bytecode in Non-Class Metaspace. In JDK 25 (JEP 519), each heap object holds a compressed Klass pointer packed into the 64-bit Mark Word — no longer a separate field.
🔗 Analogy If Java objects are cars, the Klass is the car model blueprint at the factory. Every Toyota Corolla (object) shares one Corolla blueprint (Klass) in the factory (Metaspace). You can build a million Corollas; there's still only one blueprint.
click to flip → say it in an interview
Say it in an interviewClass Loading
✅ Strong answer "A Klass is the JVM's internal C++ class descriptor stored in Metaspace. It contains the vtable for virtual method dispatch, field layout offsets, and pointers to bytecode. Every heap object has a compressed pointer back to its Klass — this is how instanceof, reflection, and polymorphic method calls work at the JVM level."
⚠️ JDK 25 change JEP 519 packed the Klass pointer into the Mark Word, eliminating it as a separate header field. This saved 4 bytes per object — massive at scale (100M objects = 400 MB saved).
click to flip back
Bytecode VerificationClass Loading
🟡 What it is — plain English Before the JVM runs any class, it checks the bytecode is safe — like airport security for code. It ensures the bytecode won't try to read random memory, corrupt the stack, or type-cast illegally. Done once at load time, never again.
⚙️ Under the hood The verifier performs data-flow analysis on each method's bytecode using stack map frames (metadata embedded in .class files since JDK 6). It checks: operand stack never underflows/overflows, every branch target is a valid instruction start, type of every value matches what the opcode expects, this is definitely initialised before use in constructors.
🔗 Why Java is memory-safe C/C++ has no verifier — a bug can read random memory. Java's verifier mathematically proves bytecode is safe before execution. This is the technical foundation of Java's "write once, run safely anywhere" claim.
click to flip → say it in an interview
Say it in an interviewClass Loading
✅ Strong answer "Bytecode verification is Java's safety guarantee. Before executing any class, the JVM verifier proves the bytecode is type-safe using data-flow analysis on stack map frames. It catches corrupt or malicious bytecode before a single instruction runs. This is why Java doesn't get buffer overflows — the verifier makes certain class of memory bugs structurally impossible."
⚠️ Can be disabled -Xverify:none skips verification — only ever used in startup-speed-critical builds where you 100% trust the bytecode source. Never in production.
click to flip back
Preparation PhaseClass Loading
🟡 What it is — plain English The JVM allocates memory for all static fields and gives them safe default zero values — before any of your code runs. So int → 0, boolean → false, Object → null. Your actual values (like static int MAX = 100) come later.
⚙️ Under the hood This is the second sub-step of Linking. The JVM needs to guarantee that no field ever contains garbage memory — so it zeroes everything out first. Constants (static final int X = 5 where 5 is a compile-time literal) are the one exception — they're assigned during Preparation because the compiler already inlined them.
🔗 Common interview trap static int count = 100;
After Preparation: count = 0 ← safe default
After Initialization: count = 100 ← your value

This is why accessing static fields during class loading (e.g., circular static initialisation) can give you 0 unexpectedly.
click to flip → say it in an interview
Say it in an interviewClass Loading
✅ Strong answer "Preparation allocates memory for static fields and assigns type-default zero values — not the programmer's values. That's Initialization's job. This two-phase design guarantees every field has a defined value before any code references it, preventing undefined behaviour from uninitialized memory reads."
⚠️ The one exception static final int CONSTANT = 42; where 42 is a compile-time constant literal — assigned during Preparation, not Initialization. The compiler inlines it everywhere it's used anyway, so this is mostly an academic detail.
click to flip back
Constant Pool ResolutionClass Loading
🟡 What it is — plain English A .class file doesn't store actual memory addresses — it stores symbolic names (like a phone book: "call ArrayList.add"). Resolution is the step where the JVM looks up those names and replaces them with real memory addresses so method calls become fast pointer jumps.
⚙️ Under the hood The Constant Pool is a table in the .class file holding strings like "java/util/ArrayList", "add:(Ljava/lang/Object;)Z". Resolution walks this table and swaps each string with a direct C++ pointer to the Klass or method. Lazy by default — a symbolic ref is resolved the first time that specific instruction executes, not all upfront. This avoids loading classes that are declared but never used.
🔗 Ripple effect Resolving a reference to ArrayList triggers loading ArrayList.class if it isn't loaded yet — which triggers its own Verification → Preparation → Resolution chain. Class loading is recursive and lazy.
click to flip → say it in an interview
Say it in an interviewClass Loading
✅ Strong answer "Resolution converts symbolic references in the Constant Pool — plain strings like method names — into direct runtime memory pointers. It's lazy: each reference is resolved the first time that bytecode instruction executes, not all at startup. This is why Java doesn't fail at startup for classes that are declared but never used."
⚠️ NoClassDefFoundError vs ClassNotFoundException ClassNotFoundException = class not found during loading. NoClassDefFoundError = class was found and compiled against, but missing at runtime during Resolution. The second is sneakier — it often means a dependency is on the compile classpath but not the runtime classpath.
click to flip back
<clinit> — Class InitializerClass Loading
🟡 What it is — plain English <clinit> is a hidden method the compiler generates to run all your static blocks and set your static field values. It runs exactly once — the very first time anyone uses the class — and the JVM guarantees only one thread can run it.
⚙️ Under the hood The compiler collects all static { } blocks and static field = value assignments in textual order and puts them into <clinit>. The JVM guarantees: thread-safe execution (only one thread runs it; others block), exactly-once (idempotent), happens-before (all threads see the initialised state after it completes). This is the basis of the safe publication guarantee for static fields.
🔗 Design pattern it enables Initialization-on-Demand Holder singleton pattern works entirely because <clinit> is thread-safe and lazy:
private static class Holder { static final S INSTANCE = new S(); }
public static S get() { return Holder.INSTANCE; }
The inner class Holder is only initialised (and <clinit> only runs) on the first call to get().
click to flip → say it in an interview
Say it in an interviewClass Loading
✅ Strong answer "<clinit> is the compiler-generated static initializer. It runs all static blocks and assignments in textual order, exactly once per class, with JVM-level thread safety. This is the mechanism behind safe lazy initialisation — the Initialization-on-Demand Holder pattern exploits it to get a thread-safe singleton with zero synchronisation overhead after first use."
⚠️ Deadlock risk If two classes' <clinit> methods reference each other (circular static init), the JVM can deadlock — thread A holds the init lock for class X waiting for Y, thread B holds the lock for Y waiting for X.
click to flip back
Bootstrap ClassLoaderClass Loading
🟡 What it is — plain English The root loader — the one that started everything. Written in native C++, not Java. It loads the foundational JDK classes (String, Object, Integer…) that everything else depends on. It has no parent — it IS the top of the chain.
⚙️ Under the hood Because it's native, it has no Java object representation. That's why String.class.getClassLoader() returns null — null is the Java API's signal for "Bootstrap". It loads from the JVM's internal boot module path (the core java.* modules). You cannot subclass or replace it from Java code.
🔗 Quick test you can run System.out.println(String.class.getClassLoader()); // null = Bootstrap
System.out.println(MyApp.class.getClassLoader()); // jdk.internal.loader.ClassLoaders$AppClassLoader
click to flip → say it in an interview
Say it in an interviewClass Loading
✅ Strong answer "Bootstrap ClassLoader is the root of the hierarchy, implemented in native C++. It loads core JDK classes. Calling getClassLoader() on a Bootstrap-loaded class returns null — that's the Java API convention for 'this loader is Bootstrap'. It cannot be accessed or replaced from Java."
⚠️ Security implication This is why you can't inject a fake java.lang.String — the Bootstrap loader loads it from the JVM's internal trusted location, not your classpath. The delegation model ensures this happens before your App loader even gets a chance.
click to flip back
03 Memory Architecture 4 terms
Compressed Class SpaceMemory
🟡 What it is — plain English A contiguous block of memory in Metaspace that holds only Klass definitions. Every object on the heap has a tiny pointer into this space. "Compressed" means these pointers are 32-bit (half the size of normal 64-bit pointers) to save memory.
⚙️ Under the hood The compression trick: store a 32-bit value X, then compute the real 64-bit address as base + X << 3. This only works if all Klasses sit in one contiguous block (so the base address is fixed). Max 3 GB because 32 bits × 8-byte shift = 32 GB addressable, and 3 GB is the practical JVM limit for this space.
🔗 Real saving Without compression, each object would need an 8-byte Klass pointer. With compression, it's 4 bytes (packed into the Mark Word in JDK 25). 100 million objects = 400 MB saved just from pointer compression.
click to flip → say it in an interview
Say it in an interviewMemory
✅ Strong answer "Compressed Class Space is the contiguous Metaspace sub-region holding Klass structs. It uses 32-bit shifted pointers instead of 64-bit, which saves 4 bytes per heap object. The contiguity requirement is what makes the arithmetic possible — a fixed base address plus a shifted 32-bit offset equals the full 64-bit Klass address."
⚠️ When it's disabled If Metaspace needs to grow beyond 3 GB (very unusual), UseCompressedClassPointers is automatically disabled. You pay the full 8-byte pointer cost per object. Avoid this by profiling Metaspace growth.
click to flip back
G1GC Heap RegionMemory
🟡 What it is — plain English G1GC divides the entire Java heap into equal-sized chunks called regions (1–32 MB each). At any moment, each region plays a role: Eden (new objects), Survivor (objects that survived GC), Old (long-lived objects), or Humongous (giant objects). Regions can change roles between GC cycles.
⚙️ Under the hood JVM targets ~2048 total regions. With a 16 GB heap: 16000 / 2000 ≈ 8 MB per region. The flexibility of role-changing is G1's superpower: it can decide at each collection which regions are worth collecting based on how much garbage they contain — this is the "Garbage First" in its name.
🔗 Why this beats old generational GC Old GC (CMS, ParNew): fixed Young/Old boundaries → must scan all of Young or all of Old. G1: fluid boundaries → collect just the top 10 most garbage-dense regions → done. Predictable pause times.
click to flip → say it in an interview
Say it in an interviewMemory
✅ Strong answer "G1 splits the heap into equal-sized regions (1–32 MB) that can dynamically change their role — Eden today, Old tomorrow. This flexibility means G1 can choose exactly which regions to collect in each cycle, always picking the most garbage-dense ones first, which is how it honours the MaxGCPauseMillis budget."
⚠️ Tuning tip -XX:G1HeapRegionSize=16m doubles the region size, which raises the Humongous allocation threshold from 8 MB to 8 MB (50% of region). Useful if your app allocates medium-large objects to stop them going straight to Old Gen.
click to flip back
Code CacheMemory
🟡 What it is — plain English A special memory region that stores JIT-compiled native machine code. Once the JVM has compiled a hot method to assembly, it stores it here. Next time that method is called, the JVM executes the cached assembly directly — no interpretation, near-CPU-native speed.
⚙️ Under the hood Three sub-heaps inside the Code Cache: non-method (JVM internal stubs), profiled (C1 Level 3 code with profiling instrumentation), non-profiled (C2 Level 4 optimised code). When the Code Cache fills up, the JVM emits a warning and stops JIT-compiling — all new code runs interpreted. This is a performance cliff that's hard to recover from without restarting.
🔗 Sizing rule of thumb For most Spring Boot apps: 256–512 MB is adequate. Large microservice meshes or JVM apps running many lambdas/inner classes may need up to 1 GB. Monitor with jcmd <pid> Compiler.codecache.
click to flip → say it in an interview
Say it in an interviewMemory
✅ Strong answer "Code Cache stores JIT-compiled native assembly. Hot methods compiled by C2 execute directly from here — no interpreter overhead. It's separate from both heap and Metaspace. A full Code Cache silently kills JIT compilation; the JVM falls back to interpretation with no obvious error, causing a gradual performance regression that's hard to diagnose without GC/JIT logs."
⚠️ Invisible failure mode The JVM logs a warning when the Code Cache is full, but doesn't throw an exception. If you don't have JVM logging enabled, you'll see mysterious throughput degradation and wonder why. Always set -XX:ReservedCodeCacheSize=512m proactively.
click to flip back
Non-Class MetaspaceMemory
🟡 What it is — plain English The larger, more flexible part of Metaspace. If Compressed Class Space stores the class blueprint (Klass), Non-Class Metaspace stores everything else that describes a class: the actual bytecode of methods, the Constant Pool table, annotations, and JIT profiling data.
⚙️ Under the hood Uses full 64-bit pointers (no compression needed). Does NOT need to be contiguous — the OS can give it memory scattered across different address ranges. This is why Metaspace can grow by mapping new pages almost anywhere. In a typical Spring Boot app, Non-Class Metaspace is 5–10× larger than Compressed Class Space because bytecode and constant pools dwarf the Klass structs.
🔗 Monitor separately jcmd <pid> VM.metaspace shows both spaces separately. If Non-Class Metaspace grows unboundedly while Class Space stays flat, it means existing classes are adding metadata (e.g., new JIT profiling data, annotations being parsed repeatedly).
click to flip → say it in an interview
Say it in an interviewMemory
✅ Strong answer "Non-Class Metaspace holds bytecode, Constant Pools, and annotations — everything about a class except the Klass struct itself. It uses full 64-bit pointers and can be fragmented, unlike Compressed Class Space. In practice it's much larger than Class Space and is the first thing to bloat in annotation-heavy or reflection-heavy frameworks."
⚠️ Framework gotcha Spring, Hibernate, and Jackson scan and parse annotations on every class they process. Each parse materialises annotation data in Non-Class Metaspace. In a large app with thousands of annotated classes, this can be 50–100 MB of Non-Class Metaspace just for annotation metadata.
click to flip back
04 Thread Stack & Frames 5 terms
JVM StackStack
🟡 What it is — plain English Every thread gets its own private call stack. It's a pile of "what am I currently doing" cards — every method call adds a card on top, every method return removes the top card. Threads cannot see each other's stacks at all.
⚙️ Under the hood The JVM Stack lives in native OS memory (not the heap). Each thread's stack is a fixed contiguous block set by -Xss (default 512 KB–1 MB). Stack frames are pushed/popped in O(1). No GC ever touches the stack — local variables live and die with their frame. This is why local variables are inherently thread-safe: they exist only in one thread's private address range.
🔗 StackOverflowError Infinite recursion keeps pushing frames. The stack hits its -Xss limit → JVM throws StackOverflowError. Note: it's an Error not an Exception — the JVM considers this unrecoverable.
click to flip → say it in an interview
Say it in an interviewStack
✅ Strong answer "Each thread has a private JVM Stack in native memory. Frames are pushed on method call and popped on return — O(1), no GC needed. Thread isolation means local variables in one thread's stack are physically inaccessible to other threads. This is why local variables are always thread-safe without synchronisation."
⚠️ Thread pool sizing With 1000 threads each at 1 MB stack: 1 GB of native memory consumed just for stacks — before any heap allocation. Virtual Threads (JDK 21+) solve this: they have tiny, growable stacks starting at ~1 KB and are mounted on carrier threads only when running.
click to flip back
Stack Frame (Activation Record)Stack
🟡 What it is — plain English A Stack Frame is all the information needed to execute one method call: the local variables, the workspace for doing calculations, and the notes for returning to the caller. One frame exists per active method call.
⚙️ Under the hood Three parts: Local Variable Array (fixed-size indexed storage), Operand Stack (LIFO workspace for bytecode computation), Frame Data (Constant Pool pointer, return address, exception table). The frame's exact size — max operand stack depth + local variable count — is computed at compile time and embedded in the .class Code attribute. No runtime sizing needed.
🔗 The moment of method call result = add(5, 3);
1. New frame created for add, pushed onto stack
2. add runs inside its frame
3. Frame popped, result returned
4. Caller's frame resumes from its return address in Frame Data
click to flip → say it in an interview
Say it in an interviewStack
✅ Strong answer "A Stack Frame is created per method invocation and contains three things: the Local Variable Array for storing method params and locals, the Operand Stack as the computation scratch space, and Frame Data for return address and exception dispatch. The frame's size is determined at compile time — the JVM knows exactly how much stack to allocate before the method even starts."
⚠️ Key insight for thread safety If a reference to an object is only in a Stack Frame's LVA and that object never gets assigned to a field or passed to another thread, C2's escape analysis can allocate that object on the stack itself — zero heap allocation, zero GC.
click to flip back
Local Variable Array (LVA)Stack
🟡 What it is — plain English A numbered list of slots inside a Stack Frame that stores all the method's local variables and parameters. Slot 0 is always this (for instance methods). After that: parameters in order, then local variables declared in the method body.
⚙️ Under the hood Each slot is 4 bytes wide. long and double are 8 bytes → they occupy two consecutive slots. Bytecode instructions reference slots by index: iload_1 = push LVA[1] (an int) onto the Operand Stack. astore_3 = pop a reference from Operand Stack into LVA[3]. Indices are fixed at compile time — the JVM doesn't search by name.
🔗 Example slot layout Instance method: int sum(int a, long b)
LVA[0] = this
LVA[1] = a (int, 1 slot)
LVA[2–3] = b (long, 2 slots)
LVA[4+] = any local vars declared in body
click to flip → say it in an interview
Say it in an interviewStack
✅ Strong answer "LVA is the fixed-size array of slots holding all locals and parameters. Index 0 is always this for instance methods — static methods start params at index 0. long/double take two consecutive slots. Bytecode uses numeric slot indices, not names — variable names are debug metadata only, stripped in optimised builds."
⚠️ Slot reuse The compiler can reuse LVA slots for variables in different scopes of the same method. If variable x is out of scope before variable y is declared, they might share the same slot. This is an optimisation to minimise frame size.
click to flip back
Operand StackStack
🟡 What it is — plain English Think of the Operand Stack as a scratch paper for calculations. Bytecode instructions push values onto it and pop them off to do arithmetic. Every computation — adding two numbers, calling a method, creating an object — flows through this scratch pad.
⚙️ Under the hood The JVM is a stack-based virtual machine (unlike register-based VMs like Dalvik/ART or the actual CPU). Operations always work on the top of the stack. For a + b: push a → push b → iadd pops both and pushes result. The max depth of this stack is fixed at compile time (stored in the Code attribute). JIT compilation maps these stack slots to actual CPU registers for performance.
🔗 Bytecode trace for return a + b iload_1 ← push a (LVA[1]) onto OS
iload_2 ← push b (LVA[2]) onto OS
iadd ← pop a and b, push (a+b)
ireturn ← pop result, return to caller
click to flip → say it in an interview
Say it in an interviewStack
✅ Strong answer "The Operand Stack is the JVM's computation workspace — a LIFO stack where bytecode pushes operands, executes operations, and reads results. Java bytecode is stack-based (not register-based), making it portable across CPU architectures. JIT compilation's job is to map this abstract stack model to actual CPU registers for real performance."
⚠️ JVM vs Android Android's Dalvik/ART VM uses a register-based bytecode format — operations specify source and destination registers directly. This reduces the number of bytecode instructions needed but makes the format CPU-specific. Java chose stack-based for portability; Android chose register-based for mobile CPU efficiency.
click to flip back
Frame Data (FD)Stack
🟡 What it is — plain English Frame Data is the bookkeeping section of a Stack Frame — the notes that tell the JVM "where do I go when this method finishes?" and "what do I do if an exception is thrown?". It connects the current frame back to the rest of the JVM.
⚙️ Under the hood Three things: (1) Constant Pool reference — pointer to this class's runtime CP, needed to resolve any symbolic references during execution. (2) Return address — the exact bytecode offset in the calling method's frame to resume at after this method returns. (3) Exception dispatch table — a per-method table mapping bytecode-offset-ranges to catch-handler start addresses.
🔗 How exceptions find their handler An exception is thrown at bytecode offset 42. JVM looks up the exception table in Frame Data: does offset 42 fall in the range [10, 50) for a catch (IOException e) handler? If yes → jump to handler. If no → pop this frame, check the caller's exception table. Repeat until caught or stack is empty → unhandled exception.
click to flip → say it in an interview
Say it in an interviewStack
✅ Strong answer "Frame Data is the metadata part of a Stack Frame. It holds the Constant Pool reference for runtime symbol resolution, the return address for resuming the caller, and the exception dispatch table for try-catch handling. Exception propagation literally walks up the stack, checking each frame's exception table, until a handler is found or the stack is exhausted."
⚠️ Performance note Throwing exceptions is expensive precisely because of this table lookup + stack unwinding. Exceptions should model exceptional conditions — not control flow. Using exceptions as a loop exit condition is a common Java anti-pattern with measurable overhead.
click to flip back
05 JIT Compilation 5 terms
Tiered CompilationJIT
🟡 What it is — plain English The JVM doesn't compile everything at startup (too slow). Instead, it watches which methods get called a lot, then progressively compiles those hot methods into faster and faster native code over time. Like promoting a chef: start as dishwasher, prove yourself, get promoted to sous-chef, then head chef.
⚙️ Under the hood 5 levels. Methods start at Level 0 (interpreter). Two counters trigger promotion: invocation counter (i) and backedge counter (b, counts loop iterations). At ~200 invocations → Level 3 (C1 + full profiling). At ~5000+ → Level 4 (C2 aggressive). The profiling data collected at Level 3 is what gives C2 the information it needs to make speculative optimisations. Methods can also be deoptimised and re-queued if assumptions break.
🔗 Why Java "warms up" A brand-new JVM runs everything interpreted. After a few minutes of traffic, hot paths reach Level 4. Performance improves over time. This is why Java benchmarks must always include a warmup phase — measuring cold startup is measuring the interpreter, not the JIT.
click to flip → say it in an interview
Say it in an interviewJIT
✅ Strong answer "Tiered compilation is the JVM's progressive compilation strategy. Methods start interpreted, get C1-compiled at ~200 invocations (with profiling instrumentation), and reach C2 at ~5000+ invocations. The key insight: C1's profiling data tells C2 what actually happens at runtime — types, branches, call sites — enabling speculative optimisations impossible with static compilers."
⚠️ Benchmarking gotcha If you benchmark without warmup, you're measuring the interpreter (Level 0) or early C1. Use JMH (Java Microbenchmark Harness) which handles warmup automatically. Never use System.currentTimeMillis() in a loop to benchmark JVM code.
click to flip back
C1 Compiler (Client Compiler)JIT
🟡 What it is — plain English C1 is the fast-but-not-perfect compiler. It quickly turns bytecode into native code that runs much faster than the interpreter. At its highest level (Level 3), it also secretly watches what's happening — recording which types show up, which branches are taken — and reports this back so C2 can later do an even better job.
⚙️ Under the hood C1 operates at 3 sub-levels: Level 1 (no profiling — trivial methods like getters), Level 2 (light profiling — C2 queue is backed up, pressure relief valve), Level 3 (full profiling — the standard path). Level 3 code is instrumented with trap instructions and counters that log type feedback into MethodData objects stored in Metaspace. C2 reads this MethodData to make its speculative bets.
🔗 Why Level 2 exists C2 is expensive and has a compilation queue. During a traffic spike, the queue might back up. Rather than leaving methods at slow Level 3, the JVM promotes some to Level 2 (light profile) as a compromise — faster than Level 3, less queue pressure on C2.
click to flip → say it in an interview
Say it in an interviewJIT
✅ Strong answer "C1 is the quick-compile stage. It produces decent native code fast, and at Level 3 it instruments that code with profiling probes — type counters, branch statistics — stored as MethodData in Metaspace. C2 reads this MethodData to make speculative optimisations like monomorphic inlining and branch prediction tuning."
⚠️ Level 1 methods stay forever Trivial methods (pure getters with no branches) go to Level 1 and never progress to Level 4 — there's nothing to profile and C2 won't do anything better anyway. They're already as fast as they can be.
click to flip back
C2 Compiler (Server Compiler)JIT
🟡 What it is — plain English C2 is the turbo engine. After a method has been called thousands of times, C2 takes the profiling data, studies the method deeply, and produces brutally optimised native machine code. It can make assumptions a static compiler never could — because it knows exactly what's happened at runtime so far.
⚙️ Under the hood C2 uses profiling data to make speculative optimisations: if profiling shows list.add() is always called on ArrayList, C2 inlines the ArrayList implementation directly (no virtual dispatch overhead) with a guard check. Other tricks: loop unrolling (flatten small loops into sequential instructions), null check elimination, dead code removal, scalar replacement (decompose objects into primitive fields on stack/registers). Takes longer to compile but the result is near-hand-written-C performance.
🔗 How Java can beat C++ A C++ compiler optimises based on static types. C2 optimises based on runtime behaviour. If at runtime 99% of calls go through one subclass, C2 inlines that subclass directly. The C++ compiler can't know this without profile-guided optimisation (PGO), which requires a separate profiling build step.
click to flip → say it in an interview
Say it in an interviewJIT
✅ Strong answer "C2 is the heavy-duty optimising JIT, active after ~5000+ invocations. It uses runtime profile data for speculative optimisations: monomorphic inlining, loop unrolling, escape analysis-based stack allocation, null-check elimination. The result can outperform equivalent C++ because C2 optimises for actual runtime behaviour, not just static type information."
⚠️ Profiling stops at Level 4 Once C2 compiles a method, profiling stops. C2 bets on its compiled code being correct. If a new subtype appears or branches change distribution, those assumptions may be wrong → deoptimisation kicks in.
click to flip back
Escape AnalysisJIT
🟡 What it is — plain English C2 tries to answer: "does this object ever leave this method?" If you create an object, use it only locally, and never pass it to another method or store it in a field — C2 knows the object "doesn't escape". It can then skip the heap allocation entirely and put it on the stack or even in CPU registers.
⚙️ Under the hood Three outcomes of escape analysis: (1) No escape → stack allocate (object lives and dies in the frame, zero GC cost), (2) No escape → scalar replace (decompose the object into its individual fields as local variables — no object header overhead at all), (3) Thread-local escape → object doesn't escape to other threads, so no need for lock contention checks. This eliminates GC pressure for many short-lived objects like iterators, StringBuilder in string concatenation, and temporary DTOs.
🔗 Concrete example String result = "Hello " + name + "!";
The compiler creates a StringBuilder. C2 escape analysis proves it doesn't escape → scalar replaces it → zero heap allocation for the StringBuilder itself. Only the final String goes to heap.
click to flip → say it in an interview
Say it in an interviewJIT
✅ Strong answer "Escape analysis lets C2 prove an object's lifetime is bounded within a method. If proven non-escaping, C2 can stack-allocate it (instant cleanup on return) or scalar-replace it (decompose into CPU registers — no object at all). This eliminates GC pressure for many short-lived allocations like iterators and temporary builders that look like heap objects in source code but never actually hit the heap."
⚠️ How to break it accidentally Passing a "local" object to a logging framework, putting it in a thread-local, or using reflection on it — any of these cause C2 to conclude the object might escape and bail out on the optimisation. Keep hot-path objects truly local.
click to flip back
DeoptimizationJIT
🟡 What it is — plain English C2 makes bets when it optimises ("I bet this method will always be called with an ArrayList"). If that bet turns out to be wrong at runtime, the JVM undoes the optimisation, falls back to the interpreter, re-profiles, and eventually recompiles a more cautious version. The safety net that makes speculative optimisation safe.
⚙️ Under the hood C2 inserts uncommon trap instructions at every speculation point. When the trap fires (assumption violated), the JVM executes a deoptimisation sequence: reconstruct the interpreter state from the compiled frame (using the debug info C2 preserved), return to the interpreter at the right bytecode offset, increment a counter. If the same method is deoptimised repeatedly, it may get marked "not entrant" and recompiled from scratch with the new reality.
🔗 Real scenario Production JVM runs for hours, processShape() always gets Circle → C2 inlines Circle's logic directly. Then a Square request arrives → uncommon trap fires → deoptimise → re-profile → recompile with a polymorphic inline cache handling both types. The temporary slowdown is a few milliseconds, not a crash.
click to flip → say it in an interview
Say it in an interviewJIT
✅ Strong answer "Deoptimisation is the JVM's safety net for speculative compilation. When a C2 assumption — like a call site being monomorphic — is violated, an uncommon trap fires, the compiled frame is reconstructed as interpreter state, and execution continues safely in interpreted mode. The method is then re-profiled and recompiled with a more accurate model."
⚠️ Detect excessive deopt -XX:+TraceDeoptimization logs every event. Frequent deoptimisation is a performance red flag — common cause is loading a new class at runtime that polymorphises a previously monomorphic call site (e.g., plugin systems, late-bound services). Also triggered by ClassCastException catches in tight loops.
click to flip back
06 Object Allocation & Heap 5 terms
TLAB — Thread-Local Allocation BufferHeap
🟡 What it is — plain English Instead of every thread fighting over a shared memory pool to create objects, the JVM gives each thread its own private chunk of Eden. Each thread allocates objects in its own chunk without asking anyone. When the chunk fills up, it quietly gets a fresh one.
⚙️ Under the hood Allocation inside a TLAB = pointer bump: just move a pointer forward by the object's size. One machine instruction. No locks, no CAS (Compare-And-Swap). Without TLAB, every new Object() would need a CAS to safely advance the shared Eden top pointer, which under high thread contention serialises all allocation. TLAB eliminates this completely. TLAB is exhausted → thread requests a new one from Eden (this does require a lock, but happens rarely).
🔗 Allocation speed reality check Java object allocation (with TLAB) is often faster than malloc() in C — because malloc must manage free lists and handle fragmentation. TLAB is just a pointer increment into pre-zeroed memory.
click to flip → say it in an interview
Say it in an interviewHeap
✅ Strong answer "TLAB gives each thread a private Eden slice, making object allocation a single pointer bump — no locks, no CAS. It's why Java object creation is extremely fast for small objects. TLAB exhaustion is rare and handled by requesting a fresh chunk, which is the only point where synchronisation is needed."
⚠️ TLAB waste When a TLAB is exhausted but the remaining space is too small for the next object, the leftover space is wasted (filled with a dummy object to keep the heap walkable). Large objects can cause excessive TLAB waste. Objects larger than -XX:TLABWasteTargetPercent of TLAB size bypass TLAB and allocate directly from Eden's shared bump pointer.
click to flip back
Humongous ObjectHeap
🟡 What it is — plain English Any object 50% or more of a G1GC Region's size is called "Humongous" and treated very differently. It skips Eden entirely, goes straight to Old Gen, needs multiple physically adjacent memory regions, and wastes any leftover space in its last region.
⚙️ Under the hood Humongous objects span one or more contiguous regions (the first is "starts-humongous", subsequent are "continues-humongous"). Because they must be contiguous and go straight to Old Gen, they: (1) bypass TLAB, (2) skip Young GC's efficient evacuation, (3) can trigger Full GC if the heap is fragmented and no contiguous block exists. One special case: primitive arrays with zero incoming references are eagerly reclaimed during Young GC without waiting for a Mixed GC.
🔗 Common production mistake Reading a large file or HTTP body into a byte[] on every request. With an 8 MB region size, any byte[] over 4 MB becomes Humongous. In a service handling 100 req/s, this rapidly fills Old Gen with Humongous objects → frequent Mixed GC → potential Full GC.
click to flip → say it in an interview
Say it in an interviewHeap
✅ Strong answer "Humongous objects (≥50% of region size) bypass Eden and go directly to Old Gen, requiring physically contiguous regions. This causes internal fragmentation, bypasses TLAB's fast allocation path, and can trigger Full GC when contiguous space isn't available. The fix: pool large buffers with ByteBuffer.allocateDirect(), or increase region size with -XX:G1HeapRegionSize=32m to raise the threshold."
⚠️ Detection GC log will show Humongous regions: N. If this number grows, you have a Humongous allocation problem. Use async profilers (e.g., async-profiler with -e alloc) to find the allocation site.
click to flip back
Mark Word (Object Header)Heap
🟡 What it is — plain English Every Java object starts with a 64-bit (8-byte) header called the Mark Word. The JVM packs multiple pieces of information into this one field: what type of object it is, its hash code, how many GC cycles it survived, and whether it's currently locked.
⚙️ Under the hood — JDK 25 (JEP 519) In JDK 25, the Mark Word encodes: compressed Klass pointer (replaces the old separate 32-bit field), 31-bit identity hash code (lazily computed only on first hashCode() call), 4-bit GC age (0–15), tag bits (encode lock state: unlocked / lightweight-locked / inflated), 4 reserved bits (for future Project Valhalla value types). The JVM reinterprets the same bits differently depending on the lock state tag.
🔗 Lock state transitions Unlocked → tag = 01, Klass + hash + age in the word.
Lightweight locked (synchronized, uncontended) → tag = 00, pointer to lock record on the locking thread's stack.
Inflated (contended) → tag = 10, pointer to a native OS mutex.
The JVM transitions between these states automatically based on contention.
click to flip → say it in an interview
Say it in an interviewHeap
✅ Strong answer "The Mark Word is the 64-bit header field every Java object carries. It multiplexes several uses: Klass pointer (type info), identity hash code (lazy), GC age (4 bits, 0–15), and lock state (tag bits encoding unlocked, lightweight-locked, or inflated). JDK 25's JEP 519 packed the Klass pointer into the Mark Word, eliminating the separate 32-bit Klass field and saving 4 bytes per object."
⚠️ Why hash code is lazy Computing and storing the hash code consumes 31 bits of the Mark Word. If stored eagerly, there's no room for the Klass pointer or GC age without expanding the header. Lazy storage means most objects (that never have hashCode() called) pay zero cost.
click to flip back
GC Age BitsHeap
🟡 What it is — plain English A counter embedded in every object's header that tracks how many garbage collections the object has survived. Every time it survives a Young GC without being collected, the counter ticks up. When it hits 15, the JVM says "okay, you've clearly been around for a while" and moves the object permanently to Old Gen.
⚙️ Under the hood 4 bits → values 0–15. Each Young GC "evacuation" (copying surviving objects to To-Survivor) increments the age. The threshold (default 15) is adaptive — the JVM may lower it dynamically if Survivor regions are filling up (to avoid Survivor overflow causing premature mass promotion). -XX:MaxTenuringThreshold sets the hard cap but the JVM may promote earlier.
🔗 Generational hypothesis The entire design assumes: most objects die young. Short-lived objects (iterators, temp strings, request-scoped objects) die in Eden before even needing an age. Objects that survive several GCs are probably long-lived (caches, connection pools, singletons) → they belong in Old Gen.
click to flip → say it in an interview
Say it in an interviewHeap
✅ Strong answer "GC Age is a 4-bit counter in the object header that increments with each Young GC survived. At the tenuring threshold (default 15, adaptively lowered under Survivor pressure), the object promotes to Old Gen. This implements the generational hypothesis: objects that survive many GC cycles are likely long-lived and belong in Old Gen where they're collected less frequently."
⚠️ Premature promotion warning sign If you see many objects age 1–2 being promoted, your Survivor regions are too small. The JVM is force-promoting because it can't fit all survivors. Fix: increase heap size, or tune -XX:SurvivorRatio to give more space to Survivor regions.
click to flip back
Compact Object Headers (JEP 519)Heap
🟡 What it is — plain English A JDK 25 change that shrinks every Java object's overhead from 12–16 bytes down to 8 bytes. By being cleverer about packing information into the header, every single object in the JVM just got a bit smaller — which means your app can fit more objects in the same heap.
⚙️ Under the hood Legacy header: 64-bit Mark Word + 32-bit Klass pointer = 96 bits (12 bytes) minimum, padded to 128 bits (16 bytes) for 8-byte alignment. JEP 519: packs the Klass pointer directly into the 64-bit Mark Word — one field instead of two. This required carefully rearranging the bit layout to fit Klass ptr + hash + age + lock state + 4 reserved bits all into 64 bits. The 4 reserved bits are explicitly saved for Project Valhalla's Value Types.
🔗 Scale impact 100 million objects × 4 bytes saved = 400 MB less heap. Better CPU cache utilisation because objects are denser. Reduced GC pressure because more objects fit per cache line, fewer cache misses during GC tracing.
click to flip → say it in an interview
Say it in an interviewHeap
✅ Strong answer "JEP 519 in JDK 25 compresses every object header from 96/128 bits to 64 bits by subsuming the Klass pointer into the Mark Word. The savings compound at scale — 400 MB+ on an app with 100 million objects. Better memory density also improves CPU cache efficiency during GC tracing. The 4 reserved bits in the new layout are forward-looking capacity for Project Valhalla."
⚠️ Observability impact Tools that parse raw object headers (JVM TI agents, some profilers, memory analysers) need updating for JDK 25's new header layout. If your JVM agent was built against JDK 21 header assumptions, it may misread headers silently.
click to flip back
07–08 GC & Cross-Region Tracking 10 terms
Write BarrierGC
🟡 What it is — plain English Every time you write a reference (e.g., oldObj.field = youngObj), the JVM quietly runs a tiny piece of hidden code around that assignment. This hidden code tells the GC about the change so GC bookkeeping stays accurate. You never see it — the JIT injects it automatically.
⚙️ Under the hood G1GC uses two write barriers: pre-write (SATB) — before overwriting a reference, log the old value to an SATB queue (so concurrent marking doesn't miss objects that were referenced before the change). Post-write (Card Table) — after the write, mark the 512-byte "card" covering the old object's address as dirty in the Card Table (so the RSet can be updated asynchronously).
🔗 The hidden cost of mutation Write-heavy code (updating object graphs in tight loops) pays ~3–5 extra instructions per reference write. Immutable objects pay zero write barrier cost after construction. This is a non-trivial reason functional/immutable styles can outperform mutable ones in GC-heavy workloads.
click to flip → say it in an interview
Say it in an interviewGC
✅ Strong answer "Write barriers are JIT-injected code that runs around every reference field write. G1 uses two: a pre-write SATB barrier (logs the old reference value to preserve marking correctness during concurrent GC), and a post-write Card Table barrier (marks the region's card dirty for async RSet update). They're the mechanism that allows concurrent GC to run while your application mutates the heap."
⚠️ ZGC's approach ZGC uses load barriers instead of write barriers — code runs when a reference is read. This enables concurrent relocation (moving objects while the app runs) but adds a tiny overhead to every reference read. Different trade-off from G1's write-barrier approach.
click to flip back
Card TableGC
🟡 What it is — plain English Imagine a map of the entire Java heap where each square on the map represents 512 bytes of actual heap. When an Old Gen object's reference field changes, the JVM marks that square on the map as "dirty". Background threads later come and process all dirty squares to update the proper cross-region tracking table.
⚙️ Under the hood Physically it's a byte array — one byte per 512-byte "card" of heap. For a 16 GB heap: 16 GB / 512 = 32 million cards = 32 MB of Card Table (very space-efficient). Marking a card dirty = writing 0 to its byte (one store instruction). Refinement threads scan the card table in chunks, find dirty (0) bytes, resolve the exact pointer inside that 512-byte card, and update the target region's RSet. Cards are then marked clean again.
🔗 Two-level design Card Table is imprecise (512-byte granularity) — a dirty card means "some reference in this 512-byte area changed". Refinement threads then do precise work to find the exact pointer. This two-level design lets write barriers be incredibly cheap (one store) while keeping RSet updates accurate.
click to flip → say it in an interview
Say it in an interviewGC
✅ Strong answer "The Card Table is a coarse-grained index of the heap at 512-byte granularity. Post-write barriers mark cards dirty with a single store. Background refinement threads process dirty cards to precisely update RSets. This two-level indirection — cheap coarse marking, expensive precise updating done asynchronously — keeps write barrier overhead minimal while maintaining GC correctness."
⚠️ Card Table thrashing If the same card is dirtied repeatedly (a frequently-mutated Old Gen object), refinement threads have extra work. In extreme cases, concurrent refinement can't keep up → STW pause to drain dirty cards. Monitor with -Xlog:gc+remset=trace.
click to flip back
Remembered Set (RSet)GC
🟡 What it is — plain English Every G1GC region keeps a notebook listing which other regions have references pointing into it. When it's time to collect a region, instead of scanning the entire 100 GB heap to find all references pointing in, the GC just reads this small notebook.
⚙️ Under the hood Physically: a per-region hash table. Key = the address of the external region containing the reference. Value = the Card Table index (which 512-byte card within that region). At Young GC time, GC scans the RSets of Young regions to find all incoming Old→Young pointers. These become additional GC roots — objects pointed to from Old Gen are treated as live. Without RSets, the GC would have to scan all of Old Gen to find these pointers, making Young GC proportional to total heap size instead of just Young Gen size.
🔗 The O(1) miracle With RSets: Young GC scans ~few KB of RSet data regardless of Old Gen size.
Without RSets: Young GC scans the entire Old Gen (could be 50+ GB).
This is the core reason G1GC can have predictable Young GC pause times independent of heap size.
click to flip → say it in an interview
Say it in an interviewGC
✅ Strong answer "RSets are per-region hash maps recording incoming cross-region references. At Young GC time, GC scans only the RSets of Young regions — not all of Old Gen. This makes Young GC pause time proportional to Young Gen size, not total heap size. It's the key engineering insight that lets G1GC work at multi-hundred-GB heap sizes."
⚠️ RSet overhead Every region has an RSet. Old Gen regions with many incoming references (e.g., a shared cache referenced by many other objects) build large RSets. Large RSets = more refinement thread work + more memory. Monitor total RSet memory with -XX:+PrintGCDetails → "Remembered Set" line.
click to flip back
Young GC (Minor Collection)GC
🟡 What it is — plain English When Eden fills up, the JVM pauses everything and cleans up Eden. It checks which objects in Eden are still needed (reachable), copies them to a Survivor region, and simply discards everything else by marking the entire Eden region as empty. Fast because most objects are garbage.
⚙️ Under the hood Stop-The-World — all application threads pause. GC roots (stack frames, static fields) are traced. RSets provide additional roots (Old→Young pointers). Live objects are evacuated (copied) to To-Survivor regions, incrementing their age. Objects age 15+ or Survivor overflow → promoted to Old Region. Eden + From-Survivor regions are atomically reclaimed. The entire Young Gen is compacted as a side effect of copying. Typical duration: 5–50 ms.
🔗 Why it's so fast The generational hypothesis: 90–98% of Eden objects are already dead at collection time. The GC only copies the ~2–10% survivors. Compare to Full GC which must scan 100% of heap.
click to flip → say it in an interview
Say it in an interviewGC
✅ Strong answer "Young GC is a STW evacuation — it copies live Young Gen objects to Survivor regions and promotes age-threshold objects to Old Gen. It's fast because the generational hypothesis holds: 90%+ of Eden objects are dead. RSets guide GC to Old→Young pointers without scanning all of Old Gen. Compaction happens automatically as a side effect of copying."
⚠️ Long Young GC pause Young GC should be fast (5–50 ms). If it's taking hundreds of ms, suspect: very large Young Gen (more to scan), huge survivor space overflow (mass promotion), or RSets that are too large (many dirty cards to process). Check -Xlog:gc* for breakdown.
click to flip back
IHOP — Initiating Heap Occupancy %GC
🟡 What it is — plain English IHOP is the "start marking now" trigger. G1GC watches Old Gen filling up and at some point says "if I don't start scanning the heap for live objects now, Old Gen will fill up completely before I finish — and that means Full GC". IHOP is the threshold (as a percentage of heap) that triggers this early warning scan.
⚙️ Under the hood Default ~45% of heap. But G1 uses Adaptive IHOP — it measures actual allocation rate and marking throughput, then calculates: "given how fast we're allocating and how long marking takes, we need to start marking when Old Gen is at X%". If allocation rate suddenly spikes, IHOP moves earlier. This is a prediction engine inside the JVM, continuously recalibrating.
🔗 What happens if IHOP fires too late Old Gen fills up before marking finishes → G1 can't reclaim space fast enough → Mixed GC can't keep up → Full GC kicks in. This is the cascade you're always trying to prevent.
click to flip → say it in an interview
Say it in an interviewGC
✅ Strong answer "IHOP is G1's trigger for concurrent marking. Adaptive IHOP continuously models allocation rate and marking speed to calculate the right moment to start — so marking finishes before Old Gen exhausts. If IHOP fires too late, the GC races against allocation and loses → Full GC. Tuning IHOP is about giving the GC enough runway to finish marking safely."
⚠️ Disabling adaptive IHOP -XX:InitiatingHeapOccupancyPercent=35 disables adaptive IHOP and fixes the threshold. Useful for workloads with extremely spiky allocation where adaptive IHOP's predictions are inaccurate. Lower threshold = more frequent marking cycles but more headroom.
click to flip back
SATB — Snapshot-At-The-BeginningGC
🟡 What it is — plain English G1GC marks live objects while your app keeps running. This creates a problem: your app is constantly changing references while GC is trying to trace them. SATB solves this by saying: "I'll mark everything that was reachable at the moment I started — even if your app later throws those references away". It takes a logical snapshot and stays loyal to it.
⚙️ Under the hood The pre-write barrier logs the old reference value before every reference field is overwritten during concurrent marking. These old values go into per-thread SATB queues. GC threads drain these queues and treat any logged reference as live (even if the app no longer holds it). This guarantees: no live object at snapshot-time is ever missed. Cost: objects that die after the snapshot are treated as live until next cycle — "float garbage".
🔗 Float garbage explained simply At marking start: object A is reachable. Your app deletes the reference to A mid-marking. SATB logged A as "was referenced" → GC keeps A alive. A gets collected in the next cycle. This one-cycle delay is called float garbage. Normal, expected, not a leak.
click to flip → say it in an interview
Say it in an interviewGC
✅ Strong answer "SATB is G1's invariant for concurrent marking correctness. By logging the pre-write value of every reference change, G1 guarantees it never loses a live object mid-marking. The trade-off: objects that become garbage after the marking snapshot aren't collected until the next cycle — this is float garbage. It's a deliberate design choice: prefer missing some garbage (float) over ever collecting a live object."
⚠️ Float garbage + heap pressure In a workload with high churn (lots of objects dying during concurrent marking), float garbage can temporarily inflate live set size. If your heap is tight, this extra uncollected garbage can push Old Gen past IHOP and trigger an unexpected marking cycle. Size the heap with this float in mind.
click to flip back
Mixed GCGC
🟡 What it is — plain English After concurrent marking finishes, G1 knows which Old Gen regions are mostly garbage. Mixed GC is like a targeted clean-up: it collects Young Gen as usual, plus the most garbage-filled Old Gen regions. It's incremental — it chips away at Old Gen over multiple cycles rather than doing it all at once.
⚙️ Under the hood G1 builds a sorted list of Old Gen regions by garbage ratio (most garbage first — hence "Garbage First"). It picks regions to collect until the estimated collection time would breach -XX:MaxGCPauseMillis, then stops. Remaining dirty Old regions are collected in subsequent Mixed GC cycles (up to -XX:G1MixedGCCountTarget = 8 by default). This is the key insight: G1 never tries to collect all of Old Gen at once — it takes bite-sized chunks within its pause budget.
🔗 Why not just Full GC? Full GC = collect all of Old Gen at once, serially, taking seconds. Mixed GC = collect the top 10% most garbage-dense Old regions, STW for 50–100 ms. Repeat 8× over the next few seconds to chip away at Old Gen. Same total work, but broken into tiny pauses that fit within the latency budget.
click to flip → say it in an interview
Say it in an interviewGC
✅ Strong answer "Mixed GC is G1's incremental Old Gen reclamation strategy. It collects all Young regions plus a pause-budget-constrained subset of Old regions — always prioritising the highest-garbage ones. It spreads Old Gen collection over multiple cycles, keeping each pause within MaxGCPauseMillis. This is the 'Garbage First' in G1GC — maximise garbage collected per millisecond of STW pause."
⚠️ If Mixed GC can't keep up Allocation rate exceeds Mixed GC's reclamation rate → Old Gen fills up → Full GC. Root causes: allocation rate too high, heap too small, too many long-lived objects being created (memory leak pattern), or Humongous objects overwhelming Old Gen.
click to flip back
Remark Phase (STW)GC
🟡 What it is — plain English After concurrent marking finishes its main scan (while your app ran), there's a backlog of reference changes logged in SATB queues that haven't been processed yet. Remark is a short pause where the app stops so GC can process that backlog cleanly and finalise which objects are live.
⚙️ Under the hood Stop-The-World. GC drains all per-thread SATB queues (the pre-write barrier logs accumulated during concurrent marking). Then processes weak references (SoftReference, WeakReference, PhantomReference) — deciding which to keep and which to clear based on heap pressure. After Remark, the live object set is precisely known. Typically very short: 1–10 ms, because concurrent marking did 95%+ of the work already.
🔗 SoftReference clearing Remark is when SoftReference.get() can start returning null. The JVM decides to clear SoftReferences during Remark if heap pressure is high (it uses a heuristic based on time since last GC and free heap ratio). SoftReferences are meant to be cleared before OOM, not on a schedule.
click to flip → say it in an interview
Say it in an interviewGC
✅ Strong answer "Remark is a short STW phase that closes out concurrent marking. It drains the SATB queues that accumulated during concurrent marking, finalising the live object set. It also processes all reference types — soft, weak, phantom. Short because concurrent marking did the heavy lifting. After Remark, G1 knows exactly what's alive and can calculate Mixed GC collection sets."
⚠️ Long Remark pauses If Remark takes longer than expected, suspect: large SATB queue backlog (application is mutating references very rapidly during marking, outpacing queue draining), or many WeakReferences/finalizers to process. Tune with -XX:G1SATBBufferSize.
click to flip back
Object PromotionGC
🟡 What it is — plain English When an object survives enough Young GC cycles (default: 15), the JVM decides it's probably going to stick around for a long time and "promotes" it to Old Gen — the part of the heap that gets collected much less often. It's like graduating from intern to permanent staff.
⚙️ Under the hood Promotion happens during Young GC evacuation. If GC Age ≥ MaxTenuringThreshold (adaptive, default 15) → copy object to an Old Region instead of Survivor. If Survivor regions are full (can't fit all live Eden objects) → mass promotion of all surviving objects regardless of age. This Survivor overflow is called premature promotion and is the main cause of Old Gen growing faster than expected.
🔗 The right objects to promote Good promotions: connection pools, caches, singletons — they're genuinely long-lived.
Bad promotions: HTTP session objects that die after a few seconds, but survived 15 GCs before being released. Old Gen then fills with these "should be short-lived" objects → more Mixed GC → worse latency.
click to flip → say it in an interview
Say it in an interviewGC
✅ Strong answer "Promotion moves objects to Old Gen when they exceed the tenuring threshold or when Survivor space overflows. Premature promotion — caused by undersized Survivor regions — is a common performance problem: medium-lived objects flood Old Gen, triggering more frequent Mixed GC cycles. Diagnosing it requires -XX:+PrintTenuringDistribution to see the age histogram of survivors."
⚠️ Survivor sizing -XX:SurvivorRatio=8 means each Survivor region is 1/(8+2) = 10% of Eden size. If your app has lots of medium-lived objects (ages 3–10), you may need smaller SurvivorRatio (larger Survivor regions) to avoid premature promotion.
click to flip back
Full GC — The Last ResortGC
🟡 What it is — plain English Full GC is the JVM's emergency mode. Everything stops. The JVM serially scans and compacts the entire heap — Young Gen, Survivor, Old Gen, Humongous — all in one go. It can take seconds. It means Mixed GC failed to keep up. In production, Full GC is a fire alarm, not normal operation.
⚙️ Under the hood G1's Full GC uses a serial, single-threaded mark-compact algorithm (improved to multi-threaded in JDK 10+). It: marks all live objects starting from all roots, computes new addresses (compaction plan), updates all references to new addresses, physically moves objects to compacted positions. The heap becomes fully compacted and fragmentation-free after a Full GC — but you paid seconds of pause time to get there.
🔗 Most common root causes 1. Memory leak (live set grows unboundedly) — objects accumulate in caches or collections that never release entries.
2. Heap too small for workload — steady-state live set is too close to heap limit.
3. Humongous object explosion — large byte[] allocations exhaust contiguous memory.
4. Allocation rate far exceeds GC throughput — need a bigger or faster GC.
click to flip → say it in an interview
Say it in an interviewGC
✅ Strong answer "Full GC is G1's last resort — a STW, stop-everything compaction of the entire heap. It means Mixed GC failed: allocation rate exceeded reclamation rate, or contiguous memory wasn't available for a Humongous object. In production it should be essentially never. The right response isn't tuning GC flags — it's finding and fixing the root cause: a memory leak, undersized heap, or Humongous allocation anti-pattern."
⚠️ Detecting it GC log line: [GC pause (G1 Compaction Pause) (young) (initial-mark) → Mixed GC. [Full GC (Allocation Failure) → emergency Full GC. Alert on any Full GC in production. Investigate with heap dumps (jmap -dump:format=b) to find what's holding the live set large.
click to flip back
Reviewed
0 / 0