January 13, 2026

Mastering Advanced SQL Interview Questions

A Practical Guide for Freshers & Experienced Engineers

SQL interviews are not about memorizing syntax — they test data thinking, edge-case handling, and real-world querying skills.

This blog covers frequently asked SQL interview problems, explained step-by-step with:

  • clear intent

  • correct SQL

  • beginner-friendly explanations

  • advanced variations for experienced candidates

You can revisit this blog anytime — it’s written for long-term learning.


1️⃣ Delete Duplicate Records While Keeping One

๐Ÿ“Œ Problem

A table contains duplicate rows. You need to delete duplicates but keep one record per group.

Assume a table:

Employees(id, email)

✅ Solution Using ROW_NUMBER() (Best Practice)

DELETE FROM Employees
WHERE id IN (
    SELECT id
    FROM (
        SELECT id,
               ROW_NUMBER() OVER (PARTITION BY email ORDER BY id) AS rn
        FROM Employees
    ) t
    WHERE rn > 1
);

๐Ÿง  Explanation

  • PARTITION BY email groups duplicates

  • ROW_NUMBER() assigns 1, 2, 3…

  • Keep rn = 1, delete the rest

๐Ÿ’ก Interview Tip

Always preview with SELECT before DELETE.


2️⃣ Find Employees Who Worked on All Projects

Tables:

Employees(emp_id)
EmployeeProjects(emp_id, project_id)
Projects(project_id)

✅ Solution Using GROUP BY + HAVING

SELECT emp_id
FROM EmployeeProjects
GROUP BY emp_id
HAVING COUNT(DISTINCT project_id) =
       (SELECT COUNT(*) FROM Projects);

๐Ÿง  Explanation

  • Count projects per employee

  • Compare with total project count

๐Ÿ’ก Interview Tip

This pattern = worked on all” / “matched all → remember it.


3️⃣ Customers With Most Orders but Lowest Total Spend (Last Month)

Table:

Orders(order_id, customer_id, amount, order_date)

✅ Step 1: Aggregate Last Month Data

WITH last_month_orders AS (
    SELECT customer_id,
           COUNT(*) AS order_count,
           SUM(amount) AS total_amount
    FROM Orders
    WHERE order_date >= DATE_TRUNC('month', CURRENT_DATE - INTERVAL '1 month')
      AND order_date < DATE_TRUNC('month', CURRENT_DATE)
    GROUP BY customer_id
)
SELECT *
FROM last_month_orders
ORDER BY order_count DESC, total_amount ASC;

๐Ÿง  Explanation

  • Highest orders → order_count DESC

  • Lowest spend → total_amount ASC

๐Ÿ’ก Interview Tip

Ordering by multiple business conditions is very common.


4️⃣ Products With Sales Higher Than Average Monthly Sales

Tables:

Sales(product_id, sale_amount, sale_date)
Products(product_id, product_name)

✅ Using Subquery + JOIN

SELECT p.product_id, p.product_name
FROM Products p
JOIN (
    SELECT product_id,
           AVG(sale_amount) AS avg_sales
    FROM Sales
    GROUP BY product_id
) ps ON p.product_id = ps.product_id
WHERE ps.avg_sales >
      (SELECT AVG(sale_amount) FROM Sales);

๐Ÿง  Explanation

  • Inner query → avg per product

  • Subquery → global avg

  • Compare both

๐Ÿ’ก Interview Tip

Interviewers love compare against average questions.


5️⃣ Students in Top 10% of Their Class (Window Functions)

Table:

Students(student_id, class_id, marks)

✅ Using PERCENT_RANK()

SELECT student_id, class_id, marks
FROM (
    SELECT student_id,
           class_id,
           marks,
           PERCENT_RANK() OVER (PARTITION BY class_id ORDER BY marks DESC) AS pr
    FROM Students
) t
WHERE pr <= 0.10;

๐Ÿง  Explanation

  • PERCENT_RANK() gives percentile

  • <= 0.10 → top 10%

๐Ÿ’ก Interview Tip

For rank-based questions, always think window functions.


6️⃣ Suppliers With Products Cheaper Than Category Average

(Correlated Subquery + JOIN)

Tables:

Suppliers(supplier_id, name)
Products(product_id, supplier_id, category_id, price)

✅ Solution

SELECT DISTINCT s.supplier_id, s.name
FROM Suppliers s
JOIN Products p ON s.supplier_id = p.supplier_id
WHERE p.price <
      (
        SELECT AVG(p2.price)
        FROM Products p2
        WHERE p2.category_id = p.category_id
      );

๐Ÿง  Explanation

  • Subquery recalculates avg per category

  • Compares product price with category avg

๐Ÿ’ก Interview Tip

This is a classic correlated subquery example.


7️⃣ Customers and Their Total Order Amount

(Include Customers With No Orders)

Tables:

Orders(order_id, customer_id, amount)
Customers(customer_id, name)

✅ Correct Solution Using LEFT JOIN

SELECT c.customer_id,
       c.name,
       COALESCE(SUM(o.amount), 0) AS total_order_amount
FROM Customers c
LEFT JOIN Orders o
       ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.name;

๐Ÿง  Explanation

  • LEFT JOIN keeps all customers

  • COALESCE converts NULL to 0

๐Ÿ’ก Interview Tip

If they say include even if no records exist, think LEFT JOIN.


๐Ÿง  Key SQL Patterns to Remember

Problem TypePattern
Delete duplicatesROW_NUMBER()
Worked on all itemsGROUP BY + HAVING
Top N per groupRANK() / DENSE_RANK()
Compare with averageSubquery
Percentile / top %PERCENT_RANK()
Include missing dataLEFT JOIN

๐ŸŽฏ What Interviewers Really Look For

✔ Correct joins
✔ Proper grouping
✔ No missing edge cases
✔ Business logic clarity
✔ Clean, readable SQL

Not just syntax.


๐Ÿ“Œ Final Advice for Long-Term SQL Mastery

  • Always start with SELECT, then DELETE/UPDATE

  • Think in sets, not rows

  • Ask clarifying questions (> vs >=, date ranges)

  • Practice explaining why, not just how


Identifying the Top 10 Sellers for Featured Promotions (E-commerce Platform)

E-commerce platforms frequently rank sellers to decide featured placements, promotions, and incentives.

This problem simulates a real-world backend/data scenario where seller performance is distributed across multiple months and must be evaluated using business rules.


๐Ÿ“Œ Problem Statement

An e-commerce platform wants to identify the Top 10 sellers for featured promotions based on their performance over the last 3 months.


๐Ÿ“‚ Data Sources

  • jan_orders.csv

  • feb_orders.csv

  • mar_orders.csv

Each Record

(sellerId, orderValue, customerRating)

Each record represents one completed order.


✅ Selection Criteria

A seller is considered eligible only if all the following conditions are met:

  1. At least 100 total orders across all months

  2. Average customer rating ≥ 4.2

  3. At least 20 orders in EACH month

Ranking Rule


๐ŸŽฏ Expected Output

List<String> → sellerIds sorted by average order value (highest first)

๐Ÿง  Key Clarifications (Interview Gold)

Before coding, clarify these assumptions:

  • Customer rating is per order (not per seller)

  • Average rating = totalRatings / totalOrders

  • Average order value = totalOrderValue / totalOrders

  • A seller missing even one month is not eligible

  • ≥ 4.2 means 4.2 is allowed

These clarifications show strong analytical thinking.


๐Ÿ— High-Level Approach

Step 1: Aggregate seller data

Use a map:

sellerId → SellerSummary

Track:

  • order count per month

  • total order value

  • total rating sum


Step 2: Compute derived metrics

For each seller:

  • totalOrders

  • averageRating

  • averageOrderValue


Step 3: Apply eligibility rules

Filter sellers who fail any condition:

  • totalOrders < 100

  • any month < 20 orders

  • averageRating < 4.2


Step 4: Rank & select

  • Sort by average order value (DESC)

  • Pick top 10


๐Ÿงฉ Java Data Model (Interview-Ready)

class SellerSummary {
    String sellerId;

    int ordersJan, ordersFeb, ordersMar;
    double totalOrderValue;
    double totalRating;

    int totalOrders() {
        return ordersJan + ordersFeb + ordersMar;
    }

    double averageRating() {
        return totalOrders() == 0 ? 0 : totalRating / totalOrders();
    }

    double averageOrderValue() {
        return totalOrders() == 0 ? 0 : totalOrderValue / totalOrders();
    }

    boolean isEligible() {
        return totalOrders() >= 100
            && ordersJan >= 20
            && ordersFeb >= 20
            && ordersMar >= 20
            && averageRating() >= 4.2;
    }
}

๐Ÿง  Stream Pipeline (Conceptual)

sellers.values().stream()
    .filter(SellerSummary::isEligible)
    .sorted(Comparator.comparingDouble(
        SellerSummary::averageOrderValue).reversed())
    .limit(10)
    .map(s -> s.sellerId)
    .toList();

⏱ Time & Space Complexity

Let:

  • N = total number of orders

  • S = number of sellers

OperationComplexity
AggregationO(N)
FilteringO(S)
SortingO(S log S)
SpaceO(S)

This is optimal and production-ready.


❌ Common Mistakes to Avoid

  1. Averaging averages

    • Always calculate from totals

  2. Integer division

    • Use double for all averages

  3. Ignoring monthly constraints

    • Each month must meet minimum orders

  4. Using > instead of

    • Ratings are inclusive here

  5. Ranking before filtering

    • Eligibility must come first


๐Ÿš€ How to Extend This in Real Systems


๐Ÿงช Interview Follow-Up Questions

  • How would you handle seller fraud or fake ratings?

  • What if promotions require category-wise top sellers?

  • How would you optimize this for millions of orders/day?

  • Would you precompute rankings or calculate on demand?


✅ Final Takeaway

This problem tests more than coding:

✔ data modeling
✔ aggregation logic
✔ business rule enforcement
✔ ranking correctness
✔ scalability thinking

If you can clearly explain this flow, you demonstrate strong backend and system-level thinking—exactly what interviewers want.


Rewarding the Top 5 Sales Executives Across 3 Quarters (Java + Map + Streams)

This post shows an interview-ready way to solve a common “aggregate + filter + rank” problem in Java. We’ll go from the problem statement to a clean final solution, and then cover tips, pitfalls, and improvements that interviewers typically look for.


✅ Question

A company wants to reward the Top 5 sales executives based on performance over the last 3 quarters.

Data Files

  • q1_sales.csv

  • q2_sales.csv

  • q3_sales.csv

Each Record

(employeeId, dealsClosed, revenue)

Selection Criteria

An employee is eligible only if all conditions are met:

  1. Total deals closed ≥ 30 across all quarters

  2. Average revenue per deal > $5,000

  3. Deals closed ≥ 5 in each quarter

  4. From eligible employees, select Top 5 by total revenue (descending)

Output

Return a list of employeeIds sorted by total revenue DESC.


๐Ÿง  Approach

This is an “aggregation + eligibility + ranking” problem. The clean approach is:

  1. Aggregate quarter data into one summary per employee.

    • Track: deals and revenue per quarter, and totals.

  2. Apply filters:

    • totalDeals ≥ 30

    • dealsQ1 ≥ 5, dealsQ2 ≥ 5, dealsQ3 ≥ 5

    • averageRevenuePerDeal > 5000

  3. Sort by total revenue descending

  4. Take top 5

  5. Return list of employeeIds

Why Map?

A HashMap<String, EmployeeSalesSummary> gives you O(1) updates per record and keeps the code straightforward.


✅ Final Answer (Clean Java Solution)

This is a polished version of the code you started, with small improvements:

  • Uses constants for rule thresholds

  • Fixes the placeholder return (now returns top list)

  • Ensures filters match the problem statement

  • Adds a tie-breaker for consistent sorting (optional but good practice)

  • Removes unused imports

import java.util.*;
import java.util.stream.Collectors;

public class TopSalesExecutivesFinder {

    // ---------------- SAMPLE INPUT (Hardcoded) ----------------
    static List<SalesRecord> q1Sales = Arrays.asList(
            new SalesRecord("E1", 10, 60000),
            new SalesRecord("E2", 8, 30000),
            new SalesRecord("E3", 12, 80000)
    );

    static List<SalesRecord> q2Sales = Arrays.asList(
            new SalesRecord("E1", 9, 55000),
            new SalesRecord("E2", 12, 70000),
            new SalesRecord("E3", 10, 60000)
    );

    static List<SalesRecord> q3Sales = Arrays.asList(
            new SalesRecord("E1", 11, 65000),
            new SalesRecord("E2", 10, 50000),
            new SalesRecord("E3", 9, 55000)
    );

    public static List<String> findTopSalesExecutives() {

        // ---------------- RULE THRESHOLDS ----------------
        final int MIN_TOTAL_DEALS = 30;
        final int MIN_DEALS_EACH_QUARTER = 5;
        final double MIN_AVG_REV_PER_DEAL = 5000.0; // must be strictly greater
        final int TOP_K = 5;

        // ---------------- AGGREGATION ----------------
        Map<String, EmployeeSalesSummary> summaryMap = new HashMap<>();

        q1Sales.forEach(r ->
                summaryMap.computeIfAbsent(r.employeeId, EmployeeSalesSummary::new)
                          .addQ1(r.dealsClosed, r.revenue)
        );

        q2Sales.forEach(r ->
                summaryMap.computeIfAbsent(r.employeeId, EmployeeSalesSummary::new)
                          .addQ2(r.dealsClosed, r.revenue)
        );

        q3Sales.forEach(r ->
                summaryMap.computeIfAbsent(r.employeeId, EmployeeSalesSummary::new)
                          .addQ3(r.dealsClosed, r.revenue)
        );

        // ---------------- FILTER + SORT + PICK ----------------
        return summaryMap.values().stream()
                // 1) total deals >= 30
                .filter(s -> s.totalDeals() >= MIN_TOTAL_DEALS)
                // 2) deals >= 5 in each quarter
                .filter(s -> s.dealsEachQuarterAtLeast(MIN_DEALS_EACH_QUARTER))
                // 3) avg revenue per deal > 5000
                .filter(s -> s.averageRevenuePerDeal() > MIN_AVG_REV_PER_DEAL)
                // 4) sort by total revenue desc (tie-breaker optional)
                .sorted(Comparator.comparingDouble(EmployeeSalesSummary::totalRevenue).reversed()
                        .thenComparing(EmployeeSalesSummary::totalDeals, Comparator.reverseOrder())
                        .thenComparing(s -> s.employeeId))
                // 5) top 5
                .limit(TOP_K)
                // return employeeIds
                .map(s -> s.employeeId)
                .collect(Collectors.toList());
    }

    public static void main(String[] args) {
        List<String> result = findTopSalesExecutives();
        System.out.println("Top Sales Executives: " + result);
    }
}

// ---------------- SUPPORT CLASSES ----------------

class SalesRecord {
    String employeeId;
    int dealsClosed;
    double revenue;

    SalesRecord(String employeeId, int dealsClosed, double revenue) {
        this.employeeId = employeeId;
        this.dealsClosed = dealsClosed;
        this.revenue = revenue;
    }
}

class EmployeeSalesSummary {
    String employeeId;

    int dealsQ1, dealsQ2, dealsQ3;
    double revenueQ1, revenueQ2, revenueQ3;

    EmployeeSalesSummary(String employeeId) {
        this.employeeId = employeeId;
    }

    void addQ1(int deals, double revenue) {
        dealsQ1 += deals;
        revenueQ1 += revenue;
    }

    void addQ2(int deals, double revenue) {
        dealsQ2 += deals;
        revenueQ2 += revenue;
    }

    void addQ3(int deals, double revenue) {
        dealsQ3 += deals;
        revenueQ3 += revenue;
    }

    int totalDeals() {
        return dealsQ1 + dealsQ2 + dealsQ3;
    }

    double totalRevenue() {
        return revenueQ1 + revenueQ2 + revenueQ3;
    }

    double averageRevenuePerDeal() {
        int totalDeals = totalDeals();
        if (totalDeals == 0) return 0.0;
        return totalRevenue() / totalDeals; // safe double division
    }

    boolean dealsEachQuarterAtLeast(int minDeals) {
        return dealsQ1 >= minDeals && dealsQ2 >= minDeals && dealsQ3 >= minDeals;
    }

    @Override
    public String toString() {
        return "EmployeeSalesSummary{" +
                "employeeId='" + employeeId + '\'' +
                ", dealsQ1=" + dealsQ1 +
                ", dealsQ2=" + dealsQ2 +
                ", dealsQ3=" + dealsQ3 +
                ", totalDeals=" + totalDeals() +
                ", totalRevenue=" + totalRevenue() +
                ", avgRevPerDeal=" + averageRevenuePerDeal() +
                '}';
    }
}

๐Ÿงพ What’s the Output for Your Sample Data?

With your sample input:

  • E1 total deals = 10 + 9 + 11 = 30
    avg revenue per deal = (60000+55000+65000)/30 = 180000/30 = 6000
    each quarter deals ≥ 5 ✅
    total revenue = 180000

  • E2 total deals = 8 + 12 + 10 = 30
    avg revenue per deal = (30000+70000+50000)/30 = 150000/30 = 5000 ❌ (must be > 5000)

  • E3 total deals = 12 + 10 + 9 = 31
    avg revenue per deal = (80000+60000+55000)/31 ≈ 6290
    each quarter deals ≥ 5 ✅
    total revenue = 195000

✅ Result:

Top Sales Executives: [E3, E1]

✅ Tips & Interview Notes

1) Don’t mix up “average revenue per deal”

The rule is:

average revenue per deal across ALL quarters

So compute:

totalRevenue / totalDeals

Not quarter averages.

2) Beware >= vs >

  • The problem says average revenue per deal > 5000

  • That means 5000 exactly should FAIL (like E2 above)

3) Avoid integer division

Here we’re safe because totalRevenue is double.
If revenue was int, you’d need:

(double) totalRevenue / totalDeals

4) Sorting tie-breakers (bonus)

If total revenue ties, you can stabilize output:

  • higher total deals wins

  • then alphabetical employeeId

Interviewers like deterministic sorting.

5) Make rules configurable

Putting thresholds in constants makes the function reusable:

  • change to Top 10

  • change min deals from 30 to 20

  • etc.


⏱ Complexity

Let N be total number of sales records across q1, q2, q3, and P employees.

  • Aggregation: O(N)

  • Sorting: O(P log P)

  • Memory: O(P)

This is optimal for the problem.


Optional Enhancements (If Asked)

  • Read from actual CSVs using BufferedReader

  • Support any number of quarters (use arrays instead of dealsQ1/Q2/Q3)

  • Add unit tests:

    • avg exactly 5000

    • missing quarter data

    • fewer than 5 eligible employees


Selecting Top Players Across 3 Months in Java (Map Aggregation + Streams)

When match data is split across multiple months, the cleanest way to compute “best players” is to:

  1. Aggregate all records into a per-player summary (matches + total score, month-wise counts)

  2. Filter by eligibility rules

  3. Sort by average score

  4. Pick the top K

This post walks through a simple, interview-friendly Java solution using a HashMap + a Streams pipeline.


Problem

You’re given match performance data for the last 3 months:

  • month1.csv

  • month2.csv

  • month3.csv

Each record is:

(playerId, score)

Each record represents one match played by that player.


Eligibility Rules (as used in the code below)

A player is eligible if:

  1. Played at least 4 matches overall (across all months)

  2. Played at least 1 match in each month

  3. Their average score across all matches is greater than 50

  4. From eligible players, return the TOP 3 by average score (descending)

Want the original stricter rules (like 20 overall and 5 per month)?
This solution is parameterized, so you’ll only change the thresholds.


Example Input

month1:
(P1, 60), (P2, 45), (P3, 70)

month2:
(P1, 55), (P1, 56), (P2, 50), (P3, 75)

month3:
(P1, 65), (P2, 55), (P3, 68), (P3, 69)

Expected Output

["P3", "P1"]

Why?

  • P1 scores: 60, 55, 56, 65 → average = 59.0 ✅ (played all months + 4 matches)

  • P2 scores: 45, 50, 55 → average = 50.0 ❌ (must be > 50 and also only 3 matches)

  • P3 scores: 70, 75, 68, 69 → average = 70.5 ✅

Sorted by average score: P3, then P1


Approach

Step 1: Aggregate into a per-player summary

We build a map:

Map<String, PlayerScoreSummary> summaryByPlayer

Each player summary tracks:

  • matchesByMonth[m]

  • scoreByMonth[m]

This makes it easy to check:

  • “played each month”

  • “total matches”

  • “average score”

Step 2: Filter + sort + limit

After aggregation:

  • filter players that fail rules

  • sort by average score descending

  • take top K


Final Java Solution (Single File, Runnable)

import java.util.*;
import java.util.stream.Collectors;

public class TopPlayersSelectorBlog {

    // ---------------- SAMPLE INPUT ----------------
    static List<PlayerScore> month1 = Arrays.asList(
            new PlayerScore("P1", 60),
            new PlayerScore("P2", 45),
            new PlayerScore("P3", 70)
    );

    static List<PlayerScore> month2 = Arrays.asList(
            new PlayerScore("P1", 55),
            new PlayerScore("P1", 56),
            new PlayerScore("P2", 50),
            new PlayerScore("P3", 75)
    );

    static List<PlayerScore> month3 = Arrays.asList(
            new PlayerScore("P1", 65),
            new PlayerScore("P2", 55),
            new PlayerScore("P3", 68),
            new PlayerScore("P3", 69)
    );

    public static void main(String[] args) {
        List<List<PlayerScore>> months = List.of(month1, month2, month3);

        // Rules (easy to change)
        int minTotalMatches = 4;
        int minMatchesEachMonth = 1;
        double minAvgScore = 50.0;   // must be strictly greater than 50
        int topK = 3;

        List<String> selected = selectTopPlayers(
                months,
                minTotalMatches,
                minMatchesEachMonth,
                minAvgScore,
                topK
        );

        System.out.println("Selected Players: " + selected);
        // Expected: [P3, P1]
    }

    /**
     * Returns topK playerIds by average score (descending),
     * after applying eligibility rules.
     */
    static List<String> selectTopPlayers(
            List<List<PlayerScore>> months,
            int minTotalMatches,
            int minMatchesEachMonth,
            double minAvgScoreExclusive,
            int topK
    ) {
        int monthCount = months.size();
        Map<String, PlayerScoreSummary> summaryByPlayer = new HashMap<>();

        // 1) Aggregate (single pass over each month)
        for (int m = 0; m < monthCount; m++) {
            for (PlayerScore ps : months.get(m)) {
                summaryByPlayer
                        .computeIfAbsent(ps.playerId, id -> new PlayerScoreSummary(id, monthCount))
                        .addMatch(m, ps.score);
            }
        }

        // 2) Filter + sort + limit + map to playerId
        return summaryByPlayer.values().stream()
                .filter(s -> s.totalMatches() >= minTotalMatches)
                .filter(s -> s.matchesEachMonthAtLeast(minMatchesEachMonth))
                .filter(s -> s.averageScore() > minAvgScoreExclusive)
                .sorted(
                        Comparator.comparingDouble(PlayerScoreSummary::averageScore).reversed()
                                // optional tie-breaker for stability
                                .thenComparing(PlayerScoreSummary::totalMatches, Comparator.reverseOrder())
                                .thenComparing(s -> s.playerId)
                )
                .limit(topK)
                .map(s -> s.playerId)
                .collect(Collectors.toList());
    }

    // ---------------- DATA MODELS ----------------

    static class PlayerScore {
        final String playerId;
        final int score;

        PlayerScore(String playerId, int score) {
            this.playerId = playerId;
            this.score = score;
        }
    }

    static class PlayerScoreSummary {
        final String playerId;
        final int[] matchesByMonth;
        final int[] scoreByMonth;

        PlayerScoreSummary(String playerId, int monthCount) {
            this.playerId = playerId;
            this.matchesByMonth = new int[monthCount];
            this.scoreByMonth = new int[monthCount];
        }

        void addMatch(int monthIndex, int score) {
            matchesByMonth[monthIndex]++;
            scoreByMonth[monthIndex] += score;
        }

        int totalMatches() {
            int sum = 0;
            for (int c : matchesByMonth) sum += c;
            return sum;
        }

        int totalScore() {
            int sum = 0;
            for (int s : scoreByMonth) sum += s;
            return sum;
        }

        double averageScore() {
            int matches = totalMatches();
            if (matches == 0) return 0.0;
            return (double) totalScore() / matches;  // avoid integer division
        }

        boolean matchesEachMonthAtLeast(int minMatchesEachMonth) {
            for (int c : matchesByMonth) {
                if (c < minMatchesEachMonth) return false;
            }
            return true;
        }
    }
}

Complexity

Let N be the total number of match records across all months.

  • Aggregation: O(N)

  • Sorting eligible players: O(P log P) where P is number of players

  • Space: O(P) for the per-player summaries

In interviews, this is typically considered optimal and clean.


Common Pitfalls (and how this code avoids them)

1) Integer division bugs

If you do totalScore / totalMatches with integers, you truncate.
We use:

(double) totalScore / totalMatches

2) Missing months

If a player never appears in a month, their matchesByMonth[m] stays 0, and they fail the per-month filter.

3) Hardcoding month fields (scoreA/scoreB/scoreC)

Instead of scoreA, scoreB, scoreC, we use arrays so the code is:

  • less repetitive

  • easy to scale from 3 months to N months


How to Switch to the Original Stricter Rules

Just change the parameters:

int minTotalMatches = 20;
int minMatchesEachMonth = 5;
double minAvgScore = 50.0;
int topK = 3;

Everything else stays the same.


Interview Add-ons (If You Want to Impress)

  • Read CSVs (BufferedReader) → convert to List<PlayerScore>

  • Handle malformed rows safely

  • If the dataset is massive:

    • stream file line-by-line

    • aggregate in a map without storing all rows

  • Add unit tests for:

    • “exactly 50 average should fail”

    • “missing one month should fail”

    • “fewer than 3 eligible players”


Java Interview Deep‑Dive Blog Pragmatic play

(DSA + HashMap/ConcurrentHashMap Internals + Polymorphism + Best Answers + Code + Learning Roadmap)

This blog is written as a complete interview-prep note based on the questions you described (DSA “move zeros in-place”, HashMap / ConcurrentHashMap internals, equals/hashCode, and polymorphism + overload resolution). It’s designed so freshers and experienced devs can both use it as a “ready-to-speak” guide.


1) Typical Interview Flow (What they were testing)

1. Self introduction (“Tell me about yourself”)

What they check:

  • Can you communicate clearly?

  • Can you summarize your work/learning?

  • Do you understand your own projects?

Expected answer (simple):

  • Name + current role/experience

  • Tech stack

  • 1–2 projects + impact

  • Why this role/company

Good answer:

  • Adds measurable impact (“reduced API latency by 30%”, “handled 10k RPM”, etc.)

  • Mentions 1 challenge + how you solved it

Best answer (interviewer loves this):

  • 60–90 seconds

  • Clear story: “I’m a Java backend developer… I built X… I enjoy Y… I’m now looking for Z…”

  • Ends with a bridge: “That’s why this role matches me: Java + concurrency + problem solving.”


2) DSA Problem: Move Zeros In‑Place (Order may / may not be required)

You gave this input:

{1,0,2,0,5,0,0,3,0,-2,-1}

And one possible output shown was:

{1,2,5,3,-2,-1,0,0,0,0,0}

That output preserves the relative order of non‑zero elements (stable).

But you later clarified an important twist:

“It should be in-place… and sequence should not be maintained.”

That means the interviewer may have been okay with an unstable solution (faster in practice, fewer writes).

The best move in interviews

Always ask one clarifying question:

  • “Should I preserve the order of non-zero elements, or any order is fine?”

That instantly shows maturity.


2.1 Expected Answer (Stable, preserves order of non‑zeros)

Approach: write all non-zeros forward, then fill remaining with zeros.

✅ Pros: stable, simple
✅ Time: O(n)
✅ Space: O(1)
⚠️ Writes: can be ~n (still fine)

Java Code (Stable)

import java.util.Arrays;

public class MoveZerosStable {
    public static void moveZerosStable(int[] arr) {
        int write = 0;

        // Move non-zeros forward (keeps order)
        for (int x : arr) {
            if (x != 0) arr[write++] = x;
        }

        // Fill rest with zeros
        while (write < arr.length) {
            arr[write++] = 0;
        }
    }

    public static void main(String[] args) {
        int[] arr = {1,0,2,0,5,0,0,3,0,-2,-1};
        moveZerosStable(arr);
        System.out.println(Arrays.toString(arr));
        // [1, 2, 5, 3, -2, -1, 0, 0, 0, 0, 0]
    }
}

Expected explanation in interview:

  • “I compact non-zeros using a write pointer.”

  • “Then I fill remaining positions with zeros.”

  • “O(n) time, O(1) space.”


2.2 Best Answer (Unstable, order NOT required, fewer writes)

If order does not matter, you can do a two‑pointer swap:

  • i from start

  • j from end

  • If arr[i] == 0, swap with arr[j] and decrement j

  • Otherwise increment i

  • Also skip trailing zeros at j

✅ Pros: in-place, single pass feel, fewer writes
✅ Time: O(n)
✅ Space: O(1)
⚠️ Order changes (which is allowed in this variant)

Java Code (Unstable)

import java.util.Arrays;

public class MoveZerosUnstable {
    public static void moveZerosUnstable(int[] arr) {
        int i = 0, j = arr.length - 1;

        while (i < j) {
            // Move j left past zeros
            while (i < j && arr[j] == 0) j--;

            if (arr[i] == 0) {
                // swap arr[i] and arr[j]
                int temp = arr[i];
                arr[i] = arr[j];
                arr[j] = temp;
                j--;
            } else {
                i++;
            }
        }
    }

    public static void main(String[] args) {
        int[] arr = {1,0,2,0,5,0,0,3,0,-2,-1};
        moveZerosUnstable(arr);
        System.out.println(Arrays.toString(arr));
        // One valid output example (order may differ):
        // [1, -1, 2, -2, 5, 3, 0, 0, 0, 0, 0]
    }
}

Best explanation:

  • “If order isn’t required, I’ll swap zeros with the last non-zero from the end.”

  • “Still O(n), but fewer writes and simpler than shifting.”


2.3 Common Follow‑Up Variant (If they bring “negative numbers to other side”)

Sometimes interviewers extend it into a 3‑way partition:

  • negatives on one side

  • zeros in the middle

  • positives on the other side

This is basically the Dutch National Flag pattern.

Java Code (Negatives | Zeros | Positives) — Unstable

import java.util.Arrays;

public class ThreeWayPartition {
    public static void partitionNegZeroPos(int[] arr) {
        int low = 0, mid = 0, high = arr.length - 1;

        while (mid <= high) {
            if (arr[mid] < 0) {
                swap(arr, low++, mid++);
            } else if (arr[mid] == 0) {
                mid++;
            } else { // > 0
                swap(arr, mid, high--);
            }
        }
    }

    private static void swap(int[] arr, int i, int j) {
        int t = arr[i];
        arr[i] = arr[j];
        arr[j] = t;
    }

    public static void main(String[] args) {
        int[] arr = {1,0,2,0,5,0,0,3,0,-2,-1};
        partitionNegZeroPos(arr);
        System.out.println(Arrays.toString(arr));
        // Example: [-2, -1, 0, 0, 0, 0, 0, 3, 5, 2, 1]
    }
}

3) HashMap Internals (What interviewers expect you to know)

3.1 What HashMap is (best speaking answer)

“HashMap is a hash-table based Map implementation. It stores entries in an internal bucket array, uses hashCode() + equality checks to locate keys, allows null key/value, is not synchronized, and does not guarantee ordering.” (Oracle Docs)

Oracle explicitly notes:


3.2 equals(), hashCode(), hash value — how to explain clearly

The “contract” (must say)

  • If a.equals(b) is true → a.hashCode() == b.hashCode() must be true.

  • Same hashCode does not mean equals is true (collisions exist).

How HashMap uses them:

  1. Compute hashCode()

  2. Map it to a bucket index (implementation detail)

  3. If multiple keys land in same bucket, compare:

    • hash (quick check)

    • then equals() (final check)

  4. If equals matches → update value; else add new node

Best warning example (interviewer loves it):

  • “If you override equals() but not hashCode(), HashMap lookups can fail or behave incorrectly.”


3.3 Collisions: linked list → tree (Java 8+)

When collisions become frequent, HashMap can convert a bucket from a linked structure to a balanced tree to improve performance. That design was introduced to handle frequent collisions more efficiently. (OpenJDK)

Interview line:

“Worst-case lookup becomes closer to O(log n) in heavily-collided buckets due to tree bins.” (OpenJDK)


3.4 Resizing (rehash) — what to mention

  • HashMap resizes when size > capacity * loadFactor

  • Default load factor is 0.75

  • Resizing is expensive, so if you know approximate size, pass initial capacity (Oracle Docs)


3.5 Fail-fast iterators

HashMap iterators are fail-fast (they can throw ConcurrentModificationException if structurally modified during iteration). This is documented behavior. (Oracle Docs)


4) ConcurrentHashMap (Need + internal behavior + best explanation)

4.1 Why ConcurrentHashMap is needed

HashMap is not thread-safe. In concurrent updates you can get:

  • Lost updates

  • Data races

  • Corrupted internal structure

So ConcurrentHashMap provides a thread-safe Map with high throughput.


4.2 The most important line to say (from Java docs)

“Retrieval operations do not entail locking.” (Oracle Docs)

That means:

  • get() is designed to be fast under concurrency

  • reads can overlap with writes safely

Also:

  • It does not support locking the entire table to block all access (no global “stop the world” table lock). (Oracle Docs)


4.3 Null handling difference

ConcurrentHashMap does not allow null keys or values (unlike HashMap). (Oracle Docs)

Best explanation:

  • In CHM, null is useful as a special meaning: “no mapping present”, so allowing null would break atomic methods and concurrent semantics.


4.4 Segment locking vs modern Java

You told them “segments + segment locking” — that was correct historically (older CHM designs), but interviewers today usually expect the Java 8+ design (more fine-grained, not segment-based).

Best safe way to answer in interview:

“Earlier versions used segmented locking. Modern implementations focus on lock-free reads and fine-grained coordination for updates, avoiding a single global lock.” (Oracle Docs)

(That phrasing is accurate and doesn’t trap you into version-specific internals.)


4.5 How CHM gives good reads + good writes

Reads (get):

Writes (put/update):

  • Thread-safe updates

  • Avoids locking entire map

  • Supports atomic compound ops: computeIfAbsent, putIfAbsent, etc.

Java docs even highlight scalable frequency-map usage with LongAdder and computeIfAbsent: (Oracle Docs)

Example:

freqs.computeIfAbsent(key, k -> new LongAdder()).increment();

(That exact idea is from the docs.) (Oracle Docs)


5) Polymorphism Deep Dive (what they were really asking)

Polymorphism = “same call, different behavior”.

5.1 Types of polymorphism you should say

  1. Compile-time polymorphism → Method overloading

  2. Runtime polymorphism → Method overriding

  3. Parametric polymorphismGenerics (often a bonus mention)


6) The int/float/double overload question (compiler decision)

They asked something like:

“If you have methods with int, float, double… which gets called?”

This is compile-time method overload resolution.

6.1 Key rule

Overloading is resolved by the compiler using the most specific applicable method rule. (Oracle Docs)

6.2 Practical rules (in the order compiler prefers)

  1. Exact match

  2. Widening primitive conversion (int → long → float → double)

  3. Boxing (int → Integer)

  4. Varargs (last resort)

6.3 Example you can speak in interview

void f(int x) {}
void f(float x) {}
void f(double x) {}

f(10);     // calls f(int) - exact
f(10.0f);  // calls f(float) - exact
f(10.0);   // calls f(double) - exact

Best explanation line:

“The compiler picks the most specific method available; it prefers exact match, then widening, then boxing, then varargs.” (Oracle Docs)


7) “Expected vs Good vs Best” Answers (Interview-ready scripts)

Q1: “Explain HashMap internal working”

Expected:

  • Buckets + hashing + collisions

Good:

  • Mentions load factor + resizing, not thread-safe, allows null

Best:

  • Adds collision-to-tree optimization and fail-fast iterators, and explains equals/hashCode contract
    (and quotes: not synchronized + null allowed + ordering not guaranteed) (Oracle Docs)


Q2: “equals vs hashCode”

Expected:

  • If equals true then hashCode same

Good:

  • Explains collision possibility

Best:

  • Gives bug scenario: override equals only → HashMap lookups fail


Q3: “Why ConcurrentHashMap? How it works?”

Expected:

  • Thread-safe map, better than synchronized HashMap

Good:

  • Mentions better concurrency than Hashtable, no global lock

Best:

  • Says: “Reads don’t lock” (Oracle Docs)

  • Says: “No null keys/values” (Oracle Docs)

  • Mentions atomic operations like computeIfAbsent (and the LongAdder pattern) (Oracle Docs)


Q4: “Explain polymorphism”

Expected:

  • Overloading + overriding

Good:

  • Compile-time vs runtime

Best:

  • Adds: “Overloading is compile-time (compiler picks most specific). Overriding is runtime dispatch based on actual object.” (Oracle Docs)


8) Mini “Source Code Pack” You Can Keep in Your Notes

A) Move zeros (stable)

(Already given above)

B) Move zeros (unstable, best when order not required)

(Already given above)

C) equals/hashCode demo (HashMap bug)

import java.util.*;

class User {
    int id;
    User(int id) { this.id = id; }

    @Override public boolean equals(Object o) {
        if (!(o instanceof User)) return false;
        return this.id == ((User)o).id;
    }

    // Uncomment this to fix it
    // @Override public int hashCode() { return Integer.hashCode(id); }
}

public class EqualsHashCodeBug {
    public static void main(String[] args) {
        Map<User, String> map = new HashMap<>();
        map.put(new User(7), "Alice");

        System.out.println(map.get(new User(7))); // null if hashCode not overridden
    }
}

9) What to study next (Roadmap + Courses)

Track A (Freshers)

  1. Java basics + OOP + Collections

  2. DSA patterns (arrays, two pointers, hashing, sliding window)

  3. Core concurrency basics

  4. SQL + REST basics

  5. Mini backend project

Track B (Experienced)

  1. Deep collections internals (HashMap/CHM)

  2. Java Memory Model basics (visibility, happens-before)

  3. Threading + executors + locks

  4. System design basics (API design, caching, DB indexing)

  5. Performance thinking (latency vs throughput)


Recommended Courses (Practical + Recognized)

Java Foundation

  • Java Programming and Software Engineering Fundamentals (Duke / Coursera) (Coursera)

DSA in Java

  • Algorithms, Part I (Princeton / Coursera) (Coursera)
    (Strong for fundamentals like sorting/searching/data structures with performance analysis.)

Coding Interview Patterns

  • Grokking the Coding Interview Patterns (Educative) (Educative)

Java Multithreading / Concurrency

  • Java Multithreading, Concurrency & Performance Optimization (Udemy) (Udemy)
    (Very aligned with “volatile / concurrency / parallelism / performance” discussions.)


10) Final “Best Interview Tip” for This Exact Set of Questions

When they ask something broad like HashMap or polymorphism:

✅ Start with a clean high-level explanation (10–15 seconds)
✅ Then go 1 level deeper (internals / rules / trade-offs)
✅ Then give one example (tiny code or scenario)
✅ End with why it matters (performance, correctness, concurrency)

That’s how you sound “senior” even as a fresher.


If you want, I can also create:

  • a ready-to-post Medium/LinkedIn version (cleaner formatting + story tone),

  • a 1-page cheat sheet PDF of “best answers” (HashMap/CHM/Polymorphism + code patterns),

  • a mock Q&A script (interviewer asks, you answer) based exactly on these topics.

From “Top‑K Frequent” to JVM Internals: A Full Interview Debrief (with Clear Q&A + Follow‑ups) Univest

This blog captures the same arc your interview followed: a frequency-counting coding problem → reflection & Spring Boot internals → singleton + reflection edge-cases → database choice & concurrency (Postgres/MySQL/Oracle/Mongo) → distributed locks → “heap is 99% full” production triage.

I’ll write every answer in a way a fresher can understand and include the deeper details that experienced folks are expected to know.


1) Coding Problem: “Print Top K Frequent Strings”

What you wrote (core idea)

You counted frequencies using a HashMap, then put unique keys into a max-heap (PriorityQueue) ordered by frequency.

That’s a standard and acceptable solution. The improvements are in complexity explanation, correctness edge cases, and scaling.


Interview Question 1

Q: Given a list of strings, return/print the top K most frequent strings.

Simple answer (fresher-friendly)

  1. Count how many times each string appears using a map.

  2. Put the strings in a heap ordered by frequency.

  3. Pop K times.

This works because the heap always gives you the most frequent remaining element.

Better answer (what interviewers expect)

Let:

  • n = total number of items

  • u = number of unique strings

  • k = number of top elements to output

Your approach is:

  • Counting: O(n) time, O(u) space

  • Building heap with all unique keys: addAll(u items)O(u log u) time

  • Polling k times: O(k log u) time (poll is not O(1) for a heap)

So overall: O(n + u log u + k log u) time, O(u) space.

That “n log n” vs “u log u” mix-up is a very common interview slip—fixing it makes your explanation look sharp.

What can be improved in your code (concrete fixes)

  1. Comparator overflow risk
    map.get(b) - map.get(a) can overflow if counts are large. Use Integer.compare(...) or Long.compare(...).

  2. Tie-breaking
    If two keys have the same frequency, the heap order is unstable. Add a tie-breaker if the interviewer asks for deterministic order (e.g., lexicographic).

  3. k bounds / null safety
    If k > u, poll() can return null. Guard it.

  4. Use long for counts
    Your comment mentions “1 billion”; counts can exceed Integer.MAX_VALUE in other variants. Use long.

  5. Better heap strategy when k is small
    Don’t heap all u keys if you only need top k.


A stronger “interview-ready” version (Top‑K with Min‑Heap of size K)

This is the version that usually impresses, because it’s more optimal when k << u:

  • Count: O(n)

  • Maintain a min-heap of size k over (freq, word):

    • Push each unique entry

    • If heap grows beyond k, pop the smallest

  • Complexity: O(n + u log k) time, O(u + k) space

import java.util.*;

public class TopKFrequentStrings {

    public static List<String> topKFrequent(String[] words, int k) {
        if (words == null || k <= 0) return List.of();

        Map<String, Long> freq = new HashMap<>();
        for (String w : words) {
            freq.merge(w, 1L, Long::sum);
        }

        // Min-heap by frequency; tie-break for deterministic output (optional)
        PriorityQueue<Map.Entry<String, Long>> minHeap =
                new PriorityQueue<>(
                        Comparator.<Map.Entry<String, Long>>comparingLong(Map.Entry::getValue)
                                .thenComparing(Map.Entry::getKey)
                );

        for (Map.Entry<String, Long> e : freq.entrySet()) {
            minHeap.offer(e);
            if (minHeap.size() > k) minHeap.poll();
        }

        List<String> result = new ArrayList<>(minHeap.size());
        while (!minHeap.isEmpty()) result.add(minHeap.poll().getKey());

        // heap gives smallest first; reverse to get most frequent first
        Collections.reverse(result);
        return result;
    }

    public static void main(String[] args) {
        String[] list = {"def","abc","def","abc","def","ghi"};
        System.out.println(topKFrequent(list, 3));
    }
}

When to “go deeper” (big input: “1 billion strings”)

If the interviewer really means 1B records, you usually can’t store them all in memory as a Java array.

Good directions to mention:

  • Streaming counts: Read input as a stream and update a map incrementally.

  • External sort / MapReduce: Shard the input, count per shard, merge counts.

  • Approximate heavy hitters (if memory-bounded): Misra–Gries / Count–Min Sketch (great follow-up topic if the interviewer is senior).


Related LeetCode questions (very close to this)

  • 347. Top K Frequent Elements

  • 692. Top K Frequent Words (adds lexicographic tie-breaking)

  • 451. Sort Characters By Frequency

  • 895. Maximum Frequency Stack (design + frequency tracking)


Follow-up questions I’d ask (as interviewer)

  1. If two strings have the same frequency, how do you order them (lexicographically)? (LeetCode 692 style)

  2. Optimize from O(u log u) to O(u log k)—why is it better?

  3. What if input doesn’t fit in memory? (streaming / distributed / approximate)

  4. How would you do it concurrently? (ConcurrentHashMap + LongAdder pattern)

  5. Can you do it in O(n) time? (Bucket sort approach; memory tradeoff)


2) Java Reflection: What, Why, Pain Points, and How It Works Internally

Interview Question 2

Q: What is reflection, and why do we need it?

Simple answer

Reflection lets Java code inspect and use classes at runtime:

  • List methods/fields/annotations

  • Create objects dynamically

  • Call methods dynamically

Frameworks use it to build “plug-and-play” behavior (e.g., dependency injection, serialization).

Deeper answer

Reflection is mainly used for:

  • Frameworks (DI containers, ORMs, serializers)

  • Libraries that work with unknown classes at compile time

  • Runtime tooling (debuggers, profilers, test frameworks)

But it comes with tradeoffs: performance overhead, weaker type safety, and potential security/encapsulation bypass.


Interview Question 3

Q: What’s the pain point of reflection?

Simple answer

It’s slower and less safe than normal method calls, and it can break encapsulation.

Deeper answer (real pain points)

  1. Performance overhead

    • Extra access checks (unless suppressed)

    • Boxing/unboxing + varargs arrays for Method.invoke

    • Harder for JVM to optimize vs direct calls (though modern JVMs got better)

  2. Security / encapsulation concerns
    Reflection can suppress access checks using setAccessible(true) (with restrictions). Oracle’s AccessibleObject docs describe that it can “suppress checks for Java language access control.”

  3. Maintainability
    Renaming a method/field breaks reflective code at runtime, not compile time.

  4. AOT / Native Image friendliness
    Reflection-heavy apps are harder to compile ahead-of-time because the compiler can’t see all runtime access paths. Spring’s AOT docs explicitly frame AOT as helping startup by using generated initialization code rather than doing everything dynamically at runtime.
    Spring Boot’s Native Image section explains key differences of native images (AOT compiled executables).


Interview Question 4

Q: How does reflection work internally, and where does the JIT come in?

Simple answer

Reflection uses class metadata stored in the JVM. When you call something like method.invoke(obj, args), the JVM uses that metadata to locate the method and execute it. The JIT can still optimize the underlying code paths once they become “hot.”

Deeper answer (what impresses)

  • Modern Java reflection internals evolved significantly.

  • OpenJDK’s JEP 416 reimplemented core reflection (Method, Constructor, Field) on top of method handles, aiming to reduce duplicated mechanisms and maintenance cost.

  • Once reflection uses method handles internally, HotSpot can JIT-compile those invocation paths as they get hot (just like other Java code).

Nice interview line:
“Reflection is dynamic, but it’s not ‘interpreted forever.’ Hot paths can still get optimized; however, it’s harder to get as clean as direct calls and can still carry extra overhead like argument adaptation.”


Follow-up questions I’d ask

  1. What is setAccessible(true) and why is it dangerous?

  2. Reflection vs dynamic proxies vs bytecode generation—when to use which?

  3. What breaks when moving to Java modules (Java 9+)?

  4. How does Spring reduce reflection overhead in newer versions? (AOT)


3) Spring Boot: Why Use It, Drawbacks, and Alternatives

Interview Question 5

Q: Why do we use Spring Boot? What problem does it solve?

Simple answer

Spring Boot makes building Spring applications faster by:

  • Auto-configuring common components

  • Providing “starter” dependencies

  • Running with an embedded server

  • Giving production features (metrics, health checks)

Deeper answer

Spring Boot’s auto-configuration tries to configure your app based on what’s on the classpath (e.g., if an embedded DB is present, it can auto-configure it).
It also “backs off” if you define your own beans (custom configuration overrides defaults). (This is a key concept interviewers like.)


Interview Question 6

Q: What are drawbacks of Spring Boot?

Simple answer

It can be heavier (startup time, memory), and sometimes feels “magical” which makes debugging harder.

Deeper answer

Typical drawbacks:

  • Classpath scanning + reflection (startup cost)

  • Large dependency graph (bigger footprint)

  • Hidden configuration unless you know where to look

  • Cold start concerns in serverless/container autoscaling scenarios

But: modern Spring has AOT processing and native images, which explicitly target startup time improvements by generating initialization code at build time.


Interview Question 7

Q: Alternatives to Spring Boot? Why would you choose them?

Options (with the “why”)

  • Quarkus: pushes work to build time (augmentation) to reduce runtime overhead; their performance page explains it scans and builds a model at build time to compute startup instructions.

  • Micronaut: designed to use DI/AOP with no runtime reflection, making it easier for GraalVM/native usage; Micronaut explicitly states its DI/AOP runtime uses no reflection.

  • Others you can mention (no need to over-explain unless asked): Dropwizard, Vert.x, Helidon, Ktor.


Follow-up questions I’d ask

  1. Explain Spring Boot auto-configuration and the “back off” behavior.

  2. What is Spring AOT and what restrictions does it introduce?

  3. Spring Boot JVM vs Native Image: tradeoffs?

  4. Compare Quarkus build-time model vs Spring runtime model.


4) Singleton Pattern + Reflection: Can It Be Broken?

Interview Question 8

Q: How does reflection break a singleton?

Simple answer

Even if the constructor is private, reflection can access it and create another instance.

Deeper answer

If you do:

  • Constructor<?> c = X.class.getDeclaredConstructor();

  • c.setAccessible(true);

  • c.newInstance();

…you can create a second instance unless you add protections.

(And yes, setAccessible is the classic “break encapsulation” lever.)


Interview Question 9

Q: If I create the singleton instance as static final, does that stop reflection?

Simple answer

Not by itself.

Better answer

  • static final (eager init) ensures thread safety and one instance via normal code paths.

  • But reflection can still call the constructor again unless the constructor blocks it.

A common mitigation:

  • Keep a static flag or check inside constructor and throw if instance already exists.

Important nuance (that interviewers like):
This mitigation helps, but reflection can sometimes still bypass if it’s called before class initialization in some tricky setups—or if attackers can modify private fields (again via reflection). So it’s a deterrent, not a perfect defense in hostile environments.


Interview Question 10

Q: Is enum singleton really unbreakable? The interviewer claimed “it can still be breakable.”

Simple answer

In normal Java, enum singletons are the safest.

Deeper, accurate answer

Java forbids reflectively instantiating enum types. Oracle’s tutorial shows attempting to instantiate an enum via reflection leads to an IllegalArgumentException and states enum instantiation is forbidden.

So in standard Java reflection, enum singleton is effectively “reflection-proof.”

But (to address the interviewer’s “claim” professionally):

  • With extreme techniques (internal JDK APIs, Unsafe, custom JVM args opening modules), you can sometimes do things Java normally forbids. That’s outside typical application threat models, but it’s good to acknowledge: “If the attacker can run arbitrary code with deep JVM access, many guarantees can be bypassed.”


Follow-up questions I’d ask

  1. How does serialization break singleton, and how do you fix it? (readResolve)

  2. Can cloning break singleton? How do you prevent it?

  3. Singleton in DI frameworks (Spring): do you still need manual singleton?

  4. When is singleton a bad idea? (testability, hidden dependencies)


5) Databases: Choosing Between MySQL, Postgres, Oracle, Mongo + Concurrency

You said you “played CAP theorem.” That’s a common move, but interviewers often want practical DB-engine reasoning: consistency/isolation model, locking, indexing, operational features, licensing/support.

Interview Question 11

Q: How do you choose between MySQL and PostgreSQL?

Simple answer

Pick based on:

  • Your query complexity

  • Needed features

  • Team expertise

  • Operational constraints

Deeper answer (useful decision points)

PostgreSQL

  • Strong concurrency story with MVCC (documented in the concurrency control chapter).

  • Needs maintenance like VACUUM because old row versions accumulate (Postgres docs describe why routine vacuuming matters).

  • Isolation levels are well defined; Postgres default is Read Committed and the docs explain snapshot visibility rules.

  • Architecture: “process per user” model (one backend process per connection).

MySQL (InnoDB)

  • InnoDB uses “consistent reads” and snapshot behavior differs by isolation level; MySQL docs explain how REPEATABLE READ and READ COMMITTED affect consistent reads.

  • Generally great for many OLTP workloads; ecosystem maturity is strong.

A clean interview positioning

  • “If we need advanced SQL, strong extension ecosystem, complex queries, and richer indexing types → Postgres.”

  • “If we want simpler operational patterns in some orgs, wide community tooling, and a straightforward OLTP setup → MySQL.”

  • Then ask: “What’s the workload? write-heavy? read-heavy? latency SLA? HA requirements?”


Interview Question 12

Q: How does PostgreSQL handle concurrency internally (MVCC + isolation)?

Simple answer

Postgres keeps multiple versions of rows so readers don’t block writers (most of the time). That’s MVCC.

Deeper answer

  • Postgres MVCC behavior is documented under Concurrency Control.

  • Isolation levels (Read Committed / Repeatable Read / Serializable) define what snapshot you see; Postgres docs clearly describe that Read Committed sees a snapshot per query, while Repeatable Read uses a snapshot from transaction start.

  • Because old row versions remain, VACUUM is needed to reclaim space and prevent bloat; the docs explain dead row versions and vacuum behavior.


Interview Question 13

Q: MongoDB concurrency/consistency—how do read/write concerns work?

Simple answer

MongoDB lets you tune how consistent reads are and how durable writes are.

Deeper answer

  • Write concern controls acknowledgment requirements.

  • Read concern "snapshot" gives snapshot guarantees only when the transaction commits with write concern "majority".
    This is a good point to mention when someone says “NoSQL is always eventually consistent” — MongoDB lets you choose the consistency/durability tradeoff depending on read/write concern and topology.


Interview Question 14

Q: Why is Oracle paid? Is it because Oracle is “more consistent” than Postgres?

Simple answer

No—both can be strongly consistent and ACID. Oracle is paid because of licensing + enterprise-grade features + paid support.

Deeper answer with specifics

  • Oracle has formal licensing documents detailing editions, entitlements, and restrictions.

  • Oracle also provides structured support tiers (Premier/Extended/Sustaining) described in its Lifetime Support Policy.

  • Oracle RAC is a flagship HA/scale architecture positioned as a major enterprise feature.
    So the “paid” part is commercial licensing + bundled enterprise tooling + support lifecycle + certain advanced features (often HA, clustering, management packs, etc.), not “Oracle is the only consistent DB.”

Meanwhile PostgreSQL is open source under the PostgreSQL License, and the project states it’s committed to remaining free and open source.


Follow-up questions I’d ask

  1. Explain MVCC vs locking reads in Postgres. When do readers still block writers?

  2. What causes table bloat in Postgres and how do you fix it?

  3. In MySQL InnoDB, what changes between REPEATABLE READ and READ COMMITTED?

  4. How would you design HA for each DB (replication, failover, RPO/RTO)?

  5. When does CAP theorem actually apply in DB selection? (distributed partitions)


6) Distributed Locking: “Redlock / Redis Locks” Style Question

Interview Question 15

Q: How do distributed locks work (e.g., Redis)?

Simple answer

A distributed lock makes sure only one process across many machines enters a critical section.

Deeper answer

Redis documents a distributed lock pattern and explains why you need it when multiple processes coordinate shared resources.

Key interview points:

  • A lock must have expiry (to avoid deadlocks if the holder crashes).

  • Unlock must be safe (only the owner can unlock).

  • In distributed systems, clock drift, network delays, and failover make “perfect locks” hard.


Follow-up questions I’d ask

  1. What happens if the client pauses (GC stop-the-world) while holding the lock?

  2. How do you prevent “unlocking someone else’s lock”?

  3. When should you avoid distributed locks and use DB constraints / idempotency instead?


7) Production Triage: “Heap is 99% Full” (What Do You Do?)

Interview Question 16

Q: Suppose JVM heap is ~99% utilized and errors are happening. What do you do?

Simple answer

  • Restart might reduce symptoms, but first find the root cause.

  • Check logs/metrics, take a heap dump, identify the leak or high allocation.

Strong step-by-step answer (what seniors say)

  1. Stabilize

    • If OOM is imminent, reduce load (rate limit), scale out, or temporarily increase heap to stop the bleeding (but don’t stop there).

  2. Decide: leak vs spike vs mis-sizing

    • Leak: heap grows steadily and doesn’t come down after GC cycles.

    • Spike: heap jumps during bursts and falls after GC.

    • Mis-sizing: heap too small for normal steady-state.

  3. Collect evidence

    • Heap dump at high-water mark, GC logs, thread dumps.

    • Use JDK Flight Recorder (JFR) for low-overhead profiling in production-style environments. JFR is described as an observability/monitoring framework built into the JVM.

    • If memory pressure might be off-heap/native, use Native Memory Tracking (NMT); Oracle docs describe NMT and that you access it via jcmd.

  4. Tune GC only after identifying the problem

    • Default in many JDKs is G1; good general-purpose.

    • For low-latency large heaps, consider ZGC or Shenandoah:

      • ZGC: scalable low-latency collector (OpenJDK project page).

      • Shenandoah: low-pause collector (JEP 189).

    • If interviewer mentions “GC1” they likely mean G1GC (common).

  5. Fix

    • Remove unbounded caches

    • Fix object retention (static maps, listeners, threadlocals)

    • Add backpressure for high allocation paths

    • Add memory/GC dashboards + alerts


Follow-up questions I’d ask

  1. How do you distinguish heap leak vs native memory leak? (heap dump vs NMT)

  2. What do you look for in a heap dump?

  3. When do you pick ZGC vs G1 vs Shenandoah?

  4. What metrics/alerts would you add so it never surprises you again?


Closing: How to Answer These in an Interview (Fast + Clear)

A good pattern that works for both freshers and experienced candidates:

  1. Define the concept in one sentence.

  2. Give a practical example (where it’s used in real systems).

  3. Discuss tradeoffs (pros/cons).

  4. Offer a better alternative / optimization when applicable.

If you want, paste the job description (or role level: SDE‑1/SDE‑2/Senior), and I’ll tailor the Q&A to exactly the depth they expect—same topics, but tuned for that level.