👨‍💻 Design & Algorithm Insights: 2026-01-11

January 13, 2026

Mastering Advanced SQL Interview Questions

A Practical Guide for Freshers & Experienced Engineers

SQL interviews are not about memorizing syntax — they test data thinking, edge-case handling, and real-world querying skills.

This blog covers frequently asked SQL interview problems, explained step-by-step with:

clear intent
correct SQL
beginner-friendly explanations
advanced variations for experienced candidates

You can revisit this blog anytime — it’s written for long-term learning.

1️⃣ Delete Duplicate Records While Keeping One

📌 Problem

A table contains duplicate rows. You need to delete duplicates but keep one record per group.

Assume a table:

Employees(id, email)

✅ Solution Using `ROW_NUMBER()` (Best Practice)

DELETE FROM Employees
WHERE id IN (
    SELECT id
    FROM (
        SELECT id,
               ROW_NUMBER() OVER (PARTITION BY email ORDER BY id) AS rn
        FROM Employees
    ) t
    WHERE rn > 1
);

🧠 Explanation

PARTITION BY email groups duplicates
ROW_NUMBER() assigns 1, 2, 3…
Keep rn = 1, delete the rest

💡 Interview Tip

Always preview with SELECT before DELETE.

2️⃣ Find Employees Who Worked on All Projects

Tables:

Employees(emp_id)
EmployeeProjects(emp_id, project_id)
Projects(project_id)

✅ Solution Using `GROUP BY + HAVING`

SELECT emp_id
FROM EmployeeProjects
GROUP BY emp_id
HAVING COUNT(DISTINCT project_id) =
       (SELECT COUNT(*) FROM Projects);

🧠 Explanation

Count projects per employee
Compare with total project count

💡 Interview Tip

This pattern = “worked on all” / “matched all” → remember it.

3️⃣ Customers With Most Orders but Lowest Total Spend (Last Month)

Table:

Orders(order_id, customer_id, amount, order_date)

✅ Step 1: Aggregate Last Month Data

WITH last_month_orders AS (
    SELECT customer_id,
           COUNT(*) AS order_count,
           SUM(amount) AS total_amount
    FROM Orders
    WHERE order_date >= DATE_TRUNC('month', CURRENT_DATE - INTERVAL '1 month')
      AND order_date < DATE_TRUNC('month', CURRENT_DATE)
    GROUP BY customer_id
)
SELECT *
FROM last_month_orders
ORDER BY order_count DESC, total_amount ASC;

🧠 Explanation

Highest orders → order_count DESC
Lowest spend → total_amount ASC

💡 Interview Tip

Ordering by multiple business conditions is very common.

4️⃣ Products With Sales Higher Than Average Monthly Sales

Tables:

Sales(product_id, sale_amount, sale_date)
Products(product_id, product_name)

✅ Using Subquery + JOIN

SELECT p.product_id, p.product_name
FROM Products p
JOIN (
    SELECT product_id,
           AVG(sale_amount) AS avg_sales
    FROM Sales
    GROUP BY product_id
) ps ON p.product_id = ps.product_id
WHERE ps.avg_sales >
      (SELECT AVG(sale_amount) FROM Sales);

🧠 Explanation

Inner query → avg per product
Subquery → global avg
Compare both

💡 Interview Tip

Interviewers love “compare against average” questions.

5️⃣ Students in Top 10% of Their Class (Window Functions)

Table:

Students(student_id, class_id, marks)

✅ Using `PERCENT_RANK()`

SELECT student_id, class_id, marks
FROM (
    SELECT student_id,
           class_id,
           marks,
           PERCENT_RANK() OVER (PARTITION BY class_id ORDER BY marks DESC) AS pr
    FROM Students
) t
WHERE pr <= 0.10;

🧠 Explanation

PERCENT_RANK() gives percentile
<= 0.10 → top 10%

💡 Interview Tip

For rank-based questions, always think window functions.

6️⃣ Suppliers With Products Cheaper Than Category Average

(Correlated Subquery + JOIN)

Tables:

Suppliers(supplier_id, name)
Products(product_id, supplier_id, category_id, price)

✅ Solution

SELECT DISTINCT s.supplier_id, s.name
FROM Suppliers s
JOIN Products p ON s.supplier_id = p.supplier_id
WHERE p.price <
      (
        SELECT AVG(p2.price)
        FROM Products p2
        WHERE p2.category_id = p.category_id
      );

🧠 Explanation

Subquery recalculates avg per category
Compares product price with category avg

💡 Interview Tip

This is a classic correlated subquery example.

7️⃣ Customers and Their Total Order Amount

(Include Customers With No Orders)

Tables:

Orders(order_id, customer_id, amount)
Customers(customer_id, name)

✅ Correct Solution Using `LEFT JOIN`

SELECT c.customer_id,
       c.name,
       COALESCE(SUM(o.amount), 0) AS total_order_amount
FROM Customers c
LEFT JOIN Orders o
       ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.name;

🧠 Explanation

LEFT JOIN keeps all customers
COALESCE converts NULL to 0

💡 Interview Tip

If they say “include even if no records exist”, think LEFT JOIN.

🧠 Key SQL Patterns to Remember

Problem Type	Pattern
Delete duplicates	`ROW_NUMBER()`
Worked on all items	`GROUP BY + HAVING`
Top N per group	`RANK()` / `DENSE_RANK()`
Compare with average	Subquery
Percentile / top %	`PERCENT_RANK()`
Include missing data	`LEFT JOIN`

🎯 What Interviewers Really Look For

✔ Correct joins
✔ Proper grouping
✔ No missing edge cases
✔ Business logic clarity
✔ Clean, readable SQL

Not just syntax.

📌 Final Advice for Long-Term SQL Mastery

Always start with SELECT, then DELETE/UPDATE
Think in sets, not rows
Ask clarifying questions (> vs >=, date ranges)
Practice explaining why, not just how

Identifying the Top 10 Sellers for Featured Promotions (E-commerce Platform)

E-commerce platforms frequently rank sellers to decide featured placements, promotions, and incentives.

This problem simulates a real-world backend/data scenario where seller performance is distributed across multiple months and must be evaluated using business rules.

📌 Problem Statement

An e-commerce platform wants to identify the Top 10 sellers for featured promotions based on their performance over the last 3 months.

📂 Data Sources

jan_orders.csv
feb_orders.csv
mar_orders.csv

Each Record

(sellerId, orderValue, customerRating)

Each record represents one completed order.

✅ Selection Criteria

A seller is considered eligible only if all the following conditions are met:

At least 100 total orders across all months
Average customer rating ≥ 4.2
At least 20 orders in EACH month

Ranking Rule

Rank sellers by average order value
Select the Top 10 sellers

🎯 Expected Output

List<String> → sellerIds sorted by average order value (highest first)

🧠 Key Clarifications (Interview Gold)

Before coding, clarify these assumptions:

Customer rating is per order (not per seller)
Average rating = totalRatings / totalOrders
Average order value = totalOrderValue / totalOrders
A seller missing even one month is not eligible
≥ 4.2 means 4.2 is allowed

These clarifications show strong analytical thinking.

🏗 High-Level Approach

Step 1: Aggregate seller data

Use a map:

sellerId → SellerSummary

Track:

order count per month
total order value
total rating sum

Step 2: Compute derived metrics

For each seller:

totalOrders
averageRating
averageOrderValue

Step 3: Apply eligibility rules

Filter sellers who fail any condition:

totalOrders < 100
any month < 20 orders
averageRating < 4.2

Step 4: Rank & select

Sort by average order value (DESC)
Pick top 10

🧩 Java Data Model (Interview-Ready)

class SellerSummary {
    String sellerId;

    int ordersJan, ordersFeb, ordersMar;
    double totalOrderValue;
    double totalRating;

    int totalOrders() {
        return ordersJan + ordersFeb + ordersMar;
    }

    double averageRating() {
        return totalOrders() == 0 ? 0 : totalRating / totalOrders();
    }

    double averageOrderValue() {
        return totalOrders() == 0 ? 0 : totalOrderValue / totalOrders();
    }

    boolean isEligible() {
        return totalOrders() >= 100
            && ordersJan >= 20
            && ordersFeb >= 20
            && ordersMar >= 20
            && averageRating() >= 4.2;
    }
}

🧠 Stream Pipeline (Conceptual)

sellers.values().stream()
    .filter(SellerSummary::isEligible)
    .sorted(Comparator.comparingDouble(
        SellerSummary::averageOrderValue).reversed())
    .limit(10)
    .map(s -> s.sellerId)
    .toList();

⏱ Time & Space Complexity

Let:

N = total number of orders
S = number of sellers

Operation	Complexity
Aggregation	O(N)
Filtering	O(S)
Sorting	O(S log S)
Space	O(S)

This is optimal and production-ready.

❌ Common Mistakes to Avoid

Averaging averages
- Always calculate from totals
Integer division
- Use double for all averages
Ignoring monthly constraints
- Each month must meet minimum orders
Using > instead of ≥
- Ratings are inclusive here
Ranking before filtering
- Eligibility must come first

🚀 How to Extend This in Real Systems

Support N months dynamically
Stream CSV files instead of loading all data
Exclude canceled/returned orders
Weight ratings by order value
Cache top sellers for faster access

🧪 Interview Follow-Up Questions

How would you handle seller fraud or fake ratings?
What if promotions require category-wise top sellers?
How would you optimize this for millions of orders/day?
Would you precompute rankings or calculate on demand?

✅ Final Takeaway

This problem tests more than coding:

✔ data modeling
✔ aggregation logic
✔ business rule enforcement
✔ ranking correctness
✔ scalability thinking

If you can clearly explain this flow, you demonstrate strong backend and system-level thinking—exactly what interviewers want.

Rewarding the Top 5 Sales Executives Across 3 Quarters (Java + Map + Streams)

This post shows an interview-ready way to solve a common “aggregate + filter + rank” problem in Java. We’ll go from the problem statement to a clean final solution, and then cover tips, pitfalls, and improvements that interviewers typically look for.

✅ Question

A company wants to reward the Top 5 sales executives based on performance over the last 3 quarters.

Data Files

q1_sales.csv
q2_sales.csv
q3_sales.csv

Each Record

(employeeId, dealsClosed, revenue)

Selection Criteria

An employee is eligible only if all conditions are met:

Total deals closed ≥ 30 across all quarters
Average revenue per deal > $5,000
Deals closed ≥ 5 in each quarter
From eligible employees, select Top 5 by total revenue (descending)

Output

Return a list of employeeIds sorted by total revenue DESC.

🧠 Approach

This is an “aggregation + eligibility + ranking” problem. The clean approach is:

Aggregate quarter data into one summary per employee.
- Track: deals and revenue per quarter, and totals.
Apply filters:
- totalDeals ≥ 30
- dealsQ1 ≥ 5, dealsQ2 ≥ 5, dealsQ3 ≥ 5
- averageRevenuePerDeal > 5000
Sort by total revenue descending
Take top 5
Return list of employeeIds

Why Map?

A HashMap<String, EmployeeSalesSummary> gives you O(1) updates per record and keeps the code straightforward.

✅ Final Answer (Clean Java Solution)

This is a polished version of the code you started, with small improvements:

Uses constants for rule thresholds
Fixes the placeholder return (now returns top list)
Ensures filters match the problem statement
Adds a tie-breaker for consistent sorting (optional but good practice)
Removes unused imports

import java.util.*;
import java.util.stream.Collectors;

public class TopSalesExecutivesFinder {

    // ---------------- SAMPLE INPUT (Hardcoded) ----------------
    static List<SalesRecord> q1Sales = Arrays.asList(
            new SalesRecord("E1", 10, 60000),
            new SalesRecord("E2", 8, 30000),
            new SalesRecord("E3", 12, 80000)
    );

    static List<SalesRecord> q2Sales = Arrays.asList(
            new SalesRecord("E1", 9, 55000),
            new SalesRecord("E2", 12, 70000),
            new SalesRecord("E3", 10, 60000)
    );

    static List<SalesRecord> q3Sales = Arrays.asList(
            new SalesRecord("E1", 11, 65000),
            new SalesRecord("E2", 10, 50000),
            new SalesRecord("E3", 9, 55000)
    );

    public static List<String> findTopSalesExecutives() {

        // ---------------- RULE THRESHOLDS ----------------
        final int MIN_TOTAL_DEALS = 30;
        final int MIN_DEALS_EACH_QUARTER = 5;
        final double MIN_AVG_REV_PER_DEAL = 5000.0; // must be strictly greater
        final int TOP_K = 5;

        // ---------------- AGGREGATION ----------------
        Map<String, EmployeeSalesSummary> summaryMap = new HashMap<>();

        q1Sales.forEach(r ->
                summaryMap.computeIfAbsent(r.employeeId, EmployeeSalesSummary::new)
                          .addQ1(r.dealsClosed, r.revenue)
        );

        q2Sales.forEach(r ->
                summaryMap.computeIfAbsent(r.employeeId, EmployeeSalesSummary::new)
                          .addQ2(r.dealsClosed, r.revenue)
        );

        q3Sales.forEach(r ->
                summaryMap.computeIfAbsent(r.employeeId, EmployeeSalesSummary::new)
                          .addQ3(r.dealsClosed, r.revenue)
        );

        // ---------------- FILTER + SORT + PICK ----------------
        return summaryMap.values().stream()
                // 1) total deals >= 30
                .filter(s -> s.totalDeals() >= MIN_TOTAL_DEALS)
                // 2) deals >= 5 in each quarter
                .filter(s -> s.dealsEachQuarterAtLeast(MIN_DEALS_EACH_QUARTER))
                // 3) avg revenue per deal > 5000
                .filter(s -> s.averageRevenuePerDeal() > MIN_AVG_REV_PER_DEAL)
                // 4) sort by total revenue desc (tie-breaker optional)
                .sorted(Comparator.comparingDouble(EmployeeSalesSummary::totalRevenue).reversed()
                        .thenComparing(EmployeeSalesSummary::totalDeals, Comparator.reverseOrder())
                        .thenComparing(s -> s.employeeId))
                // 5) top 5
                .limit(TOP_K)
                // return employeeIds
                .map(s -> s.employeeId)
                .collect(Collectors.toList());
    }

    public static void main(String[] args) {
        List<String> result = findTopSalesExecutives();
        System.out.println("Top Sales Executives: " + result);
    }
}

// ---------------- SUPPORT CLASSES ----------------

class SalesRecord {
    String employeeId;
    int dealsClosed;
    double revenue;

    SalesRecord(String employeeId, int dealsClosed, double revenue) {
        this.employeeId = employeeId;
        this.dealsClosed = dealsClosed;
        this.revenue = revenue;
    }
}

class EmployeeSalesSummary {
    String employeeId;

    int dealsQ1, dealsQ2, dealsQ3;
    double revenueQ1, revenueQ2, revenueQ3;

    EmployeeSalesSummary(String employeeId) {
        this.employeeId = employeeId;
    }

    void addQ1(int deals, double revenue) {
        dealsQ1 += deals;
        revenueQ1 += revenue;
    }

    void addQ2(int deals, double revenue) {
        dealsQ2 += deals;
        revenueQ2 += revenue;
    }

    void addQ3(int deals, double revenue) {
        dealsQ3 += deals;
        revenueQ3 += revenue;
    }

    int totalDeals() {
        return dealsQ1 + dealsQ2 + dealsQ3;
    }

    double totalRevenue() {
        return revenueQ1 + revenueQ2 + revenueQ3;
    }

    double averageRevenuePerDeal() {
        int totalDeals = totalDeals();
        if (totalDeals == 0) return 0.0;
        return totalRevenue() / totalDeals; // safe double division
    }

    boolean dealsEachQuarterAtLeast(int minDeals) {
        return dealsQ1 >= minDeals && dealsQ2 >= minDeals && dealsQ3 >= minDeals;
    }

    @Override
    public String toString() {
        return "EmployeeSalesSummary{" +
                "employeeId='" + employeeId + '\'' +
                ", dealsQ1=" + dealsQ1 +
                ", dealsQ2=" + dealsQ2 +
                ", dealsQ3=" + dealsQ3 +
                ", totalDeals=" + totalDeals() +
                ", totalRevenue=" + totalRevenue() +
                ", avgRevPerDeal=" + averageRevenuePerDeal() +
                '}';
    }
}

🧾 What’s the Output for Your Sample Data?

With your sample input:

E1 total deals = 10 + 9 + 11 = 30 ✅
avg revenue per deal = (60000+55000+65000)/30 = 180000/30 = 6000 ✅
each quarter deals ≥ 5 ✅
total revenue = 180000
E2 total deals = 8 + 12 + 10 = 30 ✅
avg revenue per deal = (30000+70000+50000)/30 = 150000/30 = 5000 ❌ (must be > 5000)
E3 total deals = 12 + 10 + 9 = 31 ✅
avg revenue per deal = (80000+60000+55000)/31 ≈ 6290 ✅
each quarter deals ≥ 5 ✅
total revenue = 195000

✅ Result:

Top Sales Executives: [E3, E1]

✅ Tips & Interview Notes

1) Don’t mix up “average revenue per deal”

The rule is:

average revenue per deal across ALL quarters

So compute:

totalRevenue / totalDeals

Not quarter averages.

2) Beware `>=` vs `>`

The problem says average revenue per deal > 5000
That means 5000 exactly should FAIL (like E2 above)

3) Avoid integer division

Here we’re safe because totalRevenue is double.
If revenue was int, you’d need:

(double) totalRevenue / totalDeals

4) Sorting tie-breakers (bonus)

If total revenue ties, you can stabilize output:

higher total deals wins
then alphabetical employeeId

Interviewers like deterministic sorting.

5) Make rules configurable

Putting thresholds in constants makes the function reusable:

change to Top 10
change min deals from 30 to 20
etc.

⏱ Complexity

Let N be total number of sales records across q1, q2, q3, and P employees.

Aggregation: O(N)
Sorting: O(P log P)
Memory: O(P)

This is optimal for the problem.

Optional Enhancements (If Asked)

Read from actual CSVs using BufferedReader
Support any number of quarters (use arrays instead of dealsQ1/Q2/Q3)
Add unit tests:
- avg exactly 5000
- missing quarter data
- fewer than 5 eligible employees

Selecting Top Players Across 3 Months in Java (Map Aggregation + Streams)

When match data is split across multiple months, the cleanest way to compute “best players” is to:

Aggregate all records into a per-player summary (matches + total score, month-wise counts)
Filter by eligibility rules
Sort by average score
Pick the top K

This post walks through a simple, interview-friendly Java solution using a HashMap + a Streams pipeline.

Problem

You’re given match performance data for the last 3 months:

month1.csv
month2.csv
month3.csv

Each record is:

(playerId, score)

Each record represents one match played by that player.

Eligibility Rules (as used in the code below)

A player is eligible if:

Played at least 4 matches overall (across all months)
Played at least 1 match in each month
Their average score across all matches is greater than 50
From eligible players, return the TOP 3 by average score (descending)

Want the original stricter rules (like 20 overall and 5 per month)?
This solution is parameterized, so you’ll only change the thresholds.

Example Input

month1:
(P1, 60), (P2, 45), (P3, 70)

month2:
(P1, 55), (P1, 56), (P2, 50), (P3, 75)

month3:
(P1, 65), (P2, 55), (P3, 68), (P3, 69)

Expected Output

["P3", "P1"]

Why?

P1 scores: 60, 55, 56, 65 → average = 59.0 ✅ (played all months + 4 matches)
P2 scores: 45, 50, 55 → average = 50.0 ❌ (must be > 50 and also only 3 matches)
P3 scores: 70, 75, 68, 69 → average = 70.5 ✅

Sorted by average score: P3, then P1

Approach

Step 1: Aggregate into a per-player summary

We build a map:

Map<String, PlayerScoreSummary> summaryByPlayer

Each player summary tracks:

matchesByMonth[m]
scoreByMonth[m]

This makes it easy to check:

“played each month”
“total matches”
“average score”

Step 2: Filter + sort + limit

After aggregation:

filter players that fail rules
sort by average score descending
take top K

Final Java Solution (Single File, Runnable)

import java.util.*;
import java.util.stream.Collectors;

public class TopPlayersSelectorBlog {

    // ---------------- SAMPLE INPUT ----------------
    static List<PlayerScore> month1 = Arrays.asList(
            new PlayerScore("P1", 60),
            new PlayerScore("P2", 45),
            new PlayerScore("P3", 70)
    );

    static List<PlayerScore> month2 = Arrays.asList(
            new PlayerScore("P1", 55),
            new PlayerScore("P1", 56),
            new PlayerScore("P2", 50),
            new PlayerScore("P3", 75)
    );

    static List<PlayerScore> month3 = Arrays.asList(
            new PlayerScore("P1", 65),
            new PlayerScore("P2", 55),
            new PlayerScore("P3", 68),
            new PlayerScore("P3", 69)
    );

    public static void main(String[] args) {
        List<List<PlayerScore>> months = List.of(month1, month2, month3);

        // Rules (easy to change)
        int minTotalMatches = 4;
        int minMatchesEachMonth = 1;
        double minAvgScore = 50.0;   // must be strictly greater than 50
        int topK = 3;

        List<String> selected = selectTopPlayers(
                months,
                minTotalMatches,
                minMatchesEachMonth,
                minAvgScore,
                topK
        );

        System.out.println("Selected Players: " + selected);
        // Expected: [P3, P1]
    }

    /**
     * Returns topK playerIds by average score (descending),
     * after applying eligibility rules.
     */
    static List<String> selectTopPlayers(
            List<List<PlayerScore>> months,
            int minTotalMatches,
            int minMatchesEachMonth,
            double minAvgScoreExclusive,
            int topK
    ) {
        int monthCount = months.size();
        Map<String, PlayerScoreSummary> summaryByPlayer = new HashMap<>();

        // 1) Aggregate (single pass over each month)
        for (int m = 0; m < monthCount; m++) {
            for (PlayerScore ps : months.get(m)) {
                summaryByPlayer
                        .computeIfAbsent(ps.playerId, id -> new PlayerScoreSummary(id, monthCount))
                        .addMatch(m, ps.score);
            }
        }

        // 2) Filter + sort + limit + map to playerId
        return summaryByPlayer.values().stream()
                .filter(s -> s.totalMatches() >= minTotalMatches)
                .filter(s -> s.matchesEachMonthAtLeast(minMatchesEachMonth))
                .filter(s -> s.averageScore() > minAvgScoreExclusive)
                .sorted(
                        Comparator.comparingDouble(PlayerScoreSummary::averageScore).reversed()
                                // optional tie-breaker for stability
                                .thenComparing(PlayerScoreSummary::totalMatches, Comparator.reverseOrder())
                                .thenComparing(s -> s.playerId)
                )
                .limit(topK)
                .map(s -> s.playerId)
                .collect(Collectors.toList());
    }

    // ---------------- DATA MODELS ----------------

    static class PlayerScore {
        final String playerId;
        final int score;

        PlayerScore(String playerId, int score) {
            this.playerId = playerId;
            this.score = score;
        }
    }

    static class PlayerScoreSummary {
        final String playerId;
        final int[] matchesByMonth;
        final int[] scoreByMonth;

        PlayerScoreSummary(String playerId, int monthCount) {
            this.playerId = playerId;
            this.matchesByMonth = new int[monthCount];
            this.scoreByMonth = new int[monthCount];
        }

        void addMatch(int monthIndex, int score) {
            matchesByMonth[monthIndex]++;
            scoreByMonth[monthIndex] += score;
        }

        int totalMatches() {
            int sum = 0;
            for (int c : matchesByMonth) sum += c;
            return sum;
        }

        int totalScore() {
            int sum = 0;
            for (int s : scoreByMonth) sum += s;
            return sum;
        }

        double averageScore() {
            int matches = totalMatches();
            if (matches == 0) return 0.0;
            return (double) totalScore() / matches;  // avoid integer division
        }

        boolean matchesEachMonthAtLeast(int minMatchesEachMonth) {
            for (int c : matchesByMonth) {
                if (c < minMatchesEachMonth) return false;
            }
            return true;
        }
    }
}

Complexity

Let N be the total number of match records across all months.

Aggregation: O(N)
Sorting eligible players: O(P log P) where P is number of players
Space: O(P) for the per-player summaries

In interviews, this is typically considered optimal and clean.

Common Pitfalls (and how this code avoids them)

1) Integer division bugs

If you do totalScore / totalMatches with integers, you truncate.
We use:

(double) totalScore / totalMatches

2) Missing months

If a player never appears in a month, their matchesByMonth[m] stays 0, and they fail the per-month filter.

3) Hardcoding month fields (scoreA/scoreB/scoreC)

Instead of scoreA, scoreB, scoreC, we use arrays so the code is:

less repetitive
easy to scale from 3 months to N months

How to Switch to the Original Stricter Rules

Just change the parameters:

int minTotalMatches = 20;
int minMatchesEachMonth = 5;
double minAvgScore = 50.0;
int topK = 3;

Everything else stays the same.

Interview Add-ons (If You Want to Impress)

Read CSVs (BufferedReader) → convert to List<PlayerScore>
Handle malformed rows safely
If the dataset is massive:
- stream file line-by-line
- aggregate in a map without storing all rows
Add unit tests for:
- “exactly 50 average should fail”
- “missing one month should fail”
- “fewer than 3 eligible players”

Java Interview Deep‑Dive Blog Pragmatic play

(DSA + HashMap/ConcurrentHashMap Internals + Polymorphism + Best Answers + Code + Learning Roadmap)

This blog is written as a complete interview-prep note based on the questions you described (DSA “move zeros in-place”, HashMap / ConcurrentHashMap internals, equals/hashCode, and polymorphism + overload resolution). It’s designed so freshers and experienced devs can both use it as a “ready-to-speak” guide.

1) Typical Interview Flow (What they were testing)

1. Self introduction (“Tell me about yourself”)

What they check:

Can you communicate clearly?
Can you summarize your work/learning?
Do you understand your own projects?

Expected answer (simple):

Name + current role/experience
Tech stack
1–2 projects + impact
Why this role/company

Good answer:

Adds measurable impact (“reduced API latency by 30%”, “handled 10k RPM”, etc.)
Mentions 1 challenge + how you solved it

Best answer (interviewer loves this):

60–90 seconds
Clear story: “I’m a Java backend developer… I built X… I enjoy Y… I’m now looking for Z…”
Ends with a bridge: “That’s why this role matches me: Java + concurrency + problem solving.”

2) DSA Problem: Move Zeros In‑Place (Order may / may not be required)

You gave this input:

{1,0,2,0,5,0,0,3,0,-2,-1}

And one possible output shown was:

{1,2,5,3,-2,-1,0,0,0,0,0}

That output preserves the relative order of non‑zero elements (stable).

But you later clarified an important twist:

“It should be in-place… and sequence should not be maintained.”

That means the interviewer may have been okay with an unstable solution (faster in practice, fewer writes).

The best move in interviews

Always ask one clarifying question:

“Should I preserve the order of non-zero elements, or any order is fine?”

That instantly shows maturity.

2.1 Expected Answer (Stable, preserves order of non‑zeros)

Approach: write all non-zeros forward, then fill remaining with zeros.

✅ Pros: stable, simple
✅ Time: O(n)
✅ Space: O(1)
⚠️ Writes: can be ~n (still fine)

Java Code (Stable)

import java.util.Arrays;

public class MoveZerosStable {
    public static void moveZerosStable(int[] arr) {
        int write = 0;

        // Move non-zeros forward (keeps order)
        for (int x : arr) {
            if (x != 0) arr[write++] = x;
        }

        // Fill rest with zeros
        while (write < arr.length) {
            arr[write++] = 0;
        }
    }

    public static void main(String[] args) {
        int[] arr = {1,0,2,0,5,0,0,3,0,-2,-1};
        moveZerosStable(arr);
        System.out.println(Arrays.toString(arr));
        // [1, 2, 5, 3, -2, -1, 0, 0, 0, 0, 0]
    }
}

Expected explanation in interview:

“I compact non-zeros using a write pointer.”
“Then I fill remaining positions with zeros.”
“O(n) time, O(1) space.”

2.2 Best Answer (Unstable, order NOT required, fewer writes)

If order does not matter, you can do a two‑pointer swap:

i from start
j from end
If arr[i] == 0, swap with arr[j] and decrement j
Otherwise increment i
Also skip trailing zeros at j

✅ Pros: in-place, single pass feel, fewer writes
✅ Time: O(n)
✅ Space: O(1)
⚠️ Order changes (which is allowed in this variant)

Java Code (Unstable)

import java.util.Arrays;

public class MoveZerosUnstable {
    public static void moveZerosUnstable(int[] arr) {
        int i = 0, j = arr.length - 1;

        while (i < j) {
            // Move j left past zeros
            while (i < j && arr[j] == 0) j--;

            if (arr[i] == 0) {
                // swap arr[i] and arr[j]
                int temp = arr[i];
                arr[i] = arr[j];
                arr[j] = temp;
                j--;
            } else {
                i++;
            }
        }
    }

    public static void main(String[] args) {
        int[] arr = {1,0,2,0,5,0,0,3,0,-2,-1};
        moveZerosUnstable(arr);
        System.out.println(Arrays.toString(arr));
        // One valid output example (order may differ):
        // [1, -1, 2, -2, 5, 3, 0, 0, 0, 0, 0]
    }
}

Best explanation:

“If order isn’t required, I’ll swap zeros with the last non-zero from the end.”
“Still O(n), but fewer writes and simpler than shifting.”

2.3 Common Follow‑Up Variant (If they bring “negative numbers to other side”)

Sometimes interviewers extend it into a 3‑way partition:

negatives on one side
zeros in the middle
positives on the other side

This is basically the Dutch National Flag pattern.

Java Code (Negatives | Zeros | Positives) — Unstable

import java.util.Arrays;

public class ThreeWayPartition {
    public static void partitionNegZeroPos(int[] arr) {
        int low = 0, mid = 0, high = arr.length - 1;

        while (mid <= high) {
            if (arr[mid] < 0) {
                swap(arr, low++, mid++);
            } else if (arr[mid] == 0) {
                mid++;
            } else { // > 0
                swap(arr, mid, high--);
            }
        }
    }

    private static void swap(int[] arr, int i, int j) {
        int t = arr[i];
        arr[i] = arr[j];
        arr[j] = t;
    }

    public static void main(String[] args) {
        int[] arr = {1,0,2,0,5,0,0,3,0,-2,-1};
        partitionNegZeroPos(arr);
        System.out.println(Arrays.toString(arr));
        // Example: [-2, -1, 0, 0, 0, 0, 0, 3, 5, 2, 1]
    }
}

3) HashMap Internals (What interviewers expect you to know)

3.1 What HashMap is (best speaking answer)

“HashMap is a hash-table based Map implementation. It stores entries in an internal bucket array, uses hashCode() + equality checks to locate keys, allows null key/value, is not synchronized, and does not guarantee ordering.” (Oracle Docs)

Oracle explicitly notes:

Permits null keys/values
Not synchronized
No ordering guarantee
Has initial capacity + load factor (default 0.75) and resizes when threshold is exceeded (Oracle Docs)

3.2 equals(), hashCode(), hash value — how to explain clearly

The “contract” (must say)

If a.equals(b) is true → a.hashCode() == b.hashCode() must be true.
Same hashCode does not mean equals is true (collisions exist).

How HashMap uses them:

Compute hashCode()
Map it to a bucket index (implementation detail)
If multiple keys land in same bucket, compare:
- hash (quick check)
- then equals() (final check)
If equals matches → update value; else add new node

Best warning example (interviewer loves it):

“If you override equals() but not hashCode(), HashMap lookups can fail or behave incorrectly.”

3.3 Collisions: linked list → tree (Java 8+)

When collisions become frequent, HashMap can convert a bucket from a linked structure to a balanced tree to improve performance. That design was introduced to handle frequent collisions more efficiently. (OpenJDK)

Interview line:

“Worst-case lookup becomes closer to O(log n) in heavily-collided buckets due to tree bins.” (OpenJDK)

3.4 Resizing (rehash) — what to mention

HashMap resizes when size > capacity * loadFactor
Default load factor is 0.75
Resizing is expensive, so if you know approximate size, pass initial capacity (Oracle Docs)

3.5 Fail-fast iterators

HashMap iterators are fail-fast (they can throw ConcurrentModificationException if structurally modified during iteration). This is documented behavior. (Oracle Docs)

4) ConcurrentHashMap (Need + internal behavior + best explanation)

4.1 Why ConcurrentHashMap is needed

HashMap is not thread-safe. In concurrent updates you can get:

Lost updates
Data races
Corrupted internal structure

So ConcurrentHashMap provides a thread-safe Map with high throughput.

4.2 The most important line to say (from Java docs)

“Retrieval operations do not entail locking.” (Oracle Docs)

That means:

get() is designed to be fast under concurrency
reads can overlap with writes safely

Also:

It does not support locking the entire table to block all access (no global “stop the world” table lock). (Oracle Docs)

4.3 Null handling difference

ConcurrentHashMap does not allow null keys or values (unlike HashMap). (Oracle Docs)

Best explanation:

In CHM, null is useful as a special meaning: “no mapping present”, so allowing null would break atomic methods and concurrent semantics.

4.4 Segment locking vs modern Java

You told them “segments + segment locking” — that was correct historically (older CHM designs), but interviewers today usually expect the Java 8+ design (more fine-grained, not segment-based).

Best safe way to answer in interview:

“Earlier versions used segmented locking. Modern implementations focus on lock-free reads and fine-grained coordination for updates, avoiding a single global lock.” (Oracle Docs)

(That phrasing is accurate and doesn’t trap you into version-specific internals.)

4.5 How CHM gives good reads + good writes

Reads (get):

No lock for retrieval (Oracle Docs)
Uses memory visibility guarantees (happens-before behavior) documented for updates and subsequent reads (Oracle Docs)

Writes (put/update):

Thread-safe updates
Avoids locking entire map
Supports atomic compound ops: computeIfAbsent, putIfAbsent, etc.

Java docs even highlight scalable frequency-map usage with LongAdder and computeIfAbsent: (Oracle Docs)

Example:

freqs.computeIfAbsent(key, k -> new LongAdder()).increment();

(That exact idea is from the docs.) (Oracle Docs)

5) Polymorphism Deep Dive (what they were really asking)

Polymorphism = “same call, different behavior”.

5.1 Types of polymorphism you should say

Compile-time polymorphism → Method overloading
Runtime polymorphism → Method overriding
Parametric polymorphism → Generics (often a bonus mention)

6) The int/float/double overload question (compiler decision)

They asked something like:

“If you have methods with int, float, double… which gets called?”

This is compile-time method overload resolution.

6.1 Key rule

Overloading is resolved by the compiler using the most specific applicable method rule. (Oracle Docs)

6.2 Practical rules (in the order compiler prefers)

Exact match
Widening primitive conversion (int → long → float → double)
Boxing (int → Integer)
Varargs (last resort)

6.3 Example you can speak in interview

void f(int x) {}
void f(float x) {}
void f(double x) {}

f(10);     // calls f(int) - exact
f(10.0f);  // calls f(float) - exact
f(10.0);   // calls f(double) - exact

Best explanation line:

“The compiler picks the most specific method available; it prefers exact match, then widening, then boxing, then varargs.” (Oracle Docs)

7) “Expected vs Good vs Best” Answers (Interview-ready scripts)

Q1: “Explain HashMap internal working”

Expected:

Buckets + hashing + collisions

Good:

Mentions load factor + resizing, not thread-safe, allows null

Best:

Adds collision-to-tree optimization and fail-fast iterators, and explains equals/hashCode contract
(and quotes: not synchronized + null allowed + ordering not guaranteed) (Oracle Docs)

Q2: “equals vs hashCode”

Expected:

If equals true then hashCode same

Good:

Explains collision possibility

Best:

Gives bug scenario: override equals only → HashMap lookups fail

Q3: “Why ConcurrentHashMap? How it works?”

Expected:

Thread-safe map, better than synchronized HashMap

Good:

Mentions better concurrency than Hashtable, no global lock

Best:

Says: “Reads don’t lock” (Oracle Docs)
Says: “No null keys/values” (Oracle Docs)
Mentions atomic operations like computeIfAbsent (and the LongAdder pattern) (Oracle Docs)

Q4: “Explain polymorphism”

Expected:

Overloading + overriding

Good:

Compile-time vs runtime

Best:

Adds: “Overloading is compile-time (compiler picks most specific). Overriding is runtime dispatch based on actual object.” (Oracle Docs)

8) Mini “Source Code Pack” You Can Keep in Your Notes

A) Move zeros (stable)

(Already given above)

B) Move zeros (unstable, best when order not required)

(Already given above)

C) equals/hashCode demo (HashMap bug)

import java.util.*;

class User {
    int id;
    User(int id) { this.id = id; }

    @Override public boolean equals(Object o) {
        if (!(o instanceof User)) return false;
        return this.id == ((User)o).id;
    }

    // Uncomment this to fix it
    // @Override public int hashCode() { return Integer.hashCode(id); }
}

public class EqualsHashCodeBug {
    public static void main(String[] args) {
        Map<User, String> map = new HashMap<>();
        map.put(new User(7), "Alice");

        System.out.println(map.get(new User(7))); // null if hashCode not overridden
    }
}

9) What to study next (Roadmap + Courses)

Track A (Freshers)

Java basics + OOP + Collections
DSA patterns (arrays, two pointers, hashing, sliding window)
Core concurrency basics
SQL + REST basics
Mini backend project

Track B (Experienced)

Deep collections internals (HashMap/CHM)
Java Memory Model basics (visibility, happens-before)
Threading + executors + locks
System design basics (API design, caching, DB indexing)
Performance thinking (latency vs throughput)

Recommended Courses (Practical + Recognized)

Java Foundation

Java Programming and Software Engineering Fundamentals (Duke / Coursera) (Coursera)

DSA in Java

Algorithms, Part I (Princeton / Coursera) (Coursera)
(Strong for fundamentals like sorting/searching/data structures with performance analysis.)

Coding Interview Patterns

Grokking the Coding Interview Patterns (Educative) (Educative)

Java Multithreading / Concurrency

Java Multithreading, Concurrency & Performance Optimization (Udemy) (Udemy)
(Very aligned with “volatile / concurrency / parallelism / performance” discussions.)

10) Final “Best Interview Tip” for This Exact Set of Questions

When they ask something broad like HashMap or polymorphism:

✅ Start with a clean high-level explanation (10–15 seconds)
✅ Then go 1 level deeper (internals / rules / trade-offs)
✅ Then give one example (tiny code or scenario)
✅ End with why it matters (performance, correctness, concurrency)

That’s how you sound “senior” even as a fresher.

If you want, I can also create:

a ready-to-post Medium/LinkedIn version (cleaner formatting + story tone),
a 1-page cheat sheet PDF of “best answers” (HashMap/CHM/Polymorphism + code patterns),
a mock Q&A script (interviewer asks, you answer) based exactly on these topics.

From “Top‑K Frequent” to JVM Internals: A Full Interview Debrief (with Clear Q&A + Follow‑ups) Univest

This blog captures the same arc your interview followed: a frequency-counting coding problem → reflection & Spring Boot internals → singleton + reflection edge-cases → database choice & concurrency (Postgres/MySQL/Oracle/Mongo) → distributed locks → “heap is 99% full” production triage.

I’ll write every answer in a way a fresher can understand and include the deeper details that experienced folks are expected to know.

1) Coding Problem: “Print Top K Frequent Strings”

What you wrote (core idea)

You counted frequencies using a HashMap, then put unique keys into a max-heap (PriorityQueue) ordered by frequency.

That’s a standard and acceptable solution. The improvements are in complexity explanation, correctness edge cases, and scaling.

Interview Question 1

Q: Given a list of strings, return/print the top K most frequent strings.

Simple answer (fresher-friendly)

Count how many times each string appears using a map.
Put the strings in a heap ordered by frequency.
Pop K times.

This works because the heap always gives you the most frequent remaining element.

Better answer (what interviewers expect)

Let:

n = total number of items
u = number of unique strings
k = number of top elements to output

Your approach is:

Counting: O(n) time, O(u) space
Building heap with all unique keys: addAll(u items) → O(u log u) time
Polling k times: O(k log u) time (poll is not O(1) for a heap)

So overall: O(n + u log u + k log u) time, O(u) space.

That “n log n” vs “u log u” mix-up is a very common interview slip—fixing it makes your explanation look sharp.

What can be improved in your code (concrete fixes)

Comparator overflow risk
map.get(b) - map.get(a) can overflow if counts are large. Use Integer.compare(...) or Long.compare(...).
Tie-breaking
If two keys have the same frequency, the heap order is unstable. Add a tie-breaker if the interviewer asks for deterministic order (e.g., lexicographic).
k bounds / null safety
If k > u, poll() can return null. Guard it.
Use long for counts
Your comment mentions “1 billion”; counts can exceed Integer.MAX_VALUE in other variants. Use long.
Better heap strategy when k is small
Don’t heap all u keys if you only need top k.

A stronger “interview-ready” version (Top‑K with Min‑Heap of size K)

This is the version that usually impresses, because it’s more optimal when k << u:

Count: O(n)
Maintain a min-heap of size k over (freq, word):
- Push each unique entry
- If heap grows beyond k, pop the smallest
Complexity: O(n + u log k) time, O(u + k) space

import java.util.*;

public class TopKFrequentStrings {

    public static List<String> topKFrequent(String[] words, int k) {
        if (words == null || k <= 0) return List.of();

        Map<String, Long> freq = new HashMap<>();
        for (String w : words) {
            freq.merge(w, 1L, Long::sum);
        }

        // Min-heap by frequency; tie-break for deterministic output (optional)
        PriorityQueue<Map.Entry<String, Long>> minHeap =
                new PriorityQueue<>(
                        Comparator.<Map.Entry<String, Long>>comparingLong(Map.Entry::getValue)
                                .thenComparing(Map.Entry::getKey)
                );

        for (Map.Entry<String, Long> e : freq.entrySet()) {
            minHeap.offer(e);
            if (minHeap.size() > k) minHeap.poll();
        }

        List<String> result = new ArrayList<>(minHeap.size());
        while (!minHeap.isEmpty()) result.add(minHeap.poll().getKey());

        // heap gives smallest first; reverse to get most frequent first
        Collections.reverse(result);
        return result;
    }

    public static void main(String[] args) {
        String[] list = {"def","abc","def","abc","def","ghi"};
        System.out.println(topKFrequent(list, 3));
    }
}

When to “go deeper” (big input: “1 billion strings”)

If the interviewer really means 1B records, you usually can’t store them all in memory as a Java array.

Good directions to mention:

Streaming counts: Read input as a stream and update a map incrementally.
External sort / MapReduce: Shard the input, count per shard, merge counts.
Approximate heavy hitters (if memory-bounded): Misra–Gries / Count–Min Sketch (great follow-up topic if the interviewer is senior).

Follow-up questions I’d ask (as interviewer)

If two strings have the same frequency, how do you order them (lexicographically)? (LeetCode 692 style)
Optimize from O(u log u) to O(u log k)—why is it better?
What if input doesn’t fit in memory? (streaming / distributed / approximate)
How would you do it concurrently? (ConcurrentHashMap + LongAdder pattern)
Can you do it in O(n) time? (Bucket sort approach; memory tradeoff)

2) Java Reflection: What, Why, Pain Points, and How It Works Internally

Interview Question 2

Q: What is reflection, and why do we need it?

Simple answer

Reflection lets Java code inspect and use classes at runtime:

List methods/fields/annotations
Create objects dynamically
Call methods dynamically

Frameworks use it to build “plug-and-play” behavior (e.g., dependency injection, serialization).

Deeper answer

Reflection is mainly used for:

Frameworks (DI containers, ORMs, serializers)
Libraries that work with unknown classes at compile time
Runtime tooling (debuggers, profilers, test frameworks)

But it comes with tradeoffs: performance overhead, weaker type safety, and potential security/encapsulation bypass.

Interview Question 3

Q: What’s the pain point of reflection?

Simple answer

It’s slower and less safe than normal method calls, and it can break encapsulation.

Deeper answer (real pain points)

Performance overhead
- Extra access checks (unless suppressed)
- Boxing/unboxing + varargs arrays for Method.invoke
- Harder for JVM to optimize vs direct calls (though modern JVMs got better)
Security / encapsulation concerns
Reflection can suppress access checks using setAccessible(true) (with restrictions). Oracle’s AccessibleObject docs describe that it can “suppress checks for Java language access control.”
Maintainability
Renaming a method/field breaks reflective code at runtime, not compile time.
AOT / Native Image friendliness
Reflection-heavy apps are harder to compile ahead-of-time because the compiler can’t see all runtime access paths. Spring’s AOT docs explicitly frame AOT as helping startup by using generated initialization code rather than doing everything dynamically at runtime.
Spring Boot’s Native Image section explains key differences of native images (AOT compiled executables).

Interview Question 4

Q: How does reflection work internally, and where does the JIT come in?

Simple answer

Reflection uses class metadata stored in the JVM. When you call something like method.invoke(obj, args), the JVM uses that metadata to locate the method and execute it. The JIT can still optimize the underlying code paths once they become “hot.”

Deeper answer (what impresses)

Modern Java reflection internals evolved significantly.
OpenJDK’s JEP 416 reimplemented core reflection (Method, Constructor, Field) on top of method handles, aiming to reduce duplicated mechanisms and maintenance cost.
Once reflection uses method handles internally, HotSpot can JIT-compile those invocation paths as they get hot (just like other Java code).

Nice interview line:
“Reflection is dynamic, but it’s not ‘interpreted forever.’ Hot paths can still get optimized; however, it’s harder to get as clean as direct calls and can still carry extra overhead like argument adaptation.”

Follow-up questions I’d ask

What is setAccessible(true) and why is it dangerous?
Reflection vs dynamic proxies vs bytecode generation—when to use which?
What breaks when moving to Java modules (Java 9+)?
How does Spring reduce reflection overhead in newer versions? (AOT)

3) Spring Boot: Why Use It, Drawbacks, and Alternatives

Interview Question 5

Q: Why do we use Spring Boot? What problem does it solve?

Simple answer

Spring Boot makes building Spring applications faster by:

Auto-configuring common components
Providing “starter” dependencies
Running with an embedded server
Giving production features (metrics, health checks)

Deeper answer

Spring Boot’s auto-configuration tries to configure your app based on what’s on the classpath (e.g., if an embedded DB is present, it can auto-configure it).
It also “backs off” if you define your own beans (custom configuration overrides defaults). (This is a key concept interviewers like.)

Interview Question 6

Q: What are drawbacks of Spring Boot?

Simple answer

It can be heavier (startup time, memory), and sometimes feels “magical” which makes debugging harder.

Deeper answer

Typical drawbacks:

Classpath scanning + reflection (startup cost)
Large dependency graph (bigger footprint)
Hidden configuration unless you know where to look
Cold start concerns in serverless/container autoscaling scenarios

But: modern Spring has AOT processing and native images, which explicitly target startup time improvements by generating initialization code at build time.

Interview Question 7

Q: Alternatives to Spring Boot? Why would you choose them?

Options (with the “why”)

Quarkus: pushes work to build time (augmentation) to reduce runtime overhead; their performance page explains it scans and builds a model at build time to compute startup instructions.
Micronaut: designed to use DI/AOP with no runtime reflection, making it easier for GraalVM/native usage; Micronaut explicitly states its DI/AOP runtime uses no reflection.
Others you can mention (no need to over-explain unless asked): Dropwizard, Vert.x, Helidon, Ktor.

Follow-up questions I’d ask

Explain Spring Boot auto-configuration and the “back off” behavior.
What is Spring AOT and what restrictions does it introduce?
Spring Boot JVM vs Native Image: tradeoffs?
Compare Quarkus build-time model vs Spring runtime model.

4) Singleton Pattern + Reflection: Can It Be Broken?

Interview Question 8

Q: How does reflection break a singleton?

Simple answer

Even if the constructor is private, reflection can access it and create another instance.

Deeper answer

If you do:

Constructor<?> c = X.class.getDeclaredConstructor();
c.setAccessible(true);
c.newInstance();

…you can create a second instance unless you add protections.

(And yes, setAccessible is the classic “break encapsulation” lever.)

Interview Question 9

Q: If I create the singleton instance as static final, does that stop reflection?

Simple answer

Not by itself.

Better answer

static final (eager init) ensures thread safety and one instance via normal code paths.
But reflection can still call the constructor again unless the constructor blocks it.

A common mitigation:

Keep a static flag or check inside constructor and throw if instance already exists.

Important nuance (that interviewers like):
This mitigation helps, but reflection can sometimes still bypass if it’s called before class initialization in some tricky setups—or if attackers can modify private fields (again via reflection). So it’s a deterrent, not a perfect defense in hostile environments.

Interview Question 10

Q: Is enum singleton really unbreakable? The interviewer claimed “it can still be breakable.”

Simple answer

In normal Java, enum singletons are the safest.

Deeper, accurate answer

Java forbids reflectively instantiating enum types. Oracle’s tutorial shows attempting to instantiate an enum via reflection leads to an IllegalArgumentException and states enum instantiation is forbidden.

So in standard Java reflection, enum singleton is effectively “reflection-proof.”

But (to address the interviewer’s “claim” professionally):

With extreme techniques (internal JDK APIs, Unsafe, custom JVM args opening modules), you can sometimes do things Java normally forbids. That’s outside typical application threat models, but it’s good to acknowledge: “If the attacker can run arbitrary code with deep JVM access, many guarantees can be bypassed.”

Follow-up questions I’d ask

How does serialization break singleton, and how do you fix it? (readResolve)
Can cloning break singleton? How do you prevent it?
Singleton in DI frameworks (Spring): do you still need manual singleton?
When is singleton a bad idea? (testability, hidden dependencies)

5) Databases: Choosing Between MySQL, Postgres, Oracle, Mongo + Concurrency

You said you “played CAP theorem.” That’s a common move, but interviewers often want practical DB-engine reasoning: consistency/isolation model, locking, indexing, operational features, licensing/support.

Interview Question 11

Q: How do you choose between MySQL and PostgreSQL?

Simple answer

Pick based on:

Your query complexity
Needed features
Team expertise
Operational constraints

Deeper answer (useful decision points)

PostgreSQL

Strong concurrency story with MVCC (documented in the concurrency control chapter).
Needs maintenance like VACUUM because old row versions accumulate (Postgres docs describe why routine vacuuming matters).
Isolation levels are well defined; Postgres default is Read Committed and the docs explain snapshot visibility rules.
Architecture: “process per user” model (one backend process per connection).

MySQL (InnoDB)

InnoDB uses “consistent reads” and snapshot behavior differs by isolation level; MySQL docs explain how REPEATABLE READ and READ COMMITTED affect consistent reads.
Generally great for many OLTP workloads; ecosystem maturity is strong.

A clean interview positioning

“If we need advanced SQL, strong extension ecosystem, complex queries, and richer indexing types → Postgres.”
“If we want simpler operational patterns in some orgs, wide community tooling, and a straightforward OLTP setup → MySQL.”
Then ask: “What’s the workload? write-heavy? read-heavy? latency SLA? HA requirements?”

Interview Question 12

Q: How does PostgreSQL handle concurrency internally (MVCC + isolation)?

Simple answer

Postgres keeps multiple versions of rows so readers don’t block writers (most of the time). That’s MVCC.

Deeper answer

Postgres MVCC behavior is documented under Concurrency Control.
Isolation levels (Read Committed / Repeatable Read / Serializable) define what snapshot you see; Postgres docs clearly describe that Read Committed sees a snapshot per query, while Repeatable Read uses a snapshot from transaction start.
Because old row versions remain, VACUUM is needed to reclaim space and prevent bloat; the docs explain dead row versions and vacuum behavior.

Interview Question 13

Q: MongoDB concurrency/consistency—how do read/write concerns work?

Simple answer

MongoDB lets you tune how consistent reads are and how durable writes are.

Deeper answer

Write concern controls acknowledgment requirements.
Read concern "snapshot" gives snapshot guarantees only when the transaction commits with write concern "majority".
This is a good point to mention when someone says “NoSQL is always eventually consistent” — MongoDB lets you choose the consistency/durability tradeoff depending on read/write concern and topology.

Interview Question 14

Q: Why is Oracle paid? Is it because Oracle is “more consistent” than Postgres?

Simple answer

No—both can be strongly consistent and ACID. Oracle is paid because of licensing + enterprise-grade features + paid support.

Deeper answer with specifics

Oracle has formal licensing documents detailing editions, entitlements, and restrictions.
Oracle also provides structured support tiers (Premier/Extended/Sustaining) described in its Lifetime Support Policy.
Oracle RAC is a flagship HA/scale architecture positioned as a major enterprise feature.
So the “paid” part is commercial licensing + bundled enterprise tooling + support lifecycle + certain advanced features (often HA, clustering, management packs, etc.), not “Oracle is the only consistent DB.”

Meanwhile PostgreSQL is open source under the PostgreSQL License, and the project states it’s committed to remaining free and open source.

Follow-up questions I’d ask

Explain MVCC vs locking reads in Postgres. When do readers still block writers?
What causes table bloat in Postgres and how do you fix it?
In MySQL InnoDB, what changes between REPEATABLE READ and READ COMMITTED?
How would you design HA for each DB (replication, failover, RPO/RTO)?
When does CAP theorem actually apply in DB selection? (distributed partitions)

6) Distributed Locking: “Redlock / Redis Locks” Style Question

Interview Question 15

Q: How do distributed locks work (e.g., Redis)?

Simple answer

A distributed lock makes sure only one process across many machines enters a critical section.

Deeper answer

Redis documents a distributed lock pattern and explains why you need it when multiple processes coordinate shared resources.

Key interview points:

A lock must have expiry (to avoid deadlocks if the holder crashes).
Unlock must be safe (only the owner can unlock).
In distributed systems, clock drift, network delays, and failover make “perfect locks” hard.

Follow-up questions I’d ask

What happens if the client pauses (GC stop-the-world) while holding the lock?
How do you prevent “unlocking someone else’s lock”?
When should you avoid distributed locks and use DB constraints / idempotency instead?

7) Production Triage: “Heap is 99% Full” (What Do You Do?)

Interview Question 16

Q: Suppose JVM heap is ~99% utilized and errors are happening. What do you do?

Simple answer

Restart might reduce symptoms, but first find the root cause.
Check logs/metrics, take a heap dump, identify the leak or high allocation.

Strong step-by-step answer (what seniors say)

Stabilize
- If OOM is imminent, reduce load (rate limit), scale out, or temporarily increase heap to stop the bleeding (but don’t stop there).
Decide: leak vs spike vs mis-sizing
- Leak: heap grows steadily and doesn’t come down after GC cycles.
- Spike: heap jumps during bursts and falls after GC.
- Mis-sizing: heap too small for normal steady-state.
Collect evidence
- Heap dump at high-water mark, GC logs, thread dumps.
- Use JDK Flight Recorder (JFR) for low-overhead profiling in production-style environments. JFR is described as an observability/monitoring framework built into the JVM.
- If memory pressure might be off-heap/native, use Native Memory Tracking (NMT); Oracle docs describe NMT and that you access it via jcmd.
Tune GC only after identifying the problem
- Default in many JDKs is G1; good general-purpose.
- For low-latency large heaps, consider ZGC or Shenandoah:
  - ZGC: scalable low-latency collector (OpenJDK project page).
  - Shenandoah: low-pause collector (JEP 189).
- If interviewer mentions “GC1” they likely mean G1GC (common).
Fix
- Remove unbounded caches
- Fix object retention (static maps, listeners, threadlocals)
- Add backpressure for high allocation paths
- Add memory/GC dashboards + alerts

Follow-up questions I’d ask

How do you distinguish heap leak vs native memory leak? (heap dump vs NMT)
What do you look for in a heap dump?
When do you pick ZGC vs G1 vs Shenandoah?
What metrics/alerts would you add so it never surprises you again?

Closing: How to Answer These in an Interview (Fast + Clear)

A good pattern that works for both freshers and experienced candidates:

Define the concept in one sentence.
Give a practical example (where it’s used in real systems).
Discuss tradeoffs (pros/cons).
Offer a better alternative / optimization when applicable.

If you want, paste the job description (or role level: SDE‑1/SDE‑2/Senior), and I’ll tailor the Q&A to exactly the depth they expect—same topics, but tuned for that level.

Categories

January 13, 2026

Mastering Advanced SQL Interview Questions

1️⃣ Delete Duplicate Records While Keeping One

📌 Problem

✅ Solution Using ROW_NUMBER() (Best Practice)

🧠 Explanation

💡 Interview Tip

2️⃣ Find Employees Who Worked on All Projects

✅ Solution Using GROUP BY + HAVING

🧠 Explanation

💡 Interview Tip

3️⃣ Customers With Most Orders but Lowest Total Spend (Last Month)

✅ Step 1: Aggregate Last Month Data

🧠 Explanation

💡 Interview Tip

4️⃣ Products With Sales Higher Than Average Monthly Sales

✅ Using Subquery + JOIN

🧠 Explanation

💡 Interview Tip

5️⃣ Students in Top 10% of Their Class (Window Functions)

✅ Using PERCENT_RANK()

🧠 Explanation

💡 Interview Tip

6️⃣ Suppliers With Products Cheaper Than Category Average

✅ Solution

🧠 Explanation

💡 Interview Tip

7️⃣ Customers and Their Total Order Amount

✅ Correct Solution Using LEFT JOIN

🧠 Explanation

💡 Interview Tip

🧠 Key SQL Patterns to Remember

🎯 What Interviewers Really Look For

📌 Final Advice for Long-Term SQL Mastery

Identifying the Top 10 Sellers for Featured Promotions (E-commerce Platform)

📌 Problem Statement

📂 Data Sources

Each Record

✅ Selection Criteria

Ranking Rule

🎯 Expected Output

🧠 Key Clarifications (Interview Gold)

🏗 High-Level Approach

Step 1: Aggregate seller data

Step 2: Compute derived metrics

Step 3: Apply eligibility rules

Step 4: Rank & select

🧩 Java Data Model (Interview-Ready)

🧠 Stream Pipeline (Conceptual)

⏱ Time & Space Complexity

❌ Common Mistakes to Avoid

🚀 How to Extend This in Real Systems

🧪 Interview Follow-Up Questions

✅ Final Takeaway

Rewarding the Top 5 Sales Executives Across 3 Quarters (Java + Map + Streams)

✅ Question

Data Files

Each Record

Selection Criteria

Output

🧠 Approach

Why Map?

✅ Final Answer (Clean Java Solution)

🧾 What’s the Output for Your Sample Data?

✅ Tips & Interview Notes

1) Don’t mix up “average revenue per deal”

2) Beware >= vs >

3) Avoid integer division

4) Sorting tie-breakers (bonus)

5) Make rules configurable

⏱ Complexity

Optional Enhancements (If Asked)

Selecting Top Players Across 3 Months in Java (Map Aggregation + Streams)

Problem

Eligibility Rules (as used in the code below)

Example Input

Expected Output

Why?

Approach

✅ Solution Using `ROW_NUMBER()` (Best Practice)

✅ Solution Using `GROUP BY + HAVING`

✅ Using `PERCENT_RANK()`

✅ Correct Solution Using `LEFT JOIN`

2) Beware `>=` vs `>`