Logic & Cloud Labs

Real-world engineering scenarios focused on infrastructure, cloud computing, debugging and incident response.

Incident 01 — API Gateway Failure

CRITICAL

At 02:14 UTC, monitoring systems detected a spike in failed requests.

[02:14:22] API Gateway online.
[02:14:31] High latency detected on auth-service.
[02:14:37] Retry threshold exceeded.
[02:14:44] JWT validation failed.
[02:14:48] Database timeout after 3000ms.
[02:14:55] Worker queue saturation detected.
[02:15:03] HTTP 500 returned to client.
[02:15:11] Automatic recovery attempt started.

Engineering Tasks

Identify which service appears to be failing first.
Analyze the relationship between JWT validation failures.
Explain worker queue saturation.
Suggest engineering actions to reduce incident impact.

Incident Resolution

What is the MOST probable root cause of the incident?

Incident 07 — Kubernetes Node Failure

HIGH

Several workloads became unavailable after one cluster node stopped responding.

Node status: NotReady
Pods unavailable
Scheduler waiting for resources

Infrastructure Investigation

What is the infrastructure impact?

Code Investigation — SQL Injection

CRITICAL

Analyze the following backend authentication code.


const query =
"SELECT * FROM users
WHERE email = '" + email + "'
AND password = '" + password + "'";

Security Analysis

What is the MOST critical vulnerability?

Code Investigation — Memory Leak

HIGH

Production memory usage increases continuously over time.


const cache = [];

app.get("/data", async () => {

   const result = await fetchData();

   cache.push(result);

   return result;

});

Backend Analysis

What issue is MOST likely happening?

Code Investigation — Exposed Secret

CRITICAL

A Git repository containing credentials became public.


const AWS_SECRET_KEY =
"AKIAXXXXXXXXXXXXX";

Cloud Security Investigation

What is the MOST severe issue?

Incident 09 — Redis Queue Saturation

HIGH

Background jobs are delayed and message processing time increased drastically.

Redis latency exceeded threshold
Worker queue backlog increasing
Processing delay: 4800ms

Queue Investigation

What infrastructure issue is MOST likely occurring?

Incident 10 — PostgreSQL Replication Lag

CRITICAL

Users report outdated account information after recent transactions.

Replication lag exceeded 60 seconds
Replica node delayed
Read replica outdated

Database Investigation

What platform risk is MOST likely happening?

Code Investigation — Missing Error Handling

HIGH

Production API crashes during unexpected database failures.


app.get("/users", async (req,res) => {

   const users =
   await database.getUsers();

   res.send(users);

});

Backend Stability Analysis

What engineering issue exists in this code?

Code Investigation — Blocking Operation

HIGH

The API becomes extremely slow under high traffic.


app.get("/report", (req,res) => {

   const file =
   fs.readFileSync("huge-report.csv");

   res.send(file);

});

Performance Analysis

What is the MOST probable issue?

Incident 11 — TLS Certificate Expired

MEDIUM

Users cannot establish secure HTTPS connections to the platform.

NET::ERR_CERT_DATE_INVALID
TLS certificate expired
Secure connection rejected

Security Investigation

What impact does this MOST likely create?