What to Expect in 75 Minutes
The Format (confirmed from Optiver)
- ~10 min: Intro + warm-up (who you are, why Optiver, why tech in trading)
- ~25 min: Code snippet review — read code, find bugs/smells, suggest improvements
- ~15 min: Simple deployment plan for what you just reviewed
- ~15 min: CS fundamentals discussion — architecture, concurrency, networking, memory
- ~10 min: Behavioral + your questions
Optiver's Key Signal: HOW you think, not WHAT you know
- Ask clarifying questions before diving in — show you understand constraints
- Think out loud at all times — silence is penalized
- Engage with hints — if interviewer nudges you, pick it up and run with it
- Communicate trade-offs explicitly — "I chose X over Y because..."
- It's collaborative, not adversarial — treat it like pair programming
Your Strongest Angles (use these!)
- V2X intern: TTL caching, data adapters, real-time dashboards → real production systems
- ETH Big Data + Probabilistic AI → you understand performance and uncertainty
- TORCS KD-Tree project → you reason about time complexity and algorithmic choices
- Algorithms Lab at ETH → competitive-level DSA, you're current on this
- Head of Engineering @ ETH Entrepreneur Club → ownership, architecture decisions
Code Review — What Optiver Is Testing
They'll show you a short snippet (Python, Java, or C++ — pick your strongest). You need to do a structured review, not just spot typos.
🐛Correctness & Edge Cases
- Off-by-one errors in loops
- Null/None pointer dereferences
- Integer overflow (critical in trading: prices, quantities)
- Empty collection handling
- Concurrency bugs: race conditions, non-atomic operations
- Exception handling — are errors swallowed silently?
⚡Performance & Complexity
- Unnecessary O(n²) where O(n log n) or O(n) is possible
- Repeated work inside loops that could be cached
- Wrong data structure choice (e.g. list.contains() vs set.contains())
- Memory allocations in hot paths (critical for low-latency)
- String concatenation in loops (use StringBuilder/join)
- Locking granularity — too coarse = contention, too fine = complex
🏗️Code Quality & Design
- Single Responsibility Principle — is the function doing too much?
- Magic numbers/strings — should be named constants
- Variable names — are they descriptive?
- Dead code or unreachable branches
- Testability — is this easily unit-testable?
- Missing input validation
💬How to Structure Your Answer
- Step 1: "Let me read through this first..." (take 30-60s, read fully)
- Step 2: "At a high level, this code is doing X..." (show you understood it)
- Step 3: "I see a few issues, let me go through them by severity..."
- Step 4: Prioritize — critical (correctness) → performance → style
- Step 5: "Here's how I'd fix the most important one..." (write the fix)
- Step 6: "Are there specific constraints I should consider?"
def get_prices(order_ids):
prices = []
for id in order_ids:
if id in prices: # BUG 1
continue
price = db.query(f"SELECT price FROM orders WHERE id={id}") # BUG 2, 3
prices.append(price)
return prices
→ BUG 1: 'if id in prices' checks a list of prices, not ids — logic error
→ BUG 2: SQL injection via f-string — use parameterized queries
→ BUG 3: N queries to DB in a loop — should batch: WHERE id IN (...)
→ IMPROVEMENT: No error handling if DB query fails
→ IMPROVEMENT: No type hints, unclear what 'prices' returns
Practice — Review These Snippets
Read each snippet as if the interviewer just showed it to you. Write your review in the box, then reveal the answer to compare.
import threading
class OrderBook:
def __init__(self):
self.orders = {}
self.total_volume = 0
def add_order(self, order_id, price, quantity):
self.orders[order_id] = {"price": price, "qty": quantity}
self.total_volume += price * quantity
def remove_order(self, order_id):
order = self.orders[order_id]
self.total_volume -= order["price"] * order["qty"]
del self.orders[order_id]
def get_best_bid(self):
best = 0
for oid in self.orders:
if self.orders[oid]["price"] > best:
best = self.orders[oid]["price"]
return bestModel Answer
- Thread safety: No locking on shared state — add_order/remove_order will race. With multiple threads (common in trading), self.orders and self.total_volume can get corrupted. Need a mutex or use lock-free structures.
- KeyError: remove_order doesn't check if order_id exists — will throw KeyError on unknown ID. Add a check or use .pop() with default.
- get_best_bid is O(n): Scanning all orders every call is too slow for a hot path. Use a sorted structure (heap or sorted map by price) for O(1) best bid.
- No side distinction: Orders have no buy/sell side — get_best_bid mixes bids and asks. An order book needs separate bid/ask sides.
- Float precision: price * quantity with floats will accumulate rounding errors. Use Decimal or integer cents.
- total_volume inconsistency: If remove_order throws (KeyError), total_volume won't be decremented but the operation is aborted — state stays consistent by accident, but fragile.
struct Tick {
int instrument_id;
double price;
int volume;
char exchange[4];
};
void process_ticks(std::vector<Tick> ticks) {
std::map<int, double> vwap;
std::map<int, int> total_vol;
for (int i = 0; i <= ticks.size(); i++) {
Tick t = ticks[i];
std::string ex(t.exchange);
if (ex == "XAMS" || ex == "XLON") {
vwap[t.instrument_id] += t.price * t.volume;
total_vol[t.instrument_id] += t.volume;
}
}
for (auto& [id, sum] : vwap) {
vwap[id] = sum / total_vol[id];
}
}Model Answer
- Off-by-one (crash):
i <= ticks.size()should bei < ticks.size()— accesses one past the end, undefined behaviour. - Pass by value:
std::vector<Tick> tickscopies the entire vector on call. Must beconst std::vector<Tick>&. - String allocation in hot loop:
std::string ex(t.exchange)allocates on the heap each iteration. Usestrncmporstd::string_view. - Tick copied per iteration:
Tick t = ticks[i]copies the struct. Useconst Tick& t. - Division by zero: If total_vol[id] is 0 (can happen if volume is 0 on all ticks for an instrument), division crashes.
- Struct padding: exchange is char[4] after an int — likely no padding issue, but worth noting the layout. If exchange were char[3], padding would waste a byte.
- Double precision: Accumulating price*volume in doubles will lose precision for large volumes. Consider fixed-point.
- No return value: Computes VWAP but doesn't return or store it anywhere — dead computation.
public class PriceCache {
private HashMap<String, Double> cache = new HashMap<>();
private long lastUpdate = System.currentTimeMillis();
public Double getPrice(String instrument) {
if (System.currentTimeMillis() - lastUpdate > 5000) {
refreshAll();
}
return cache.get(instrument);
}
public void updatePrice(String instrument, double price) {
cache.put(instrument, price);
lastUpdate = System.currentTimeMillis();
}
private void refreshAll() {
cache.clear();
// ... reload from source
lastUpdate = System.currentTimeMillis();
}
}Model Answer
- Not thread-safe: HashMap is not synchronized. Concurrent getPrice + updatePrice causes ConcurrentModificationException or corrupted state. Use ConcurrentHashMap or synchronize.
- Race on lastUpdate: long reads/writes are NOT atomic on 32-bit JVMs. Use AtomicLong or volatile.
- TOCTOU in getPrice: Between checking the timestamp and reading cache, another thread could call refreshAll() which calls cache.clear() — getPrice returns null for a valid instrument.
- refreshAll() is destructive: cache.clear() wipes everything before reload. During the reload window, all gets return null. Use swap pattern: build new map, then atomically replace reference.
- Every read can trigger refresh: If the source is slow, every reader thread that sees an expired cache will trigger refreshAll() simultaneously. Need a "refreshing" flag or scheduled refresh.
- Returning Double (boxed): Autoboxing/unboxing overhead on every call. In a hot path, matters. Also, cache.get() returns null for missing keys — no distinction between "no data" and "price is unknown".
def find_matching_trades(trades, target_value):
results = []
for i in range(len(trades)):
for j in range(len(trades)):
if trades[i]["value"] + trades[j]["value"] == target_value:
results.append((trades[i], trades[j]))
return resultsModel Answer
- Pairs with itself: j starts from 0, so a trade can be paired with itself when i == j. Should start j from i+1.
- Duplicate pairs: Will find (A,B) and (B,A) — returns duplicates. Use j = i+1 to fix both issues.
- O(n²) complexity: Classic two-sum problem. Use a hash set: for each trade, check if (target - trade.value) is in the set. O(n) time.
- Float comparison: Using == on trade values (likely floats) is dangerous. Use abs(a + b - target) < epsilon.
- No input validation: If trades is empty or items don't have "value" key, this crashes.
Simple Deployment Plan — Framework
After code review, they'll ask: "How would you deploy this to production?" This is NOT a complex SRE question — it's testing whether you think beyond just writing code.
1. Pre-deployment
- Code review + tests passing (unit, integration)
- Feature flags / toggles to control rollout
- Config management — no hardcoded secrets, use env vars or secrets manager
- Build artifact versioned and stored (Docker image, JAR, wheel)
- Staging environment validation — does it work in a prod-like setup?
2. Deployment Strategy
- Blue/green deploy: spin up new version, switch traffic, keep old as fallback
- Canary release: route 5-10% of traffic to new version, watch metrics
- Rolling update: replace instances one-by-one (good for stateless services)
- Key question to ask: "Is this service stateful or stateless?"
- For trading systems: zero-downtime is non-negotiable → prefer blue/green
3. Observability
- Logging: structured logs with request IDs for traceability
- Metrics: latency (p50/p99), error rate, throughput — not just uptime
- Alerts: define thresholds BEFORE deploying, not after
- Health checks: readiness (is it ready for traffic?) vs liveness (is it alive?)
- At Optiver scale: microsecond-level latency monitoring matters
4. Rollback Plan
- ALWAYS have a rollback plan — mention this proactively
- Blue/green: flip traffic back in seconds
- Database migrations: always write backward-compatible migrations first
- Define rollback triggers: error rate > X%, latency > Y ms
- Post-mortem process: what went wrong, how to prevent it
Practice — Deployment Scenarios
The interviewer describes a system and asks "How would you deploy this?" Think through your answer, then reveal the model response.
Model Answer
Pre-deployment:
- Unit tests on pricing formulas + integration tests with sample market data
- Validate outputs against existing pricing engine in staging (shadow mode)
- Feature flag to switch between old and new engine per-instrument
Deployment strategy:
- Blue/green — critical for trading: can't have stale prices reaching dashboards
- Run new engine in shadow mode first: consume same market data, compute prices, compare to old engine output but don't write to DB
- Once shadow results match within tolerance, switch writes to new engine
Observability:
- Latency of computation (p50/p99) — traders need fresh prices
- Compare new prices to old prices — alert if delta exceeds threshold
- Queue consumer lag — if falling behind, market data is stale
- DB write latency and error rate
Rollback:
- Feature flag: flip back to old engine instantly
- Blue/green: keep old instance running for 24h after switch
- Trigger: if price deviation > X bps or latency > Y ms, auto-rollback
Model Answer
Key insight: Never deploy schema changes and code changes together.
Phase 1 — Expand:
- Add column as NULLABLE (not NOT NULL) with DEFAULT — this avoids locking the table
- On Postgres, adding a column with a default is instant (metadata-only since PG11)
- All 12 services still work — they ignore the new column
Phase 2 — Migrate:
- Backfill risk_score values in batches (not one UPDATE for 50M rows — that locks the table)
- Deploy services one-by-one to start writing risk_score on new inserts
Phase 3 — Contract:
- Once all rows have risk_score and all services write it, add NOT NULL constraint
- This is the "expand-and-contract" migration pattern
Rollback:
- At any phase, rolling back is safe — old services don't read the column
- Mention: "I'd never alter a column type in-place on a table this hot — always add new, migrate, drop old"
Model Answer
- Strategy: Rolling update — stateless service, so instances can be replaced one-by-one with zero downtime. Simplest approach that works.
- Pre-deploy: Run API tests against staging, load test to confirm latency SLAs, Docker image tagged and pushed to registry.
- Health checks: Readiness probe (can serve requests?) and liveness probe (process alive?). Load balancer stops sending traffic to unready instances.
- Observability: Request latency, error rate (4xx/5xx), and throughput. Set alerts before deploy.
- Rollback: Kubernetes-style: if new pods fail readiness check, rollout stops automatically. Manual rollback = redeploy previous image tag.
- Bonus: Since it's read-only, you could also use canary (route 10% of traffic first) at almost no risk.
CS Fundamentals — Optiver's Favourite Topics
- Stack vs Heap: stack is fast (push/pop), heap is dynamic but slower (malloc/free)
- Cache hierarchy: L1 (~4 cycles) → L2 (~12) → L3 (~40) → RAM (~200) → Disk
- Cache locality: accessing contiguous memory is MUCH faster (array vs linked list)
- Struct padding: compiler adds padding for alignment — know how to minimize it
- Example: struct { char a; int b; char c; } = 12 bytes due to padding
- Memory leak vs dangling pointer vs buffer overflow
- GC languages (Java/Python) vs manual (C++) trade-offs for latency
- Thread vs process: threads share memory (fast comms, race risks), processes don't
- Race condition: two threads read-modify-write shared state without sync
- Mutex: mutual exclusion lock — only one thread at a time
- Deadlock: Thread A holds lock X waiting for Y, Thread B holds Y waiting for X
- Atomic operations: hardware-guaranteed read-modify-write (lock-free)
- Memory ordering: volatile (Java), std::atomic, memory barriers
- Producer-consumer: queue + condition variable pattern
- At Optiver: lock-free data structures used extensively for low latency
- CPU pipeline: fetch → decode → execute → writeback — branch misprediction flushes it
- Branch prediction: CPUs predict which branch is taken; misprediction = ~15 cycle penalty
- SIMD: process multiple data with one instruction (vectorization)
- False sharing: two threads modify different fields in the same cache line → cache invalidations
- NUMA: Non-Uniform Memory Access — memory closer to some CPUs than others
- Context switch: OS saves/restores thread state — expensive (~microseconds)
- TCP vs UDP: TCP = reliable ordered, UDP = fast unreliable. Trading often uses UDP multicast for market data
- Latency sources: propagation (~5µs/km), serialization, queuing, OS kernel
- Kernel bypass (DPDK, RDMA): skip OS stack entirely for sub-microsecond latency
- TCP Nagle's algorithm: buffers small packets — disable with TCP_NODELAY in trading
- Load balancing: round-robin, least connections, consistent hashing
- Connection pooling: avoid reconnect overhead in hot paths
- Hash map: O(1) avg get/put, O(n) worst case with collisions — know open addressing vs chaining
- Binary search tree: O(log n) but unbalanced degrades to O(n) → use Red-Black tree
- Priority queue (heap): O(log n) insert, O(1) peek — used for order books
- Order book: usually implemented as sorted map (price → queue of orders)
- Bloom filter: probabilistic set membership, no false negatives, small false positive rate
- Know when to use each and their space complexity too
- Process scheduling: preemptive (OS interrupts) vs cooperative
- Virtual memory + paging: address translation via page table, TLB caches it
- System calls: crossing user/kernel boundary is expensive (~microseconds)
- Signals: async notifications to processes (SIGTERM, SIGKILL)
- File descriptors: abstract handle for I/O — sockets are file descriptors
- mmap: map file/device into memory address space — faster than read/write for large files
Practice — Rapid-Fire Flashcards
The interviewer asks short CS questions. Think of your answer, then click to reveal. Practice explaining out loud.
alignas(64) in C++).
bucket = hash(key) % capacity. Collision handling: Chaining — each bucket is a linked list of entries (simple but cache-unfriendly). Open addressing — on collision, probe the next slot (linear/quadratic/double hashing). Open addressing has better cache locality. Average O(1) get/put, but worst case O(n) if all keys hash to the same bucket. Load factor matters: rehash when >0.75 full. Java's HashMap switches from chaining to a red-black tree at 8 collisions per bucket.
perf stat to count cache misses.
Behavioral Prep — STAR Stories from YOUR Experience
Optiver assesses: intellectual curiosity, ownership, collaboration, and resilience. Here are ready-made stories from your background.
During my thesis internship at MinervaS building the V2X co-pilot module
I had to integrate heterogeneous traffic and weather data streams with different update frequencies and formats into a real-time advisory system
I designed a spatio-temporal filtering pipeline with TTL-based caching to normalize staleness across streams. I introduced a fuzzy logic layer to handle uncertainty in sensor data rather than hard thresholds
The system could provide explainable, real-time speed advice for heavy vehicles with latency under the operational requirement. It became the core of my Bachelor thesis (110/110 cum laude)
Starting at ETH Zürich, my first semester included Big Data and Probabilistic AI simultaneously
Both courses demanded graduate-level understanding of distributed systems and Bayesian inference — well beyond my BSc curriculum
I built a structured study schedule, connected concepts between courses (uncertainty in ML ↔ uncertainty in distributed systems), and focused on implementation to solidify theory
Scored 5.5/6 in both courses — top tier at ETH — while also starting as Head of Engineering at the Entrepreneur Club
In the TORCS autonomous driving project, I had to choose a nearest-neighbor search approach for behavioral cloning
Simple linear scan was too slow for real-time decisions during simulated driving
I implemented a KD-Tree structure, which reduced nearest-neighbor search from O(n) to O(log n). I also added dynamic-k selection and out-of-distribution detection so the system would fall back to rule-based safety mechanisms when the KD-Tree returned unreliable neighbors
The driver could navigate the track in real time with robust safety fallbacks — a deliberate design choice to prioritize reliability over raw speed
Quick Tips
→ Keep each STAR story under 2 minutes when spoken aloud
→ End with what you LEARNED, not just what happened
→ Optiver values intellectual honesty — "it didn't fully work because X" is fine if you learned from it
→ Prepare 2-3 questions about engineering culture, team structure, and what they're building
Questions to Ask — Show Intellectual Curiosity
Asking good questions is evaluated. These show you've researched Optiver and are genuinely curious — not just going through the motions.