Making Data-Driven Technical Decisions: Metrics and A/B Testing

Why This Matters

Senior engineers don't rely on intuition alone. They measure, experiment, and quantify impact. Interviewers want to see: Can you define success metrics? Do you run experiments? Can you interpret data?

Example 1: Performance Optimization

Situation: "Users complained the dashboard was slow. I suspected the issue was N+1 queries."

Measurement: "I instrumented the code with DataDog APM. Found 100+ database queries per page load. P95 load time was 4.2 seconds."

Action: "Implemented query batching + Redis caching. Deployed to 5% of users first (canary)."

Result: "P95 load time dropped to 800ms (81% improvement). Rolled out to 100%. User engagement (daily active users) increased 12%."

Why it works: Specific metrics (P95), before/after comparison, gradual rollout, business impact.

Example 2: A/B Testing a Technical Change

Situation: "Team debated whether to migrate from REST to GraphQL for our mobile app."

Experiment: "Built GraphQL API for 1 screen (user profile). A/B tested: 50% REST, 50% GraphQL for 2 weeks."

Metrics tracked: API latency, payload size, crash rate, screen load time, user retention.

Result: "GraphQL: 40% smaller payloads, 200ms faster screen loads, but 2x increase in server CPU. Decided against full migration due to cost. Used GraphQL only for slow, data-heavy screens."

Why it works: Real experiment, multiple metrics, cost-benefit analysis, nuanced decision.

Example 3: Quantifying Tech Debt Impact

Situation: "PM asked if we should refactor the payment service."

Analysis: "I pulled metrics: 15 production incidents in 6 months, avg 45 min to debug each. 80% were caused by tight coupling. Estimated 1 sprint to refactor."

Business case: "11 hours/month of eng time saved (15 incidents × 45 min / 6 months) = ~$3k/month. Refactor cost: ~$8k (1 sprint). ROI break-even in 3 months."

Result: "Got buy-in. Post-refactor: incidents dropped to 2 in 6 months (87% reduction)."

Why it works: Quantified pain (incident count, time), ROI calculation, validation with post-refactor data.

Metrics to Track

Performance: P50/P95/P99 latency, throughput (RPS), error rate
Reliability: Uptime %, MTTR (mean time to recovery), incident count
User impact: Conversion rate, bounce rate, daily active users
Developer productivity: Build time, deploy frequency, PR cycle time

Red Flags

"We made the change and it felt faster" → no measurement
"I assumed it would improve performance" → no validation
"The metric improved but we don't know why" → no causal understanding

Best Practices

Define success metrics upfront: Before starting work, agree on what "better" means
Measure before AND after: Establish baseline, measure impact
Use gradual rollouts: Feature flags, canary deploys, A/B tests
Monitor business metrics: Not just technical metrics—user behavior, revenue, retention
Share results: Post-mortems, dashboards, team retrospectives

Interview Questions

How do you measure the success of a technical change?
Tell me about a time you used data to make a decision.
How do you prioritize performance optimizations?
Describe a time you ran an experiment to validate a hypothesis.

Master Data-Driven Answers

Practice quantifying impact with AI-powered mock interviews.

Start Practicing Free →