System Design Interview - Building the Backend for a News In Shorts App (Part -2)

S1E2 - Backend of News Aggregator

Jul 23, 2025

Hey everyone, and welcome to Episode 2 of system design interview series!

We will go through a series of real time System Design Questions & answers. Today I will design News In Shorts application along with how to answer in real interview setting.

This is part 2. If you want to read part 1. Check my earlier article here -

System Design Interview - Building the Backend for a News In Shorts App (Part-1)

Hitesh Gupta

Jul 20

System Design Interview - Building the Backend for a News In Shorts App (Part-1)

Hey everyone, and welcome to Episode 1- (Part 1) of system design interview series!

Read full story

In this part we will cover :-

Back-of-the-Envelope Calculations
Deep-dives
Enhancements

Before doing a deep dive it is better to do some calculations as this will impact our architecture choices and trade-offs.

7. Back-of-the-Envelope Calculations (3-5 minutes)

The most profound conclusion from this exercise is the stark imbalance between read and write operations. Optimise aggressively for read latency

7. Deep-dives (10-15 minutes)

Now that we've estimated scale (~50k RPS, ~40 GB article data/year), let's address NFRs like scalability, resilience, and availability. Optimize for high reads, efficient storage, and minimal downtime.

For structured data (e.g., user profiles/sources), PostgreSQL is reliable with indexing/partitioning. For semi-structured articles (high volume, evolving schemas), NoSQL (e.g., Cassandra/MongoDB) scales better horizontally. PostgreSQL works with aggressive caching for read-heavy feeds.

Let's dive into the key NFRs, with guidance on how to articulate these in system design interviews. Use these as templates to sound confident and structured!

Deep Dive 1: Handling High Scale and Resilience

How to frame your answer in interviews: "When designing for high scale, I anchor on microservices, async communication, and caching."

Microservices Architecture: "We use microservices for independent scaling and fault isolation—if one (e.g., feed generation) fails, the app runs. Trade-off: Distributed complexity, but resilience wins for high-traffic feeds."
Asynchronous Communication (Kafka): "Kafka decouples services—if one (e.g., ingestion) slows, others continue. It buffers durably, absorbing spikes like viral events."
Stateless, Horizontally Scalable Services: "Stateless microservices scale behind load balancers with CPU/memory autoscaling for unpredictable loads (e.g., global surges)."

Deep Dive 2: Ensuring High Availability (99.99% Uptime Requirement)

How to confidently explain: "I design assuming failures, with redundancy to minimize downtime during breaking news."

Load Balancing: "Load balancers reroute traffic if instances fail, keeping experiences seamless."
Database Replication:
- "PostgreSQL: Primary-replica—writes to primary, reads distributed; auto-failover for quick recovery."
- "NoSQL (e.g., Cassandra): Built-in replication prevents data loss from node failures."
Designing for Failure:
- "Circuit breakers on 3P calls prevent cascades—trip on failures to fail fast."
- "API Gateway: Rate limiting/throttling blocks DDoS, protecting during spikes."

Deep Dive 3: Achieving Low Latency for News Feeds

Low latency is key—aim for <200ms loads. Use multi-layered caching with Redis for millisecond reads; optimize to avoid staleness.

How to frame: "I layer caching for sub-200ms responses, preferring CDC over TTLs for freshness."

Level 1: Client-Side Caching: "Apps cache locally for instant display; APIs use ETag/Cache-Control."
Level 2: CDN: "CloudFront caches globally: Static assets (<50ms from S3), dynamic feeds (1-5 mins offload)."
Level 3: Redis: "Stores pre-computed sets (e.g., feed:region:IN). TTL (30 mins) helps but risks misses, staleness, thundering herds.”

Better Approach: Real-Time Cached Feeds with Change Data Capture (CDC)

To fix staleness:

Pre-Computation: "Store feeds in Redis sorted sets (ZADD), with 2,000-3,000 latest IDs."
Real-Time Updates: "Debezium captures DB changes (e.g., new articles in Cassandra), publishes to Kafka; workers update Redis instantly."
Cache Management: "Size-based eviction (ZREMRANGEBYRANK) for freshness without TTLs."
Fast Reads: "ZREVRANGE queries: <5ms hits, <200ms overall."

Trade-Offs: Adds complexity (CDC/Kafka/workers) but ensures real-time accuracy—essential for news.
Scalable Caching: "Use Redis replicas/sharding for load distribution; regional setups for low latency."”

Deep Dive 4: Ensuring Timely Feed Delivery (Within 15 Minutes)

As discussed earlier we will use schedulers like Cron or Apache Airflow to poll RSS/APIs every 15 minutes, ingest articles, and update feeds via CDC (Deep Dive 3). This ensures batch efficiency and delivery within the window.

How do we publish Breaking news?

Advanced Approach: Real-Time Pushing for Breaking News

To handle urgent content, shift to proactive pushes:

Push-Based Web hooks:
“Publishers POST to a secure endpoint (e.g., /webhooks/article-published), authenticated via API keys. High-availability service validates and queues to Kafka for immediate processing.”
Real-Time Delivery:
“Articles bypass schedulers, updating Redis sets in seconds—content live within 30 seconds for web-hook users, enabling real-time viral stories.”
Hybrid System with Fallbacks:
“Maintain RSS polling/scraping for non-webhook publishers, creating a resilient hybrid (e.g., 70-80% real-time, rest scheduled) to avoid gaps.”

This is the final design for backend of News In shorts app. Let’s work on some enhancements as well. This will be good-to-do but not necessary

If you want me to dive-deep on specific real system design problem. Let me know in comments

Enhancements

Real-Time Updates

Alerting Service: Consumes from cleaned_articles Kafka; detects breaking news via keywords ("BREAKING") or anomaly detection (e.g., source spikes).
Push Notifications: Triggers APNS/FCM for millions of devices (<10s delivery).
In-App Updates: WebSockets/SSE push "Breaking News" cards to active users' feeds instantly.

## Personalised Feed Architecture

To move from generic to personalised feeds, we'll implement a new system:

1Shift to tailored feeds:

Signal Collection: Publish interactions (taps, reads) to user_interactions Kafka.
Recommendation Engine:
- Feature Generation: Spark jobs create interest vectors.
- Model Training: Periodic ML (e.g., collaborative filtering) predicts engagement.
- Ranking Service: /v1/feed/for-you scores/ranks candidates.

Pre-Computation: Background jobs generate category feeds, store IDs in Redis for fast reads (no DB hits).

Stay tuned for more dive deeps on Real-World System Design Problems. Let me know in comments if you want me to solve some specific Design Problem.

And hey, don’t forget to follow me on Linkedin for more insights and updates. Let's stay connected!

Y School Of Tech

System Design Interview - Building the Backend for a News In Shorts App (Part-1)

Discussion about this post