Analytics Platform

Build your own Mixpanel/Amplitude

✓ Good to Vibe

Product analytics is one of the best candidates for vibe coding. The domain is well-understood, open source options provide reference implementations, and modern databases like ClickHouse handle billions of events effortlessly. You'll own your data, avoid per-event pricing that scales painfully, and build exactly the features your product needs.

Why This Works

Mature ecosystem

PostHog, Plausible, and Umami provide battle-tested patterns. You're not pioneering—you're adapting proven approaches.

Cost arbitrage

Mixpanel charges $20k+/year at scale. Self-hosted ClickHouse on a $50/mo server handles 100M+ events.

Data ownership

No vendor lock-in. Export, transform, and query your data however you want. Join with your production database.

Feature control

Build the exact dashboards and metrics your team needs. No compromise on what the vendor decided to build.

Privacy-first

GDPR/CCPA compliance is simpler when data never leaves your infrastructure. No third-party processors.

Real-time by default

Sub-second queries on recent data. No waiting for batch processing or daily refreshes.

Tech Stack

LayerToolsWhy
Event IngestionNext.js API routes or FastifyHandle high throughput with edge deployment. Validate and enrich events before storage.
Queue LayerUpstash Redis or AWS SQSBuffer events during traffic spikes. Guaranteed delivery even if database is temporarily down.
StorageClickHouse (primary) or TimescaleDBClickHouse handles 1B+ events with sub-second aggregations. Column-oriented storage compresses 10-20x.
Dashboard UITremor + RechartsProduction-ready chart components. Tremor provides dashboard primitives; Recharts handles custom visualizations.
JavaScript SDKCustom (~2KB gzipped)Lightweight client that auto-captures page views, identifies users, and batches events.
Query EnginetRPC + ClickHouse SQLType-safe queries from frontend to database. Complex aggregations stay performant.

Architecture

┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ Website │────▶│ Ingestion │────▶│ Queue │ │ (SDK) │ │ API │ │ (Redis) │ └─────────────┘ └─────────────┘ └──────┬──────┘ │ ┌─────────────┐ ┌──────▼──────┐ │ Dashboard │◀────│ ClickHouse │ │ (UI) │ │ (Storage) │ └─────────────┘ └─────────────┘

The Prompt

Copy this into Cursor, Claude, or ChatGPT to generate a working implementation:

Build a product analytics system with these components: ## Event Ingestion API - POST /api/events - Accept batched events with validation - Fields: event_name, user_id, anonymous_id, properties (JSON), timestamp, session_id - Validate schema, reject malformed events with helpful errors - Add server-side enrichment: IP geolocation, user agent parsing, UTM extraction - Deduplicate events using event_id + user_id composite key - Queue events to Redis for async processing ## JavaScript SDK Create a lightweight (<3KB) browser SDK: - analytics.init(writeKey) - Initialize with project key - analytics.track(event, properties) - Track custom events - analytics.identify(userId, traits) - Link anonymous to known user - analytics.page() - Auto-capture page views with referrer, UTM params - Automatic session management with 30-min timeout - Batch events (max 10 or 5 seconds) before sending - Retry failed requests with exponential backoff - Queue events offline, send when connection returns ## ClickHouse Schema CREATE TABLE events ( event_id UUID, event_name LowCardinality(String), user_id String, anonymous_id String, session_id String, properties String, -- JSON timestamp DateTime64(3), received_at DateTime64(3), -- Enriched fields country LowCardinality(String), city String, device_type LowCardinality(String), browser LowCardinality(String), os LowCardinality(String), referrer String, utm_source LowCardinality(String), utm_medium LowCardinality(String), utm_campaign LowCardinality(String) ) ENGINE = MergeTree() PARTITION BY toYYYYMM(timestamp) ORDER BY (event_name, user_id, timestamp); ## Dashboard Features Build these core views: 1. **Event Explorer** - Filter/group events by any property, time range 2. **Funnel Analysis** - Define multi-step funnels, see conversion rates and drop-off 3. **Retention Cohorts** - Weekly/monthly cohorts, show return rates over time 4. **User Segmentation** - Create segments based on behavior, compare metrics 5. **Real-time View** - Live event stream, active users counter 6. **User Profiles** - Individual user timeline, all events with properties ## Query Patterns Implement these efficient ClickHouse queries: - Event counts with arbitrary GROUP BY - Funnel conversion with sequence matching - Retention matrix using cohort joins - Property breakdown with TopK - Percentile calculations for timing metrics

Timeline

Week 1: Foundation

  • Set up ClickHouse (use ClickHouse Cloud for simplicity or self-host)
  • Build ingestion API with validation and enrichment
  • Create Redis queue and worker for async processing
  • Implement JavaScript SDK with batching and retry logic
  • Add basic event explorer UI with filters

Week 2: Core Analytics

  • Build funnel analysis with visual builder
  • Implement retention cohort calculations
  • Create user segmentation with saved segments
  • Add real-time dashboard with WebSocket updates
  • Build user profile pages with event timelines

Week 3: Polish & Scale

  • Add dashboard templates for common use cases
  • Implement data export (CSV, API)
  • Build alerting for metric thresholds
  • Performance tune queries, add materialized views
  • Deploy SDK to npm, document integration

Open Source References

PostHog ↗

Full-featured analytics + session replay. Study their ClickHouse schema.

Plausible ↗

Privacy-focused, lightweight. Great reference for simple analytics.

Umami ↗

Simple, fast, privacy-focused. Good starting point for basic needs.

Jitsu ↗

Event collection and routing. Study their ingestion patterns.

Cost Comparison

Mixpanel

$20,000+/year
At 50M events/month. Grows with usage.

Self-Hosted

$600/year
ClickHouse Cloud starter + Vercel. Handles 100M+ events.
$19,400/year
potential annual savings

Watch Out For