About Scandal AI
What We Do
Scandal AI monitors the AI industry for emerging controversies, heated debates, and significant disputes in real-time. We track social media, news sources, and community platforms to surface stories as they develop.
Data Sources
Content is crawled from four primary source types:
- Twitter/X — timelines of key AI industry figures and trending search terms
- Reddit — AI-focused subreddit RSS feeds
- Hacker News — front page and new stories
- AI news RSS feeds — 10+ curated publications
Analysis Pipeline
Every piece of content goes through a multi-stage pipeline:
- Ingestion — content is deduplicated (SHA-256) and matched against known AI figures
- Spike Detection — a 30-minute keyword co-occurrence buffer clusters related content into topics
- Classification — Claude Haiku quickly determines if content represents a genuine controversy
- Full Analysis — Claude Sonnet produces structured analysis: summaries, key points, party positions, timeline, and forecast
Noise Score
Each controversy receives a noise score from 0 to 100, computed from seven weighted factors:
- Reach (20%) — total audience exposed
- Engagement (20%) — interaction velocity (posts per hour)
- Star Power (15%) — involvement of high-profile industry figures
- Cross-Platform (15%) — spread across multiple sources
- Duration (10%) — how long the story has been active
- Polarity (10%) — how divided opinions are
- Industry Impact (10%) — potential lasting effect on the AI field
A 7-day half-life decay ensures scores naturally decrease as controversies lose momentum.
State Machine
Topics progress through five states based on signal thresholds:
- Emerging — initial detection of a potential controversy
- Growing — content velocity exceeds 15 items/hour
- Debated — spread across 2+ platforms
- Major Controversy — noise score reaches 75+
- Resolved — activity drops below threshold after cooldown period
Each transition has a cooldown (12–48 hours) before decay can step the topic back to a lower state.
AI Disclosure
Summaries, key points, forecasts, and party analysis are generated by Claude (Anthropic). All AI-generated content is based on sourced material from the data sources listed above. Forecasts are probabilistic assessments based on current signals — events may develop differently.