X/Twitter Scraping Pipeline
Pull your feed every morning, rank by signal, ignore the noise
See it work
Placeholder — real preview will be a Telegram screenshot of this morning's ranked intel brief.
Is this for me?
Yes, if you want a daily brief of what mattered on X without opening X yourself. You'll get 50-100 tweets/day filtered down to 5-10 high-signal ones, ranked by engagement + topic match, delivered wherever you want (Telegram, email, a file, a Slack channel).
What you need
- A Mac or PC
- Node.js 18+ (`brew install node` on Mac)
- An X account (any tier — free works for scraping your own feed)
- An X API key via rettiwt (one-time login, saved to env)
- Optional but recommended: Module 10 (Telegram Bot Orchestration) to ship briefs to Telegram
Includes the one-line rettiwt 7.0.0 parser patch that fixes silent empty-feed failures. Without the patch, your feed just... returns nothing. Most people don't realize. You will.
Do it
- step 1 / 6
Install the scrape pipeline skill
Generate install command below → copy → Terminal → enter. The skill lands in
~/.claude/skills/rettiwt-x-scraping/and includes:install-rettiwt.sh— installs rettiwt-api + auto-applies the 7.0.0 parser patchlogin.sh— walks you through the one-time X cookie loginscrape-feed.sh— pulls For You + Following feeds + per-account timelinesrank.js— scores each tweet by engagement × topic matchfilter.js— dedup + de-noise against prior runssend-to-telegram.sh— formats + ships to the chat of your choice
- step 2 / 6
Install rettiwt + apply the parser patch
bash ~/.claude/skills/rettiwt-x-scraping/install-rettiwt.shThis installs
rettiwt-api@7.0.0and patches/opt/homebrew/lib/node_modules/rettiwt-api/dist/models/data/User.js:49:// BEFORE (broken) this.pinnedTweet = data.pinned_tweet_ids_str[0]; // AFTER (patched) this.pinnedTweet = data.pinned_tweet_ids_str?.[0];Without this patch, rettiwt crashes on any user without a pinned tweet — which silently kills ALL feed/search/timeline endpoints. Most 'rettiwt doesn't work' reports are this bug.
- step 3 / 6
One-time login to get your API key
bash ~/.claude/skills/rettiwt-x-scraping/login.shIt opens your browser → you log into X → it captures session cookies → outputs your
RETTIWT_API_KEY. Save to~/.env.shared:RETTIWT_API_KEY="<long-base64-string>"This key is tied to your session. If X forces a re-login, re-run
login.sh. Treat the key like a password. - step 4 / 6
Pull your first scrape
source ~/.env.shared bash ~/.claude/skills/rettiwt-x-scraping/scrape-feed.sh 50The
50= tweets per source. Script pulls:- 50 tweets from your For You feed
- 50 tweets from Following
- 5 tweets each from any monitored accounts you listed
Output:
memory/intel-reports/twitter-YYYY-MM-DD.json— structured records with author/text/likes/views/retweets/timestamps/media flags/hashtags. - step 5 / 6
Rank + filter to signal
node ~/.claude/skills/rettiwt-x-scraping/rank.js \ --input memory/intel-reports/twitter-YYYY-MM-DD.json \ --topics "your-topic-1,topic-2,topic-3" \ --min-likes 100 \ --top 10Scoring formula:
likes × (1 + topic_match) - (is_ad × 1000) - duplicate_penalty. The filter dedups against the previous 3 days of scrapes so you don't see the same tweet twice.Output: top 10 ranked tweets in markdown, ready to send or read.
- step 6 / 6
Ship it to Telegram (optional) + cron it daily
If you did module 10:
bash ~/.claude/skills/rettiwt-x-scraping/send-to-telegram.sh "$CHAT_ID" todayTo automate: add to your crontab:
0 7 * * * cd ~/your-project && bash scrape-feed.sh 50 && node rank.js ... && bash send-to-telegram.sh "$CHAT_ID" today7am daily brief, hands-off forever.
Make it yours
- Following a specific industry? Add 10-20 accounts to `MONITOR_ACCOUNTS` in `scrape-feed.sh`. The timeline pull is additive — adds ~2 min per scrape, worth it for concentrated signal.
- Want topic-only (no engagement minimum)? Drop `--min-likes 100` and crank `--topics` — the ranker still prioritizes matches but surfaces niche tweets you'd miss otherwise.
- Running multiple scrapes/day? Run at 7am (overnight catch-up), 1pm (US morning cycle), 7pm (late-day signal). Diff each against the last run to surface genuinely new signal. Perfect for market-sensitive topics.
- Need long threads, not just tweets? Chain in the `expand-thread.sh` helper — it resolves quoted tweets + reply chains for the top 3 ranked items. Gives you full context without manual clicking.
Stuck?
- `rettiwt user followed` returns `{list: []}` even though my feed has tweets.
Parser patch didn't apply. Re-run `install-rettiwt.sh`. Check `/opt/homebrew/lib/node_modules/rettiwt-api/dist/models/data/User.js:49` manually — it should have `?.[0]`. If the line is wrong, the CLI silently returns empty.
- Login script can't capture cookies / browser never returns.
X changed their auth flow recently — you need to login MANUALLY in your browser first, THEN run `login.sh` with the `--reuse-session` flag. The script reads existing cookies rather than forcing a new login.
- For You + Following return empty but per-user timeline works.
Account-level gating — X throttles scraping on low-activity accounts. Fix: use the account who's running the scrape to like/post occasionally, OR pivot to per-user `timeline()` calls for the accounts you actually care about. The skill's scraper auto-falls-back to timelines if feed endpoints are gated.
- Rate-limited (429) after a few scrapes.
X caps at ~500 requests/15min. The skill batches + sleeps automatically but if you're running multiple scrapes in parallel or hitting it from multiple agents, split the scrapes across time. One scrape per 15 min is safe.
- Cron fires but nothing shows up in Telegram.
Cron env is sparse by default — it doesn't source your `.env.shared`. Wrap the cron command in a shell that explicitly sources: ``` 0 7 * * * bash -c 'source ~/.env.shared && cd ~/your-project && bash scrape-feed.sh 50' ```
💰 money moves that use this
all money moves →- ★ requiredLead-gen as a service
Sell warm leads to small businesses on retainer. Google Maps scrape + enrich + qualify, dropped in their CRM.
- ★ requiredPlaybook aggregator (meta)
Build YOUR own version of Money Moves — daily AI scan + curated $$ patterns + paid tier. The receipts ARE the product.
- AI agency generalist
Bespoke AI agents and automations for mid-size companies. Higher-touch, higher-ticket cousin of the local-biz play.
- AI-driven low-ticket info arbitrage
Resell publicly-available knowledge — packaged + AI-curated — at $7-37 to buyers who'd never find it themselves.
- Claude Code for local businesses
Install Claude Code at a plumber, lawyer, or clinic. Build them custom workflows. Hourly retainer + monthly fee.
- Content-at-scale for founders
Ghostwrite + clip founders' content into a high-volume LinkedIn / X / podcast presence. Premium DFY play.
Next up
Now your agent has INPUT (daily intel from X). The final module is about OUTPUT — how to run the agent on a task queue so it actually builds things based on that intel. The meta-module on how Escape 9 to 5 itself was built.