[etm]
module 011v1.0· updated 2026-04-19

X/Twitter Scraping Pipeline

Pull your feed every morning, rank by signal, ignore the noise

01 / 07

See it work

Placeholder — real preview will be a Telegram screenshot of this morning's ranked intel brief.

02 / 07

Is this for me?

Yes, if you want a daily brief of what mattered on X without opening X yourself. You'll get 50-100 tweets/day filtered down to 5-10 high-signal ones, ranked by engagement + topic match, delivered wherever you want (Telegram, email, a file, a Slack channel).

03 / 07

What you need

  • A Mac or PC
  • Node.js 18+ (`brew install node` on Mac)
  • An X account (any tier — free works for scraping your own feed)
  • An X API key via rettiwt (one-time login, saved to env)
  • Optional but recommended: Module 10 (Telegram Bot Orchestration) to ship briefs to Telegram
time · 40 minutes for first setup. 5 min to add new accounts or filters after.coding · no

Includes the one-line rettiwt 7.0.0 parser patch that fixes silent empty-feed failures. Without the patch, your feed just... returns nothing. Most people don't realize. You will.

04 / 07

Do it

  1. step 1 / 6

    Install the scrape pipeline skill

    Generate install command below → copy → Terminal → enter. The skill lands in ~/.claude/skills/rettiwt-x-scraping/ and includes:

    • install-rettiwt.sh — installs rettiwt-api + auto-applies the 7.0.0 parser patch
    • login.sh — walks you through the one-time X cookie login
    • scrape-feed.sh — pulls For You + Following feeds + per-account timelines
    • rank.js — scores each tweet by engagement × topic match
    • filter.js — dedup + de-noise against prior runs
    • send-to-telegram.sh — formats + ships to the chat of your choice
  2. step 2 / 6

    Install rettiwt + apply the parser patch

    bash ~/.claude/skills/rettiwt-x-scraping/install-rettiwt.sh
    

    This installs rettiwt-api@7.0.0 and patches /opt/homebrew/lib/node_modules/rettiwt-api/dist/models/data/User.js:49:

    // BEFORE (broken)
    this.pinnedTweet = data.pinned_tweet_ids_str[0];
    
    // AFTER (patched)
    this.pinnedTweet = data.pinned_tweet_ids_str?.[0];
    

    Without this patch, rettiwt crashes on any user without a pinned tweet — which silently kills ALL feed/search/timeline endpoints. Most 'rettiwt doesn't work' reports are this bug.

  3. step 3 / 6

    One-time login to get your API key

    bash ~/.claude/skills/rettiwt-x-scraping/login.sh
    

    It opens your browser → you log into X → it captures session cookies → outputs your RETTIWT_API_KEY. Save to ~/.env.shared:

    RETTIWT_API_KEY="<long-base64-string>"
    

    This key is tied to your session. If X forces a re-login, re-run login.sh. Treat the key like a password.

  4. step 4 / 6

    Pull your first scrape

    source ~/.env.shared
    bash ~/.claude/skills/rettiwt-x-scraping/scrape-feed.sh 50
    

    The 50 = tweets per source. Script pulls:

    • 50 tweets from your For You feed
    • 50 tweets from Following
    • 5 tweets each from any monitored accounts you listed

    Output: memory/intel-reports/twitter-YYYY-MM-DD.json — structured records with author/text/likes/views/retweets/timestamps/media flags/hashtags.

  5. step 5 / 6

    Rank + filter to signal

    node ~/.claude/skills/rettiwt-x-scraping/rank.js \
      --input memory/intel-reports/twitter-YYYY-MM-DD.json \
      --topics "your-topic-1,topic-2,topic-3" \
      --min-likes 100 \
      --top 10
    

    Scoring formula: likes × (1 + topic_match) - (is_ad × 1000) - duplicate_penalty. The filter dedups against the previous 3 days of scrapes so you don't see the same tweet twice.

    Output: top 10 ranked tweets in markdown, ready to send or read.

  6. step 6 / 6

    Ship it to Telegram (optional) + cron it daily

    If you did module 10:

    bash ~/.claude/skills/rettiwt-x-scraping/send-to-telegram.sh "$CHAT_ID" today
    

    To automate: add to your crontab:

    0 7 * * * cd ~/your-project && bash scrape-feed.sh 50 && node rank.js ... && bash send-to-telegram.sh "$CHAT_ID" today
    

    7am daily brief, hands-off forever.

05 / 07

Make it yours

Tune the scrape to your world
  • Following a specific industry? Add 10-20 accounts to `MONITOR_ACCOUNTS` in `scrape-feed.sh`. The timeline pull is additive — adds ~2 min per scrape, worth it for concentrated signal.
  • Want topic-only (no engagement minimum)? Drop `--min-likes 100` and crank `--topics` — the ranker still prioritizes matches but surfaces niche tweets you'd miss otherwise.
  • Running multiple scrapes/day? Run at 7am (overnight catch-up), 1pm (US morning cycle), 7pm (late-day signal). Diff each against the last run to surface genuinely new signal. Perfect for market-sensitive topics.
  • Need long threads, not just tweets? Chain in the `expand-thread.sh` helper — it resolves quoted tweets + reply chains for the top 3 ranked items. Gives you full context without manual clicking.
06 / 07

Stuck?

  • `rettiwt user followed` returns `{list: []}` even though my feed has tweets.

    Parser patch didn't apply. Re-run `install-rettiwt.sh`. Check `/opt/homebrew/lib/node_modules/rettiwt-api/dist/models/data/User.js:49` manually — it should have `?.[0]`. If the line is wrong, the CLI silently returns empty.

  • Login script can't capture cookies / browser never returns.

    X changed their auth flow recently — you need to login MANUALLY in your browser first, THEN run `login.sh` with the `--reuse-session` flag. The script reads existing cookies rather than forcing a new login.

  • For You + Following return empty but per-user timeline works.

    Account-level gating — X throttles scraping on low-activity accounts. Fix: use the account who's running the scrape to like/post occasionally, OR pivot to per-user `timeline()` calls for the accounts you actually care about. The skill's scraper auto-falls-back to timelines if feed endpoints are gated.

  • Rate-limited (429) after a few scrapes.

    X caps at ~500 requests/15min. The skill batches + sleeps automatically but if you're running multiple scrapes in parallel or hitting it from multiple agents, split the scrapes across time. One scrape per 15 min is safe.

  • Cron fires but nothing shows up in Telegram.

    Cron env is sparse by default — it doesn't source your `.env.shared`. Wrap the cron command in a shell that explicitly sources: ``` 0 7 * * * bash -c 'source ~/.env.shared && cd ~/your-project && bash scrape-feed.sh 50' ```

💰 money moves that use this

all money moves →
07 / 07

Next up

module 012
Autonomous Build Loop

Now your agent has INPUT (daily intel from X). The final module is about OUTPUT — how to run the agent on a task queue so it actually builds things based on that intel. The meta-module on how Escape 9 to 5 itself was built.

open module →