Dashboard: Dashboard · landscape: competitor-landscape-report · roll-up: SENTIMENT-SYNTHESIS. This documents how the dossiers are built so the findings are defensible and the process is repeatable across tools.
Tool stack (evaluated 2026-06-16)
| Source | Tool | Notes |
|---|---|---|
| App Store metadata + marketing screenshots | rzn-tools app-store search/lookup (public iTunes API) | rating, rating count, price, screenshotUrls (downloaded) |
| App Store on-page reviews + screens | rzn-browser appstore/app-details | supplementary; web App Store shows limited reviews |
| Capterra reviews (volume) | rzn-browser capterra/product-details-reviews paginated ?page=N | ~25 reviews/page via JSON-LD; primary volume engine |
| G2 reviews | rzn-browser g2/product-details-reviews | first-page set; G2 anti-bot limits depth |
| Web search / review round-ups | rzn-tools exa search + exa answer (cited) | configured key; entity search + content extraction |
| Web search backup | rzn-tools xai-search | configured key |
| YouTube demos | rzn-tools search youtube (public) | walkthrough/demo videos |
| Real in-app screenshots (fallback) | rzn-phone (device “iGoobey”, iOS 18.7.8, available) | drive the real app for true in-app UI; last resort |
| Capterra real segmentation | rzn-browser run _research/capterra_dom_reviews.json + parse_capterra_rsc.py | RSC scrape → REAL firm-size/role/industry/sub-ratings/solicitation/competitive-intel |
| App Store hi-res screens | _research/harvest_visuals.py --match <bundle> | swaps mzstatic size token for full-res; filter by bundle id (search JSON carries many apps) |
| Walkthrough UI frames | _research/harvest_frames.sh (yt-dlp + ffmpeg scene-detect) | find videos via yt-dlp "ytsearch12:<tool> demo walkthrough" (rzn-tools youtube returned empty); frames show the web/office UI the App Store doesn’t |
| Review analysis | Haiku subagents (batched) | classify sentiment/role/theme/bias per _research/REVIEW_ANALYSIS_SCHEMA.md |
Not available: Parallel, Serper, SerpAPI, Firecrawl, Tavily, Perplexity (no keys); App Store RSS reviews feed (Apple deprecated — returns empty); rzn-tools reddit (permission_denied → use browser reddit or exa site:reddit.com).
Anti-bot reality: G2/Capterra throttle rapid repeated hits. Mitigation: 22–25s delay between page calls, retry-once on an empty page, seed from prior single-page pulls. Realistic yield ~100–250 Capterra reviews for well-reviewed tools; thin/newer tools yield less.
Per-tool pipeline
_research/pull.py— App Store (search/lookup/reviews + download screenshots), Play Store, exa (4 query angles + cited answer), YouTube. →raw/,screens/._research/pull_browser.py— Capterra + G2 search → match product → first-page reviews. →raw/._research/pull_reviews_volume.py— paginate Capterra (+G2) with delays →raw/reviews_corpus.json._research/make_batches.py— split corpus into Haiku batches (raw/batches/).- Haiku subagents — one per batch: Read batch → classify per schema → Write
raw/analysis/batch_NN_out.json. _research/aggregate.py— roll up →raw/analysis/SUMMARY.{json,md}(theme freq, sentiment/role split, bias counts, quotes).- Hand-write
dossier.md— profile, pricing, UX narrative, embedded screens, quantified + bias-filtered review synthesis, gap-vs-our-wedge.
Bias / credibility method
Every review tagged by reviewer role (field vs office vs unknown), specificity, and a bias_flag: solicited_vague (down-weight vague 5★), disgruntled_nonproduct (price/sales/culture rants), fit_mismatch (wrong-segment, not a defect), or none (credible). Headline numbers report the credible share; complaint themes are ranked by negative-sentiment mentions from credible reviews.
Dossier structure (the locked template — read like a coherent report)
The brief must read as a continuous report in full sentences, not a worksheet. A stranger should understand what the company does before any strategic judgement appears, so the narrative builds understanding first and reaches our verdict as a conclusion. No emojis, no telegraphic bullet-fragments for the analytical sections, and no internal jargon (terms like “counter-positioning” or “talk-vs-ship gap” are tools for our thinking in _OPPORTUNITY-LENS; in the brief, explain the idea in plain English). Tables are welcome for the coverage grid, ratings and the closing assessment; prose carries everything else. Order:
- Opening orientation — one short paragraph stating what the brief is for, then What the company is and what it does (plain prose: the product, the buyer, the core job, the business model).
- Where it plays across the market — score 0–100 against the 21 areas in _MARKET-PROBLEM-MAP, sorted, with a prose takeaway on deep-vs-thin. Same 21 areas every time, enabling the cross-competitor coverage heatmap.
- The input side — what is captured, the input methods, onboarding and ease, and the entry friction users report.
- The management side — what data lands in the dashboard, which teams consume it, what they value, and whether the view is operational or commercial.
- Where the value actually comes from — the sales story versus the real source of stickiness; state what not to attack and where the value stops.
- What users say, both sides — lead with the credibility caveat (solicited vs organic share), then the praise and the criticism in their own words, ending with the signal that matters for us.
- The opportunity for AI in this space — which parts of the job inexpensive AI can now take over versus which are structured plumbing; then what we would build (the baseline to match, the recurring pains we can solve, the specific niche to target first).
- How open the platform is — API, webhooks and integrations, and what that openness means for building on top versus replacing.
- The vendor’s own AI — what they have actually shipped (each item tagged shipped/beta/announced) versus what the industry is talking about; state the gap plainly; close with a confidence score on how much of their AI ambition they can realistically deliver, with the structural reasoning.
- Who actually uses it — real firm-size, role and industry segmentation from the review scrape; alternatives considered and switched-from.
- Our read: can we enter and win? — the strategic conclusion in prose: what is entrenched and off-limits, the verified gap, the way in, and the one situation that would undermine the approach (the “what would make us walk away” paragraph). Close with a compact plain-language assessment table. This is where the _OPPORTUNITY-LENS thinking surfaces, translated into readable judgement.
- The app itself — the public mobile app and its ratings as the concrete closing artifact.
- Screenshots — an Obsidian-embedded gallery (“, never copied bytes), starting with the contact sheets, linking screens/README.
- Sources and method — what was verified where, and a pointer to _RESEARCH-METHOD.
DISCIPLINE: evidence over bias (no absence-claims without proof)
Never assert a competitor “lacks X” (e.g. “no commercial event”) from not-having-looked. State only what facts show. If unverified, write “not verified yet,” not “absent.” Actively search the vendor site, RFI/claims/change-order pages, dashboard/insights pages, tier matrices, API docs, and walkthrough videos before concluding a gap. (Caught in Raken pilot: Raken DOES have RFIs with cost/schedule/plan impact + a public API — earlier “no commercial event / closed” claims were bias, corrected to the precise evidenced gap: no dedicated claims/entitlement module, operational-only dashboard.)
REQUIRED: map the full product surface (don’t research only the app)
The app-store listing is the front door, not the product. Sales-led / quote-only / seat-based pricing almost always means a field→office platform (web dashboards, analytics, time→payroll, integrations) where the office side carries the invoice. Every brief must answer:
- Full surface — field (mobile) vs office (web/admin) vs finance vs integrations.
- Who consumes centrally — PM/ops, executives, payroll/finance, bid managers.
- Why they actually get paid / what P&L line — labor-cost control? payroll automation? risk? productivity-vs-estimate? (vs the visible “nicer reports”).
- Real moat — usually integrations + the office data loop, not the capture UX.
- Sales motion — PLG vs sales-led, land-and-expand, contract size, buyer.
Source it from the vendor site (features/product/pricing/integrations pages) + exa
answer+ parallel-search + funding (Tracxn/PR). Caught in the Raken pilot: it looked like a voice→PDF app; it’s actually a labor/production/payroll platform wired into Sage/QuickBooks/Viewpoint.
Lessons from the Raken pilot (apply at scale)
- Capterra
body= headline only (JSON-LD), not full pros/cons. Great for volume + rating distribution + bias detection, shallow for deep themes. G2 + exa carry the qualitative. - Firm-size / industry / role / solicitation — SOLVED via the RSC payload. Capterra is Next.js; the page streams complete structured review objects in
self.__next_f(keys:reviewer.companySize/industry/jobTitle/timeUsedProduct, six sub-ratings,prosText/consText,reviewSource.code,incentivized,alternativeProducts,switchedProducts). Tooling:_research/capterra_dom_reviews.json(workflow: navigate →execute_javascriptreturns cards + raw size-script) +_research/parse_capterra_rsc.py(unescapes__next_fchunks, brace-matches each{"reviewId"...}, emits clean records) +_research/run_capterra_dom.sh(loops?page=N) +_research/aggregate_dom.py(real firm-size/role/industry/solicitation/sub-ratings/competitive-intel rollup). Raken: 248 reviews, real segmentation, 66% vendor-solicited surfaced. ~25/page, same Cloudflare-warm-profile rule.rzn-browser act --urlneeds an OpenAI key (absent) → drive navigation via the workflow’snavigate_to_urlstep, notact. The--output-fileflag needs afile_writeside-effect declared; just redirect stdout (strip the log header before the first{). - Reddit is low-reliability: rzn-tools reddit = HTTP 403 (anonymous block); rzn-browser reddit search returned off-topic junk. parallel-search is the usable Reddit/community path (finds real threads) — pair with exa
answerfor synthesis. - EXA discipline: prefer browser / parallel-search / public connectors for raw pulls; reserve exa for synthesis + hard-to-get (cited answers, people/company specifics). exa
answergave strong independent sentiment cheaply. - HN has ~no construction-SaaS coverage — skip for this niche.
- Volume that works: Capterra pagination
?page=Nwith ~22s delays + seed-from-prior + retry-once → all 248 in 10 pages. - Haiku batch analysis: ~43 reviews/agent, 6 agents parallel, ~60s each; disjoint file writes;
aggregate.pyrolls up. Cheap and consistent.
Tier plan (user chose broadest coverage)
- Tier A — full dossier (screens + volume reviews + Haiku analysis + gap): Raken (sample, complete), Fieldwire, PlanRadar, Procore (Daily Log + AI agents), ConstructionDailyReport.ai, Voice Log Pro, Document Crunch, Trunk Tools, OpenSpace — plus tender/cost adds (ContraVault, Civils.ai, Pype, Kreo) per “broadest.”
- Tier B — medium (reviews + pricing + summary): SiteLogs, EasySiteLog, BuildPass, Autodesk ACC, Trimble/ProjectSight, Buildots, DroneDeploy, Togal, STACK, Matrak, Sablono, Zutec, Operance.
- Tier C — light (matrix line + pricing): the takeoff/AR/handover tail.
Scaling at fan-out — the packet + pipeline (10-competitor run, 2026-06-16)
The dossiers now scale via _DOSSIER-PACKET.md — the runbook handed to one background agent per competitor, each owning one folder dossiers/<slug>/. Roll-up of every agent’s returned MATRIX + COVERAGE lines lives in _CROSS-COMPETITOR (the opportunity matrix + the 21-area coverage heatmap).
Pipeline split (the architecture that makes fan-out safe):
- Stage 0 — central, serial, main thread: the Capterra RSC scrape (
run_capterra_dom.sh). The browser is a single shared session with anti-bot delays — NEVER parallelise it across agents. Resolve Capterra URLs withrzn-browser run capterra search(shape:output.result.results[].product_url), scrape each competitor one at a time, hand the agent itscapterra_dom_corpus.json. - Stage 1+ — per-agent, parallel:
pull.py(App Store/exa/reddit/youtube — API, no browser),harvest_visuals.py,harvest_frames.sh,aggregate_dom.py, vendor-site reading, and the dossier write. None touch the shared browser, so N agents run concurrently with zero contention on disjoint folders.
Fixes baked in this run (all durable):
parse_capterra_rsc.pynow skips unparseable pages instead of aborting the whole corpus — a thin product (≤1 review page) returns non-JSON for the requested?page=2, which used to kill the entire parse (Kreo lost all 25 reviews to this).aggregate_dom.pynow excludes sub-ratings < 1 — Capterra’s RSC encodes an unrated sub-dimension as0, and averaging those zeros understated value-for-money / support by 0.5–1.2 stars and reversed conclusions (Raken value 3.92→4.54, Fieldwire 3.6→4.36, Kreo 3.16→4.39, Pype 2.75→4.12; only Procore’s “value is the weakest sub-rating” survived, 3.46→4.07). Lesson: never headline a “weakest sub-rating” claim without confirming blanks are excluded.render_dossier_html.pynow self-titles from the H1 and writes<slug>.htmldirectly — nomvstep, no stale “Raken” title.- macOS has no
timeoutbinary — agents must run commands directly, not wrapped intimeout.
Known gap (fix before the next browser wave): some Capterra products (e.g. PlanRadar) 301-redirect /reviews/ → /#reviews (the overview), where reviews lazy-load and the workflow captures none (__next_f present but zero reviewId objects). Workaround used: the thin-Capterra path. Proper fix: add a scroll-the-reviews-container step to capterra_dom_reviews.json before the extract.
Thin-data path (validated): AI-native tools with no Capterra and no app (Document Crunch, Trunk Tools, Civils.ai, OpenSpace) still yield strong briefs from exa + funding/press + walkthrough-video frames + vendor site, provided the credibility caveat leads. Ultra-thin tools (ContraVault, ConstructionDailyReport.ai, Voice Log Pro) are folded into one cohort note (_micro-entrants/cohort) rather than padded into solo 14-section briefs.