Skip to main content

Account Scoring & Propensity Modeling

Static ICP checklists break the moment your market shifts. A propensity model built from your actual wins and losses adapts with your data.

The Play

1

Export wins and losses from your CRM

Pull closed-won and closed-lost opportunities from HubSpot or Salesforce via MCP. Include company attributes: employee count, industry, revenue, funding stage, deal size, sales cycle length.
2

Enrich the historical dataset

For each company, pull firmographics via Crustdata or Apollo: headcount, tech stack, hiring velocity, funding rounds, revenue estimates. The more signal dimensions, the better the model.
3

Build a baseline scoring model

Ask Claude to analyze the enriched win/loss dataset and identify which attributes correlate with winning. Weight each dimension by its predictive power. Start simple — logistic regression or a weighted scorecard.
4

Backtest against held-out deals

Hold out 20-30% of your historical deals. Score them with the model. Check: does the model rank wins higher than losses? What’s the AUC? If it’s below 0.65, the model needs more signal dimensions or the data is too noisy.
5

Iterate until consistent but not overfit

Tune weights, add or drop dimensions, re-run. Watch for overfitting: if training accuracy is 95% but holdout accuracy is 55%, you’re memorizing noise. Target holdout accuracy within 10% of training accuracy.
6

Score new target accounts

Apply the model to your pipeline and target account list. Tier them by propensity score. Focus outbound on the top tier.
7

Find contacts at top accounts

For high-propensity accounts, use the company-to-contact waterfall to find decision-makers. Enrich with verified emails.

Tell Claude Code

“Pull all closed-won and closed-lost deals from HubSpot for the last 12 months. Enrich each company with Crustdata firmographics. Build a propensity model that predicts win vs loss based on company attributes. Backtest it on 25% held-out deals. Keep iterating until holdout accuracy is within 10% of training accuracy. Then score the 300 companies in target-accounts.csv and tier them. Find the VP of Sales at every top-tier company and write everything to scored-accounts.csv.”

How the Model Works

Input: Your CRM’s closed-won and closed-lost deals, enriched with firmographic data. What the model learns:
SignalWhy it matters
Employee count rangeYou may win more often at 100-500 than at 5,000+
Industry verticalSaaS companies may convert 3x vs manufacturing
Funding stageSeries B might be your sweet spot
Hiring velocityCompanies adding GTM roles signal active buying
Tech stack overlapUsing Salesforce + Outreach might correlate with wins
Revenue range10M10M-50M ARR might be where you win on value
GeographyRegional patterns in close rates
Sales cycle lengthShort cycles at similar companies predict fit
Output: A weighted score per account, calibrated against your actual outcomes.

Avoiding Overfitting

The goal is a model that generalizes, not one that memorizes your training data.
  • Split your data — 75% train, 25% test. Never tune on the test set.
  • Start with fewer dimensions — 5-8 signals. Add more only if holdout accuracy improves.
  • Watch the gap — training accuracy 80%, holdout 72% = fine. Training 95%, holdout 55% = overfit.
  • Re-run quarterly — your market changes. The model should too. Pull fresh wins/losses and retrain.

Cost Estimate

  • Company enrichment via Crustdata: ~0.4 credits per account
  • Contact finding (top tier): ~0.2-0.6 credits per company (via company_to_contact waterfall)
  • Email waterfall for found contacts: ~0.3 credits per contact
  • Model building and backtesting: no enrichment cost (runs on already-enriched data)
After your first outbound campaign, compare response rates by tier. If mid-tier converts as well as top-tier, retrain — the model’s weights are off.

Niche Signal Discovery → | Closed-Lost Recovery → | I Have X, I Want Y →