Quick Answer: What Is an Account Scoring Model?
An account scoring model assigns a numerical value to each company based on how likely it is to become a customer. You build one by scoring firmographic data (revenue brackets, employee count, industry), technographic fit, engagement signals, and intent data, then combining them with a weighted formula. This article gives you the actual code.
What Is an Account Scoring Model?
An account scoring model is a system that ranks companies by their likelihood to buy your product. It takes raw data about a company -- revenue, headcount, industry, tech stack, how they interact with your brand -- and outputs a number that tells your sales team where to spend their time.
If you have worked in B2B, you have seen the alternative: reps cherry-picking accounts based on gut feel, or marketing dumping thousands of "qualified" leads into a CRM with no prioritization. Account scoring replaces that with a repeatable algorithm.
The difference between account scoring and lead scoring matters. Lead scoring evaluates individual people. Account scoring evaluates entire companies. In B2B, where buying committees of 6-10 stakeholders make purchase decisions, the company-level view is more predictive. Five engaged contacts at a well-fit company are worth more than fifty scattered leads at random organizations.
The rest of this article gives you the concrete algorithm. We will start with revenue bracket scoring (the most common firmographic signal), build up to a full weighted model in Python, show you how to calibrate weights against real data, and then cover what to do when the static model stops working.
The Revenue Bracket Scoring Algorithm
Revenue is the single most common firmographic signal in account scoring. The logic is simple: companies in certain revenue ranges are more likely to buy your product. A $5M ARR startup has different needs (and budgets) than a $500M enterprise.
The first step is defining your revenue brackets. These should come from your actual closed-won data, not guesswork. Pull your last 200 closed deals and look at the revenue distribution.
Revenue Bracket Scoring in SQL
If your account data lives in a database or warehouse, here is the SQL to assign bracket scores. This is a pattern you can plug directly into a dbt model or a CRM workflow.
-- Revenue bracket scoring algorithm
-- Adjust brackets and scores to match your ICP
SELECT
account_id,
account_name,
annual_revenue,
CASE
WHEN annual_revenue IS NULL THEN 20 -- Unknown: low default
WHEN annual_revenue < 1000000 THEN 15 -- Under $1M: too small
WHEN annual_revenue < 10000000 THEN 55 -- $1M-$10M: emerging fit
WHEN annual_revenue < 50000000 THEN 90 -- $10M-$50M: sweet spot
WHEN annual_revenue < 200000000 THEN 100 -- $50M-$200M: ideal
WHEN annual_revenue < 1000000000 THEN 75 -- $200M-$1B: good, longer cycle
ELSE 40 -- $1B+: enterprise, diff motion
END AS revenue_score,
CASE
WHEN annual_revenue IS NULL THEN 'unknown'
WHEN annual_revenue < 1000000 THEN 'smb'
WHEN annual_revenue < 10000000 THEN 'lower_mid_market'
WHEN annual_revenue < 50000000 THEN 'upper_mid_market'
WHEN annual_revenue < 200000000 THEN 'mid_enterprise'
WHEN annual_revenue < 1000000000 THEN 'enterprise'
ELSE 'large_enterprise'
END AS revenue_tier
FROM accounts
ORDER BY revenue_score DESC;
Two things to notice. First, null handling matters. Revenue data is frequently missing, especially for private companies. Assigning a low default score (rather than zero) keeps these accounts in the funnel without over-prioritizing them. Second, the score curve is not linear. Your "sweet spot" bracket gets the highest score, and the score drops off on both ends. A company that is too small cannot afford you; a company that is too large may need an entirely different sales motion.
Revenue Bracket Scoring in Python
If you are building scoring in a Python pipeline (common with Clay workflows or custom enrichment scripts), here is the equivalent function.
def score_revenue_bracket(annual_revenue: float | None) -> dict:
"""Score an account based on annual revenue bracket.
Returns dict with score (0-100), tier label, and reasoning.
Adjust BRACKETS to match your ICP's revenue sweet spot.
"""
# Define brackets: (max_revenue, score, tier_name)
# Order matters -- first match wins
BRACKETS = [
(1_000_000, 15, "smb"),
(10_000_000, 55, "lower_mid_market"),
(50_000_000, 90, "upper_mid_market"),
(200_000_000, 100, "mid_enterprise"),
(1_000_000_000, 75, "enterprise"),
(float("inf"), 40, "large_enterprise"),
]
if annual_revenue is None:
return {
"score": 20,
"tier": "unknown",
"reasoning": "Revenue data unavailable. Default low score."
}
for max_rev, score, tier in BRACKETS:
if annual_revenue < max_rev:
return {
"score": score,
"tier": tier,
"reasoning": f"Revenue ${annual_revenue:,.0f} falls in {tier} bracket."
}
return {"score": 20, "tier": "unknown", "reasoning": "Fallback."}
# Usage
result = score_revenue_bracket(35_000_000)
# {'score': 90, 'tier': 'upper_mid_market', 'reasoning': 'Revenue $35,000,000 falls in upper_mid_market bracket.'}
Do not copy these brackets blindly. Pull your closed-won deals from the last 12 months, bucket them by revenue, and calculate win rate per bucket. Your brackets should reflect where you actually win, not where you wish you did. If 60% of your wins come from the $10M-$50M range, that is your sweet spot.
Building the Full Account Scoring Model
Revenue brackets are one signal. A complete account scoring model combines firmographic fit, technographic compatibility, engagement behavior, and intent data into a single weighted score. Here is a production-ready implementation.
Firmographic Scoring Function
This function scores the static attributes of a company: revenue, employee count, and industry. These three factors together form the firmographic foundation of your model.
def score_firmographics(account: dict) -> dict:
"""Score firmographic fit (0-100).
Combines revenue bracket, employee count, and industry.
"""
# Revenue score (reuse the function from above)
rev = score_revenue_bracket(account.get("annual_revenue"))
# Employee count score
emp = account.get("employee_count")
if emp is None:
emp_score = 20
elif emp < 50:
emp_score = 15
elif emp < 200:
emp_score = 60
elif emp < 1000:
emp_score = 95
elif emp < 5000:
emp_score = 80
else:
emp_score = 50
# Industry score -- your ICP verticals get the highest scores
ICP_INDUSTRIES = {
"saas": 100, "fintech": 95, "cybersecurity": 90,
"healthtech": 80, "ecommerce": 70, "martech": 85,
}
industry = account.get("industry", "").lower()
ind_score = ICP_INDUSTRIES.get(industry, 30) # Default for non-ICP
# Weighted combination within firmographics
firmo_score = (rev["score"] * 0.45) + (emp_score * 0.30) + (ind_score * 0.25)
return {
"score": round(firmo_score, 1),
"revenue": rev,
"employee_score": emp_score,
"industry_score": ind_score,
}
Engagement Scoring with Time Decay
Engagement signals are inherently time-sensitive. A pricing page visit last week is a fundamentally different signal than one six months ago. The algorithm needs a decay function to prevent stale activity from inflating scores.
from datetime import datetime, timedelta
import math
def score_engagement(events: list[dict], now: datetime = None) -> dict:
"""Score engagement signals with exponential time decay.
Each event has: type, timestamp, and a base weight.
Decay halves the value every 14 days.
"""
now = now or datetime.utcnow()
HALF_LIFE_DAYS = 14
# Base points by event type
EVENT_WEIGHTS = {
"pricing_page_view": 25,
"demo_request": 50,
"webinar_attended": 20,
"email_replied": 30,
"content_download": 15,
"email_opened": 5,
"page_view": 3,
}
total_decayed = 0
for event in events:
base = EVENT_WEIGHTS.get(event["type"], 1)
days_ago = (now - event["timestamp"]).days
# Exponential decay: value = base * (0.5 ^ (days / half_life))
decay_factor = math.pow(0.5, days_ago / HALF_LIFE_DAYS)
total_decayed += base * decay_factor
# Normalize to 0-100 scale (cap at 200 raw points = 100 score)
score = min(100, (total_decayed / 200) * 100)
return {
"score": round(score, 1),
"raw_decayed_points": round(total_decayed, 1),
"event_count": len(events),
}
The Composite Scoring Function
Now combine everything into a single account score. The weighted formula is straightforward: normalize each component to 0-100, apply weights, sum them.
def score_account(account: dict, events: list[dict]) -> dict:
"""Calculate composite account score.
Weights should be calibrated against your closed-won data.
These defaults are a reasonable starting point.
"""
WEIGHTS = {
"firmographic": 0.45, # Largest signal for most B2B
"engagement": 0.25, # Active interest indicator
"technographic": 0.20, # Stack compatibility
"intent": 0.10, # Third-party signals
}
# Score each component
firmo = score_firmographics(account)
engagement = score_engagement(events)
techno_score = account.get("technographic_score", 50)
intent_score = account.get("intent_score", 30)
# Weighted composite
composite = (
firmo["score"] * WEIGHTS["firmographic"] +
engagement["score"] * WEIGHTS["engagement"] +
techno_score * WEIGHTS["technographic"] +
intent_score * WEIGHTS["intent"]
)
# Assign priority tier
if composite >= 80:
tier = "hot"
elif composite >= 60:
tier = "warm"
elif composite >= 40:
tier = "nurture"
else:
tier = "cold"
return {
"composite_score": round(composite, 1),
"tier": tier,
"components": {
"firmographic": firmo["score"],
"engagement": engagement["score"],
"technographic": techno_score,
"intent": intent_score,
},
"weights": WEIGHTS,
}
Worked Example
Here is what the output looks like for a real account running through this model.
Example: TechCorp Inc -- Score: 78.3/100
- Firmographic (85.3 x 0.45 = 38.4) -- $35M revenue (upper_mid_market), 420 employees, SaaS industry. Strong ICP fit across all three firmographic dimensions.
- Engagement (68.0 x 0.25 = 17.0) -- 3 pricing page views in the last 10 days, replied to one email, downloaded a case study. Time decay is working in their favor because the activity is recent.
- Technographic (80 x 0.20 = 16.0) -- Uses Salesforce (direct integration), no competing product installed.
- Intent (69 x 0.10 = 6.9) -- Moderate research activity on G2 in the relevant category.
Tier: Hot. But the score alone does not tell you who at TechCorp is driving the research, why they are looking now, or which product of yours is the best fit. We will come back to that gap.
How to Calibrate Your Scoring Weights
The algorithm above works out of the box, but the default weights are generic. The difference between a mediocre scoring model and one that actually predicts conversion is calibration against your own data.
The process is straightforward: calculate win rate by factor, then adjust weights so that the factors most correlated with winning get the most weight.
SQL: Win Rate by Revenue Bracket
-- Win rate by revenue bracket (last 12 months)
SELECT
CASE
WHEN a.annual_revenue < 1000000 THEN 'Under $1M'
WHEN a.annual_revenue < 10000000 THEN '$1M-$10M'
WHEN a.annual_revenue < 50000000 THEN '$10M-$50M'
WHEN a.annual_revenue < 200000000 THEN '$50M-$200M'
ELSE '$200M+'
END AS revenue_bracket,
COUNT(*) AS total_opps,
SUM(CASE WHEN o.stage = 'closed_won' THEN 1 ELSE 0 END) AS wins,
ROUND(
SUM(CASE WHEN o.stage = 'closed_won' THEN 1 ELSE 0 END) * 100.0
/ NULLIF(COUNT(*), 0)
, 1) AS win_rate_pct
FROM opportunities o
JOIN accounts a ON o.account_id = a.id
WHERE o.created_at >= CURRENT_DATE - INTERVAL '12 months'
GROUP BY 1
ORDER BY win_rate_pct DESC;
Run the same analysis for employee count, industry, and each engagement signal. The pattern that emerges will tell you where your current weights are off. If your $10M-$50M bracket wins at 32% but your $50M-$200M bracket wins at only 18%, your revenue scoring curve needs to reflect that -- not the other way around.
You need at least 30 opportunities per bracket before the win rate is statistically meaningful. If you are pre-scale with fewer than 200 total opportunities, start with the default weights and recalibrate after two quarters of data. Premature optimization of scoring weights is a common trap.
Adjusting Weights Based on Correlation
Once you have win rates by factor, the calibration logic is simple. Factors with higher win-rate variance between "good" and "bad" values deserve more weight. If industry predicts conversion at 3x the rate of employee count, it should carry more weight in the formula.
A practical approach: score your closed-won deals and closed-lost deals through the model. If the model cannot distinguish between the two groups, the weights are wrong. Iterate until the score distributions for wins and losses have minimal overlap.
Revisit this analysis quarterly. Your ICP shifts as you grow, and weights that were accurate six months ago may not reflect your current market position.
Where Static Scoring Breaks Down
The model above is solid for prioritization. It will outperform gut-feel account selection and it will give your SDRs a ranked list they can work from. But after running it for 12-18 months, most teams hit the same ceiling.
Three failure modes show up repeatedly.
Context collapse
A 78-point account at Company A and a 78-point account at Company B get the same treatment. But Company A is in your core vertical with a champion you met at a conference, while Company B is a stretch industry with no existing relationship. The score collapses the rich context that determines how you should actually engage into a single number.
Your best reps work around the score. They look at the account, do their own research, and make a judgment call. Your less experienced reps follow the score blindly and get worse results. The scoring model is only as useful as the context it throws away.
Tribal knowledge stays outside the model
Your best SDRs know which verticals are hot right now, which competitor situations are winnable, and which buyer personas convert fastest. This knowledge lives in their heads. It never makes it into the CASE statements or weight configurations.
As your persona-based targeting becomes more sophisticated, the gap between what the model knows and what the team knows widens. The scoring algorithm cannot encode "we just hired an AE who came from their industry and has relationships there."
Static weights in a dynamic market
You calibrate weights based on last quarter's data. A new competitor launches. A product release changes your ICP. A macroeconomic shift makes one vertical suddenly more budget-conscious. Your weights are now wrong, but the model still returns confident-looking numbers.
Even teams that commit to quarterly recalibration find the weights drift faster than they can update them. The market is continuous; your calibration cycle is discrete. That gap is structural, not operational.
The fundamental problem with static scoring is not the math. The math is fine. The problem is that a single number cannot carry enough information for a rep to take the right action. A score tells you which accounts to prioritize. It cannot tell you why they are a fit or how to approach them.
From Static Scores to AI-Powered Qualification
This is where the scoring conversation shifts. If your static model is working well enough for basic prioritization, keep it. But when you need the why behind the score -- the reasoning that tells a rep how to engage an account, not just whether to -- you need a different kind of system.
AI-powered qualification agents evaluate accounts against your actual ICP definition, product criteria, and competitive positioning. Instead of returning just a number, they return a score plus the reasoning behind it.
How Qualification Agents Work
Octave provides two qualification agents that operate this way: the Qualify Company Agent and the Qualify Person Agent.
The Qualify Company Agent evaluates a company against one or more of your products to determine fit. You define "good fit" and "bad fit" qualifying questions in your Octave Library (for example: "Does the company sell B2B software?" or "Is the company in a highly regulated industry where our compliance features matter?"). The agent researches the company and answers each question with a yes/no determination, a rationale, and a confidence level (LOW, MEDIUM, or HIGH).
The output includes an overall score, an overall rationale explaining the score, and individual answers to each qualifying question. This is a meaningfully different output than a static score. When a rep sees "Score: 82 -- Strong fit. They are a mid-market SaaS company using Salesforce (we integrate natively). Answered YES with HIGH confidence to 4 of 5 good-fit questions. One concern: they appear to have a small sales team, which may limit expansion potential" -- they know exactly what to do with that account.
The Qualify Person Agent does the same thing at the individual contact level, evaluating a person against both a product and a persona. The output includes product qualification, persona fit, and segment alignment, each with their own scores and reasoning.
Calling the API
Both agents are accessible via API, which means you can integrate them into Clay workflows, custom pipelines, or any orchestration tool. Here is what a call to the Qualify Company Agent looks like.
import requests
def qualify_company(company_domain: str, agent_id: str, api_key: str,
runtime_context: dict = None) -> dict:
"""Qualify a company using Octave's Qualify Company Agent.
Args:
company_domain: e.g. "techcorp.com"
agent_id: Your agent OId from the Octave dashboard
api_key: Your Octave API key
runtime_context: Optional dict of known data to pass
(e.g., {"employee_count": 420, "tech_stack": ["salesforce"]})
"""
response = requests.post(
"https://app.octavehq.com/api/v2/agents/qualify-company/run",
headers={"api_key": api_key, "Content-Type": "application/json"},
json={
"agentOId": agent_id,
"companyDomain": company_domain,
"runtimeContext": runtime_context,
"includeFullAnnotation": False,
}
)
response.raise_for_status()
return response.json()
# Example usage
result = qualify_company(
company_domain="techcorp.com",
agent_id="your_agent_oid",
api_key="your_api_key",
runtime_context={"employee_count": 420}
)
# Response includes:
# - Overall score + rationale
# - Product qualification (answers to good/bad fit questions)
# - Segment qualification
# - Confidence levels (LOW, MEDIUM, HIGH) per answer
# - Disqualifier summary (if any instant disqualifiers triggered)
If you have specific quantitative data about a company (employee count, revenue, tech stack) from an enrichment step, pass it as runtimeContext. This prevents the agent from having to infer those values from public sources. For questions like "Does the company have more than 100 employees?", the agent will use your runtime context value for a precise answer rather than an estimate.
When to Use Each Approach
Static scoring and AI qualification are not mutually exclusive. They solve different problems at different points in your funnel.
| Use Case | Static Scoring | AI Qualification |
|---|---|---|
| Bulk prioritization of 10K+ accounts | Fast, cheap, good enough | Slower, higher cost per account |
| SDR outreach prep | Tells them which account to call | Tells them what to say and why |
| ICP drift detection | Requires manual recalibration | Adapts as you update Library criteria |
| Multi-product qualification | Needs separate models per product | One agent, multiple product evaluations |
| Rep trust and adoption | "The score says 75" (opaque) | "Here is why they fit" (transparent) |
A practical setup: use your static scoring model as the first filter to identify which accounts are worth qualifying, then run the Qualify Company Agent on the top tier. This keeps API costs proportional to the accounts that matter while giving reps the context they need on priority accounts.
Frequently Asked Questions
How do you build a scoring algorithm based on company revenue brackets?
Define revenue tiers that match your ICP (for example: under $1M, $1M-$10M, $10M-$50M, $50M-$200M, $200M+), assign each tier a score from 0-100 based on your historical win rates per bracket, then use a SQL CASE statement or Python conditional to map each account to its bracket score. Combine the revenue score with employee count, industry, and other signals using a weighted formula. The code examples in this article are production-ready starting points.
What is the difference between lead scoring and account scoring?
Lead scoring evaluates individual contacts. Account scoring evaluates entire companies. In B2B, where buying committees of 6-10 stakeholders make purchase decisions, account scoring is more predictive of deal potential. Most mature GTM teams use both: account scoring for prioritization, lead scoring for routing contacts within prioritized accounts.
How often should I recalibrate my scoring weights?
Quarterly at minimum. Run the win-rate-by-factor SQL queries shown in this article and compare the output to your current weight configuration. Major recalibration should happen after product launches that change your ICP, when entering new markets, or when you see score distributions for wins and losses converging (meaning the model is losing discriminative power).
What is a good score threshold for SDR follow-up?
It depends on your funnel capacity. If SDRs are capacity-constrained, set the threshold higher (70+). If you need top-of-funnel volume, lower it (50+). The right threshold balances coverage with conversion rate. Monitor both metrics and adjust weekly until you find the equilibrium where SDRs are neither starved nor overwhelmed.
Should I use machine learning for account scoring?
ML can improve accuracy if you have enough closed-won data (500+ deals minimum) and clean input features. The tradeoff is explainability: a gradient-boosted model may score more accurately, but your reps cannot see why an account scored high or low, which hurts adoption. A practical middle ground is to use a rules-based model for transparency and layer AI qualification agents on top for contextual reasoning about why an account fits.
What data sources do I need for account scoring?
At minimum: CRM data (company name, industry, employee count, revenue) and your own engagement data (website visits, email interactions). To improve accuracy, add technographic data from providers like BuiltWith or SimilarTech, and third-party intent signals from Bombora or G2 Buyer Intent. The more signals you layer in, the more discriminative the model becomes -- but start with what you have and add sources incrementally.
