How to Build an Email Verification Solution for B2B SaaS Teams

A Decade In

The product launch that didn't make sense

I have been in email verification for over a decade. Before proxy25, I ran an email verification product. We were good at what we did — or so we thought. We ran head-to-head tests against competitors and came out ahead on valid detection rate. We launched with confidence.

Six weeks later, a customer came back with a bounce rate of 14 percent on a list our tool had classified as 94 percent valid.

We looked at the code. Nothing was wrong. We ran the list again. Similar output. We ran it through a competitor. Similar output. We spent two weeks inside the verification logic looking for a bug that did not exist.

The problem was not in the code. It was in how we had thought about what we were building. We had built a verification tool. What we needed to build was a verification system — four distinct layers, each with its own requirements, each dependent on the one above it being solid before it can return anything trustworthy.

We had built layer three before we fully understood what layers one, two, and four were supposed to do. And we paid for it in exactly the way you would expect: with customer complaints we could not diagnose and accuracy numbers we could not defend.

That experience — and the two years of fixing it properly — is what eventually led to proxy25. This is the complete picture of how to build it correctly.

The Architecture

The Four Layers — And Why the Sequence Is Not Optional

Most teams think of email verification as one thing. It is four things, and they are not interchangeable.

1
Format & Syntax
Format and Syntax Validation
Does this string structurally resemble an email address? Does it have an @ symbol, a valid local part, a domain with a recognisable structure? This layer is fast, cheap, and catches a meaningful amount of noise before you spend any real resources on it. It is also completely insufficient on its own.
2
Domain & DNS
Domain and DNS Validation
Does this domain exist? Does it have MX records pointing to a live mail server? A domain with no MX records cannot receive email. Running SMTP handshakes against domains with no MX records wastes infrastructure, inflates your unknown count, and tells you nothing useful.
Layer two is the filter that should run before layer three ever fires — and in too many builds, it does not.
3
SMTP Handshake
SMTP Handshake Verification
Open a TCP connection to the mail server, send the handshake sequence, interpret the response. The layer most teams mean when they say "email verification." The layer that looks simple and is not. Not because the mechanics are complicated, but because the response you get back is not always the truth — and your logic has no way to know the difference without the right infrastructure underneath it.
4
Output & Confidence
Output Classification and Confidence Scoring
What does your output actually tell the customer, and how much should they trust each result? This layer is almost always built last, treated as presentation logic, and shipped without enough thought. It is the layer that determines whether your product builds trust or erodes it.
Ship layer four as an afterthought and you give customers false certainty. That's the mistake we made — and it took months of complaints to understand it.

Build them in this sequence. Layer three built before layer two means you are burning SMTP capacity on domains that could have been eliminated in milliseconds with a DNS lookup. Layer four built without understanding the failure modes of layer three means you are presenting false confidence to customers who will discover the inaccuracy at the worst possible moment — when their campaign is live.

The Output Problem

The Output Classification Mistake That Costs You Customers

Here is the question that reveals whether a verification system was built with real understanding or assembled from documentation: what exactly does "valid" mean in your output?

Most tools return four categories. The problem is not the categories. The problem is that these four categories carry fundamentally different levels of confidence — and almost every tool presents them with identical certainty.

Valid

A 250 response means the server told you the mailbox exists. It does not mean the mailbox exists. A degraded IP and a clean IP can both return 250 — they are not the same result.

Least Reliable

Invalid

A genuine 550 hard reject means the mailbox does not exist. Mail servers have no strategic incentive to lie by returning 550. This result holds up across almost all infrastructure conditions.

Most Reliable

Catch-all

Domains configured to accept all email. Can be further classified as catchall valid or catchall invalid — a distinction that unlocks up to 40% of a typical list for senders.

Refineable

Unknown

A high unknown rate — above 8–10% on a well-sourced B2B list — is telling you something is wrong in your delivery layer, not in the addresses.

Diagnostic Signal

"Valid" is your least reliable output category, not your most reliable one. Almost nobody says this clearly. Your output classification should reflect the confidence level behind the category — not just the category itself.

The Measurement Problem

You Are Measuring Accuracy Wrong

For the first two years of running the verification product, we measured accuracy the standard way: verify a list, send to it, measure the bounce rate, back-calculate what percentage of our "valid" results were actually deliverable. When the number looked bad, we went into the code.

Back-calculating accuracy from bounce rate conflates two completely different failure modes and presents them as one number. We spent months fixing logic that did not need fixing — because the infrastructure was feeding it bad inputs and we had no way to see the difference.

🔧

Failure Mode 1: Logic Problem

Your SMTP interpretation has a flaw. Your catch-all detection is misclassifying domains. Something in the code is wrong.

→ Fix is in the product

🌐

Failure Mode 2: Infrastructure Problem

Your logic is working exactly as designed. Mail servers querying through degraded IPs are returning defensive responses. Your logic faithfully classifies those responses.

→ Fix is not in the product

The measurement that actually diagnoses which problem you have is tracking two numbers separately:

False Positive Rate

Logic

Tells you about your verification logic. When this is wrong, fix the code.

Unknown Rate

Infrastructure

Tells you about your delivery layer. When this is high, fix the infrastructure.

Track them separately. The behaviour of each tells you something the other cannot. Conflating them into a single accuracy number means you keep applying the right fix to the wrong layer.

The Critical Decision

The Build vs Buy Decision Most Teams Get Backwards

When we were deep in fixing the verification product, we considered building our own IP infrastructure. The logic seemed sound — own the infrastructure, control the costs, remove the vendor dependency.

We did not do it. And the reason we did not do it is the same reason I would tell any team today not to do it.

IP reputation is not something you build on a product timeline. The IPs that mail servers trust for SMTP verification have years of consistent, clean history behind them. Not months. Years. An IP that has been sending clean, well-paced, legitimate traffic for three years carries trust that an IP activated six months ago simply does not have — regardless of how cleanly the newer IP has behaved since activation.

You cannot build that history. You can only accumulate it. Which means the infrastructure you build today will not perform at the level you need for another two to three years — while your product is live, your customers are using it, and your accuracy numbers are being evaluated against competitors whose infrastructure was built years before yours.

Build This

Your Verification Logic

This is where your product differentiation actually lives. The SMTP interpretation, the classification system, the confidence scoring — build this yourself.

Source This

The IP Infrastructure

Residential rotating proxies with port 25 access, with years of established reputation. Source from providers who have already built the history your logic needs to get honest answers.

That is the right division of labour. It took me two years and more customer complaints than I would like to remember to understand why.

The Complete Picture

Key Takeaways

✓

The four layers — format, DNS, SMTP, output classification — are not interchangeable. Build them in sequence or each layer below inherits the failures of the one above it.

✓

"Valid" is your least reliable output category, not your most reliable one. A 250 from a clean IP and a 250 from a flagged IP look identical in your output. They are not the same result.

✓

Invalid is the result you can trust most. A genuine 550 holds up across almost all infrastructure conditions.

✓

High unknown rate is a signal about your infrastructure, not your data. It is unresolved, not unresolvable.

✓

Track false positive rate and unknown rate separately. They diagnose different problems. Conflating them into a single accuracy number means you keep applying the right fix to the wrong layer.

✓

IP reputation takes years to build. Source infrastructure from providers who have already built it. Build the logic yourself. That is the right division of labour.

Built for teams that want honest answers

proxy25 residential rotating proxies with port 25 access are purpose-built for teams that want their verification logic to get honest answers from mail servers.

Start at proxy25.com → 500 free credits · No credit card required