What does it mean to normalize a root domain from a URL?

Normalizing a root domain means converting a full URL into a consistent registered domain format, such as turning https://www.blog.example.com/page?utm_source=x into example.com.

Should I remove www when normalizing domains?

Yes, for most SEO, reporting, and deduplication workflows, www.example.com and example.com should be normalized to the same root domain: example.com.

Why is the Public Suffix List important for root domain extraction?

The Public Suffix List helps identify multi-part suffixes like co.uk and com.au, so a parser returns example.co.uk instead of incorrectly reducing the domain to co.uk.

Can I normalize root domains in bulk?

Yes. You can paste multiple URLs into the URL to Domain tool and extract a deduplicated list of normalized root domains in one step.

Extract and Normalize Root Domain from URL

A raw URL often contains more than the domain you actually need: protocol, subdomains, folders, query strings, tracking parameters, and fragments. Root domain normalization turns messy URLs into one consistent value, such as example.com. This guide explains the process, common edge cases, and the fastest way to do it in bulk.

Try URL to Domain Tool View Rules

What Is a Normalized Root Domain?

A root domain is the registered domain plus its public suffix. For example, the root domain of https://blog.example.com/pricing?ref=ad is example.com. Normalization means applying the same cleanup rules to every URL so equivalent URLs produce the same domain.

Example normalization:

https://www.blog.example.co.uk/articles?id=42&utm_source=newsletter#intro

becomes

example.co.uk

Why Normalize Root Domains?

Deduplicate URL Lists

Backlink exports, analytics reports, search results, and crawl data often contain many URLs from the same site. Normalizing to root domains lets you count unique websites instead of unique pages.

Clean SEO and Outreach Data

SEO workflows usually care about referring domains, prospects, competitors, or publisher websites. A normalized root domain is easier to group, filter, enrich, and compare across tools.

Make Reporting Consistent

Without normalization, http://example.com, https://www.example.com/, and https://blog.example.com/post can appear as separate records. Normalization collapses them into one value.

Root Domain Normalization Rules

Step	Input Part	Action
1	Protocol	Remove http, https, and ftp prefixes.
2	Path and filename	Remove everything after the hostname.
3	Query string and fragment	Remove tracking parameters, IDs, and hash fragments.
4	www prefix	Normalize www.example.com to example.com.
5	Subdomains	Strip subdomains unless you intentionally need hostnames.

Before and After Examples

URL	Normalized Root Domain
`https://www.example.com/blog/post?utm_source=email`	example.com
`http://shop.example.com/products/123#reviews`	example.com
`https://news.bbc.co.uk/sport/football`	bbc.co.uk
`https://docs.github.com/en/actions?query=test`	github.com
`https://mail.yahoo.co.jp/login`	yahoo.co.jp

How to Extract and Normalize Root Domains

1Use a Free URL to Domain Tool

Paste one or more URLs into the tool, click Extract Domains, and copy or export the normalized root domains. This is the fastest option for audits, spreadsheets, and one-off cleanup tasks.

Paste URLs, one per line or mixed inside text.
Run the extractor.
Review the normalized root domains.
Copy the results or export them as CSV.

2JavaScript Approach

The built-in URL API can isolate the hostname. For production root-domain logic, pair it with a Public Suffix List parser.

const input = 'https://www.blog.example.co.uk/path?utm_source=email';
const hostname = new URL(input).hostname.replace(/^www\./, '');

// Basic parser for simple cases:
const parts = hostname.split('.');
const simpleRoot = parts.slice(-2).join('.');

// Use a PSL-aware library for real datasets:
// psl.parse(hostname).domain -> 'example.co.uk'

3Python Approach

In Python, tldextract is a common choice because it handles public suffixes.

import tldextract

url = 'https://www.blog.example.co.uk/path?utm_source=email'
parts = tldextract.extract(url)
root_domain = f'{parts.domain}.{parts.suffix}'

print(root_domain)  # example.co.uk

Edge Cases to Watch For

Multi-Part Suffixes

Domains like example.co.uk and example.com.au need suffix-aware parsing. Splitting on dots and taking the last two parts is not enough.

Subdomains That Matter

For SEO deduplication, root domains are usually best. For security review, app inventory, or hosting analysis, you may need to preserve full hostnames such as api.example.com.

Malformed URLs

Real datasets often include missing protocols, trailing punctuation, copied email text, or markdown links. A good extractor should handle plain domains and URLs embedded in text, not only perfectly formatted links.

Try the Free URL to Domain Tool

Paste full URLs below and convert them into normalized root domains. The tool works in your browser and supports bulk input.

Enter URLs

Frequently Asked Questions

What is the difference between a hostname and a root domain?

A hostname can include subdomains, such as blog.example.com. The root domain is the registered domain, such as example.com.

Does normalization remove tracking parameters?

Yes. Since root domain extraction only keeps the domain, query strings such as utm_source, fbclid, and other tracking parameters are removed.

Is root domain normalization safe for all use cases?

It is ideal for deduplication, SEO reporting, and prospect lists. If your workflow depends on individual hosts or subdomains, keep a hostname column alongside the root domain column.

Related Guides

How to Extract Domain from a Website URL

Learn multiple ways to extract domains from URLs with tools, code, and spreadsheets.

URL Cleaner

Remove UTM parameters, FBCLID, and tracking codes before sharing URLs.