Unfake Text
Convert homoglyphs and look like characters back to normal Latin text.
Detection Settings
Continue with Related Tools
What is Homoglyph Detection?
Homoglyph detection is the process of identifying and converting look-alike Unicode characters back to their standard Latin equivalents. This tool scans text for Cyrillic, Greek, and Fullwidth characters that visually resemble English letters but have different Unicode code points—essential for detecting phishing attacks and cleaning obfuscated data.
For example, gοοglе.cοm (with Greek ο and Cyrillic е) is normalized to google.com (all Latin).
Features
Auto-Detection
Automatically detects Cyrillic, Greek, Fullwidth, and Roman numeral homoglyphs.
Detection Stats
See exactly how many homoglyphs were found and the detection rate.
Comparison Mode
Side-by-side view of fake text vs. normalized Latin text.
File Upload
Load files and download normalized versions for cleaning datasets.
Batch Processing
Process multiple lines independently for URL lists or datasets.
Security Focus
Detect phishing domains and prevent homoglyph-based attacks.
Common Use Cases
Security Protection
Detect phishing URLs, deceptive domain names, and homoglyph-based social engineering attacks.
Data Cleaning
Normalize user input, clean database records, and ensure text matching works correctly in search systems.
Quality Assurance
Verify text authenticity, prevent filter bypass attempts, and maintain data integrity across systems.
How to use
- Input: Paste text that may contain homoglyphs (e.g., a suspicious URL).
- Detect: The tool instantly scans and converts all homoglyphs to Latin.
- Review: View detection stats to see how many fakes were found.
- Copy/Download: Use the cleaned, normalized text for security checks.
Example - Phishing URL Detection
- Cyrillic 'р' (U+0440) → Latin 'p' (U+0070)
- Cyrillic 'а' (U+0430) → Latin 'a' (U+0061)
- Detection Rate: 28.6% (2 out of 7 chars)
- ⚠️ Warning: Fake domain detected!
Frequently Asked Questions
How does homoglyph detection work?
The tool scans each character in your text and checks if it's a homoglyph (look-alike character from Cyrillic, Greek, or Fullwidth Unicode). If detected, it replaces the homoglyph with its standard Latin equivalent. For example, Cyrillic 'о' (U+043E) is converted to Latin 'o' (U+006F). Detection stats show exactly how many characters were normalized.
What homoglyphs are detected?
The tool detects common homoglyphs including: (1) Cyrillic - а, е, о, р, с, т, х (and uppercase). (2) Greek - α, ο, ρ, ν, κ (and uppercase). (3) Fullwidth Unicode - a, b, c etc. (4) Roman numerals - ⅰ, ⅼ, ⅴ, ⅹ. These are the most commonly used in phishing attacks and text obfuscation.
Why would I need to unfake text?
Common use cases include: (1) Security - Detecting phishing URLs (e.g., аpple.com vs apple.com). (2) Data cleaning - Normalizing user input before database storage. (3) Search accuracy - Ensuring search queries match records. (4) Compliance - Preventing homoglyph-based filter evasion. (5) Quality assurance - Verifying text authenticity in content moderation.
Can this detect all Unicode variations?
This tool detects common homoglyphs used in 95%+ of real-world attacks and obfuscation cases. It covers Cyrillic, Greek, Fullwidth, and Roman numerals. However, Unicode has thousands of lookalike characters—this tool focuses on the most practical and frequently encountered ones.
What does the detection rate percentage mean?
Detection rate shows what percentage of your text was homoglyphs. For example, '50%' means half of the characters were fake lookalikes. 0% means the text is clean (all normal Latin). 100% means every character was a homoglyph (highly suspicious!).
Does unfaking change the meaning of text?
No! Homoglyphs are visually identical to their Latin equivalents, so normalizing them doesn't change the meaning—it only makes the text technically correct. For instance, 'gοοgle' (with Greek ο) becomes 'google' (with Latin o)—same visual appearance, different Unicode.
Can I use batch mode to check multiple lines?
Yes! Enable 'Batch Mode' to process each line independently. This is useful when checking lists of URLs, usernames, or domain names. Each line is analyzed separately while preserving the line structure.
Can I upload a file to check for homoglyphs?
Yes! Click 'Upload' to load a .txt, .md, or .csv file. The tool will scan and normalize all homoglyphs across the entire file. You can then download the cleaned version.
How is this different from Unicode normalization (NFC/NFD)?
Unicode normalization (NFC/NFD/NFKC/NFKD) handles combining characters and decomposition. This tool specifically handles confusable homoglyphs—characters from different scripts that look the same but have different code points. They solve different problems; this tool is for security and deception detection.
Is my text sent to your server for processing?
No. All homoglyph detection and normalization happens entirely in your browser using JavaScript. We never see, store, or transmit your text. This makes it safe to check sensitive URLs, passwords, or confidential content.