CascadiaPHP 2024

A classe Spoofchecker

(PHP 5 >= 5.4.0, PHP 7, PHP 8, PECL intl >= 2.0.0)

Introdução

Esta classe é fornecida porque o Unicode contém um grande número de caractares e incorpora sistemas variados de escrita de todo o mundo e seu uso incorreto pode expor programas ou sistemas a possíveis ataques à segurança usando similaridade de caracteres.

Os métodos fornecidos permitem verificar se uma string individual pode ser uma tentativa de confundir o leitor (spoof detection), como em "pаypаl" escrito com caracteres cirílicos 'а'.

Resumo da classe

class Spoofchecker {
/* Constantes */
public const int ANY_CASE;
public const int SINGLE_SCRIPT;
public const int INVISIBLE;
public const int CHAR_LIMIT;
public const int ASCII;
public const int HIGHLY_RESTRICTIVE;
public const int UNRESTRICTIVE;
public const int MIXED_NUMBERS;
public const int HIDDEN_OVERLAY;
/* Métodos */
public __construct()
public areConfusable(string $string1, string $string2, int &$errorCode = null): bool
public isSuspicious(string $string, int &$errorCode = null): bool
public setAllowedLocales(string $locales): void
public setChecks(int $checks): void
public setRestrictionLevel(int $level): void
}

Índice

add a note

User Contributed Notes 2 notes

up
3
Anonymous
7 years ago
From http://icu-project.org/apiref/icu4j/com/ibm/icu/text/SpoofChecker.html :
SINGLE_SCRIPT_CONFUSABLE: indicates that the two strings are visually confusable and that they are from the same script
MIXED_SCRIPT_CONFUSABLE: indicates that the two strings are visually confusable and that they are NOT from the same script
WHOLE_SCRIPT_CONFUSABLE: indicates that the two strings are visually confusable and that they are NOT from the same script BUT both of them are single-script strings
ANY_CASE: Deprecated.
SINGLE_SCRIPT: Deprecated.
INVISIBLE: Check an identifier for the presence of invisible characters, such as zero-width spaces, or character sequences that are likely not to display, such as multiple occurrences of the same non-spacing mark.
CHAR_LIMIT: Check that an identifier contains only characters from a specified set of acceptable characters.

Explanation of whole script, mixed script and single script confusables in UTS 39 section 4 : http://unicode.org/reports/tr39/#Confusable_Detection

Details from Java SpoofChecker class at http://icu-project.org/apiref/icu4j/com/ibm/icu/text/SpoofChecker.html
up
-1
Anonymous
6 years ago
Spoofchecker yields false positives by defaut when Whole-Script Confusables (WSC) and Mixed-Script Confusables (MSC) checks are used.
They have been deprecated since ICU 58:
http://bugs.icu-project.org/trac/ticket/12549#comment:10

Workarounds: upgrade ICU to 58+, or avoid the MSC and WSC checks with Spoofcheckers' setChecks() function.
To Top