Как узнать длину строки php
Перейти к содержимому

Как узнать длину строки php

  • автор:

mb_strlen

Параметр encoding — это кодировка символов. Если он опущен или равен null , для него будет установлена внутренняя кодировка символов.

Возвращаемые значения

Возвращает количество символов в строке ( string ) string , имеющих кодировку символов encoding . Многобайтовый символ вычисляется как 1.

Ошибки

Если кодировка неизвестна, выдаётся ошибка уровня E_WARNING .

Список изменений

Версия Описание
8.0.0 Теперь параметр encoding может принимать значение null .

Смотрите также

  • mb_internal_encoding() — Устанавливает/получает внутреннюю кодировку скрипта
  • grapheme_strlen() — Получает длину строки в единицах графемы
  • iconv_strlen() — Возвращает количество символов в строке
  • strlen() — Возвращает длину строки

User Contributed Notes 7 notes

12 years ago

If you are unsure about what $encoding can be set to, here’s a full list of all the encodings supported by this extension:

16 years ago

Speed of mb_strlen varies a lot according to specified character set.

If you need length of string in bytes (strlen cannot be trusted anymore because of mbstring.func_overload) you should use .
It’s the fastest way (still a way slower than strlen, though) to determine byte length of string. Other single byte character sets (ASCII, ISO-8859-1, . ) are several times slower than 8bit.

16 years ago

Just did a little benchmarking (1.000.000 times with lorem ipsum text) on the mbs functions

especially mb_strtolower and mb_strtoupper are really slow (up to 100 times slower compared to normal functions). Other functions are alike-ish, but sometimes up to 5 times slower.

just be cautious when using mb_ functions in high frequented scripts.

# test runs: 1000000
# benchmarking strlen vs. mb_strlen
# normal strlen: 3.6795361042023 ms, average: 3.6795361042023E-6 ms
# mb_strlen: 5.5934538841248 ms, average: 5.5934538841248E-6 ms
ok 1 — mb_strlen is slower than strlen
# mb_strlen is 1.52 slower than strlen
#
#
# benchmarking strpos vs. mb_strpos
# normal strpos: 5.5523281097412 ms, average: 5.5523281097412E-6 ms
# mb_strlen: 31.180974960327 ms, average: 3.1180974960327E-5 ms
ok 2 — mb_strlen is slower than strlen
# mb_strpos is 5.62 slower than strpos
#
#
# benchmarking substr vs. mb_substr
# normal substr: 3.4437320232391 ms, average: 3.4437320232391E-6 ms
# mb_strlen: 3.5374181270599 ms, average: 3.5374181270599E-6 ms
ok 3 — mb_strlen is slower than strlen
# mb_substr is 1.03 slower than substr
#
#
# benchmarking strtolower vs. mb_strtolower
# normal strtolower: 4.446839094162 ms, average: 4.446839094162E-6 ms
# mb_strlen: 193.44901108742 ms, average: 0.00019344901108742 ms
ok 4 — mb_strlen is slower than strlen
# mb_strtolower is 43.5 slower than strtolower
#
#
# benchmarking strtoupper vs. mb_strtoupper
# normal strtoupper: 3.0210740566254 ms, average: 3.0210740566254E-6 ms
# mb_strlen: 340.71775603294 ms, average: 0.00034071775603294 ms
ok 5 — mb_strlen is slower than strlen
# mb_strtoupper is 112.78 slower than strtoupper

4 years ago

It may not be clear whether PHP actually supports utf-8, which is the current de facto standard character encoding for Web documents, which supports most human languages. The good news is: it does.

I wrote a test program which successfully reads in a utf-8 file (without BOM) and manipulates the characters using mb_substr, mb_strlen, and mb_strpos (mb_substr should normally be avoided, as it must always start its search at character position 0).

The results with a variety of Unicode test characters in utf-8 encoding, up to four bytes in length, were mostly correct, except that accent marks were always mistakenly treated as separate characters instead of being combined with the previous character; this problem can be worked around by programming, when necessary.

15 years ago

If you find yourself without the mb string functions and can’t easily change it, a quick hack replacement for mb_strlen for utf8 characters is to use a a PCRE regex with utf8 turned on.

This is basically an ugly hack which counts all single character matches, and I’d expect it to be painfully slow on large strings.

16 years ago

Thank you Peter Albertsson for presenting that!

After spending more than eight hours tracking down two specific bugs in my mbstring-func_overloaded environment I have learned a very important lesson:

Many developers rely on strlen to give the amount of bytes in a string. While mb-overloading has very many advantages, the most hard-spotted pitfall must be this issue.

Two examples (from the two bugs found earlier):

1. Writing a string to a file:

$str = «string with utf-8 chars åèä — doo-bee doo-bee dooh» ;
$fp = fopen ( $this -> _file , «wb» );
if ( $fp ) $len = strlen ( $str );
fwrite ( $fp , $str , $len );
>
?>

PS This is found i the PEAR::Cache_Lite package (Lite.php) — Reported

2. Iterating through a string’s characters:

$str = «string with utf-8 chars åèö — doo-bee doo-bee dooh» ;
$newStr = «» ;
for ( $i = 0 ; $i < strlen ( $str ); $i ++) $newStr .= $str [ $i ];
>
?>

Both of these situations will fail to save / store the last characters in $str. This can be very hard to spot and can be especially fatal for say serialized strings, xml etc.

So, try to avoid these situations to support overloaded environments, and remeber Peter Albertssons remark if you find problems under such an environment.

17 years ago

I have been working with some funny html characters lately and due to the nightmare in manipulating them between mysql and php, I got the database column set to utf8, then store characters with html enity «ọ» as ọ in the database and set the encoding on php as «utf8».

This is where mb_strlen became more useful than strlen. While strlen(‘ọ’) gives result as 3, mb_strlen(‘ọ’,’UTF-8′) gives 1 as expected.

But left(column1,1) in mysql still gives wrong char for a multibyte string. In the example above, I had to do left(column1,3) to get the correct string from mysql. I am now about to investigate multibyte manipulation in mysql.

  • Функции для работы с многобайтовыми строками
    • mb_​check_​encoding
    • mb_​chr
    • mb_​convert_​case
    • mb_​convert_​encoding
    • mb_​convert_​kana
    • mb_​convert_​variables
    • mb_​decode_​mimeheader
    • mb_​decode_​numericentity
    • mb_​detect_​encoding
    • mb_​detect_​order
    • mb_​encode_​mimeheader
    • mb_​encode_​numericentity
    • mb_​encoding_​aliases
    • mb_​ereg_​match
    • mb_​ereg_​replace_​callback
    • mb_​ereg_​replace
    • mb_​ereg_​search_​getpos
    • mb_​ereg_​search_​getregs
    • mb_​ereg_​search_​init
    • mb_​ereg_​search_​pos
    • mb_​ereg_​search_​regs
    • mb_​ereg_​search_​setpos
    • mb_​ereg_​search
    • mb_​ereg
    • mb_​eregi_​replace
    • mb_​eregi
    • mb_​get_​info
    • mb_​http_​input
    • mb_​http_​output
    • mb_​internal_​encoding
    • mb_​language
    • mb_​list_​encodings
    • mb_​ord
    • mb_​output_​handler
    • mb_​parse_​str
    • mb_​preferred_​mime_​name
    • mb_​regex_​encoding
    • mb_​regex_​set_​options
    • mb_​scrub
    • mb_​send_​mail
    • mb_​split
    • mb_​str_​pad
    • mb_​str_​split
    • mb_​strcut
    • mb_​strimwidth
    • mb_​stripos
    • mb_​stristr
    • mb_​strlen
    • mb_​strpos
    • mb_​strrchr
    • mb_​strrichr
    • mb_​strripos
    • mb_​strrpos
    • mb_​strstr
    • mb_​strtolower
    • mb_​strtoupper
    • mb_​strwidth
    • mb_​substitute_​character
    • mb_​substr_​count
    • mb_​substr
    • Copyright © 2001-2024 The PHP Group
    • My PHP.net
    • Contact
    • Other PHP.net sites
    • Privacy policy

    Как узнать длину строки в PHP — strlen, mb_strlen

    Как узнать длину строки в PHP - strlen, mb_strlen

    В сегодняшней статье мы поговорим о том как узнать длину строки в PHP.

    Каждый раз когда мы работаем со строковыми переменными или с другими объектами в php которые имеют отношения до строки, то нам очень часто нужно узнавать длину этой самой строки для дальнейшей её обратно. Задача кажется очень простой, но есть один очень важный нюанс, это кодировка. Функции в PHP которые подсчитывают размер строки, могут показывать разные значения, а значения длины будет зависеть от того какаю кодировку вы используете у себя на сайте.

    В PHP есть несколько встроенных функций, которые определяю размер строки, и мы сегодня их рассмотрим.

    Первую функцию которую мы рассмотрим называться strlen. Функция strlen работает очень просто, она принимает всего один единственным и обязательным параметр, это строка, и возвращает длину строки.

    php> strlen(‘Hellow World’);

    string length

    В результате функция посчитала нам её длину и вернула результат. Как я и говорил при работе с этой функцией есть один, но очень важный нюанс.

    В следующей примере я напишу текст тот же самый, но только на русском языке.

    php> strlen(‘Привет мир’);

    string length

    В результате мы получаем странный результат. По идеи мы должны получить число 10, а получаем 19. Оказывается что функция strlen подсчитывает не привычное нам количество символов в строке, а подсчитывает количество байтов в строке, один символ в юникод это 2 байта, а пробел 1 байт.

    Так что при работе с этой функцией этот нюанс нужно учитывать.

    Следующую функцию в примере мы будем использовать под названием mb_strlen. Функция mb_strlen принимает два параметра, первый обязательный параметр это обычная строка, а второй параметр не обязательный это кодировка. Разница между функциями strlen и mb_strlen в том что подсчет символов будет одинаковым как на английском, так и на русском языке. Даже если символ занимает несколько байт, то будет посчитан, как один байт.

    php> mb_strlen(«Hellow World»);

    php> mb_strlen(«Привет мир»);

    string length

    Всем спасибо, я надеюсь что вам моя статья хоть чем-то помогла.

    strlen

    Замечание:

    Функция strlen() возвратит количество байт, а не число символов в строке.

    Смотрите также

    • count() — Подсчитывает количество элементов массива или Countable объекте
    • grapheme_strlen() — Получает длину строки в единицах графемы
    • iconv_strlen() — Возвращает количество символов в строке
    • mb_strlen() — Получает длину строки

    User Contributed Notes 7 notes

    8 years ago

    I want to share something seriously important for newbies or beginners of PHP who plays with strings of UTF8 encoded characters or the languages like: Arabic, Persian, Pashto, Dari, Chinese (simplified), Chinese (traditional), Japanese, Vietnamese, Urdu, Macedonian, Lithuanian, and etc.
    As the manual says: «strlen() returns the number of bytes rather than the number of characters in a string.», so if you want to get the number of characters in a string of UTF8 so use mb_strlen() instead of strlen().

    // the Arabic (Hello) string below is: 59 bytes and 32 characters
    $utf8 = «السلام علیکم ورحمة الله وبرکاته!» ;

    var_export ( strlen ( $utf8 ) ); // 59
    echo «
    » ;
    var_export ( mb_strlen ( $utf8 , ‘utf8’ ) ); // 32
    ?>

    7 years ago

    When checking for length to make sure a value will fit in a database field, be mindful of using the right function.

    There are three possible situations:

    1. Most likely case: the database column is UTF-8 with a length defined in unicode code points (e.g. mysql varchar(200) for a utf-8 database).

    // ok if php.ini default_charset set to UTF-8 (= default value)
    mb_strlen ( $value );
    iconv_strlen ( $value );
    // always ok
    mb_strlen ( $value , «UTF-8» );
    iconv_strlen ( $value , «UTF-8» );

    // BAD, do not use:
    strlen ( utf8_decode ( $value )); // breaks for some multi-byte characters
    grapheme_strlen ( $value ); // counts graphemes, not code points
    ?>

    2. The database column has a length defined in bytes (e.g. oracle’s VARCHAR2(200 BYTE))

    // ok, but assumes mbstring.func_overload is 0 in php.ini (= default value)
    strlen ( $value );
    // ok, forces count in bytes
    mb_strlen ( $value , «8bit» )
    ?>

    3. The database column is in another character set (UTF-16, ISO-8859-1, etc. ) with a length defined in characters / code points.

    Find the character set used, and pass it explicitly to the length function.

    10 years ago

    PHP’s strlen function behaves differently than the C strlen function in terms of its handling of null bytes (‘\0’).

    In PHP, a null byte in a string does NOT count as the end of the string, and any null bytes are included in the length of the string.

    For example, in PHP:

    strlen( «te\0st» ) = 5

    In C, the same call would return 2.

    Thus, PHP’s strlen function can be used to find the number of bytes in a binary string (for example, binary data returned by base64_decode).

    11 years ago

    I would like to demonstrate that you need more than just this function in order to truly test for an empty string. The reason being that will return 0. So how do you know if the value was null, or truly an empty string?

    $foo = null ;
    $len = strlen ( null );
    $bar = » ;

    echo «Length: » . strlen ( $foo ) . «
    » ;
    echo «Length: $len
    » ;
    echo «Length: » . strlen ( null ) . «
    » ;

    if ( strlen ( $foo ) === 0 ) echo ‘Null length is Zero
    ‘ ;
    if ( $len === 0 ) echo ‘Null length is still Zero
    ‘ ;

    Null length is Zero
    Null length is still Zero

    !is_null(): $foo is probably null
    isset(): $foo is probably null

    !is_null(): $bar is truly an empty string
    isset(): $bar is truly an empty string
    // End Output

    So it would seem you need either is_null() or isset() in addition to strlen() if you care whether or not the original value was null.

    13 years ago

    We just ran into what we thought was a bug but turned out to be a documented difference in behavior between PHP 5.2 & 5.3. Take the following code example:

    $attributes = array( ‘one’ , ‘two’ , ‘three’ );

    if ( strlen ( $attributes ) == 0 && ! is_bool ( $attributes )) echo «We are in the ‘if’\n» ; // PHP 5.3
    > else echo «We are in the ‘else’\n» ; // PHP 5.2
    >

    ?>

    This is because in 5.2 strlen will automatically cast anything passed to it as a string, and casting an array to a string yields the string «Array». In 5.3, this changed, as noted in the following point in the backward incompatible changes in 5.3 (http://www.php.net/manual/en/migration53.incompatible.php):

    «The newer internal parameter parsing API has been applied across all the extensions bundled with PHP 5.3.x. This parameter parsing API causes functions to return NULL when passed incompatible parameters. There are some exceptions to this rule, such as the get_class() function, which will continue to return FALSE on error.»

    So, in PHP 5.3, strlen($attributes) returns NULL, while in PHP 5.2, strlen($attributes) returns the integer 5. This likely affects other functions, so if you are getting different behaviors or new bugs suddenly, check if you have upgraded to 5.3 (which we did recently), and then check for some warnings in your logs like this:

    strlen() expects parameter 1 to be string, array given in /var/www/sis/lib/functions/advanced_search_lib.php on line 1028

    If so, then you are likely experiencing this changed behavior.

    1 year ago

    Since PHP 8.0, passing null to strlen() is deprecated. To check for a blank string (not including ‘0’):

    // PHP >= 8.0
    if ( $text === null || $text === » )) echo ’empty’ ;
    >

    6 years ago

    There’s a LOT of misinformation here, which I want to correct! Many people have warned against using strlen(), because it is «super slow». Well, that was probably true in old versions of PHP. But as of PHP7 that’s definitely no longer true. It’s now SUPER fast!

    I created a 20,00,000 byte string (~20 megabytes), and iterated ONE HUNDRED MILLION TIMES in a loop. Every loop iteration did a new strlen() on that very, very long string.

    The result: 100 million strlen() calls on a 20 megabyte string only took a total of 488 milliseconds. And the strlen() calls didn’t get slower/faster even if I made the string smaller or bigger. The strlen() was pretty much a constant-time, super-fast operation

    So either PHP7 stores the length of every string as a field that it can simply always look up without having to count characters. Or it caches the result of strlen() until the string contents actually change. Either way, you should now never, EVER worry about strlen() performance again. As of PHP7, it is super fast!

    Here is the complete benchmark code if you want to reproduce it on your machine:

    $iterations = 100000000 ; // 100 million
    $str = str_repeat ( ‘0’ , 20000000 );

    // benchmark loop and variable assignment to calculate loop overhead
    $start = microtime ( true );
    for( $i = 0 ; $i < $iterations ; ++ $i ) $len = 0 ;
    >
    $end = microtime ( true );
    $loop_elapsed = 1000 * ( $end — $start );

    // benchmark strlen in a loop
    $len = 0 ;
    $start = microtime ( true );
    for( $i = 0 ; $i < $iterations ; ++ $i ) $len = strlen ( $str );
    >
    $end = microtime ( true );
    $strlen_elapsed = 1000 * ( $end — $start );

    // subtract loop overhead from strlen() speed calculation
    $strlen_elapsed -= $loop_elapsed ;

    echo «\nstring length: < $len >\ntest took: < $strlen_elapsed >milliseconds\n» ;

    grapheme_strlen

    Строка, которую необходимо измерить. Должна быть корректная строка в кодировке UTF-8.

    Возвращаемые значения

    Длина строки в случае успешного выполнения или false в случае возникновения ошибки.

    Примеры

    Пример #1 Пример использования grapheme_strlen()

    $char_a_ring_nfd = «a\xCC\x8A» ; // ‘LATIN SMALL LETTER A WITH RING ABOVE’ (U+00E5) normalization form «D»
    $char_o_diaeresis_nfd = «o\xCC\x88» ; // ‘LATIN SMALL LETTER O WITH DIAERESIS’ (U+00F6) normalization form «D»

    print grapheme_strlen ( ‘abc’ . $char_a_ring_nfd . $char_o_diaeresis_nfd . $char_a_ring_nfd );

    Результат выполнения приведённого примера:

Добавить комментарий

Ваш адрес email не будет опубликован. Обязательные поля помечены *