|
Libparserutils
|
UTF-8 manipulation macros (implementation). More...
#include <stdbool.h>#include <stdlib.h>#include <string.h>Go to the source code of this file.
Macros | |
| #define | UTF8_TO_UCS4(s, len, ucs4, clen, error) |
| Convert a UTF-8 multibyte sequence into a single UCS-4 character. More... | |
| #define | UTF8_FROM_UCS4(ucs4, s, len, error) |
| Convert a single UCS-4 character into a UTF-8 multibyte sequence. More... | |
| #define | UTF8_LENGTH(s, max, len, error) |
| Calculate the length (in characters) of a bounded UTF-8 string. More... | |
| #define | UTF8_CHAR_BYTE_LENGTH(s, len, error) |
| Calculate the length (in bytes) of a UTF-8 character. More... | |
| #define | UTF8_PREV(s, off, prevoff, error) |
| Find previous legal UTF-8 char in string. More... | |
| #define | UTF8_NEXT(s, len, off, nextoff, error) |
| Find next legal UTF-8 char in string. More... | |
| #define | UTF8_NEXT_PARANOID(s, len, off, nextoff, error) |
| Skip to start of next sequence in UTF-8 input. More... | |
Variables | |
| const uint8_t | numContinuations [256] |
| Number of continuation bytes for a given start byte. More... | |
UTF-8 manipulation macros (implementation).
Definition in file utf8impl.h.
| #define UTF8_CHAR_BYTE_LENGTH | ( | s, | |
| len, | |||
| error | |||
| ) |
Calculate the length (in bytes) of a UTF-8 character.
| s | Pointer to start of character |
| len | Pointer to location to receive length |
| error | Location to receive error code |
Definition at line 228 of file utf8impl.h.
| #define UTF8_FROM_UCS4 | ( | ucs4, | |
| s, | |||
| len, | |||
| error | |||
| ) |
Convert a single UCS-4 character into a UTF-8 multibyte sequence.
Encoding of UCS values outside the UTF-16 plane has been removed from RFC3629. This macro conforms to RFC2279, however.
| ucs4 | The character to process (0 <= c <= 0x7FFFFFFF) (host endian) |
| s | Pointer to pointer to output buffer, updated on exit |
| len | Pointer to length, in bytes, of output buffer, updated on exit |
| error | Location to receive error code |
Definition at line 123 of file utf8impl.h.
Calculate the length (in characters) of a bounded UTF-8 string.
| s | The string |
| max | Maximum length |
| len | Pointer to location to receive length of string |
| error | Location to receive error code |
Definition at line 182 of file utf8impl.h.
| #define UTF8_NEXT | ( | s, | |
| len, | |||
| off, | |||
| nextoff, | |||
| error | |||
| ) |
Find next legal UTF-8 char in string.
| s | The string (assumed valid) |
| len | Maximum offset in string |
| off | Offset in the string to start at |
| nextoff | Pointer to location to receive offset of first byte of next legal character |
| error | Location to receive error code |
Definition at line 274 of file utf8impl.h.
| #define UTF8_NEXT_PARANOID | ( | s, | |
| len, | |||
| off, | |||
| nextoff, | |||
| error | |||
| ) |
Skip to start of next sequence in UTF-8 input.
| s | The string (assumed to be of dubious validity) |
| len | Maximum offset in string |
| off | Offset in the string to start at |
| nextoff | Pointer to location to receive offset of first byte of next legal character |
| error | Location to receive error code |
Definition at line 303 of file utf8impl.h.
| #define UTF8_PREV | ( | s, | |
| off, | |||
| prevoff, | |||
| error | |||
| ) |
Find previous legal UTF-8 char in string.
| s | The string |
| off | Offset in the string to start at |
| prevoff | Pointer to location to receive offset of first byte of previous legal character |
| error | Location to receive error code |
Definition at line 249 of file utf8impl.h.
| #define UTF8_TO_UCS4 | ( | s, | |
| len, | |||
| ucs4, | |||
| clen, | |||
| error | |||
| ) |
Convert a UTF-8 multibyte sequence into a single UCS-4 character.
Encoding of UCS values outside the UTF-16 plane has been removed from RFC3629. This macro conforms to RFC2279, however.
| s | The sequence to process |
| len | Length of sequence |
| ucs4 | Pointer to location to receive UCS-4 character (host endian) |
| clen | Pointer to location to receive byte length of UTF-8 sequence |
| error | Location to receive error code |
Definition at line 34 of file utf8impl.h.