Report a bug
If you spot a problem with this page, click here to create a Bugzilla issue.
Improve this page
Quickly fork, edit online, and submit a pull request for this page.
Requires a signed-in GitHub account. This works well for small changes.
If you'd like to make larger changes you may want to consider using
a local clone.
rt.util.utf
Encode and decode UTF-8, UTF-16 and UTF-32 strings.
For Win32 systems, the C wchar_t type is UTF-16 and corresponds to the D
wchar type.
For Posix systems, the C wchar_t type is UTF-32 and corresponds to
the D
utf
.dchar type.
UTF character support is restricted to (\u0000 <= character <= \U0010FFFF).
See Also:
License:
Authors:
Walter Bright, Sean Kelly
Source src/rt/util/utf.d
- pure nothrow @nogc @safe bool
isValidDchar
(dcharc
); - Test if
c
is a valid UTF-32 character.\uFFFE and \uFFFF are considered valid by this function, as they are permitted for internal use by an application, but they are not allowed for interchange by the Unicode standard.Returns:true
if it is,false
if not. - pure nothrow @nogc @safe uint
stride
(in char[]s
, size_ti
); stride
() returns the length of a UTF-8 sequence starting at indexi
in strings
.Returns:The number of bytes in the UTF-8 sequence or 0xFF meanings
[i
] is not the start of of UTF-8 sequence.- pure nothrow @nogc @safe uint
stride
(in wchar[]s
, size_ti
); stride
() returns the length of a UTF-16 sequence starting at indexi
in strings
.- pure nothrow @nogc @safe uint
stride
(in dchar[]s
, size_ti
); stride
() returns the length of a UTF-32 sequence starting at indexi
in strings
.Returns:The return value will always be 1.- pure @safe size_t
toUCSindex
(in char[]s
, size_ti
);
pure @safe size_ttoUCSindex
(in wchar[]s
, size_ti
);
pure nothrow @nogc @safe size_ttoUCSindex
(in dchar[]s
, size_ti
); - Given an index
i
into an array of characterss
[], and assuming that indexi
is at the start of a UTF character, determine the number of UCS characters up to that indexi
. - pure @safe size_t
toUTFindex
(in char[]s
, size_tn
);
pure nothrow @nogc @safe size_ttoUTFindex
(in wchar[]s
, size_tn
);
pure nothrow @nogc @safe size_ttoUTFindex
(in dchar[]s
, size_tn
); - Given a UCS index
n
into an array of characterss
[], return the UTF index. - pure @safe dchar
decode
(in char[]s
, ref size_tidx
);
pure @safe dchardecode
(in wchar[]s
, ref size_tidx
);
pure @safe dchardecode
(in dchar[]s
, ref size_tidx
); - Decodes and returns character starting at
s
[idx
].idx
is advanced past the decoded character. If the character is not well formed, a UtfException is thrown andidx
remains unchanged. - pure nothrow @safe void
encode
(ref char[]s
, dcharc
);
pure nothrow @safe voidencode
(ref wchar[]s
, dcharc
);
pure nothrow @safe voidencode
(ref dchar[]s
, dcharc
); - Encodes character
c
and appends it to arrays
[]. - pure nothrow @nogc @safe ubyte
codeLength
(C)(dcharc
); - Returns the code length of
c
in the encoding using C as a code point. The code is returned in character count, not in bytes. - pure @safe void
validate
(S)(in Ss
); - Checks to see if string is well formed or not. S can be an array of char, wchar, or dchar. Throws a UtfException if it is not. Use to check all untrusted input for correctness.
- pure nothrow @safe string
toUTF8
(strings
);
pure @trusted stringtoUTF8
(in wchar[]s
);
pure @trusted stringtoUTF8
(in dchar[]s
); - Encodes string
s
into UTF-8 and returns the encoded string. - pure @trusted wstring
toUTF16
(in char[]s
);
pure @safe wptrtoUTF16z
(in char[]s
);
pure nothrow @safe wstringtoUTF16
(wstrings
);
pure nothrow @trusted wstringtoUTF16
(in dchar[]s
); - Encodes string
s
into UTF-16 and returns the encoded string.toUTF16z
() is suitable for calling the 'W' functions in the Win32 API that take an LPWSTR or LPCWSTR argument. - pure @trusted dstring
toUTF32
(in char[]s
);
pure @trusted dstringtoUTF32
(in wchar[]s
);
pure nothrow @safe dstringtoUTF32
(dstrings
); - Encodes string
s
into UTF-32 and returns the encoded string.
Copyright © 1999-2018 by the D Language Foundation | Page generated by
Ddoc on Sun Feb 18 23:22:48 2018