std.string
String handling functions. Objects of types
string,
wstring, and
dstring are value types and cannot be mutated
element-by-element. For using mutation during building strings, use
char[],
wchar[], or
dchar[]. The
*string types
are preferable because they don't exhibit undesired aliasing, thus
making code more robust.
License:Boost License 1.0.
Authors:Walter Bright,
Andrei Alexandrescu,
and Jonathan M Davis
Source:
std/string.d
IMPORTANT NOTE: Beginning with version 2.052, the
following symbols have been generalized beyond strings and moved to
different modules. This action was prompted by the fact that
generalized routines belong better in other places, although they
still work for strings as expected. In order to use moved symbols, you
will need to import the respective modules as follows:
class
StringException: object.Exception;
- Exception thrown on errors in std.string functions.
this(string msg, string file = __FILE__, size_t line = __LINE__, Throwable next = null);
- Parameters:
string msg |
The message for the exception. |
string file |
The file where the exception occurred. |
size_t line |
The line number where the exception occurred. |
Throwable next |
The previous exception in the chain of exceptions, if any. |
immutable char[16u]
hexdigits;
- Scheduled for deprecation in January 2012.
Please use std.ascii.hexDigits instead.
0..9A..F
- Scheduled for deprecation in January 2012.
Please use digits">std.ascii.digits instead.
0..9
immutable char[8u]
octdigits;
- Scheduled for deprecation in January 2012.
Please use std.ascii.octDigits instead.
0..7
immutable char[26u]
lowercase;
- Scheduled for deprecation in January 2012.
Please use lowercase">std.ascii.lowercase instead.
a..z
immutable char[52u]
letters;
- Scheduled for deprecation in January 2012.
Please use letters">std.ascii.letters instead.
A..Za..z
immutable char[26u]
uppercase;
- Scheduled for deprecation in January 2012.
Please use uppercase">std.ascii.uppercase instead.
A..Z
- Scheduled for deprecation in January 2012.
Please use whitespace">std.ascii.whitespace instead.
ASCII whitespace.
- Scheduled for deprecation in January 2012.
Please use std.uni.lineSep instead.
UTF line separator.
- Scheduled for deprecation in January 2012.
Please use std.uni.paraSep instead.
UTF paragraph separator.
- Scheduled for deprecation in January 2012.
Please use newline">std.ascii.newline instead.
Newline sequence for this system.
- Scheduled for deprecation in January 2012.
Please use std.ascii.isWhite or std.uni.isWhite instead.
Returns true if c is ASCII whitespace or unicode LS or PS.
int
icmp(alias pred = "a < b", S1, S2)(S1
s1, S2
s2);
- Compares two ranges of characters lexicographically. The comparison is
case insensitive. Use XREF algorithm, cmp for a case sensitive
comparison. icmp works like XREF algorithm, cmp except that it
converts characters to lowercase prior to applying ($D pred). Technically,
icmp(r1, r2) is equivalent to
cmp!"std.uni.toLower(a) < std.uni.toLower(b)"(r1, r2).
< 0 | s1 < s2 |
= 0 | s1 == s2 |
> 0 | s1 > s2 |
pure nothrow immutable(char)*
toStringz(const(char)[]
s);
pure nothrow immutable(char)*
toStringz(string
s);
- Returns a C-style 0-terminated string equivalent to s. s must not
contain embedded 0's as any C functions will treat the first 0
that it sees a the end of the string. I s is null or empty, then
a string containing only '\0' is returned.
Important Note: When passing a char* to a C function, and the C
function keeps it around for any reason, make sure that you keep a reference
to it in your D code. Otherwise, it may go away during a garbage collection
cycle and cause a nasty bug when the C code tries to use it.
- Flag indicating whether a search is case-sensitive.
pure sizediff_t
indexOf(Char)(in Char[]
s, dchar
c, CaseSensitive
cs = CaseSensitive.yes);
- Returns the index of the first occurence of c in s. If c
is not found, then -1 is returned.
cs indicates whether the comparisons are case sensitive.
sizediff_t
indexOf(Char1, Char2)(const(Char1)[]
s, const(Char2)[]
sub, CaseSensitive
cs = CaseSensitive.yes);
- Returns the index of the first occurence of sub in s. If sub
is not found, then -1 is returned.
cs indicates whether the comparisons are case sensitive.
sizediff_t
lastIndexOf(Char)(const(Char)[]
s, dchar
c, CaseSensitive
cs = CaseSensitive.yes);
- Returns the index of the last occurence of c in s. If c
is not found, then -1 is returned.
cs indicates whether the comparisons are case sensitive.
sizediff_t
lastIndexOf(Char1, Char2)(const(Char1)[]
s, const(Char2)[]
sub, CaseSensitive
cs = CaseSensitive.yes);
- Returns the index of the last occurence of sub in s. If sub
is not found, then -1 is returned.
cs indicates whether the comparisons are case sensitive.
pure nothrow auto
representation(Char)(Char[]
s);
- Returns the representation type of a string, which is the same type
as the string except the character type is replaced by ubyte,
ushort, or uint depending on the character width.
Example:
string s = "hello";
static assert(is(typeof(representation(s)) == immutable(ubyte)[]));
- (RED Scheduled for deprecation in January 2012.
Please use toLower instead.
Convert string s[] to lower case.
pure @trusted S
toLower(S)(S
s);
- Returns a string which is identical to s except that all of its
characters are lowercase (in unicode, not just ASCII). If s does not
have any uppercase characters, then s is returned.
void
tolowerInPlace(C)(ref C[]
s);
- Scheduled for deprecation in January 2012.
Please use toLowerInPlace instead.
Converts s to lowercase in place.
void
toLowerInPlace(C)(ref C[]
s);
- Converts s to lowercase (in unicode, not just ASCII) in place.
If s does not have any uppercase characters, then s is unaltered.
- Scheduled for deprecation in January 2012.
Please use toUpper instead.
Convert string s[] to upper case.
pure @trusted S
toUpper(S)(S
s);
- Returns a string which is identical to s except that all of its
characters are uppercase (in unicode, not just ASCII). If s does not
have any lowercase characters, then s is returned.
void
toupperInPlace(C)(ref C[]
s);
- Scheduled for deprecation in January 2012.
Please use toUpperInPlace instead.
Converts s to uppercase in place.
void
toUpperInPlace(C)(ref C[]
s);
- Converts s to uppercase (in unicode, not just ASCII) in place.
If s does not have any lowercase characters, then s is unaltered.
pure @trusted S
capitalize(S)(S
s);
- Capitalize the first character of s and conver the rest of s
to lowercase.
- Scheduled for deprecation in January 2012.
Capitalize all words in string s[].
Remove leading and trailing whitespace.
Replace all sequences of whitespace with a single space.
S
repeat(S)(S
s, size_t
n);
- (RED Scheduled for deprecation in August 2011.
Please use std.array.replicate instead.
Repeat s for n times.
- Split s[] into an array of lines,
using CR, LF, or CR-LF as the delimiter.
The delimiter is not included in the line.
- Split s into an array of lines using '\r', '\n',
"\r\n", std.uni.lineSep, and std.uni.paraSep as delimiters.
The delimiter is not included in the strings returned.
String
stripl(String)(String
s);
- (RED Scheduled for deprecation in January 2012.
Please use stripLeft instead.
Strips leading whitespace.
pure @safe S
stripLeft(S)(S
s);
- Strips leading whitespace.
String
stripr(String)(String
s);
- (RED Scheduled for deprecation in January 2012.
Please use stripRight instead.
Strips trailing whitespace.
- Strips trailing whitespace.
- Strips both leading and trailing whitespace.
S
chomp(S)(S
s);
S
chomp(S, C)(S
s, const(C)[]
delimiter);
- Returns s sans the trailing delimiter, if any. If no delimiter
is given, then any trailing '\r', '\n', "\r\n",
std.uni.lineSep, or std.uni.paraSeps are removed.
C1[]
chompPrefix(C1, C2)(C1[]
longer, C2[]
shorter);
- If longer.startsWith(shorter), returns longer[shorter.length .. $].
Otherwise, returns longer.
- Returns s sans its last character, if there is one.
If s ends in "\r\n", then both are removed.
S
ljustify(S)(S
s, size_t
width);
- Scheduled for deprecation in January 2012.
Please use leftJustify instead.
Left justify string s[] in field width chars wide.
@trusted S
leftJustify(S)(S
s, size_t
width, dchar
fillChar = ' ');
- Left justify s in a field width characters wide. fillChar
is the character that will be used to fill up the space in the field that
s doesn't fill.
S
rjustify(S)(S
s, size_t
width);
- Scheduled for deprecation in January 2012.
Please use rightJustify instead.
Left right string s[] in field width chars wide.
@trusted S
rightJustify(S)(S
s, size_t
width, dchar
fillChar = ' ');
- Right justify s in a field width characters wide. fillChar
is the character that will be used to fill up the space in the field that
s doesn't fill.
@trusted S
center(S)(S
s, size_t
width, dchar
fillChar = ' ');
- Center s in a field width characters wide. fillChar
is the character that will be used to fill up the space in the field that
s doesn't fill.
S
zfill(S)(S
s, int
width);
- Scheduled for deprecation in January 2012.
Please use rightJustify with a fill character of '0' instead.
Same as rjustify(), but fill with '0's.
S
insert(S)(S
s, size_t
index, S
sub);
- Scheduled for deprecation in August 2011.
Please use std.array.insertInPlace instead.
Insert sub[] into s[] at location index.
S
expandtabs(S)(S
str, size_t
tabsize = 8);
- (RED Scheduled for deprecation in January 2012.
Please use detab instead.
Replace tabs with the appropriate number of spaces.
tabsize is the distance between tab stops.
pure @trusted S
detab(S)(S
s, size_t
tabSize = 8);
- Replace each tab character in s with the number of spaces necessary
to align the following character at the next tab stop where tabSize
is the distance between tab stops.
pure @trusted S
entab(S)(S
s, size_t
tabSize = 8);
- Replaces spaces in s with the optimal number of tabs.
All spaces and tabs at the end of a line are removed.
Parameters:
s |
String to convert. |
tabSize |
Tab columns are tabSize spaces apart. |
string
maketrans(in char[]
from, in char[]
to);
- Construct translation table for translate().
BUG:
only works with ASCII
string
translate(in char[]
s, in char[]
transtab, in char[]
delchars);
- Translate characters in s[] using table created by maketrans().
Delete chars in delchars[].
BUG:
only works with ASCII
- Format arguments into a string.
char[]
sformat(char[]
s,...);
- Format arguments into string s which must be large
enough to hold the result. Throws RangeError if it is not.
Returns:
s
bool
inPattern(S)(dchar
c, in S
pattern);
- See if character c is in the pattern.
Patterns:
A pattern is an array of characters much like a character
class in regular expressions. A sequence of characters
can be given, such as "abcde". The '-' can represent a range
of characters, as "a-e" represents the same pattern as "abcde".
"a-fA-F0-9" represents all the hex characters.
If the first character of a pattern is '^', then the pattern
is negated, i.e. "^0-9" means any character except a digit.
The functions inPattern, countchars, removeschars,
and squeeze
use patterns.
Note:
In the future, the pattern syntax may be improved
to be more like regular expression character classes.
bool
inPattern(S)(dchar
c, S[]
patterns);
- See if character c is in the intersection of the patterns.
size_t
countchars(S, S1)(S
s, in S1
pattern);
- Count characters in s that match pattern.
S
removechars(S)(S
s, in S
pattern);
- Return string that is s with all characters removed that match pattern.
S
squeeze(S)(S
s, in S
pattern = null);
- Return string where sequences of a character in s[] from pattern[]
are replaced with a single instance of that character.
If pattern is null, it defaults to all characters.
S1
munch(S1, S2)(ref S1
s, S2
pattern);
- Finds the position pos of the first character in s that does not match pattern (in the terminology used by
inPattern). Updates s =
s[pos..$]. Returns the slice from the beginning of the original
(before update) string up to, and excluding, pos.
Example:
string s = "123abc";
string t = munch(s, "0123456789");
assert(t == "123" && s == "abc");
t = munch(s, "0123456789");
assert(t == "" && s == "abc");
The munch function is mostly convenient for skipping
certain category of characters (e.g. whitespace) when parsing
strings. (In such cases, the return value is not used.)
- Return string that is the 'successor' to s[].
If the rightmost character is a-zA-Z0-9, it is incremented within
its case or digits. If it generates a carry, the process is
repeated with the one to its immediate left.
string
tr(const(char)[]
str, const(char)[]
from, const(char)[]
to, const(char)[]
modifiers = null);
- Replaces characters in str[] that are in from[]
with corresponding characters in to[] and returns the resulting
string.
Parameters:
const(char)[] modifiers |
a string of modifier characters |
Modifiers:
Modifier | Description
|
c | Complement the list of characters in from[]
|
d | Removes matching characters with no corresponding replacement in to[]
|
s | Removes adjacent duplicates in the replaced characters
|
If modifier d is present, then the number of characters
in to[] may be only 0 or 1.
If modifier d is not present and to[] is null,
then to[] is taken to be the same as from[].
If modifier d is not present and to[] is shorter
than from[], then to[] is extended by replicating the
last character in to[].
Both from[] and to[] may contain ranges using the -
character, for example a-d is synonymous with abcd.
Neither accept a leading ^ as meaning the complement of
the string (use the c modifier for that).
bool
isNumeric(const(char)[]
s, in bool
bAllowSep = false);
- [in] string s can be formatted in the following ways:
Integer Whole Number:
(for byte, ubyte, short, ushort, int, uint, long, and ulong)
['+'|'-']digit(s)[U|L|UL]
Examples:
123, 123UL, 123L, +123U, -123L
Floating-Point Number:
(for float, double, real, ifloat, idouble, and ireal)
['+'|'-']digit(s)[.][digit(s)][[e-|e+]digit(s)][i|f|L|Li|fi]]
or [nan|nani|inf|-inf]
Examples:
+123., -123.01, 123.3e-10f, 123.3e-10fi, 123.3e-10L
(for cfloat, cdouble, and creal)
['+'|'-']digit(s)[.][digit(s)][[e-|e+]digit(s)][+]
[digit(s)[.][digit(s)][[e-|e+]digit(s)][i|f|L|Li|fi]]
or [nan|nani|nan+nani|inf|-inf]
Examples:
nan, -123e-1+456.9e-10Li, +123e+10+456i, 123+456
[in] bool bAllowSep
False by default, but when set to true it will accept the
separator characters "," and "" within the string, but these
characters should be stripped from the string before using any
of the conversion functions like toInt(), toFloat(), and etc
else an error will occur.
Also please note, that no spaces are allowed within the string
anywhere whether it's a leading, trailing, or embedded space(s),
thus they too must be stripped from the string before using this
function, or any of the conversion functions.
- Scheduled for deprecation in January 2012.
Allow any object as a parameter
bool
isNumeric(TypeInfo[]
_arguments, va_list
_argptr);
- Scheduled for deprecation in January 2012.
Check only the first parameter, all others will be ignored.
char[]
soundex(const(char)[]
string, char[]
buffer = null);
- Soundex algorithm.
The Soundex algorithm converts a word into 4 characters
based on how the word sounds phonetically. The idea is that
two spellings that sound alike will have the same Soundex
value, which means that Soundex can be used for fuzzy matching
of names.
Parameters:
const(char)[] string |
String to convert to Soundex representation. |
char[] buffer |
Optional 4 char array to put the resulting Soundex
characters into. If null, the return value
buffer will be allocated on the heap. |
Returns:
The four character array with the Soundex result in it.
Returns null if there is no Soundex representation for the string.
See Also:
Wikipedia,
The Soundex Indexing System
BUGS:
Only works well with English names.
There are other arguably better Soundex algorithms,
but this one is the standard one.
string[string]
abbrev(string[]
values);
- Construct an associative array consisting of all
abbreviations that uniquely map to the strings in values.
This is useful in cases where the user is expected to type
in one of a known set of strings, and the program will helpfully
autocomplete the string once sufficient characters have been
entered that uniquely identify it.
Example:
import std.stdio;
import std.string;
void main()
{
static string[] list = [ "food", "foxy" ];
auto abbrevs = std.string.abbrev(list);
foreach (key, value; abbrevs)
{
writefln("%s => %s", key, value);
}
}
produces the output:
fox => foxy
food => food
foxy => foxy
foo => food
size_t
column(S)(S
str, size_t
tabsize = 8);
- Compute column number after string if string starts in the
leftmost column, which is numbered starting from 0.
S
wrap(S)(S
s, size_t
columns = 80, S
firstindent = null, S
indent = null, size_t
tabsize = 8);
- Wrap text into a paragraph.
The input text string s is formed into a paragraph
by breaking it up into a sequence of lines, delineated
by \n, such that the number of columns is not exceeded
on each line.
The last line is terminated with a \n.
Parameters:
s |
text string to be wrapped |
columns |
maximum number of columns in the paragraph |
firstindent |
string used to indent first line of the paragraph |
indent |
string to use to indent following lines of the paragraph |
tabsize |
column spacing of tabs |
Returns:
The resulting paragraph.