std.string
String handling functions. Objects of types
string,
wstring, and
dstring are value types and cannot be mutated
element-by-element. For using mutation during building strings, use
char[],
wchar[], or
dchar[]. The
*string types
are preferable because they don't exhibit undesired aliasing, thus
making code more robust.
License:Boost License 1.0.
Authors:Walter Bright,
Andrei Alexandrescu
- Thrown on errors in string functions.
immutable char[16u]
hexdigits;
- 0..9A..F
immutable char[10u]
digits;
- 0..9
immutable char[8u]
octdigits;
- 0..7
immutable char[26u]
lowercase;
- a..z
immutable char[52u]
letters;
- A..Za..z
immutable char[26u]
uppercase;
- A..Z
immutable char[6u]
whitespace;
- ASCII whitespace
- UTF line separator
- UTF paragraph separator
immutable char[2u]
newline;
- Newline sequence for this system
- Returns true if c is whitespace
int
cmp(C1, C2)(in C1[]
s1, in C2[]
s2);
sizediff_t
icmp(C1, C2)(in C1[]
s1, in C2[]
s2);
- Compare two strings. cmp is case sensitive, icmp is case
insensitive.
Returns:< 0 | s1 < s2 |
= 0 | s1 == s2 |
> 0 | s1 > s2 |
immutable(char)*
toStringz(const(char)[]
s);
immutable(char)*
toStringz(string
s);
- Convert array of chars s[] to a C-style 0-terminated string.
s[] must not contain embedded 0's.
- Flag indicating whether a search is case-sensitive.
sizediff_t
indexOf(Char)(in Char[]
s, dchar
c, CaseSensitive
cs = CaseSensitive.yes);
ptrdiff_t
lastIndexOf(in char[]
s, dchar
c, CaseSensitive
cs = (CaseSensitive).yes);
- indexOf: find first occurrence of c in string s. lastIndexOf: find last occurrence of c in string s. CaseSensitive.yes means the searches are case sensitive.
Returns:
Index in s where c is found, -1 if not found.
sizediff_t
indexOf(Char1, Char2)(const(Char1)[]
s, const(Char2)[]
sub, CaseSensitive
cs = CaseSensitive.yes);
ptrdiff_t
lastIndexOf(in char[]
s, in char[]
sub, CaseSensitive
cs = (CaseSensitive).yes);
- indexOf find first occurrence of sub[] in string s[].
lastIndexOf find last occurrence of sub[] in string s[].
CaseSensitive cs controls whether the comparisons are case
sensitive or not.
Returns:
Index in s where sub is found, -1 if not found.
- Convert string s[] to lower case.
void
tolowerInPlace(C)(ref C[]
s);
- Converts s to lowercase in place.
- Convert string s[] to upper case.
void
toupperInPlace(C)(ref C[]
s);
- Converts s to uppercase in place.
string
capitalize(string
s);
- Capitalize first character of string s[], convert rest of string s[]
to lower case.
string
capwords(string
s);
- Capitalize all words in string s[].
Remove leading and trailing whitespace.
Replace all sequences of whitespace with a single space.
string
repeat(string
s, size_t
n);
- Return a string that consists of s[] repeated n times.
string
join(in string[]
words, string
sep);
- Concatenate all the strings in words[] together into one
string; use sep[] as the separator.
- Split s[] into an array of words, using whitespace as delimiter.
Unqual!(S1)[]
split(S1, S2)(S1
s, S2
delim);
- Split s[] into an array of words,
using delim[] as the delimiter.
- Split s[] into an array of lines,
using CR, LF, or CR-LF as the delimiter.
The delimiter is not included in the line.
String
stripl(String)(String
s);
String
stripr(String)(String
s);
String
strip(String)(String
s);
- Strips leading or trailing whitespace, or both.
C[]
chomp(C)(C[]
s);
C[]
chomp(C, C1)(C[]
s, in C1[]
delimiter);
- Returns s[] sans trailing delimiter[], if any.
If delimiter[] is null, removes trailing CR, LF, or CRLF, if any.
C1[]
chompPrefix(C1, C2)(C1[]
longer, C2[]
shorter);
- If longer.startsWith(shorter), returns longer[shorter.length .. $]. Otherwise, returns longer.
- Returns s[] sans trailing character, if there is one.
If last two characters are CR-LF, then both are removed.
string
ljustify(string
s, size_t
width);
string
rjustify(string
s, size_t
width);
string
center(string
s, int
width);
- Left justify, right justify, or center string s[]
in field width chars wide.
string
zfill(string
s, int
width);
- Same as rjustify(), but fill with '0's.
string
replace(string
s, string
from, string
to);
- Replace occurrences of from[] with to[] in s[].
string
replaceSlice(string
s, in string
slice, in string
replacement);
- Return a string that is s[] with slice[] replaced by replacement[].
string
insert(string
s, size_t
index, string
sub);
- Insert sub[] into s[] at location index.
size_t
count(in char[]
s, in char[]
sub);
- Count up all instances of sub[] in s[].
string
expandtabs(string
str, int
tabsize = 8);
- Replace tabs with the appropriate number of spaces.
tabsize is the distance between tab stops.
string
entab(string
s, int
tabsize = 8);
- Replace spaces in string s with the optimal number of tabs.
Trailing spaces or tabs in a line are removed.
Parameters:
string s |
String to convert. |
int tabsize |
Tab columns are tabsize spaces apart. tabsize defaults to 8. |
string
maketrans(in string
from, in string
to);
- Construct translation table for translate().
BUG:
only works with ASCII
string
translate(string
s, in string
transtab, in string
delchars);
- Translate characters in s[] using table created by maketrans().
Delete chars in delchars[].
BUG:
only works with ASCII
- Format arguments into a string.
char[]
sformat(char[]
s,...);
- Format arguments into string s which must be large
enough to hold the result. Throws RangeError if it is not.
Returns:
s
bool
inPattern(dchar
c, in string
pattern);
- See if character c is in the pattern.
Patterns:
A pattern is an array of characters much like a character
class in regular expressions. A sequence of characters
can be given, such as "abcde". The '-' can represent a range
of characters, as "a-e" represents the same pattern as "abcde".
"a-fA-F0-9" represents all the hex characters.
If the first character of a pattern is '^', then the pattern
is negated, i.e. "^0-9" means any character except a digit.
The functions inPattern, countchars, removeschars,
and squeeze
use patterns.
Note:
In the future, the pattern syntax may be improved
to be more like regular expression character classes.
int
inPattern(dchar
c, string[]
patterns);
- See if character c is in the intersection of the patterns.
size_t
countchars(string
s, string
pattern);
- Count characters in s that match pattern.
string
removechars(string
s, in string
pattern);
- Return string that is s with all characters removed that match pattern.
string
squeeze(string
s, string
pattern = null);
- Return string where sequences of a character in s[] from pattern[]
are replaced with a single instance of that character.
If pattern is null, it defaults to all characters.
S1
munch(S1, S2)(ref S1
s, S2
pattern);
- Finds the position pos of the first character in s that does not match pattern (in the terminology used by
inPattern). Updates s =
s[pos..$]. Returns the slice from the beginning of the original
(before update) string up to, and excluding, pos.
Example:
string s = "123abc";
string t = munch(s, "0123456789");
assert(t == "123" && s == "abc");
t = munch(s, "0123456789");
assert(t == "" && s == "abc");
The munch function is mostly convenient for skipping
certain category of characters (e.g. whitespace) when parsing
strings. (In such cases, the return value is not used.)
- Return string that is the 'successor' to s[].
If the rightmost character is a-zA-Z0-9, it is incremented within
its case or digits. If it generates a carry, the process is
repeated with the one to its immediate left.
string
tr(string
str, string
from, string
to, string
modifiers = null);
- Replaces characters in str[] that are in from[]
with corresponding characters in to[] and returns the resulting
string.
Parameters:
string modifiers |
a string of modifier characters |
Modifiers:
Modifier | Description
|
c | Complement the list of characters in from[]
|
d | Removes matching characters with no corresponding replacement in to[]
|
s | Removes adjacent duplicates in the replaced characters
|
If modifier d is present, then the number of characters
in to[] may be only 0 or 1.
If modifier d is not present and to[] is null,
then to[] is taken to be the same as from[].
If modifier d is not present and to[] is shorter
than from[], then to[] is extended by replicating the
last character in to[].
Both from[] and to[] may contain ranges using the -
character, for example a-d is synonymous with abcd.
Neither accept a leading ^ as meaning the complement of
the string (use the c modifier for that).
final bool
isNumeric(string
s, in bool
bAllowSep = false);
- [in] string s can be formatted in the following ways:
Integer Whole Number:
(for byte, ubyte, short, ushort, int, uint, long, and ulong)
['+'|'-']digit(s)[U|L|UL]
Examples:
123, 123UL, 123L, +123U, -123L
Floating-Point Number:
(for float, double, real, ifloat, idouble, and ireal)
['+'|'-']digit(s)[.][digit(s)][[e-|e+]digit(s)][i|f|L|Li|fi]]
or [nan|nani|inf|-inf]
Examples:
+123., -123.01, 123.3e-10f, 123.3e-10fi, 123.3e-10L
(for cfloat, cdouble, and creal)
['+'|'-']digit(s)[.][digit(s)][[e-|e+]digit(s)][+]
[digit(s)[.][digit(s)][[e-|e+]digit(s)][i|f|L|Li|fi]]
or [nan|nani|nan+nani|inf|-inf]
Examples:
nan, -123e-1+456.9e-10Li, +123e+10+456i, 123+456
[in] bool bAllowSep
False by default, but when set to true it will accept the
separator characters "," and "" within the string, but these
characters should be stripped from the string before using any
of the conversion functions like toInt(), toFloat(), and etc
else an error will occur.
Also please note, that no spaces are allowed within the string
anywhere whether it's a leading, trailing, or embedded space(s),
thus they too must be stripped from the string before using this
function, or any of the conversion functions.
- Allow any object as a parameter
bool
isNumeric(TypeInfo[]
_arguments, va_list
_argptr);
- Check only the first parameter, all others will be ignored.
char[]
soundex(
string string, char[]
buffer = null);
- Soundex algorithm.
The Soundex algorithm converts a word into 4 characters
based on how the word sounds phonetically. The idea is that
two spellings that sound alike will have the same Soundex
value, which means that Soundex can be used for fuzzy matching
of names.
Parameters:
string string |
String to convert to Soundex representation. |
char[] buffer |
Optional 4 char array to put the resulting Soundex
characters into. If null, the return value
buffer will be allocated on the heap. |
Returns:
The four character array with the Soundex result in it.
Returns null if there is no Soundex representation for the string.
See Also:
Wikipedia,
The Soundex Indexing System
BUGS:
Only works well with English names.
There are other arguably better Soundex algorithms,
but this one is the standard one.
string[string]
abbrev(string[]
values);
- Construct an associative array consisting of all
abbreviations that uniquely map to the strings in values.
This is useful in cases where the user is expected to type
in one of a known set of strings, and the program will helpfully
autocomplete the string once sufficient characters have been
entered that uniquely identify it.
Example:
import std.stdio;
import std.string;
void main()
{
static string[] list = [ "food", "foxy" ];
auto abbrevs = std.string.abbrev(list);
foreach (key, value; abbrevs)
{
writefln("%s => %s", key, value);
}
}
produces the output:
fox => foxy
food => food
foxy => foxy
foo => food
size_t
column(string
str, int
tabsize = 8);
- Compute column number after string if string starts in the
leftmost column, which is numbered starting from 0.
string
wrap(string
s, int
columns = 80, string
firstindent = null, string
indent = null, int
tabsize = 8);
- Wrap text into a paragraph.
The input text string s is formed into a paragraph
by breaking it up into a sequence of lines, delineated
by \n, such that the number of columns is not exceeded
on each line.
The last line is terminated with a \n.
Parameters:
string s |
text string to be wrapped |
int columns |
maximum number of columns in the paragraph |
string firstindent |
string used to indent first line of the paragraph |
string indent |
string to use to indent following lines of the paragraph |
int tabsize |
column spacing of tabs |
Returns:
The resulting paragraph.
struct
ByCodeUnit(Range,Unit) if (isInputRange!(Range) && staticIndexOf!(Unqual!(Unit),char,wchar,dchar) >= 0 && staticIndexOf!(Unqual!(ElementType!(Range)),char,wchar,dchar) >= 0 && !is(Unqual!(ElementType!(Range)) == Unqual!(Unit)));
-
bool
empty();
ElementType
front();
ElementType
back();
void
popBack();
- Range primitives