www.digitalmars.com [Home] [Search] [D]
Last update Aug 16, 2003

D Strings vs C++ Strings

Why have strings built-in to the core language of D rather than entirely in a library as in C++ Strings? What's the point? Where's the improvement?

Concatenation Operator

C++ Strings are stuck with overloading existing operators. The obvious choice for concatenation is += and +. But someone just looking at the code will see + and think "addition". He'll have to look up the types (and types are frequently buried behind multiple typedef's) to see that it's a string type, and it's not adding strings but concatenating them.

Additionally, if one has an array of floats, is '+' overloaded to be the same as a vector addition, or an array concatenation?

In D, these problems are avoided by introducing a new binary operator ~ as the concatenation operator. It works with arrays (of which strings are a subset). ~= is the corresponding append operator. ~ on arrays of floats would concatenate them, + would imply a vector add. Adding a new operator makes it possible for orthogonality and consistency in the treatment of arrays. (In D, strings are simply arrays of characters, not a special type.)

Interoperability With C String Syntax

Overloading of operators only really works if one of the operands is overloadable. So the C++ string class cannot consistently handle arbitrary expressions containing strings. Consider:
	const char abc[5] = "world";
	string str = "hello" + abc;
That isn't going to work. But it does work when the core language knows about strings:
	const char[5] abc = "world";
	char[] str = "hello" ~ abc;

Consistency With C String Syntax

There are three ways to find the length of a string in C++:
	const char abc[] = "world";	:	sizeof(abc)/sizeof(abc[0])-1
					:	strlen(abc)
	string str			:	str.length()
That kind of inconsistency makes it hard to write generic templates. Consider D:
	char[5] abc = "world";	:	abc.length
	char[] str		:	str.length

Checking For Empty Strings

C++ strings use a function to determine if a string is empty:
	string str;
	if (str.empty())
		// string is empty
In D, an empty string is just null:
	char[] str;
	if (!str)
		// string is empty

Resizing Existing String

C++ handles this with the resize() member function:
	string str;
	str.resize(newsize);
D takes advantage of knowing that str is a string, and so resizing it is just changing the length property:
	char[] str;
	str.length = newsize;

Slicing a String

C++ slices an existing string using a special constructor:
	string s1 = "hello world";
	string s2(s1, 6, 5);		// s2 is "world"
D has the array slice syntax, not possible with C++:
	char[] s1 = "hello world";
	char[] s2 = s1[6 .. 11];	// s2 is "world"
Slicing, of course, works with any array in D, not just strings.

Copying a String

C++ copies strings with the copy function, but only to the beginning:
	string s1 = "hello world";
	string s2 = "goodbye      ";
	s1.copy(s2, 1, 5);	// s2 is "worldye      "
D allows copying anywhere in the string:
	char[] s1 = "hello world";
	char[] s2 = "goodbye      ";
	s2[8..13] = s1[6..11];		// s2 is "goodbye world"

Conversions to C Strings

This is needed for compatibility with C API's. In C++, this uses the c_str() member function:
	void foo(const char *);
	string s1;
	foo(s1.c_str());
In D, strings can be implicitly converted to char*:
	void foo(char *);
	char[] s1;
	foo(s1);
Note: some will argue that it is a mistake in D to have an implicit conversion from char[] to char*.

Taking a Reference to a Single Character

In C++, use the at() member function:
	string str = "hello";
	str.at(2);		// refer to first 'l'
In D, use the usual & operator:
	char[] str = "hello";
	&str[2];
Or, it can directly be used as an out parameter:
	void foo(out char c);

	foo(str[2]);

Array Bounds Checking

In C++, string array bounds checking for [] is not done. In D, array bounds checking is on by default and it can be turned off with a compiler switch after the program is debugged.

String Switch Statements

Are not possible in C++, nor is there any way to add them by adding more to the library. In D, they take the obvious syntactical forms:
	switch (str)
	{
	    case "hello":
	    case "world":
		...
	}
where str can be any of literal "string"s, fixed string arrays like char[10], or dynamic strings like char[]. A quality implementation can, of course, explore many strategies of efficiently implementing this based on the contents of the case strings.

Filling a String

In C++, this is done with the replace() member function:
	string str = "hello";
	str.replace(1,2,2,'?');		// str is "h??llo"
In D, use the array slicing syntax in the natural manner:
	char[] str = "hello";
	str[1..2] = '?';		// str is "h??llo"

Copyright (c) 2003 by Digital Mars, All Rights Reserved