Rule of five and move semantics for a string classCanonical Implementation of Move SemanticsYet another 'any'...
Is it true that real estate prices mainly go up?
How did Alan Turing break the enigma code using the hint given by the lady in the bar?
How can The Temple of Elementary Evil reliably protect itself against kinetic bombardment?
Are babies of evil humanoid species inherently evil?
Why would one plane in this picture not have gear down yet?
Good for you! in Russian
Is "history" a male-biased word ("his+story")?
Built-In Shelves/Bookcases - IKEA vs Built
Are there historical instances of the capital of a colonising country being temporarily or permanently shifted to one of its colonies?
Is Gradient Descent central to every optimizer?
Reverse string, can I make it faster?
Force user to remove USB token
Finding algorithms of QGIS commands?
Why was Goose renamed from Chewie for the Captain Marvel film?
What Happens when Passenger Refuses to Fly Boeing 737 Max?
infinitive telling the purpose
Can one live in the U.S. and not use a credit card?
They call me Inspector Morse
Signed and unsigned numbers
Should I take out a loan for a friend to invest on my behalf?
How much stiffer are 23c tires over 28c?
How do you like my writing?
Can someone explain what is being said here in color publishing in the American Mathematical Monthly?
PTIJ: where are Tzafra and Urta located?
Rule of five and move semantics for a string class
Canonical Implementation of Move SemanticsYet another 'any' class implementation, named 'some'Rule Of Three for a Coordinate classStack implementation with move semanticsBenchmarking copy semantics vs. move semanticsTemplated double ended queue (deque) move semantics edge cases C++C++ String classWriting order and move assignment when following the Rule of FiveString class in C++C++ Updated Stack Code
$begingroup$
I have been learning about the rule of five and I tried to implement an example myself. It all works exactly as I'd expect, but I was wondering if I've missed any obvious issues or easier/faster ways of implementing the constructors and operators.
class string {
public:
string() : data(nullptr) {
std::cout << "Default ctor" << std::endl;
}
string(const char * p, size_t size) : m_size(size) { //default constructor
std::cout << "Special ctor" << std::endl;
data = new char[m_size];
memcpy(data, p, m_size);
}
~string() noexcept { //default destructor
std::cout << "Default dtorrn";
delete[] data;
}
string(const string & str) { //copy constructor
std::cout << "Copy constructor" << std::endl;
m_size = str.m_size;
data = new char[m_size];
memcpy(data,str.data,m_size);
}
string(string && str) { //move constructor
std::cout << "Move constructor" << std::endl;
m_size = str.m_size;
data = str.data; //swap the values
str.data = nullptr;
}
string& operator=(const string & str) { //copy operator
std::cout << "Copy assignment" << std::endl;
if (this == &str) return * this;
delete[] data; //can safely delete a nullptr
m_size = str.m_size;
char * new_data = new char[m_size]; //we have to make a copy as this is not the move operator
memcpy(new_data,str.data,m_size);
data = new_data;
return * this;
}
string& operator=(string && str) { //move assignment operator
std::cout << "Move assignment" << std::endl;
if (this == &str) return * this;
delete[] data; //delete the old data, will be ok because data is always initialised to nullptr
m_size = str.m_size;
data = str.data; //swap over the pointers
str.data = nullptr;
return * this;
}
string& operator+=(const string & str) {
size_t new_size = m_size + str.m_size; //we need to create a new block of memory with the new appropriate size
char * new_data = new char[new_size];
memmove(new_data,data,m_size); //move the old memory
strcat(new_data,str.data); //concat the new memory
delete[] data; //delete the old data
m_size = new_size;
data = new_data;
return *this;
}
string operator+(const string & str) {
return static_cast<string&&>(string(*this) += str); //could be replaced with std::move() but I use static cast for education purposes
}
private:
char * data;
size_t m_size;
friend std::ostream& operator<<(std::ostream& os, const string& s) {
os << s.data << "rn";
return os;
}
};
int main() {
const char t1[] = "text";
const char t2[] = "more text";
const char * p1 = t1;
const char * p2 = t2;
string a(p1,sizeof(t1));
string b(p2,sizeof(t2));
string c(a);
string d(a + b);
string e;
a += b;
e = d;
c = b + a;
std::cout << "a:t" << a;
std::cout << "b:t" << b;
std::cout << "c:t" << c;
std::cout << "d:t" << d;
std::cout << "e:t" << e;
return 0;
}
I ran it with valgrind and recieved the following output: I am concerned that there are 11 uses of alloc and only 10 frees? I am vaguely familiar with the difference but not confident enough to tell if it's a mistake I've made or expected output?
==24703== HEAP SUMMARY:
==24703== in use at exit: 72,704 bytes in 1 blocks
==24703== total heap usage: 11 allocs, 10 frees, 73,833 bytes allocated
==24703==
==24703== LEAK SUMMARY:
==24703== definitely lost: 0 bytes in 0 blocks
==24703== indirectly lost: 0 bytes in 0 blocks
==24703== possibly lost: 0 bytes in 0 blocks
==24703== still reachable: 72,704 bytes in 1 blocks
==24703== suppressed: 0 bytes in 0 blocks
==24703== Rerun with --leak-check=full to see details of leaked memory
==24703==
==24703== For counts of detected and suppressed errors, rerun with: -v
==24703== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
c++ strings reinventing-the-wheel memory-management
$endgroup$
add a comment |
$begingroup$
I have been learning about the rule of five and I tried to implement an example myself. It all works exactly as I'd expect, but I was wondering if I've missed any obvious issues or easier/faster ways of implementing the constructors and operators.
class string {
public:
string() : data(nullptr) {
std::cout << "Default ctor" << std::endl;
}
string(const char * p, size_t size) : m_size(size) { //default constructor
std::cout << "Special ctor" << std::endl;
data = new char[m_size];
memcpy(data, p, m_size);
}
~string() noexcept { //default destructor
std::cout << "Default dtorrn";
delete[] data;
}
string(const string & str) { //copy constructor
std::cout << "Copy constructor" << std::endl;
m_size = str.m_size;
data = new char[m_size];
memcpy(data,str.data,m_size);
}
string(string && str) { //move constructor
std::cout << "Move constructor" << std::endl;
m_size = str.m_size;
data = str.data; //swap the values
str.data = nullptr;
}
string& operator=(const string & str) { //copy operator
std::cout << "Copy assignment" << std::endl;
if (this == &str) return * this;
delete[] data; //can safely delete a nullptr
m_size = str.m_size;
char * new_data = new char[m_size]; //we have to make a copy as this is not the move operator
memcpy(new_data,str.data,m_size);
data = new_data;
return * this;
}
string& operator=(string && str) { //move assignment operator
std::cout << "Move assignment" << std::endl;
if (this == &str) return * this;
delete[] data; //delete the old data, will be ok because data is always initialised to nullptr
m_size = str.m_size;
data = str.data; //swap over the pointers
str.data = nullptr;
return * this;
}
string& operator+=(const string & str) {
size_t new_size = m_size + str.m_size; //we need to create a new block of memory with the new appropriate size
char * new_data = new char[new_size];
memmove(new_data,data,m_size); //move the old memory
strcat(new_data,str.data); //concat the new memory
delete[] data; //delete the old data
m_size = new_size;
data = new_data;
return *this;
}
string operator+(const string & str) {
return static_cast<string&&>(string(*this) += str); //could be replaced with std::move() but I use static cast for education purposes
}
private:
char * data;
size_t m_size;
friend std::ostream& operator<<(std::ostream& os, const string& s) {
os << s.data << "rn";
return os;
}
};
int main() {
const char t1[] = "text";
const char t2[] = "more text";
const char * p1 = t1;
const char * p2 = t2;
string a(p1,sizeof(t1));
string b(p2,sizeof(t2));
string c(a);
string d(a + b);
string e;
a += b;
e = d;
c = b + a;
std::cout << "a:t" << a;
std::cout << "b:t" << b;
std::cout << "c:t" << c;
std::cout << "d:t" << d;
std::cout << "e:t" << e;
return 0;
}
I ran it with valgrind and recieved the following output: I am concerned that there are 11 uses of alloc and only 10 frees? I am vaguely familiar with the difference but not confident enough to tell if it's a mistake I've made or expected output?
==24703== HEAP SUMMARY:
==24703== in use at exit: 72,704 bytes in 1 blocks
==24703== total heap usage: 11 allocs, 10 frees, 73,833 bytes allocated
==24703==
==24703== LEAK SUMMARY:
==24703== definitely lost: 0 bytes in 0 blocks
==24703== indirectly lost: 0 bytes in 0 blocks
==24703== possibly lost: 0 bytes in 0 blocks
==24703== still reachable: 72,704 bytes in 1 blocks
==24703== suppressed: 0 bytes in 0 blocks
==24703== Rerun with --leak-check=full to see details of leaked memory
==24703==
==24703== For counts of detected and suppressed errors, rerun with: -v
==24703== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
c++ strings reinventing-the-wheel memory-management
$endgroup$
$begingroup$
In your default constructor, you forgot to initializem_size. Don't assume it will be automatically initialized to 0.
$endgroup$
– ncalmbeblpaicr0011
Dec 15 '16 at 19:24
$begingroup$
One thing that surprised me is that the ostream output operator is in aprivate:section, but is still accessible frommain()- presumably being declaredfriendmakes it visible? I'm still learning here...
$endgroup$
– Toby Speight
Dec 15 '16 at 22:40
add a comment |
$begingroup$
I have been learning about the rule of five and I tried to implement an example myself. It all works exactly as I'd expect, but I was wondering if I've missed any obvious issues or easier/faster ways of implementing the constructors and operators.
class string {
public:
string() : data(nullptr) {
std::cout << "Default ctor" << std::endl;
}
string(const char * p, size_t size) : m_size(size) { //default constructor
std::cout << "Special ctor" << std::endl;
data = new char[m_size];
memcpy(data, p, m_size);
}
~string() noexcept { //default destructor
std::cout << "Default dtorrn";
delete[] data;
}
string(const string & str) { //copy constructor
std::cout << "Copy constructor" << std::endl;
m_size = str.m_size;
data = new char[m_size];
memcpy(data,str.data,m_size);
}
string(string && str) { //move constructor
std::cout << "Move constructor" << std::endl;
m_size = str.m_size;
data = str.data; //swap the values
str.data = nullptr;
}
string& operator=(const string & str) { //copy operator
std::cout << "Copy assignment" << std::endl;
if (this == &str) return * this;
delete[] data; //can safely delete a nullptr
m_size = str.m_size;
char * new_data = new char[m_size]; //we have to make a copy as this is not the move operator
memcpy(new_data,str.data,m_size);
data = new_data;
return * this;
}
string& operator=(string && str) { //move assignment operator
std::cout << "Move assignment" << std::endl;
if (this == &str) return * this;
delete[] data; //delete the old data, will be ok because data is always initialised to nullptr
m_size = str.m_size;
data = str.data; //swap over the pointers
str.data = nullptr;
return * this;
}
string& operator+=(const string & str) {
size_t new_size = m_size + str.m_size; //we need to create a new block of memory with the new appropriate size
char * new_data = new char[new_size];
memmove(new_data,data,m_size); //move the old memory
strcat(new_data,str.data); //concat the new memory
delete[] data; //delete the old data
m_size = new_size;
data = new_data;
return *this;
}
string operator+(const string & str) {
return static_cast<string&&>(string(*this) += str); //could be replaced with std::move() but I use static cast for education purposes
}
private:
char * data;
size_t m_size;
friend std::ostream& operator<<(std::ostream& os, const string& s) {
os << s.data << "rn";
return os;
}
};
int main() {
const char t1[] = "text";
const char t2[] = "more text";
const char * p1 = t1;
const char * p2 = t2;
string a(p1,sizeof(t1));
string b(p2,sizeof(t2));
string c(a);
string d(a + b);
string e;
a += b;
e = d;
c = b + a;
std::cout << "a:t" << a;
std::cout << "b:t" << b;
std::cout << "c:t" << c;
std::cout << "d:t" << d;
std::cout << "e:t" << e;
return 0;
}
I ran it with valgrind and recieved the following output: I am concerned that there are 11 uses of alloc and only 10 frees? I am vaguely familiar with the difference but not confident enough to tell if it's a mistake I've made or expected output?
==24703== HEAP SUMMARY:
==24703== in use at exit: 72,704 bytes in 1 blocks
==24703== total heap usage: 11 allocs, 10 frees, 73,833 bytes allocated
==24703==
==24703== LEAK SUMMARY:
==24703== definitely lost: 0 bytes in 0 blocks
==24703== indirectly lost: 0 bytes in 0 blocks
==24703== possibly lost: 0 bytes in 0 blocks
==24703== still reachable: 72,704 bytes in 1 blocks
==24703== suppressed: 0 bytes in 0 blocks
==24703== Rerun with --leak-check=full to see details of leaked memory
==24703==
==24703== For counts of detected and suppressed errors, rerun with: -v
==24703== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
c++ strings reinventing-the-wheel memory-management
$endgroup$
I have been learning about the rule of five and I tried to implement an example myself. It all works exactly as I'd expect, but I was wondering if I've missed any obvious issues or easier/faster ways of implementing the constructors and operators.
class string {
public:
string() : data(nullptr) {
std::cout << "Default ctor" << std::endl;
}
string(const char * p, size_t size) : m_size(size) { //default constructor
std::cout << "Special ctor" << std::endl;
data = new char[m_size];
memcpy(data, p, m_size);
}
~string() noexcept { //default destructor
std::cout << "Default dtorrn";
delete[] data;
}
string(const string & str) { //copy constructor
std::cout << "Copy constructor" << std::endl;
m_size = str.m_size;
data = new char[m_size];
memcpy(data,str.data,m_size);
}
string(string && str) { //move constructor
std::cout << "Move constructor" << std::endl;
m_size = str.m_size;
data = str.data; //swap the values
str.data = nullptr;
}
string& operator=(const string & str) { //copy operator
std::cout << "Copy assignment" << std::endl;
if (this == &str) return * this;
delete[] data; //can safely delete a nullptr
m_size = str.m_size;
char * new_data = new char[m_size]; //we have to make a copy as this is not the move operator
memcpy(new_data,str.data,m_size);
data = new_data;
return * this;
}
string& operator=(string && str) { //move assignment operator
std::cout << "Move assignment" << std::endl;
if (this == &str) return * this;
delete[] data; //delete the old data, will be ok because data is always initialised to nullptr
m_size = str.m_size;
data = str.data; //swap over the pointers
str.data = nullptr;
return * this;
}
string& operator+=(const string & str) {
size_t new_size = m_size + str.m_size; //we need to create a new block of memory with the new appropriate size
char * new_data = new char[new_size];
memmove(new_data,data,m_size); //move the old memory
strcat(new_data,str.data); //concat the new memory
delete[] data; //delete the old data
m_size = new_size;
data = new_data;
return *this;
}
string operator+(const string & str) {
return static_cast<string&&>(string(*this) += str); //could be replaced with std::move() but I use static cast for education purposes
}
private:
char * data;
size_t m_size;
friend std::ostream& operator<<(std::ostream& os, const string& s) {
os << s.data << "rn";
return os;
}
};
int main() {
const char t1[] = "text";
const char t2[] = "more text";
const char * p1 = t1;
const char * p2 = t2;
string a(p1,sizeof(t1));
string b(p2,sizeof(t2));
string c(a);
string d(a + b);
string e;
a += b;
e = d;
c = b + a;
std::cout << "a:t" << a;
std::cout << "b:t" << b;
std::cout << "c:t" << c;
std::cout << "d:t" << d;
std::cout << "e:t" << e;
return 0;
}
I ran it with valgrind and recieved the following output: I am concerned that there are 11 uses of alloc and only 10 frees? I am vaguely familiar with the difference but not confident enough to tell if it's a mistake I've made or expected output?
==24703== HEAP SUMMARY:
==24703== in use at exit: 72,704 bytes in 1 blocks
==24703== total heap usage: 11 allocs, 10 frees, 73,833 bytes allocated
==24703==
==24703== LEAK SUMMARY:
==24703== definitely lost: 0 bytes in 0 blocks
==24703== indirectly lost: 0 bytes in 0 blocks
==24703== possibly lost: 0 bytes in 0 blocks
==24703== still reachable: 72,704 bytes in 1 blocks
==24703== suppressed: 0 bytes in 0 blocks
==24703== Rerun with --leak-check=full to see details of leaked memory
==24703==
==24703== For counts of detected and suppressed errors, rerun with: -v
==24703== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
c++ strings reinventing-the-wheel memory-management
c++ strings reinventing-the-wheel memory-management
edited Dec 15 '16 at 21:40
200_success
130k17153419
130k17153419
asked Dec 14 '16 at 20:34
TomJTomJ
1516
1516
$begingroup$
In your default constructor, you forgot to initializem_size. Don't assume it will be automatically initialized to 0.
$endgroup$
– ncalmbeblpaicr0011
Dec 15 '16 at 19:24
$begingroup$
One thing that surprised me is that the ostream output operator is in aprivate:section, but is still accessible frommain()- presumably being declaredfriendmakes it visible? I'm still learning here...
$endgroup$
– Toby Speight
Dec 15 '16 at 22:40
add a comment |
$begingroup$
In your default constructor, you forgot to initializem_size. Don't assume it will be automatically initialized to 0.
$endgroup$
– ncalmbeblpaicr0011
Dec 15 '16 at 19:24
$begingroup$
One thing that surprised me is that the ostream output operator is in aprivate:section, but is still accessible frommain()- presumably being declaredfriendmakes it visible? I'm still learning here...
$endgroup$
– Toby Speight
Dec 15 '16 at 22:40
$begingroup$
In your default constructor, you forgot to initialize
m_size. Don't assume it will be automatically initialized to 0.$endgroup$
– ncalmbeblpaicr0011
Dec 15 '16 at 19:24
$begingroup$
In your default constructor, you forgot to initialize
m_size. Don't assume it will be automatically initialized to 0.$endgroup$
– ncalmbeblpaicr0011
Dec 15 '16 at 19:24
$begingroup$
One thing that surprised me is that the ostream output operator is in a
private: section, but is still accessible from main() - presumably being declared friend makes it visible? I'm still learning here...$endgroup$
– Toby Speight
Dec 15 '16 at 22:40
$begingroup$
One thing that surprised me is that the ostream output operator is in a
private: section, but is still accessible from main() - presumably being declared friend makes it visible? I'm still learning here...$endgroup$
– Toby Speight
Dec 15 '16 at 22:40
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
Comments
Let's consider one of your constructors, and its accompanying comment:
string(const char * p, size_t size) : m_size(size) { //default constructor
At least in my opinion, this has at least two obvious problems. This first problem is that the comment is just wrong. A default constructor is one that can be invoked without supplying any arguments. In this case, two arguments must be supplied, so it's not a default ctor.
The second problem is almost simpler. As it stands right now, the comment is pointless. If the comment were corrected (or the code changed so it really was a default constructor), it would still wouldn't give any useful information. It's just telling me what's already obvious from the function's parameter list--it's a default ctor if and only if it can be invoked without supplying any arguments.
Comments should convey information that's not immediately obvious from the code itself. Most often, this is things like why you wrote the code the way it is.
The same applies to most of the other comments as well--everything they try to convey is either wrong or obvious (or, as above, both).
Non-templated string type
The days of strings supporting only char as the data type are (IMO) long past. Whether you like Unicode or not, it's better to support it (and for a string, it's much cleaner to manipulate UTF-32 than UTF-8).
Array new
The array form of new should generally be avoided. In the specific case of char (or unsigned char) it's marginally less evil than usual, but I'd advise avoiding it in general. For containers (and IMO, string is a container) you want to use operator new to allocate "raw" memory, then use placement new to construct objects in that raw memory. It's kind of overkill in the specific case of char, but at worst it's harmless, and at best it's a drastic improvement (it leaves the unused part of the memory as raw memory, where new [] always creates objects in all the memory you allocate).
operator+
I'm not particularly fond of your current implementation of operator+. It works, but (IMO) is fairly easy for somebody to misinterpret in subtle ways that could lead to problems down the road.
I'd do pretty much the same things, but split it into smaller pieces that made the intent a little more obvious:
string operator+(const string & str) const {
string r(*this);
r += str;
return r;
}
I'd have to think things through in detail to be sure, there's at least a reasonable chance that this is actually more efficient. In particular, it allows copy elision of the return value, where casting to an rvalue (whether done explicitly or in the form of std::move) can inhibit that.
Alternative Rule
In most case, you want to follow the rule of zero rather than the rule of five. In simple form, this says you should delegate essentially all the resource management to a class (most often unique_ptr or shared_ptr) devoted solely to that purpose, so your class only needs to deal with whatever it's really supposed to do. In the case of a (non-COW) string, each string representation has only a single owner, so this should be std::unique_ptr.
Copy assignment
Your current copy assignment operator isn't safe in the face of exceptions. Right now you have a sequence of deleting the existing data, then allocating new space, and then copying the new data into the new space. If (for example) that new throws an exception, your string won't contain either the old data or the new data. Since it's a constructed string, its destructor will execute when that scope is exited. That will attempt to delete the buffer pointed to by its data--which you just deleted, so you get a double delete, giving undefined behavior.
One easy way to get defined behavior is the copy and swap idiom, which could look something like this:
string& operator=(string str) {
std::cout << "Copy assignmentn";
using std::swap;
swap(data, str.data);
swap(m_size, str.m_size);
return *this;
}
Since we pass the parameter by value, the copy constructor is invoked to create a string for the parameter. We then essentially do a move from there to the target. To keep things as simple as possible, we swap the guts of the two strings, so when the function returns, the parameter goes out of scope, and gets destroyed automatically.
In this case, the allocation happens in the ctor, so if it throws, this function is never invoked at all. The function itself contains only operations that we know can't throw, so we don't have to worry about what would happen if an exception were thrown in the middle of it.
This isn't the only possible way, of course, but it's one way that's pretty easy to get right. More important, perhaps, is establishing the basic pattern:
Phase 1: carry out things that might throw, but don't affect any existing data if they do.
Phase 2: carry out operation that affect existing data, but can't throw.
This way, if anything throws, Phase 2 never happens at all, so existing data can never be affected. If phase 2 starts, it always executes completely. Therefore, we always get either the state from before we started this operation at all, or else the state after it finished completely. The outside world will never see an intermediate state.
Of course, move assignment can be handled much the same way, except taking an rvalue reference as the parameter. We can still swap the "guts" of the current string with those of the parameter, and then let the dtor take care of destroying the parameter when it goes out of scope:
string& operator=(string && str) {
std::cout << "Move assignmentn";
using std::swap;
swap(data, str.data);
swap(m_size, str.m_size);
return *this;
}
Use of strcat
As it stands right now, I'd call your use of strcat (e.g., in operator+=) basically a bug. To use strcat, you'd need to guarantee that every string contains exactly one NUL character at its very end.
As it stands now, a string can be created that contains NUL characters in its middle. A string can also be created that doesn't contain any NUL character at its end.
To work correctly, you could use memcpy for both operations:
string& operator+=(const string & str) {
size_t new_size = m_size + str.m_size;
char * new_data = new char[new_size];
mempy(new_data, data, m_size);
memcpy(new_data+m_size, str.data, str.m_size);
delete[] data;
m_size = new_size;
data = new_data;
return *this;
}
Note that your use of memmove would work correctly, but was overkill. memmove guarantees correct behavior even if the old and new locations might overlap. To do that, it has to check for overlap, and copy from beginning to end for overlap in one direction, and end to beginning for overlap in the other direction. In this case, we know there will never be any overlap, so we can use memcpy which only supports non-overlapping buffers (so it avoids checking for overlap).
$endgroup$
add a comment |
$begingroup$
When I run your code, Valgrind reports 11 allocations and a matching 11 frees, with no leaks. So I don't see why your run indicated a non-freed block.
I did have to add includes for <cstring> and <iostream> to get it to compile - please don't omit your includes, even if the code is long! Other than that, this code was complete; it's clear and easy to read.
Some observations:
Initializers
g++ -Weffc++ turns up some missing initialisers that are worth addressing. For example:
string() : data(nullptr) {
std::cout << "Default ctor" << std::endl;
}
Would be better if it also initialized m_size:
string()
: data(nullptr),
m_size(0)
{
std::cout << "Default ctor" << std::endl;
}
And this one (the comment lies):
string(const char * p, size_t size) : m_size(size) { //default constructor
std::cout << "Special ctor" << std::endl;
data = new char[m_size];
memcpy(data, p, m_size);
}
could be
string(const char *p, size_t size)
: data(new char[size]),
m_size(size)
{
std::cout << "Converting ctor" << std::endl;
std::memcpy(data, p, m_size);
}
I prefer one initialiser per line, as I do for assignments - but you're welcome to disagree.
Constructors - general
I'd add a constructor that takes a null-terminated string:
string(const char *p) : string(p, strlen(p)) {}
Assignment operator
You can simplify your assignment operator using the copy and swap idiom:
friend void swap(string &a, string& b)
{
using std::swap;
swap(a.data, b.data);
swap(a.m_size, b.m_size);
}
string& operator=(string str) {
std::cout << "Copy assignment" << std::endl;
swap(*this, str);
return *this;
}
Writing it like this means that you don't need to overload it on const string& and string&& - passing by value will construct str using the appropriate constructor. See the linked answer for details.
Copy-and-swap does have some performance cost to gain that simplicity and safety. See Scott Meyers's article Drawbacks of Implementing Move Assignment in Terms of Swap for a full explanation. If that's a concern, then you should keep the overload for move-assignment.
Append operator
You use strcat() to append the second string to the first, but that assumes that both are null-terminated. You need to use memmove() (preferably memcpy) for both:
string& operator+=(const string & str) {
size_t new_size = m_size + str.m_size;
char * new_data = new char[new_size];
memcpy(new_data, data, m_size);
memcpy(new_data+m_size, str.data, str.m_size);
delete[] data;
m_size = new_size;
data = new_data;
return *this;
}
The ostream operator also depends on a trailing NUL, so here's a safe replacement:
friend std::ostream& operator<<(std::ostream& os, const string& s) {
for (size_t i = 0; i < s.m_size; ++i)
os << s.data[i];
return os << "n";
}
Addition operator
Addition is conventionally a const method; I'm not very fond of casts:
string operator+(const string & str) const {
string s{*this};
return s += str;
}
You may prefer to add a constructor taking two (or more - you might employ std::initializer_list or C++17 fold expressions) string objects, to save allocating temporary space which is immediately freed and reallocated:
public:
string operator+(const string & str) const {
return {*this, str};
}
private:
string(const string& a, const string& b)
: data{new char[a.m_size + b.m_size]},
m_size{a.m_size + b.m_size}
{
if (a.data)
memcpy(data, a.data, a.m_size);
if (b.data)
memcpy(data+a.m_size, b.data, b.m_size);
}
That concatenation constructor could be more concise if you swap the order of m_size and data in the class.
Tests
It's great that you include the tests in the review. It helps show how you expect your class to be used, and it helps identify tests you've missed. In my opinion, all review requests should include tests.
Here I changed the initializer of a to not include its NUL and that of b to use my additional constructor:
string a(p1,sizeof t1-1);
string b(p2);
I took out the assignment e=d; so that I could test operator+() with the empty string:
std::cout << "a+e:t" << a+e;
std::cout << "e+e:t" << e+e;
std::cout << "e+a:t" << e+a;
It would be worth implementing the == operator soon, and then you could make the tests self-checking (affecting the exit status, rather than requiring manual inspection).
$endgroup$
$begingroup$
An additional observation:operator+=()can avoid some allocations. Check if*thisis empty (return *this = str;) or ifstris empty -return *this;.
$endgroup$
– Toby Speight
Jan 19 '17 at 7:18
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f149917%2frule-of-five-and-move-semantics-for-a-string-class%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Comments
Let's consider one of your constructors, and its accompanying comment:
string(const char * p, size_t size) : m_size(size) { //default constructor
At least in my opinion, this has at least two obvious problems. This first problem is that the comment is just wrong. A default constructor is one that can be invoked without supplying any arguments. In this case, two arguments must be supplied, so it's not a default ctor.
The second problem is almost simpler. As it stands right now, the comment is pointless. If the comment were corrected (or the code changed so it really was a default constructor), it would still wouldn't give any useful information. It's just telling me what's already obvious from the function's parameter list--it's a default ctor if and only if it can be invoked without supplying any arguments.
Comments should convey information that's not immediately obvious from the code itself. Most often, this is things like why you wrote the code the way it is.
The same applies to most of the other comments as well--everything they try to convey is either wrong or obvious (or, as above, both).
Non-templated string type
The days of strings supporting only char as the data type are (IMO) long past. Whether you like Unicode or not, it's better to support it (and for a string, it's much cleaner to manipulate UTF-32 than UTF-8).
Array new
The array form of new should generally be avoided. In the specific case of char (or unsigned char) it's marginally less evil than usual, but I'd advise avoiding it in general. For containers (and IMO, string is a container) you want to use operator new to allocate "raw" memory, then use placement new to construct objects in that raw memory. It's kind of overkill in the specific case of char, but at worst it's harmless, and at best it's a drastic improvement (it leaves the unused part of the memory as raw memory, where new [] always creates objects in all the memory you allocate).
operator+
I'm not particularly fond of your current implementation of operator+. It works, but (IMO) is fairly easy for somebody to misinterpret in subtle ways that could lead to problems down the road.
I'd do pretty much the same things, but split it into smaller pieces that made the intent a little more obvious:
string operator+(const string & str) const {
string r(*this);
r += str;
return r;
}
I'd have to think things through in detail to be sure, there's at least a reasonable chance that this is actually more efficient. In particular, it allows copy elision of the return value, where casting to an rvalue (whether done explicitly or in the form of std::move) can inhibit that.
Alternative Rule
In most case, you want to follow the rule of zero rather than the rule of five. In simple form, this says you should delegate essentially all the resource management to a class (most often unique_ptr or shared_ptr) devoted solely to that purpose, so your class only needs to deal with whatever it's really supposed to do. In the case of a (non-COW) string, each string representation has only a single owner, so this should be std::unique_ptr.
Copy assignment
Your current copy assignment operator isn't safe in the face of exceptions. Right now you have a sequence of deleting the existing data, then allocating new space, and then copying the new data into the new space. If (for example) that new throws an exception, your string won't contain either the old data or the new data. Since it's a constructed string, its destructor will execute when that scope is exited. That will attempt to delete the buffer pointed to by its data--which you just deleted, so you get a double delete, giving undefined behavior.
One easy way to get defined behavior is the copy and swap idiom, which could look something like this:
string& operator=(string str) {
std::cout << "Copy assignmentn";
using std::swap;
swap(data, str.data);
swap(m_size, str.m_size);
return *this;
}
Since we pass the parameter by value, the copy constructor is invoked to create a string for the parameter. We then essentially do a move from there to the target. To keep things as simple as possible, we swap the guts of the two strings, so when the function returns, the parameter goes out of scope, and gets destroyed automatically.
In this case, the allocation happens in the ctor, so if it throws, this function is never invoked at all. The function itself contains only operations that we know can't throw, so we don't have to worry about what would happen if an exception were thrown in the middle of it.
This isn't the only possible way, of course, but it's one way that's pretty easy to get right. More important, perhaps, is establishing the basic pattern:
Phase 1: carry out things that might throw, but don't affect any existing data if they do.
Phase 2: carry out operation that affect existing data, but can't throw.
This way, if anything throws, Phase 2 never happens at all, so existing data can never be affected. If phase 2 starts, it always executes completely. Therefore, we always get either the state from before we started this operation at all, or else the state after it finished completely. The outside world will never see an intermediate state.
Of course, move assignment can be handled much the same way, except taking an rvalue reference as the parameter. We can still swap the "guts" of the current string with those of the parameter, and then let the dtor take care of destroying the parameter when it goes out of scope:
string& operator=(string && str) {
std::cout << "Move assignmentn";
using std::swap;
swap(data, str.data);
swap(m_size, str.m_size);
return *this;
}
Use of strcat
As it stands right now, I'd call your use of strcat (e.g., in operator+=) basically a bug. To use strcat, you'd need to guarantee that every string contains exactly one NUL character at its very end.
As it stands now, a string can be created that contains NUL characters in its middle. A string can also be created that doesn't contain any NUL character at its end.
To work correctly, you could use memcpy for both operations:
string& operator+=(const string & str) {
size_t new_size = m_size + str.m_size;
char * new_data = new char[new_size];
mempy(new_data, data, m_size);
memcpy(new_data+m_size, str.data, str.m_size);
delete[] data;
m_size = new_size;
data = new_data;
return *this;
}
Note that your use of memmove would work correctly, but was overkill. memmove guarantees correct behavior even if the old and new locations might overlap. To do that, it has to check for overlap, and copy from beginning to end for overlap in one direction, and end to beginning for overlap in the other direction. In this case, we know there will never be any overlap, so we can use memcpy which only supports non-overlapping buffers (so it avoids checking for overlap).
$endgroup$
add a comment |
$begingroup$
Comments
Let's consider one of your constructors, and its accompanying comment:
string(const char * p, size_t size) : m_size(size) { //default constructor
At least in my opinion, this has at least two obvious problems. This first problem is that the comment is just wrong. A default constructor is one that can be invoked without supplying any arguments. In this case, two arguments must be supplied, so it's not a default ctor.
The second problem is almost simpler. As it stands right now, the comment is pointless. If the comment were corrected (or the code changed so it really was a default constructor), it would still wouldn't give any useful information. It's just telling me what's already obvious from the function's parameter list--it's a default ctor if and only if it can be invoked without supplying any arguments.
Comments should convey information that's not immediately obvious from the code itself. Most often, this is things like why you wrote the code the way it is.
The same applies to most of the other comments as well--everything they try to convey is either wrong or obvious (or, as above, both).
Non-templated string type
The days of strings supporting only char as the data type are (IMO) long past. Whether you like Unicode or not, it's better to support it (and for a string, it's much cleaner to manipulate UTF-32 than UTF-8).
Array new
The array form of new should generally be avoided. In the specific case of char (or unsigned char) it's marginally less evil than usual, but I'd advise avoiding it in general. For containers (and IMO, string is a container) you want to use operator new to allocate "raw" memory, then use placement new to construct objects in that raw memory. It's kind of overkill in the specific case of char, but at worst it's harmless, and at best it's a drastic improvement (it leaves the unused part of the memory as raw memory, where new [] always creates objects in all the memory you allocate).
operator+
I'm not particularly fond of your current implementation of operator+. It works, but (IMO) is fairly easy for somebody to misinterpret in subtle ways that could lead to problems down the road.
I'd do pretty much the same things, but split it into smaller pieces that made the intent a little more obvious:
string operator+(const string & str) const {
string r(*this);
r += str;
return r;
}
I'd have to think things through in detail to be sure, there's at least a reasonable chance that this is actually more efficient. In particular, it allows copy elision of the return value, where casting to an rvalue (whether done explicitly or in the form of std::move) can inhibit that.
Alternative Rule
In most case, you want to follow the rule of zero rather than the rule of five. In simple form, this says you should delegate essentially all the resource management to a class (most often unique_ptr or shared_ptr) devoted solely to that purpose, so your class only needs to deal with whatever it's really supposed to do. In the case of a (non-COW) string, each string representation has only a single owner, so this should be std::unique_ptr.
Copy assignment
Your current copy assignment operator isn't safe in the face of exceptions. Right now you have a sequence of deleting the existing data, then allocating new space, and then copying the new data into the new space. If (for example) that new throws an exception, your string won't contain either the old data or the new data. Since it's a constructed string, its destructor will execute when that scope is exited. That will attempt to delete the buffer pointed to by its data--which you just deleted, so you get a double delete, giving undefined behavior.
One easy way to get defined behavior is the copy and swap idiom, which could look something like this:
string& operator=(string str) {
std::cout << "Copy assignmentn";
using std::swap;
swap(data, str.data);
swap(m_size, str.m_size);
return *this;
}
Since we pass the parameter by value, the copy constructor is invoked to create a string for the parameter. We then essentially do a move from there to the target. To keep things as simple as possible, we swap the guts of the two strings, so when the function returns, the parameter goes out of scope, and gets destroyed automatically.
In this case, the allocation happens in the ctor, so if it throws, this function is never invoked at all. The function itself contains only operations that we know can't throw, so we don't have to worry about what would happen if an exception were thrown in the middle of it.
This isn't the only possible way, of course, but it's one way that's pretty easy to get right. More important, perhaps, is establishing the basic pattern:
Phase 1: carry out things that might throw, but don't affect any existing data if they do.
Phase 2: carry out operation that affect existing data, but can't throw.
This way, if anything throws, Phase 2 never happens at all, so existing data can never be affected. If phase 2 starts, it always executes completely. Therefore, we always get either the state from before we started this operation at all, or else the state after it finished completely. The outside world will never see an intermediate state.
Of course, move assignment can be handled much the same way, except taking an rvalue reference as the parameter. We can still swap the "guts" of the current string with those of the parameter, and then let the dtor take care of destroying the parameter when it goes out of scope:
string& operator=(string && str) {
std::cout << "Move assignmentn";
using std::swap;
swap(data, str.data);
swap(m_size, str.m_size);
return *this;
}
Use of strcat
As it stands right now, I'd call your use of strcat (e.g., in operator+=) basically a bug. To use strcat, you'd need to guarantee that every string contains exactly one NUL character at its very end.
As it stands now, a string can be created that contains NUL characters in its middle. A string can also be created that doesn't contain any NUL character at its end.
To work correctly, you could use memcpy for both operations:
string& operator+=(const string & str) {
size_t new_size = m_size + str.m_size;
char * new_data = new char[new_size];
mempy(new_data, data, m_size);
memcpy(new_data+m_size, str.data, str.m_size);
delete[] data;
m_size = new_size;
data = new_data;
return *this;
}
Note that your use of memmove would work correctly, but was overkill. memmove guarantees correct behavior even if the old and new locations might overlap. To do that, it has to check for overlap, and copy from beginning to end for overlap in one direction, and end to beginning for overlap in the other direction. In this case, we know there will never be any overlap, so we can use memcpy which only supports non-overlapping buffers (so it avoids checking for overlap).
$endgroup$
add a comment |
$begingroup$
Comments
Let's consider one of your constructors, and its accompanying comment:
string(const char * p, size_t size) : m_size(size) { //default constructor
At least in my opinion, this has at least two obvious problems. This first problem is that the comment is just wrong. A default constructor is one that can be invoked without supplying any arguments. In this case, two arguments must be supplied, so it's not a default ctor.
The second problem is almost simpler. As it stands right now, the comment is pointless. If the comment were corrected (or the code changed so it really was a default constructor), it would still wouldn't give any useful information. It's just telling me what's already obvious from the function's parameter list--it's a default ctor if and only if it can be invoked without supplying any arguments.
Comments should convey information that's not immediately obvious from the code itself. Most often, this is things like why you wrote the code the way it is.
The same applies to most of the other comments as well--everything they try to convey is either wrong or obvious (or, as above, both).
Non-templated string type
The days of strings supporting only char as the data type are (IMO) long past. Whether you like Unicode or not, it's better to support it (and for a string, it's much cleaner to manipulate UTF-32 than UTF-8).
Array new
The array form of new should generally be avoided. In the specific case of char (or unsigned char) it's marginally less evil than usual, but I'd advise avoiding it in general. For containers (and IMO, string is a container) you want to use operator new to allocate "raw" memory, then use placement new to construct objects in that raw memory. It's kind of overkill in the specific case of char, but at worst it's harmless, and at best it's a drastic improvement (it leaves the unused part of the memory as raw memory, where new [] always creates objects in all the memory you allocate).
operator+
I'm not particularly fond of your current implementation of operator+. It works, but (IMO) is fairly easy for somebody to misinterpret in subtle ways that could lead to problems down the road.
I'd do pretty much the same things, but split it into smaller pieces that made the intent a little more obvious:
string operator+(const string & str) const {
string r(*this);
r += str;
return r;
}
I'd have to think things through in detail to be sure, there's at least a reasonable chance that this is actually more efficient. In particular, it allows copy elision of the return value, where casting to an rvalue (whether done explicitly or in the form of std::move) can inhibit that.
Alternative Rule
In most case, you want to follow the rule of zero rather than the rule of five. In simple form, this says you should delegate essentially all the resource management to a class (most often unique_ptr or shared_ptr) devoted solely to that purpose, so your class only needs to deal with whatever it's really supposed to do. In the case of a (non-COW) string, each string representation has only a single owner, so this should be std::unique_ptr.
Copy assignment
Your current copy assignment operator isn't safe in the face of exceptions. Right now you have a sequence of deleting the existing data, then allocating new space, and then copying the new data into the new space. If (for example) that new throws an exception, your string won't contain either the old data or the new data. Since it's a constructed string, its destructor will execute when that scope is exited. That will attempt to delete the buffer pointed to by its data--which you just deleted, so you get a double delete, giving undefined behavior.
One easy way to get defined behavior is the copy and swap idiom, which could look something like this:
string& operator=(string str) {
std::cout << "Copy assignmentn";
using std::swap;
swap(data, str.data);
swap(m_size, str.m_size);
return *this;
}
Since we pass the parameter by value, the copy constructor is invoked to create a string for the parameter. We then essentially do a move from there to the target. To keep things as simple as possible, we swap the guts of the two strings, so when the function returns, the parameter goes out of scope, and gets destroyed automatically.
In this case, the allocation happens in the ctor, so if it throws, this function is never invoked at all. The function itself contains only operations that we know can't throw, so we don't have to worry about what would happen if an exception were thrown in the middle of it.
This isn't the only possible way, of course, but it's one way that's pretty easy to get right. More important, perhaps, is establishing the basic pattern:
Phase 1: carry out things that might throw, but don't affect any existing data if they do.
Phase 2: carry out operation that affect existing data, but can't throw.
This way, if anything throws, Phase 2 never happens at all, so existing data can never be affected. If phase 2 starts, it always executes completely. Therefore, we always get either the state from before we started this operation at all, or else the state after it finished completely. The outside world will never see an intermediate state.
Of course, move assignment can be handled much the same way, except taking an rvalue reference as the parameter. We can still swap the "guts" of the current string with those of the parameter, and then let the dtor take care of destroying the parameter when it goes out of scope:
string& operator=(string && str) {
std::cout << "Move assignmentn";
using std::swap;
swap(data, str.data);
swap(m_size, str.m_size);
return *this;
}
Use of strcat
As it stands right now, I'd call your use of strcat (e.g., in operator+=) basically a bug. To use strcat, you'd need to guarantee that every string contains exactly one NUL character at its very end.
As it stands now, a string can be created that contains NUL characters in its middle. A string can also be created that doesn't contain any NUL character at its end.
To work correctly, you could use memcpy for both operations:
string& operator+=(const string & str) {
size_t new_size = m_size + str.m_size;
char * new_data = new char[new_size];
mempy(new_data, data, m_size);
memcpy(new_data+m_size, str.data, str.m_size);
delete[] data;
m_size = new_size;
data = new_data;
return *this;
}
Note that your use of memmove would work correctly, but was overkill. memmove guarantees correct behavior even if the old and new locations might overlap. To do that, it has to check for overlap, and copy from beginning to end for overlap in one direction, and end to beginning for overlap in the other direction. In this case, we know there will never be any overlap, so we can use memcpy which only supports non-overlapping buffers (so it avoids checking for overlap).
$endgroup$
Comments
Let's consider one of your constructors, and its accompanying comment:
string(const char * p, size_t size) : m_size(size) { //default constructor
At least in my opinion, this has at least two obvious problems. This first problem is that the comment is just wrong. A default constructor is one that can be invoked without supplying any arguments. In this case, two arguments must be supplied, so it's not a default ctor.
The second problem is almost simpler. As it stands right now, the comment is pointless. If the comment were corrected (or the code changed so it really was a default constructor), it would still wouldn't give any useful information. It's just telling me what's already obvious from the function's parameter list--it's a default ctor if and only if it can be invoked without supplying any arguments.
Comments should convey information that's not immediately obvious from the code itself. Most often, this is things like why you wrote the code the way it is.
The same applies to most of the other comments as well--everything they try to convey is either wrong or obvious (or, as above, both).
Non-templated string type
The days of strings supporting only char as the data type are (IMO) long past. Whether you like Unicode or not, it's better to support it (and for a string, it's much cleaner to manipulate UTF-32 than UTF-8).
Array new
The array form of new should generally be avoided. In the specific case of char (or unsigned char) it's marginally less evil than usual, but I'd advise avoiding it in general. For containers (and IMO, string is a container) you want to use operator new to allocate "raw" memory, then use placement new to construct objects in that raw memory. It's kind of overkill in the specific case of char, but at worst it's harmless, and at best it's a drastic improvement (it leaves the unused part of the memory as raw memory, where new [] always creates objects in all the memory you allocate).
operator+
I'm not particularly fond of your current implementation of operator+. It works, but (IMO) is fairly easy for somebody to misinterpret in subtle ways that could lead to problems down the road.
I'd do pretty much the same things, but split it into smaller pieces that made the intent a little more obvious:
string operator+(const string & str) const {
string r(*this);
r += str;
return r;
}
I'd have to think things through in detail to be sure, there's at least a reasonable chance that this is actually more efficient. In particular, it allows copy elision of the return value, where casting to an rvalue (whether done explicitly or in the form of std::move) can inhibit that.
Alternative Rule
In most case, you want to follow the rule of zero rather than the rule of five. In simple form, this says you should delegate essentially all the resource management to a class (most often unique_ptr or shared_ptr) devoted solely to that purpose, so your class only needs to deal with whatever it's really supposed to do. In the case of a (non-COW) string, each string representation has only a single owner, so this should be std::unique_ptr.
Copy assignment
Your current copy assignment operator isn't safe in the face of exceptions. Right now you have a sequence of deleting the existing data, then allocating new space, and then copying the new data into the new space. If (for example) that new throws an exception, your string won't contain either the old data or the new data. Since it's a constructed string, its destructor will execute when that scope is exited. That will attempt to delete the buffer pointed to by its data--which you just deleted, so you get a double delete, giving undefined behavior.
One easy way to get defined behavior is the copy and swap idiom, which could look something like this:
string& operator=(string str) {
std::cout << "Copy assignmentn";
using std::swap;
swap(data, str.data);
swap(m_size, str.m_size);
return *this;
}
Since we pass the parameter by value, the copy constructor is invoked to create a string for the parameter. We then essentially do a move from there to the target. To keep things as simple as possible, we swap the guts of the two strings, so when the function returns, the parameter goes out of scope, and gets destroyed automatically.
In this case, the allocation happens in the ctor, so if it throws, this function is never invoked at all. The function itself contains only operations that we know can't throw, so we don't have to worry about what would happen if an exception were thrown in the middle of it.
This isn't the only possible way, of course, but it's one way that's pretty easy to get right. More important, perhaps, is establishing the basic pattern:
Phase 1: carry out things that might throw, but don't affect any existing data if they do.
Phase 2: carry out operation that affect existing data, but can't throw.
This way, if anything throws, Phase 2 never happens at all, so existing data can never be affected. If phase 2 starts, it always executes completely. Therefore, we always get either the state from before we started this operation at all, or else the state after it finished completely. The outside world will never see an intermediate state.
Of course, move assignment can be handled much the same way, except taking an rvalue reference as the parameter. We can still swap the "guts" of the current string with those of the parameter, and then let the dtor take care of destroying the parameter when it goes out of scope:
string& operator=(string && str) {
std::cout << "Move assignmentn";
using std::swap;
swap(data, str.data);
swap(m_size, str.m_size);
return *this;
}
Use of strcat
As it stands right now, I'd call your use of strcat (e.g., in operator+=) basically a bug. To use strcat, you'd need to guarantee that every string contains exactly one NUL character at its very end.
As it stands now, a string can be created that contains NUL characters in its middle. A string can also be created that doesn't contain any NUL character at its end.
To work correctly, you could use memcpy for both operations:
string& operator+=(const string & str) {
size_t new_size = m_size + str.m_size;
char * new_data = new char[new_size];
mempy(new_data, data, m_size);
memcpy(new_data+m_size, str.data, str.m_size);
delete[] data;
m_size = new_size;
data = new_data;
return *this;
}
Note that your use of memmove would work correctly, but was overkill. memmove guarantees correct behavior even if the old and new locations might overlap. To do that, it has to check for overlap, and copy from beginning to end for overlap in one direction, and end to beginning for overlap in the other direction. In this case, we know there will never be any overlap, so we can use memcpy which only supports non-overlapping buffers (so it avoids checking for overlap).
edited 4 mins ago
answered Dec 15 '16 at 0:17
Jerry CoffinJerry Coffin
28.4k462128
28.4k462128
add a comment |
add a comment |
$begingroup$
When I run your code, Valgrind reports 11 allocations and a matching 11 frees, with no leaks. So I don't see why your run indicated a non-freed block.
I did have to add includes for <cstring> and <iostream> to get it to compile - please don't omit your includes, even if the code is long! Other than that, this code was complete; it's clear and easy to read.
Some observations:
Initializers
g++ -Weffc++ turns up some missing initialisers that are worth addressing. For example:
string() : data(nullptr) {
std::cout << "Default ctor" << std::endl;
}
Would be better if it also initialized m_size:
string()
: data(nullptr),
m_size(0)
{
std::cout << "Default ctor" << std::endl;
}
And this one (the comment lies):
string(const char * p, size_t size) : m_size(size) { //default constructor
std::cout << "Special ctor" << std::endl;
data = new char[m_size];
memcpy(data, p, m_size);
}
could be
string(const char *p, size_t size)
: data(new char[size]),
m_size(size)
{
std::cout << "Converting ctor" << std::endl;
std::memcpy(data, p, m_size);
}
I prefer one initialiser per line, as I do for assignments - but you're welcome to disagree.
Constructors - general
I'd add a constructor that takes a null-terminated string:
string(const char *p) : string(p, strlen(p)) {}
Assignment operator
You can simplify your assignment operator using the copy and swap idiom:
friend void swap(string &a, string& b)
{
using std::swap;
swap(a.data, b.data);
swap(a.m_size, b.m_size);
}
string& operator=(string str) {
std::cout << "Copy assignment" << std::endl;
swap(*this, str);
return *this;
}
Writing it like this means that you don't need to overload it on const string& and string&& - passing by value will construct str using the appropriate constructor. See the linked answer for details.
Copy-and-swap does have some performance cost to gain that simplicity and safety. See Scott Meyers's article Drawbacks of Implementing Move Assignment in Terms of Swap for a full explanation. If that's a concern, then you should keep the overload for move-assignment.
Append operator
You use strcat() to append the second string to the first, but that assumes that both are null-terminated. You need to use memmove() (preferably memcpy) for both:
string& operator+=(const string & str) {
size_t new_size = m_size + str.m_size;
char * new_data = new char[new_size];
memcpy(new_data, data, m_size);
memcpy(new_data+m_size, str.data, str.m_size);
delete[] data;
m_size = new_size;
data = new_data;
return *this;
}
The ostream operator also depends on a trailing NUL, so here's a safe replacement:
friend std::ostream& operator<<(std::ostream& os, const string& s) {
for (size_t i = 0; i < s.m_size; ++i)
os << s.data[i];
return os << "n";
}
Addition operator
Addition is conventionally a const method; I'm not very fond of casts:
string operator+(const string & str) const {
string s{*this};
return s += str;
}
You may prefer to add a constructor taking two (or more - you might employ std::initializer_list or C++17 fold expressions) string objects, to save allocating temporary space which is immediately freed and reallocated:
public:
string operator+(const string & str) const {
return {*this, str};
}
private:
string(const string& a, const string& b)
: data{new char[a.m_size + b.m_size]},
m_size{a.m_size + b.m_size}
{
if (a.data)
memcpy(data, a.data, a.m_size);
if (b.data)
memcpy(data+a.m_size, b.data, b.m_size);
}
That concatenation constructor could be more concise if you swap the order of m_size and data in the class.
Tests
It's great that you include the tests in the review. It helps show how you expect your class to be used, and it helps identify tests you've missed. In my opinion, all review requests should include tests.
Here I changed the initializer of a to not include its NUL and that of b to use my additional constructor:
string a(p1,sizeof t1-1);
string b(p2);
I took out the assignment e=d; so that I could test operator+() with the empty string:
std::cout << "a+e:t" << a+e;
std::cout << "e+e:t" << e+e;
std::cout << "e+a:t" << e+a;
It would be worth implementing the == operator soon, and then you could make the tests self-checking (affecting the exit status, rather than requiring manual inspection).
$endgroup$
$begingroup$
An additional observation:operator+=()can avoid some allocations. Check if*thisis empty (return *this = str;) or ifstris empty -return *this;.
$endgroup$
– Toby Speight
Jan 19 '17 at 7:18
add a comment |
$begingroup$
When I run your code, Valgrind reports 11 allocations and a matching 11 frees, with no leaks. So I don't see why your run indicated a non-freed block.
I did have to add includes for <cstring> and <iostream> to get it to compile - please don't omit your includes, even if the code is long! Other than that, this code was complete; it's clear and easy to read.
Some observations:
Initializers
g++ -Weffc++ turns up some missing initialisers that are worth addressing. For example:
string() : data(nullptr) {
std::cout << "Default ctor" << std::endl;
}
Would be better if it also initialized m_size:
string()
: data(nullptr),
m_size(0)
{
std::cout << "Default ctor" << std::endl;
}
And this one (the comment lies):
string(const char * p, size_t size) : m_size(size) { //default constructor
std::cout << "Special ctor" << std::endl;
data = new char[m_size];
memcpy(data, p, m_size);
}
could be
string(const char *p, size_t size)
: data(new char[size]),
m_size(size)
{
std::cout << "Converting ctor" << std::endl;
std::memcpy(data, p, m_size);
}
I prefer one initialiser per line, as I do for assignments - but you're welcome to disagree.
Constructors - general
I'd add a constructor that takes a null-terminated string:
string(const char *p) : string(p, strlen(p)) {}
Assignment operator
You can simplify your assignment operator using the copy and swap idiom:
friend void swap(string &a, string& b)
{
using std::swap;
swap(a.data, b.data);
swap(a.m_size, b.m_size);
}
string& operator=(string str) {
std::cout << "Copy assignment" << std::endl;
swap(*this, str);
return *this;
}
Writing it like this means that you don't need to overload it on const string& and string&& - passing by value will construct str using the appropriate constructor. See the linked answer for details.
Copy-and-swap does have some performance cost to gain that simplicity and safety. See Scott Meyers's article Drawbacks of Implementing Move Assignment in Terms of Swap for a full explanation. If that's a concern, then you should keep the overload for move-assignment.
Append operator
You use strcat() to append the second string to the first, but that assumes that both are null-terminated. You need to use memmove() (preferably memcpy) for both:
string& operator+=(const string & str) {
size_t new_size = m_size + str.m_size;
char * new_data = new char[new_size];
memcpy(new_data, data, m_size);
memcpy(new_data+m_size, str.data, str.m_size);
delete[] data;
m_size = new_size;
data = new_data;
return *this;
}
The ostream operator also depends on a trailing NUL, so here's a safe replacement:
friend std::ostream& operator<<(std::ostream& os, const string& s) {
for (size_t i = 0; i < s.m_size; ++i)
os << s.data[i];
return os << "n";
}
Addition operator
Addition is conventionally a const method; I'm not very fond of casts:
string operator+(const string & str) const {
string s{*this};
return s += str;
}
You may prefer to add a constructor taking two (or more - you might employ std::initializer_list or C++17 fold expressions) string objects, to save allocating temporary space which is immediately freed and reallocated:
public:
string operator+(const string & str) const {
return {*this, str};
}
private:
string(const string& a, const string& b)
: data{new char[a.m_size + b.m_size]},
m_size{a.m_size + b.m_size}
{
if (a.data)
memcpy(data, a.data, a.m_size);
if (b.data)
memcpy(data+a.m_size, b.data, b.m_size);
}
That concatenation constructor could be more concise if you swap the order of m_size and data in the class.
Tests
It's great that you include the tests in the review. It helps show how you expect your class to be used, and it helps identify tests you've missed. In my opinion, all review requests should include tests.
Here I changed the initializer of a to not include its NUL and that of b to use my additional constructor:
string a(p1,sizeof t1-1);
string b(p2);
I took out the assignment e=d; so that I could test operator+() with the empty string:
std::cout << "a+e:t" << a+e;
std::cout << "e+e:t" << e+e;
std::cout << "e+a:t" << e+a;
It would be worth implementing the == operator soon, and then you could make the tests self-checking (affecting the exit status, rather than requiring manual inspection).
$endgroup$
$begingroup$
An additional observation:operator+=()can avoid some allocations. Check if*thisis empty (return *this = str;) or ifstris empty -return *this;.
$endgroup$
– Toby Speight
Jan 19 '17 at 7:18
add a comment |
$begingroup$
When I run your code, Valgrind reports 11 allocations and a matching 11 frees, with no leaks. So I don't see why your run indicated a non-freed block.
I did have to add includes for <cstring> and <iostream> to get it to compile - please don't omit your includes, even if the code is long! Other than that, this code was complete; it's clear and easy to read.
Some observations:
Initializers
g++ -Weffc++ turns up some missing initialisers that are worth addressing. For example:
string() : data(nullptr) {
std::cout << "Default ctor" << std::endl;
}
Would be better if it also initialized m_size:
string()
: data(nullptr),
m_size(0)
{
std::cout << "Default ctor" << std::endl;
}
And this one (the comment lies):
string(const char * p, size_t size) : m_size(size) { //default constructor
std::cout << "Special ctor" << std::endl;
data = new char[m_size];
memcpy(data, p, m_size);
}
could be
string(const char *p, size_t size)
: data(new char[size]),
m_size(size)
{
std::cout << "Converting ctor" << std::endl;
std::memcpy(data, p, m_size);
}
I prefer one initialiser per line, as I do for assignments - but you're welcome to disagree.
Constructors - general
I'd add a constructor that takes a null-terminated string:
string(const char *p) : string(p, strlen(p)) {}
Assignment operator
You can simplify your assignment operator using the copy and swap idiom:
friend void swap(string &a, string& b)
{
using std::swap;
swap(a.data, b.data);
swap(a.m_size, b.m_size);
}
string& operator=(string str) {
std::cout << "Copy assignment" << std::endl;
swap(*this, str);
return *this;
}
Writing it like this means that you don't need to overload it on const string& and string&& - passing by value will construct str using the appropriate constructor. See the linked answer for details.
Copy-and-swap does have some performance cost to gain that simplicity and safety. See Scott Meyers's article Drawbacks of Implementing Move Assignment in Terms of Swap for a full explanation. If that's a concern, then you should keep the overload for move-assignment.
Append operator
You use strcat() to append the second string to the first, but that assumes that both are null-terminated. You need to use memmove() (preferably memcpy) for both:
string& operator+=(const string & str) {
size_t new_size = m_size + str.m_size;
char * new_data = new char[new_size];
memcpy(new_data, data, m_size);
memcpy(new_data+m_size, str.data, str.m_size);
delete[] data;
m_size = new_size;
data = new_data;
return *this;
}
The ostream operator also depends on a trailing NUL, so here's a safe replacement:
friend std::ostream& operator<<(std::ostream& os, const string& s) {
for (size_t i = 0; i < s.m_size; ++i)
os << s.data[i];
return os << "n";
}
Addition operator
Addition is conventionally a const method; I'm not very fond of casts:
string operator+(const string & str) const {
string s{*this};
return s += str;
}
You may prefer to add a constructor taking two (or more - you might employ std::initializer_list or C++17 fold expressions) string objects, to save allocating temporary space which is immediately freed and reallocated:
public:
string operator+(const string & str) const {
return {*this, str};
}
private:
string(const string& a, const string& b)
: data{new char[a.m_size + b.m_size]},
m_size{a.m_size + b.m_size}
{
if (a.data)
memcpy(data, a.data, a.m_size);
if (b.data)
memcpy(data+a.m_size, b.data, b.m_size);
}
That concatenation constructor could be more concise if you swap the order of m_size and data in the class.
Tests
It's great that you include the tests in the review. It helps show how you expect your class to be used, and it helps identify tests you've missed. In my opinion, all review requests should include tests.
Here I changed the initializer of a to not include its NUL and that of b to use my additional constructor:
string a(p1,sizeof t1-1);
string b(p2);
I took out the assignment e=d; so that I could test operator+() with the empty string:
std::cout << "a+e:t" << a+e;
std::cout << "e+e:t" << e+e;
std::cout << "e+a:t" << e+a;
It would be worth implementing the == operator soon, and then you could make the tests self-checking (affecting the exit status, rather than requiring manual inspection).
$endgroup$
When I run your code, Valgrind reports 11 allocations and a matching 11 frees, with no leaks. So I don't see why your run indicated a non-freed block.
I did have to add includes for <cstring> and <iostream> to get it to compile - please don't omit your includes, even if the code is long! Other than that, this code was complete; it's clear and easy to read.
Some observations:
Initializers
g++ -Weffc++ turns up some missing initialisers that are worth addressing. For example:
string() : data(nullptr) {
std::cout << "Default ctor" << std::endl;
}
Would be better if it also initialized m_size:
string()
: data(nullptr),
m_size(0)
{
std::cout << "Default ctor" << std::endl;
}
And this one (the comment lies):
string(const char * p, size_t size) : m_size(size) { //default constructor
std::cout << "Special ctor" << std::endl;
data = new char[m_size];
memcpy(data, p, m_size);
}
could be
string(const char *p, size_t size)
: data(new char[size]),
m_size(size)
{
std::cout << "Converting ctor" << std::endl;
std::memcpy(data, p, m_size);
}
I prefer one initialiser per line, as I do for assignments - but you're welcome to disagree.
Constructors - general
I'd add a constructor that takes a null-terminated string:
string(const char *p) : string(p, strlen(p)) {}
Assignment operator
You can simplify your assignment operator using the copy and swap idiom:
friend void swap(string &a, string& b)
{
using std::swap;
swap(a.data, b.data);
swap(a.m_size, b.m_size);
}
string& operator=(string str) {
std::cout << "Copy assignment" << std::endl;
swap(*this, str);
return *this;
}
Writing it like this means that you don't need to overload it on const string& and string&& - passing by value will construct str using the appropriate constructor. See the linked answer for details.
Copy-and-swap does have some performance cost to gain that simplicity and safety. See Scott Meyers's article Drawbacks of Implementing Move Assignment in Terms of Swap for a full explanation. If that's a concern, then you should keep the overload for move-assignment.
Append operator
You use strcat() to append the second string to the first, but that assumes that both are null-terminated. You need to use memmove() (preferably memcpy) for both:
string& operator+=(const string & str) {
size_t new_size = m_size + str.m_size;
char * new_data = new char[new_size];
memcpy(new_data, data, m_size);
memcpy(new_data+m_size, str.data, str.m_size);
delete[] data;
m_size = new_size;
data = new_data;
return *this;
}
The ostream operator also depends on a trailing NUL, so here's a safe replacement:
friend std::ostream& operator<<(std::ostream& os, const string& s) {
for (size_t i = 0; i < s.m_size; ++i)
os << s.data[i];
return os << "n";
}
Addition operator
Addition is conventionally a const method; I'm not very fond of casts:
string operator+(const string & str) const {
string s{*this};
return s += str;
}
You may prefer to add a constructor taking two (or more - you might employ std::initializer_list or C++17 fold expressions) string objects, to save allocating temporary space which is immediately freed and reallocated:
public:
string operator+(const string & str) const {
return {*this, str};
}
private:
string(const string& a, const string& b)
: data{new char[a.m_size + b.m_size]},
m_size{a.m_size + b.m_size}
{
if (a.data)
memcpy(data, a.data, a.m_size);
if (b.data)
memcpy(data+a.m_size, b.data, b.m_size);
}
That concatenation constructor could be more concise if you swap the order of m_size and data in the class.
Tests
It's great that you include the tests in the review. It helps show how you expect your class to be used, and it helps identify tests you've missed. In my opinion, all review requests should include tests.
Here I changed the initializer of a to not include its NUL and that of b to use my additional constructor:
string a(p1,sizeof t1-1);
string b(p2);
I took out the assignment e=d; so that I could test operator+() with the empty string:
std::cout << "a+e:t" << a+e;
std::cout << "e+e:t" << e+e;
std::cout << "e+a:t" << e+a;
It would be worth implementing the == operator soon, and then you could make the tests self-checking (affecting the exit status, rather than requiring manual inspection).
edited Dec 16 '16 at 9:51
answered Dec 15 '16 at 18:51
Toby SpeightToby Speight
26.1k742118
26.1k742118
$begingroup$
An additional observation:operator+=()can avoid some allocations. Check if*thisis empty (return *this = str;) or ifstris empty -return *this;.
$endgroup$
– Toby Speight
Jan 19 '17 at 7:18
add a comment |
$begingroup$
An additional observation:operator+=()can avoid some allocations. Check if*thisis empty (return *this = str;) or ifstris empty -return *this;.
$endgroup$
– Toby Speight
Jan 19 '17 at 7:18
$begingroup$
An additional observation:
operator+=() can avoid some allocations. Check if *this is empty (return *this = str;) or if str is empty - return *this;.$endgroup$
– Toby Speight
Jan 19 '17 at 7:18
$begingroup$
An additional observation:
operator+=() can avoid some allocations. Check if *this is empty (return *this = str;) or if str is empty - return *this;.$endgroup$
– Toby Speight
Jan 19 '17 at 7:18
add a comment |
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f149917%2frule-of-five-and-move-semantics-for-a-string-class%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
In your default constructor, you forgot to initialize
m_size. Don't assume it will be automatically initialized to 0.$endgroup$
– ncalmbeblpaicr0011
Dec 15 '16 at 19:24
$begingroup$
One thing that surprised me is that the ostream output operator is in a
private:section, but is still accessible frommain()- presumably being declaredfriendmakes it visible? I'm still learning here...$endgroup$
– Toby Speight
Dec 15 '16 at 22:40