Basic substring algorithms + auxiliary string-generating functionsRecursive and iterative Euclidean...
Can a dragon be stuck looking like a human?
Placing an adverb between a verb and an object?
What is the purpose of easy combat scenarios that don't need resource expenditure?
Dilemma of explaining to interviewer that he is the reason for declining second interview
Chess tournament winning streaks
Notes in a lick that don't fit in the scale associated with the chord
Does fast page mode apply to ROM?
Parsing a string of key-value pairs as a dictionary
Do authors have to be politically correct in article-writing?
Disable the ">" operator in Rstudio linux terminal
Would a National Army of mercenaries be a feasible idea?
What's the most convenient time of year to end the world?
Check if the digits in the number are in increasing sequence in python
It took me a lot of time to make this, pls like. (YouTube Comments #1)
Cryptic with missing capitals
Isn't using the Extrusion Multiplier like cheating?
Word or phrase for showing great skill at something without formal training in it
Is there some relative to Dutch word "kijken" in German?
why a subspace is closed?
Program that converts a number to a letter of the alphabet
What does Cypher mean when he says Neo is "gonna pop"?
How would an AI self awareness kill switch work?
Why did this image turn out darker?
Does Improved Divine Smite trigger when a paladin makes an unarmed strike?
Basic substring algorithms + auxiliary string-generating functions
Recursive and iterative Euclidean algorithmsBinary tree traversal algorithmsString algorithms and localeSorting algorithms with utility functions and iteratorsPrefix- and z-functions in C++ (string algorithms)Minimal length of substring that forms cyclic stringVisualization of sorting algorithmsExtending functions to tuplesGenerating maze for complicated Hunt the Wumpus gameSTL-Style algorithms on tuples
$begingroup$
Summary: a bunch of algorithms to find if particular substring exists in the string. Meanwhile there's a lot to learn and I am looking forward to implementing more complex ones, I suppose there's some crucial insight regarding style, syntax, or design, which I might be missing. I have also managed to implement the auxiliary functions which potentially would assist me in the future.
Note: the question I have is how to avoid the boilerplate part in the main function. There should have been a some kind of "framework", which would accept the vector of functions and perform the tests. However, for me it's not quite clear how the architecture should be designed.
#include <algorithm>
#include <cassert>
#include <functional>
#include <iostream>
#include <random>
#include <string_view>
#include <utility>
#include <vector>
template <typename T>
std::ostream &operator<<(std::ostream &o, std::vector<T> const &vector)
{ for (auto const &element: vector) o << element << " "; return o; }
namespace string_generator_utility {
std::string generate_string(std::string::size_type length,
std::vector<char>&& alphabet = {'a', 'b', 'c', 'd'}) {
std::string result;
static std::random_device rd;
static std::mt19937_64 mersenne_twister_generator {rd ()};
const auto supremum = alphabet.size() - 1;
std::uniform_int_distribution<std::size_t> range {0, supremum};
for (std::size_t index = 0; index < length; ++index) {
const auto letter = range(mersenne_twister_generator);
result += alphabet[letter];
}
return result;
}
std::vector<std::string> generate_n_strings(std::size_t vector_size, std::size_t string_length,
std::vector<char>&& alphabet = {'a', 'b', 'c', 'd'}) {
std::vector<std::string> generated;
generated.reserve(vector_size);
std::generate_n(std::back_inserter(generated), vector_size,
[&]() { return generate_string(string_length, std::move(alphabet)); });
return generated;
}
} // namespace string_generator_utility
namespace algo {
std::vector<std::int64_t> naive_substring(std::string_view haystack, std::string_view needle) {
const auto haystack_size = haystack.size();
const auto needle_size = needle.size();
assert(haystack_size >= needle_size);
std::vector<std::int64_t> result;
for (std::size_t index = 0; index < haystack_size - needle_size + 1; ++index) {
std::size_t offset = 0;
for (; offset < needle_size; ++offset) {
if (haystack[index + offset] != needle[offset])
break;
}
if (offset == needle_size)
result.push_back(index);
}
return result;
}
std::vector<std::int64_t> rabin_karp_hash(std::string_view haystack, std::string_view needle) {
const auto haystack_size = haystack.size();
const auto needle_size = needle.size();
assert(haystack_size >= needle_size);
std::vector<std::int64_t> matches;
static const auto hash_function = [](std::string_view::iterator begin, std::string_view::iterator end) {
std::int64_t hash = 5381;
for (; begin != end; ++begin)
hash = ((hash << 5) + hash) + *begin;
return hash;
};
const auto needle_hashed = hash_function(std::begin(needle), std::end(needle));
for (std::size_t index = 0; index < haystack_size - needle_size + 1; ++index) {
const auto substring_hash = hash_function
(
std::begin(haystack) + index, std::begin(haystack) + index + needle_size
);
if (substring_hash == needle_hashed)
matches.push_back(index);
}
return matches;
}
} // namespace algo
int main() {
auto vector = string_generator_utility::generate_n_strings(25, 50);
std::cout << "naive substring:n";
for (std::size_t index = 0; index < vector.size(); ++index)
{
std::cout << vector[index] << ": ";
auto shift = algo::naive_substring(vector[index], "ab");
std::cout << shift << "n";
}
std::cout << "rabin-karp-substring:n";
for (std::size_t index = 0; index < vector.size(); ++index)
{
std::cout << vector[index] << ": ";
auto shift = algo::rabin_karp_hash(vector[index], "ab");
std::cout << shift << "n";
}
return 0;
}
c++ algorithm c++17
$endgroup$
This question has an open bounty worth +50
reputation from Inter Veridium ending in 7 days.
The current answers do not contain enough detail.
add a comment |
$begingroup$
Summary: a bunch of algorithms to find if particular substring exists in the string. Meanwhile there's a lot to learn and I am looking forward to implementing more complex ones, I suppose there's some crucial insight regarding style, syntax, or design, which I might be missing. I have also managed to implement the auxiliary functions which potentially would assist me in the future.
Note: the question I have is how to avoid the boilerplate part in the main function. There should have been a some kind of "framework", which would accept the vector of functions and perform the tests. However, for me it's not quite clear how the architecture should be designed.
#include <algorithm>
#include <cassert>
#include <functional>
#include <iostream>
#include <random>
#include <string_view>
#include <utility>
#include <vector>
template <typename T>
std::ostream &operator<<(std::ostream &o, std::vector<T> const &vector)
{ for (auto const &element: vector) o << element << " "; return o; }
namespace string_generator_utility {
std::string generate_string(std::string::size_type length,
std::vector<char>&& alphabet = {'a', 'b', 'c', 'd'}) {
std::string result;
static std::random_device rd;
static std::mt19937_64 mersenne_twister_generator {rd ()};
const auto supremum = alphabet.size() - 1;
std::uniform_int_distribution<std::size_t> range {0, supremum};
for (std::size_t index = 0; index < length; ++index) {
const auto letter = range(mersenne_twister_generator);
result += alphabet[letter];
}
return result;
}
std::vector<std::string> generate_n_strings(std::size_t vector_size, std::size_t string_length,
std::vector<char>&& alphabet = {'a', 'b', 'c', 'd'}) {
std::vector<std::string> generated;
generated.reserve(vector_size);
std::generate_n(std::back_inserter(generated), vector_size,
[&]() { return generate_string(string_length, std::move(alphabet)); });
return generated;
}
} // namespace string_generator_utility
namespace algo {
std::vector<std::int64_t> naive_substring(std::string_view haystack, std::string_view needle) {
const auto haystack_size = haystack.size();
const auto needle_size = needle.size();
assert(haystack_size >= needle_size);
std::vector<std::int64_t> result;
for (std::size_t index = 0; index < haystack_size - needle_size + 1; ++index) {
std::size_t offset = 0;
for (; offset < needle_size; ++offset) {
if (haystack[index + offset] != needle[offset])
break;
}
if (offset == needle_size)
result.push_back(index);
}
return result;
}
std::vector<std::int64_t> rabin_karp_hash(std::string_view haystack, std::string_view needle) {
const auto haystack_size = haystack.size();
const auto needle_size = needle.size();
assert(haystack_size >= needle_size);
std::vector<std::int64_t> matches;
static const auto hash_function = [](std::string_view::iterator begin, std::string_view::iterator end) {
std::int64_t hash = 5381;
for (; begin != end; ++begin)
hash = ((hash << 5) + hash) + *begin;
return hash;
};
const auto needle_hashed = hash_function(std::begin(needle), std::end(needle));
for (std::size_t index = 0; index < haystack_size - needle_size + 1; ++index) {
const auto substring_hash = hash_function
(
std::begin(haystack) + index, std::begin(haystack) + index + needle_size
);
if (substring_hash == needle_hashed)
matches.push_back(index);
}
return matches;
}
} // namespace algo
int main() {
auto vector = string_generator_utility::generate_n_strings(25, 50);
std::cout << "naive substring:n";
for (std::size_t index = 0; index < vector.size(); ++index)
{
std::cout << vector[index] << ": ";
auto shift = algo::naive_substring(vector[index], "ab");
std::cout << shift << "n";
}
std::cout << "rabin-karp-substring:n";
for (std::size_t index = 0; index < vector.size(); ++index)
{
std::cout << vector[index] << ": ";
auto shift = algo::rabin_karp_hash(vector[index], "ab");
std::cout << shift << "n";
}
return 0;
}
c++ algorithm c++17
$endgroup$
This question has an open bounty worth +50
reputation from Inter Veridium ending in 7 days.
The current answers do not contain enough detail.
add a comment |
$begingroup$
Summary: a bunch of algorithms to find if particular substring exists in the string. Meanwhile there's a lot to learn and I am looking forward to implementing more complex ones, I suppose there's some crucial insight regarding style, syntax, or design, which I might be missing. I have also managed to implement the auxiliary functions which potentially would assist me in the future.
Note: the question I have is how to avoid the boilerplate part in the main function. There should have been a some kind of "framework", which would accept the vector of functions and perform the tests. However, for me it's not quite clear how the architecture should be designed.
#include <algorithm>
#include <cassert>
#include <functional>
#include <iostream>
#include <random>
#include <string_view>
#include <utility>
#include <vector>
template <typename T>
std::ostream &operator<<(std::ostream &o, std::vector<T> const &vector)
{ for (auto const &element: vector) o << element << " "; return o; }
namespace string_generator_utility {
std::string generate_string(std::string::size_type length,
std::vector<char>&& alphabet = {'a', 'b', 'c', 'd'}) {
std::string result;
static std::random_device rd;
static std::mt19937_64 mersenne_twister_generator {rd ()};
const auto supremum = alphabet.size() - 1;
std::uniform_int_distribution<std::size_t> range {0, supremum};
for (std::size_t index = 0; index < length; ++index) {
const auto letter = range(mersenne_twister_generator);
result += alphabet[letter];
}
return result;
}
std::vector<std::string> generate_n_strings(std::size_t vector_size, std::size_t string_length,
std::vector<char>&& alphabet = {'a', 'b', 'c', 'd'}) {
std::vector<std::string> generated;
generated.reserve(vector_size);
std::generate_n(std::back_inserter(generated), vector_size,
[&]() { return generate_string(string_length, std::move(alphabet)); });
return generated;
}
} // namespace string_generator_utility
namespace algo {
std::vector<std::int64_t> naive_substring(std::string_view haystack, std::string_view needle) {
const auto haystack_size = haystack.size();
const auto needle_size = needle.size();
assert(haystack_size >= needle_size);
std::vector<std::int64_t> result;
for (std::size_t index = 0; index < haystack_size - needle_size + 1; ++index) {
std::size_t offset = 0;
for (; offset < needle_size; ++offset) {
if (haystack[index + offset] != needle[offset])
break;
}
if (offset == needle_size)
result.push_back(index);
}
return result;
}
std::vector<std::int64_t> rabin_karp_hash(std::string_view haystack, std::string_view needle) {
const auto haystack_size = haystack.size();
const auto needle_size = needle.size();
assert(haystack_size >= needle_size);
std::vector<std::int64_t> matches;
static const auto hash_function = [](std::string_view::iterator begin, std::string_view::iterator end) {
std::int64_t hash = 5381;
for (; begin != end; ++begin)
hash = ((hash << 5) + hash) + *begin;
return hash;
};
const auto needle_hashed = hash_function(std::begin(needle), std::end(needle));
for (std::size_t index = 0; index < haystack_size - needle_size + 1; ++index) {
const auto substring_hash = hash_function
(
std::begin(haystack) + index, std::begin(haystack) + index + needle_size
);
if (substring_hash == needle_hashed)
matches.push_back(index);
}
return matches;
}
} // namespace algo
int main() {
auto vector = string_generator_utility::generate_n_strings(25, 50);
std::cout << "naive substring:n";
for (std::size_t index = 0; index < vector.size(); ++index)
{
std::cout << vector[index] << ": ";
auto shift = algo::naive_substring(vector[index], "ab");
std::cout << shift << "n";
}
std::cout << "rabin-karp-substring:n";
for (std::size_t index = 0; index < vector.size(); ++index)
{
std::cout << vector[index] << ": ";
auto shift = algo::rabin_karp_hash(vector[index], "ab");
std::cout << shift << "n";
}
return 0;
}
c++ algorithm c++17
$endgroup$
Summary: a bunch of algorithms to find if particular substring exists in the string. Meanwhile there's a lot to learn and I am looking forward to implementing more complex ones, I suppose there's some crucial insight regarding style, syntax, or design, which I might be missing. I have also managed to implement the auxiliary functions which potentially would assist me in the future.
Note: the question I have is how to avoid the boilerplate part in the main function. There should have been a some kind of "framework", which would accept the vector of functions and perform the tests. However, for me it's not quite clear how the architecture should be designed.
#include <algorithm>
#include <cassert>
#include <functional>
#include <iostream>
#include <random>
#include <string_view>
#include <utility>
#include <vector>
template <typename T>
std::ostream &operator<<(std::ostream &o, std::vector<T> const &vector)
{ for (auto const &element: vector) o << element << " "; return o; }
namespace string_generator_utility {
std::string generate_string(std::string::size_type length,
std::vector<char>&& alphabet = {'a', 'b', 'c', 'd'}) {
std::string result;
static std::random_device rd;
static std::mt19937_64 mersenne_twister_generator {rd ()};
const auto supremum = alphabet.size() - 1;
std::uniform_int_distribution<std::size_t> range {0, supremum};
for (std::size_t index = 0; index < length; ++index) {
const auto letter = range(mersenne_twister_generator);
result += alphabet[letter];
}
return result;
}
std::vector<std::string> generate_n_strings(std::size_t vector_size, std::size_t string_length,
std::vector<char>&& alphabet = {'a', 'b', 'c', 'd'}) {
std::vector<std::string> generated;
generated.reserve(vector_size);
std::generate_n(std::back_inserter(generated), vector_size,
[&]() { return generate_string(string_length, std::move(alphabet)); });
return generated;
}
} // namespace string_generator_utility
namespace algo {
std::vector<std::int64_t> naive_substring(std::string_view haystack, std::string_view needle) {
const auto haystack_size = haystack.size();
const auto needle_size = needle.size();
assert(haystack_size >= needle_size);
std::vector<std::int64_t> result;
for (std::size_t index = 0; index < haystack_size - needle_size + 1; ++index) {
std::size_t offset = 0;
for (; offset < needle_size; ++offset) {
if (haystack[index + offset] != needle[offset])
break;
}
if (offset == needle_size)
result.push_back(index);
}
return result;
}
std::vector<std::int64_t> rabin_karp_hash(std::string_view haystack, std::string_view needle) {
const auto haystack_size = haystack.size();
const auto needle_size = needle.size();
assert(haystack_size >= needle_size);
std::vector<std::int64_t> matches;
static const auto hash_function = [](std::string_view::iterator begin, std::string_view::iterator end) {
std::int64_t hash = 5381;
for (; begin != end; ++begin)
hash = ((hash << 5) + hash) + *begin;
return hash;
};
const auto needle_hashed = hash_function(std::begin(needle), std::end(needle));
for (std::size_t index = 0; index < haystack_size - needle_size + 1; ++index) {
const auto substring_hash = hash_function
(
std::begin(haystack) + index, std::begin(haystack) + index + needle_size
);
if (substring_hash == needle_hashed)
matches.push_back(index);
}
return matches;
}
} // namespace algo
int main() {
auto vector = string_generator_utility::generate_n_strings(25, 50);
std::cout << "naive substring:n";
for (std::size_t index = 0; index < vector.size(); ++index)
{
std::cout << vector[index] << ": ";
auto shift = algo::naive_substring(vector[index], "ab");
std::cout << shift << "n";
}
std::cout << "rabin-karp-substring:n";
for (std::size_t index = 0; index < vector.size(); ++index)
{
std::cout << vector[index] << ": ";
auto shift = algo::rabin_karp_hash(vector[index], "ab");
std::cout << shift << "n";
}
return 0;
}
c++ algorithm c++17
c++ algorithm c++17
edited 10 hours ago
Inter Veridium
asked Feb 7 at 10:16
Inter VeridiumInter Veridium
355
355
This question has an open bounty worth +50
reputation from Inter Veridium ending in 7 days.
The current answers do not contain enough detail.
This question has an open bounty worth +50
reputation from Inter Veridium ending in 7 days.
The current answers do not contain enough detail.
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
From a readability viewpoint, your operator<<
for std::vector
declared at the top of your file should be spread out onto multiple lines so that it is easier to read and understand.
Also, the calculation of ((hash << 5) + hash)
should be rewritten as the simpler hash * 33
. The compiler will know the best way to multiply a number by 33. This could be a multiply, a shift-and-add like you've coded, or some sequence involving the address calculation instructions.
Rather than using an assert
to verify that the needle is not longer than the haystack (which will only check the condition if the NDEBUG
macro is not defined), just check the condition and return an empty collection.
In rabin_karp_hash
you assume that two strings match if their hash values are the same. This is not necessarily the case. It is possible, however unlikely, that two different strings will have the same hash value. This is a hash collision. To ensure that your potential match are identical strings, you still need to compare both strings when the hashes match.
To simplify the code in main
and eliminate the duplication, you can create a class with a virtual compare
member. Then derive two classes from it, one for the naive comparison, the other for the Rabin-Karp one. Put your loop into another function, and pass instances of the appropriate derived class to use the specific comparison you want to test.
$endgroup$
add a comment |
$begingroup$
I think the assertion in the
rabin_karp_hash
is redundant. There is nothing wrong with trying to find a substring with a size exceeding that of a text (although the result is self-evident), besides you can just return an empty vector there.
I'd replace
for (; offset < needle_size; ++offset) {
if (haystack[index + offset] != needle[offset])
break;
}
with something like
while (offset < needle_size && haystack[index + offset] == needle[offset])
++offset;
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f213023%2fbasic-substring-algorithms-auxiliary-string-generating-functions%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
From a readability viewpoint, your operator<<
for std::vector
declared at the top of your file should be spread out onto multiple lines so that it is easier to read and understand.
Also, the calculation of ((hash << 5) + hash)
should be rewritten as the simpler hash * 33
. The compiler will know the best way to multiply a number by 33. This could be a multiply, a shift-and-add like you've coded, or some sequence involving the address calculation instructions.
Rather than using an assert
to verify that the needle is not longer than the haystack (which will only check the condition if the NDEBUG
macro is not defined), just check the condition and return an empty collection.
In rabin_karp_hash
you assume that two strings match if their hash values are the same. This is not necessarily the case. It is possible, however unlikely, that two different strings will have the same hash value. This is a hash collision. To ensure that your potential match are identical strings, you still need to compare both strings when the hashes match.
To simplify the code in main
and eliminate the duplication, you can create a class with a virtual compare
member. Then derive two classes from it, one for the naive comparison, the other for the Rabin-Karp one. Put your loop into another function, and pass instances of the appropriate derived class to use the specific comparison you want to test.
$endgroup$
add a comment |
$begingroup$
From a readability viewpoint, your operator<<
for std::vector
declared at the top of your file should be spread out onto multiple lines so that it is easier to read and understand.
Also, the calculation of ((hash << 5) + hash)
should be rewritten as the simpler hash * 33
. The compiler will know the best way to multiply a number by 33. This could be a multiply, a shift-and-add like you've coded, or some sequence involving the address calculation instructions.
Rather than using an assert
to verify that the needle is not longer than the haystack (which will only check the condition if the NDEBUG
macro is not defined), just check the condition and return an empty collection.
In rabin_karp_hash
you assume that two strings match if their hash values are the same. This is not necessarily the case. It is possible, however unlikely, that two different strings will have the same hash value. This is a hash collision. To ensure that your potential match are identical strings, you still need to compare both strings when the hashes match.
To simplify the code in main
and eliminate the duplication, you can create a class with a virtual compare
member. Then derive two classes from it, one for the naive comparison, the other for the Rabin-Karp one. Put your loop into another function, and pass instances of the appropriate derived class to use the specific comparison you want to test.
$endgroup$
add a comment |
$begingroup$
From a readability viewpoint, your operator<<
for std::vector
declared at the top of your file should be spread out onto multiple lines so that it is easier to read and understand.
Also, the calculation of ((hash << 5) + hash)
should be rewritten as the simpler hash * 33
. The compiler will know the best way to multiply a number by 33. This could be a multiply, a shift-and-add like you've coded, or some sequence involving the address calculation instructions.
Rather than using an assert
to verify that the needle is not longer than the haystack (which will only check the condition if the NDEBUG
macro is not defined), just check the condition and return an empty collection.
In rabin_karp_hash
you assume that two strings match if their hash values are the same. This is not necessarily the case. It is possible, however unlikely, that two different strings will have the same hash value. This is a hash collision. To ensure that your potential match are identical strings, you still need to compare both strings when the hashes match.
To simplify the code in main
and eliminate the duplication, you can create a class with a virtual compare
member. Then derive two classes from it, one for the naive comparison, the other for the Rabin-Karp one. Put your loop into another function, and pass instances of the appropriate derived class to use the specific comparison you want to test.
$endgroup$
From a readability viewpoint, your operator<<
for std::vector
declared at the top of your file should be spread out onto multiple lines so that it is easier to read and understand.
Also, the calculation of ((hash << 5) + hash)
should be rewritten as the simpler hash * 33
. The compiler will know the best way to multiply a number by 33. This could be a multiply, a shift-and-add like you've coded, or some sequence involving the address calculation instructions.
Rather than using an assert
to verify that the needle is not longer than the haystack (which will only check the condition if the NDEBUG
macro is not defined), just check the condition and return an empty collection.
In rabin_karp_hash
you assume that two strings match if their hash values are the same. This is not necessarily the case. It is possible, however unlikely, that two different strings will have the same hash value. This is a hash collision. To ensure that your potential match are identical strings, you still need to compare both strings when the hashes match.
To simplify the code in main
and eliminate the duplication, you can create a class with a virtual compare
member. Then derive two classes from it, one for the naive comparison, the other for the Rabin-Karp one. Put your loop into another function, and pass instances of the appropriate derived class to use the specific comparison you want to test.
answered 10 hours ago
1201ProgramAlarm1201ProgramAlarm
3,2532923
3,2532923
add a comment |
add a comment |
$begingroup$
I think the assertion in the
rabin_karp_hash
is redundant. There is nothing wrong with trying to find a substring with a size exceeding that of a text (although the result is self-evident), besides you can just return an empty vector there.
I'd replace
for (; offset < needle_size; ++offset) {
if (haystack[index + offset] != needle[offset])
break;
}
with something like
while (offset < needle_size && haystack[index + offset] == needle[offset])
++offset;
$endgroup$
add a comment |
$begingroup$
I think the assertion in the
rabin_karp_hash
is redundant. There is nothing wrong with trying to find a substring with a size exceeding that of a text (although the result is self-evident), besides you can just return an empty vector there.
I'd replace
for (; offset < needle_size; ++offset) {
if (haystack[index + offset] != needle[offset])
break;
}
with something like
while (offset < needle_size && haystack[index + offset] == needle[offset])
++offset;
$endgroup$
add a comment |
$begingroup$
I think the assertion in the
rabin_karp_hash
is redundant. There is nothing wrong with trying to find a substring with a size exceeding that of a text (although the result is self-evident), besides you can just return an empty vector there.
I'd replace
for (; offset < needle_size; ++offset) {
if (haystack[index + offset] != needle[offset])
break;
}
with something like
while (offset < needle_size && haystack[index + offset] == needle[offset])
++offset;
$endgroup$
I think the assertion in the
rabin_karp_hash
is redundant. There is nothing wrong with trying to find a substring with a size exceeding that of a text (although the result is self-evident), besides you can just return an empty vector there.
I'd replace
for (; offset < needle_size; ++offset) {
if (haystack[index + offset] != needle[offset])
break;
}
with something like
while (offset < needle_size && haystack[index + offset] == needle[offset])
++offset;
edited Feb 8 at 4:24
Jamal♦
30.3k11119227
30.3k11119227
answered Feb 7 at 16:13
MikhailMikhail
111
111
add a comment |
add a comment |
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f213023%2fbasic-substring-algorithms-auxiliary-string-generating-functions%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown