#What do we mean for “combinations” of K elements?
Given a set of N items, we need to calculate all subsets of K items. In each calculated subset, the selected items must appear only one time (no repetitions). In addition, the order of selected items should follow the criteria chosen for the entire set of N items.
In the following sections, it is always assumed that the elements in the original set are all unique and sorted according to some criteria.
#An example
- { ‘a’, ‘b’, ‘c’, ‘d’ }
- { ‘a’, ‘b’, ‘c’, ‘e’ }
- { ‘a’, ‘b’, ‘d’, ‘e’ }
- { ‘a’, ‘c’, ‘d’, ‘e’ }
- { ‘b’, ‘c’, ‘d’, ‘e’ }
#How to calculate the expected number of combinations?
The number of combinations of size k from a set of n elements is calculated as:
nCk = n! / ( k! * (n-k)! )
where ! denotes the factorial operation.
#A first approach: the brute force (iterative method)
This code assumes that the input values are sorted according to some criteria and are unique inside the input vector.
// Author : Pietro L. Carotenuto
// Website: www.pietrolc.com
#include <iostream>
#include <vector>
#include <deque>
using Tin = std::vector<int>;
using Tout = std::vector<std::deque<int>>;
void calculateCombinationsRec(const Tin& vec, const int k, Tout& sol, std::deque<int>& tmp, const int nextIndex )
{
tmp.emplace_back(vec[nextIndex]);
// Save and return
if (k == tmp.size())
{
// Add to solutions bucket
sol.emplace_back(tmp);
// Remove last item
tmp.pop_back();
// Check remaining combinations
return;
}
// Otherwise, continue checking remaining combinations
for (size_t i = nextIndex + 1; i < vec.size(); i++)
{
calculateCombinationsRec(vec, k, sol, tmp, i);
}
// Remove last item before checking new combinations
tmp.pop_back();
}
// This returns a vector of all calculated combinations
// Assumption: The input vector does not contain any duplicate values.
// Ideally, the input values are sorted according some criteria.
void calculateCombinations(const Tin& vec, const int k, Tout& sol)
{
// Iterate on all possible root indexes
for (size_t rootIdx = 0; rootIdx < vec.size(); rootIdx++)
{
// For each root, this temporary queue stores all combinations having
// the same starting point
std::deque<int> tmp;
calculateCombinationsRec(vec, k, sol, tmp, rootIdx);
}
}
int main()
{
Tin vec;
Tout sol;
const int k = 5;
for (int i = 0; i < 10; i++)
{
vec.emplace_back(i);
}
// Main function
calculateCombinations(vec, k, sol);
std::cout << "Combinations: " << std::endl;
for (const auto & s1 : sol)
{
for (const auto& it2 : s1)
{
std::cout << it2 << " ";
}
std::cout << std::endl;
}
std::cout << "N. combinations: " << sol.size() << std::endl;
}
Combinations:
0 1 2 3 4
0 1 2 3 5
0 1 2 3 6
0 1 2 3 7
0 1 2 3 8
0 1 2 3 9
0 1 2 4 5
0 1 2 4 6
0 1 2 4 7
0 1 2 4 8
0 1 2 4 9
...
3 5 6 8 9
3 5 7 8 9
3 6 7 8 9
4 5 6 7 8
4 5 6 7 9
4 5 6 8 9
4 5 7 8 9
4 6 7 8 9
5 6 7 8 9
N. combinations: 252
Given a specific starting element, then the function calculateCombinationsRec() is recursively called to check the remaining elements. The function calculateCombinationsRec() only stores the combinations having size k.
#The brute force (recursive optimized method)
The code above works pretty fine but it has a big disadvantage: it cannot handle a large data set of items. Why? Simply because the proposed function is recursive.
For a large set of items or, similarly, for large values of K, this recursive function might lead to a stack overflow.
// Author : Pietro L. Carotenuto
// Website: www.pietrolc.com
#include <iostream>
#include <vector>
#include <deque>
#include <stack>
using T = char;
using Tin = std::vector<T>;
using Tout = std::vector<std::deque<T>>;
typedef struct
{
int next; // the next index to pick up
std::deque<T> comb; // the combination
} StackItem;
void calculateCombinations(const Tin& vec, const int k, Tout& sol)
{
if (vec.empty() || k == 0)
{
return;
}
// init data
sol.clear();
int rootIdx = 0;
const int n = vec.size();
while (rootIdx < (n-(k-1)) )
{
std::stack<StackItem> myStack;
// adding initial item in the stack
StackItem s0{};
s0.next = rootIdx + 1;
s0.comb.push_back(vec[rootIdx]);
myStack.emplace(s0);
while (!myStack.empty())
{
// get item from the top of the stack
auto& it = myStack.top();
// check if it is a combination of size k
if (it.comb.size() == k)
{
sol.emplace_back(it.comb); // store the solution
myStack.pop(); // remove from the stack
continue;
}
// if the index exceeds the size of the input vector,
// pop the current item from the stack
if (it.next >= vec.size())
{
myStack.pop();
continue;
}
// Otherwise, let's check next item in the input array
// saving a new state into the stack
StackItem s1{};
s1.next = it.next + 1; // the next node to pick up
s1.comb = it.comb; // current incomplete combination
s1.comb.push_back(vec[it.next]); // adding current item
// increase the node index at the top of the stack
++it.next;
// save new state into the stack
myStack.push(s1);
}
// increase root index
++rootIdx;
}
}
int main()
{
Tin vec{'a','b','c','d','e'};
Tout sol;
const int k = 3;
// Main function
calculateCombinations(vec, k, sol);
std::cout << "Combinations: " << std::endl;
for (const auto & s1 : sol)
{
for (const auto& it2 : s1)
{
std::cout << it2 << " ";
}
std::cout << std::endl;
}
std::cout << "N. combinations: " << sol.size() << std::endl;
}
Combinations:
a b c
a b d
a b e
a c d
a c e
a d e
b c d
b c e
b d e
c d e
N. combinations: 10
#Second approach: a C++ specific method
The method proposed in this section has been adapted from Rosetta Code website.
I like this approach since it is not recursive and it takes advantage of the function std::prev_permutation() already available in the STL library “algorithm”.
#include <algorithm>
#include <iostream>
#include <string>
#include <vector>
using T = char;
using Tcomb = std::vector<T>;
using Tsol = std::vector<Tcomb>;
void comb(const int K, const Tcomb& v, Tsol& sol)
{
const int N = v.size();
if (K > N)
{
// log an error message here!
return;
}
std::string bitmask(K, 1); // K leading 1's
bitmask.resize(N, 0); // N-K trailing 0's
// store values and permute bitmask
do {
Tcomb tmp{};
for (int i = 0; i < N; ++i) // [0..N-1] integers
{
if (bitmask[i] != 0)
{
tmp.emplace_back(v[i]);
}
}
if (!tmp.empty())
{
sol.emplace_back(tmp);
}
} while (std::prev_permutation(bitmask.begin(), bitmask.end()));
}
int main()
{
Tsol sol;
Tcomb values{ 'a','b','c','d','e' };
comb(3, values, sol);
for ( auto& t : sol)
{
for (auto& e: t)
{
std::cout << e << " ";
}
std::cout << std::endl;
}
std::cout << "N. combinations: " << sol.size() << std::endl;
}
#Conclusions
The main assumptions for the proposed methods are that all items in the set are unique and – eventually – sorted according to some criteria. The calculated combinations are sorted according to the same criteria followed in the set of items used as input.
