Skip to content

Commit 02ad801

Browse files
Tsung-Wei HuangTsung-Wei Huang
authored andcommitted
removed group in reduce method
1 parent 23ea8b0 commit 02ad801

6 files changed

Lines changed: 39 additions & 79 deletions

File tree

README.md

Lines changed: 7 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -213,8 +213,8 @@ The table below summarizes its commonly used methods.
213213
| placeholder | none | task | insert a node without any work; work can be assigned later |
214214
| linearize | task list | none | create a linear dependency in the given task list |
215215
| parallel_for | beg, end, callable, group | task pair | apply the callable in parallel and group-by-group to the result of dereferencing every iterator in the range |
216-
| reduce | beg, end, res, bop, group | task pair | apply a binary operator group-by-group to reduce a range of elements to a single result |
217-
| transform_reduce | beg, end, res, bop, uop, group | task pair | apply a unary operator to each element in the range and reduce them to a single result group-by-group through a binary operator |
216+
| reduce | beg, end, res, bop | task pair | reduce a range of elements to a single result through a binary operator |
217+
| transform_reduce | beg, end, res, bop, uop | task pair | apply a unary operator to each element in the range and reduce them to a single result through a binary operator |
218218
| dispatch | none | future | dispatch the current graph and return a shared future to block on completeness |
219219
| silent_dispatch | none | none | dispatch the current graph |
220220
| wait_for_all | none | none | dispatch the current graph and block until all graphs including previously dispatched ones finish |
@@ -307,36 +307,17 @@ auto [S, T] = tf.parallel_for(
307307

308308
### *reduce/transform_reduce*
309309

310-
The method `reduce` creates a subgraph that applies a binary operator to a range of items in a container.
310+
The method `reduce` creates a subgraph that applies a binary operator to a range of items.
311311
The result will be stored in the referenced `res` object passed to the method.
312312
It is your responsibility to assign it a correct initial value to reduce.
313313

314-
<img align="right" width="45%" src="image/reduce.png">
315-
316-
```cpp
317-
auto v = {1, 2, 3, 4};
318-
int sum {0}; // initial value
319-
auto [S, T] = tf.reduce(
320-
v.begin(), // beg of range
321-
v.end(), // end of range
322-
sum, // pass by reference
323-
std::plus<int>(),
324-
1 // execute one task at a time
325-
);
326-
// add dependencies via S and T.
327-
```
328-
329-
Changing the group size can force intra-group tasks to run sequentially
330-
and inter-group tasks to run in parallel.
331-
Depending on applications, different group sizes can result in significant performance hit.
332-
333314
<img align="right" width="45%" src="image/reduce_2.png">
334315

335316
```cpp
336317
auto v = {1, 2, 3, 4};
337318
int sum {0};
338-
auto [S, T] = tf.reduce(
339-
v.begin(), v.end(), sum, std::plus<int>(), 2
319+
auto [S, T] = tf.reduce( // for example, 2 threads
320+
v.begin(), v.end(), sum, std::plus<int>()
340321
);
341322
```
342323
@@ -352,8 +333,7 @@ auto [S, T] = tf.transform_reduce(v.begin(), v.end(), min,
352333
);
353334
```
354335

355-
All reduce methods have overloads with no group size,
356-
in which the workload is evenly partitioned across threads.
336+
By default, all reduce methods distribute the workload evenly across threads.
357337

358338
### *dispatch/silent_dispatch/wait_for_all*
359339
Dispatching a taskflow graph will schedule threads to execute the current graph and return immediately.
@@ -370,6 +350,7 @@ std::cout << "all tasks complete" << '\n';
370350
```
371351

372352
If you need to block your program flow until all tasks finish, use `wait_for_all` instead.
353+
373354
```cpp
374355
tf.wait_for_all();
375356
std::cout << "all tasks complete" << '\n';

image/pdf2eps

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
rm -rf $1.eps $1.ps
2+
pdfcrop $1.pdf $1.pdf
3+
pdftops $1.pdf
4+
ps2eps $1.ps

image/reduce.png

-34.5 KB
Loading

image/reduce_2.png

-16.3 KB
Binary file not shown.

taskflow.hpp

Lines changed: 12 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -448,9 +448,6 @@ class BasicTaskflow {
448448
template <typename T, typename C, std::enable_if_t<is_iterable_v<T>, void>* = nullptr>
449449
auto parallel_for(T&, C&&, size_t = 1);
450450

451-
template <typename I, typename T, typename B>
452-
auto reduce(I, I, T&, B&&, size_t);
453-
454451
template <typename I, typename T, typename B>
455452
auto reduce(I, I, T&, B&&);
456453

@@ -460,9 +457,6 @@ class BasicTaskflow {
460457
template <typename I, typename T>
461458
auto reduce_max(I, I, T&);
462459

463-
template <typename I, typename T, typename B, typename U>
464-
auto transform_reduce(I, I, T&, B&&, U&&, size_t);
465-
466460
template <typename I, typename T, typename B, typename U>
467461
auto transform_reduce(I, I, T&, B&&, U&&);
468462

@@ -987,16 +981,16 @@ auto BasicTaskflow<F>::_reduce(I beg, size_t n, O op, size_t g, Task& S) {
987981
// Function: reduce
988982
template <typename F>
989983
template <typename I, typename T, typename B>
990-
auto BasicTaskflow<F>::reduce(I beg, I end, T& result, B&& op, size_t g) {
984+
auto BasicTaskflow<F>::reduce(I beg, I end, T& result, B&& op/*, size_t g*/) {
991985

992986
using category = typename std::iterator_traits<I>::iterator_category;
993987

994988
// Evenly partition
995-
if(g == 0) {
996-
auto d = std::distance(beg, end);
997-
auto w = std::max(size_t{1}, num_workers());
998-
g = (d + w - 1) / w;
999-
}
989+
//if(g == 0) {
990+
size_t d = std::distance(beg, end);
991+
size_t w = std::max(size_t{1}, num_workers());
992+
size_t g = std::max((d + w - 1) / w, size_t{2});
993+
//}
1000994

1001995
auto source = placeholder();
1002996
auto target = placeholder();
@@ -1060,21 +1054,14 @@ auto BasicTaskflow<F>::reduce(I beg, I end, T& result, B&& op, size_t g) {
10601054
}*/
10611055
}
10621056

1063-
// Function: reduce
1064-
template <typename F>
1065-
template <typename I, typename T, typename B>
1066-
auto BasicTaskflow<F>::reduce(I beg, I end, T& result, B&& op) {
1067-
return reduce(beg, end, result, std::forward<B>(op), 0);
1068-
}
1069-
10701057
// Function: reduce_min
10711058
// Find the minimum element over a range of items.
10721059
template <typename F>
10731060
template <typename I, typename T>
10741061
auto BasicTaskflow<F>::reduce_min(I beg, I end, T& result) {
10751062
return reduce(beg, end, result, [] (const auto& l, const auto& r) {
10761063
return std::min(l, r);
1077-
}, 0);
1064+
});
10781065
}
10791066

10801067
// Function: reduce_max
@@ -1084,22 +1071,20 @@ template <typename I, typename T>
10841071
auto BasicTaskflow<F>::reduce_max(I beg, I end, T& result) {
10851072
return reduce(beg, end, result, [] (const auto& l, const auto& r) {
10861073
return std::max(l, r);
1087-
}, 0);
1074+
});
10881075
}
10891076

10901077
// Function: transform_reduce
10911078
template <typename F>
10921079
template <typename I, typename T, typename B, typename U>
1093-
auto BasicTaskflow<F>::transform_reduce(I beg, I end, T& result, B&& bop, U&& uop, size_t g) {
1080+
auto BasicTaskflow<F>::transform_reduce(I beg, I end, T& result, B&& bop, U&& uop) {
10941081

10951082
using category = typename std::iterator_traits<I>::iterator_category;
10961083

10971084
// Even partition
1098-
if(g == 0) {
1099-
auto d = std::distance(beg, end);
1100-
auto w = std::max(size_t{1}, num_workers());
1101-
g = (d + w - 1) / w;
1102-
}
1085+
size_t d = std::distance(beg, end);
1086+
size_t w = std::max(size_t{1}, num_workers());
1087+
size_t g = std::max((d + w - 1) / w, size_t{2});
11031088

11041089
auto source = placeholder();
11051090
auto target = placeholder();
@@ -1146,14 +1131,6 @@ auto BasicTaskflow<F>::transform_reduce(I beg, I end, T& result, B&& bop, U&& uo
11461131
return std::make_pair(source, target);
11471132
}
11481133

1149-
// Function: transform_reduce
1150-
template <typename F>
1151-
template <typename I, typename T, typename B, typename U>
1152-
auto BasicTaskflow<F>::transform_reduce(I beg, I end, T& result, B&& bop, U&& uop) {
1153-
return transform_reduce(beg, end, result, std::forward<B>(bop), std::forward<U>(uop), 0);
1154-
}
1155-
1156-
11571134
/*// Function: parallel_range
11581135
template <typename F>
11591136
template <typename I, typename C>

unittest/taskflow.cpp

Lines changed: 16 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -256,40 +256,40 @@ TEST_CASE("Taskflow.ParallelFor") {
256256
// --------------------------------------------------------
257257
TEST_CASE("Taskflow.Reduce") {
258258

259-
const auto plus_test = [](const size_t num_workers, auto &&data, size_t group){
259+
const auto plus_test = [](const size_t num_workers, auto &&data){
260260
tf::Taskflow tf(num_workers);
261261
int result {0};
262262
std::iota(data.begin(), data.end(), 1);
263-
tf.reduce(data.begin(), data.end(), result, std::plus<int>(), group);
263+
tf.reduce(data.begin(), data.end(), result, std::plus<int>());
264264
tf.wait_for_all();
265265
REQUIRE(result == std::accumulate(data.begin(), data.end(), 0, std::plus<int>()));
266266
};
267267

268-
const auto multiply_test = [](const size_t num_workers, auto &&data, size_t group){
268+
const auto multiply_test = [](const size_t num_workers, auto &&data){
269269
tf::Taskflow tf(num_workers);
270270
std::fill(data.begin(), data.end(), 1.0);
271271
double result {2.0};
272-
tf.reduce(data.begin(), data.end(), result, std::multiplies<double>(), group);
272+
tf.reduce(data.begin(), data.end(), result, std::multiplies<double>());
273273
tf.wait_for_all();
274274
REQUIRE(result == std::accumulate(data.begin(), data.end(), 2.0, std::multiplies<double>()));
275275
};
276276

277-
const auto max_test = [](const size_t num_workers, auto &&data, size_t group){
277+
const auto max_test = [](const size_t num_workers, auto &&data){
278278
tf::Taskflow tf(num_workers);
279279
std::iota(data.begin(), data.end(), 1);
280280
int result {0};
281281
auto lambda = [](const auto& l, const auto& r){return std::max(l, r);};
282-
tf.reduce(data.begin(), data.end(), result, lambda, group);
282+
tf.reduce(data.begin(), data.end(), result, lambda);
283283
tf.wait_for_all();
284284
REQUIRE(result == std::accumulate(data.begin(), data.end(), 0, lambda));
285285
};
286286

287-
const auto min_test = [](const size_t num_workers, auto &&data, size_t group){
287+
const auto min_test = [](const size_t num_workers, auto &&data){
288288
tf::Taskflow tf(num_workers);
289289
std::iota(data.begin(), data.end(), 1);
290290
int result {std::numeric_limits<int>::max()};
291291
auto lambda = [](const auto& l, const auto& r){return std::min(l, r);};
292-
tf.reduce(data.begin(), data.end(), result, lambda, group);
292+
tf.reduce(data.begin(), data.end(), result, lambda);
293293
tf.wait_for_all();
294294
REQUIRE(result == std::accumulate(
295295
data.begin(), data.end(), std::numeric_limits<int>::max(), lambda)
@@ -298,19 +298,17 @@ TEST_CASE("Taskflow.Reduce") {
298298

299299
for(size_t i=0; i<=4; ++i){
300300
for(size_t j=0; j<=256; j=j*2+1){
301-
for(size_t k=0; k<=256; k++){
302-
plus_test(i, std::vector<int>(j), k);
303-
plus_test(i, std::list<int>(j) , k);
301+
plus_test(i, std::vector<int>(j));
302+
plus_test(i, std::list<int>(j));
304303

305-
multiply_test(i, std::vector<double>(j), k);
306-
multiply_test(i, std::list<double>(j), k);
304+
multiply_test(i, std::vector<double>(j));
305+
multiply_test(i, std::list<double>(j));
307306

308-
max_test(i, std::vector<int>(j), k);
309-
max_test(i, std::list<int>(j), k);
307+
max_test(i, std::vector<int>(j));
308+
max_test(i, std::list<int>(j));
310309

311-
min_test(i, std::vector<int>(j), k);
312-
min_test(i, std::list<int>(j), k);
313-
}
310+
min_test(i, std::vector<int>(j));
311+
min_test(i, std::list<int>(j));
314312
}
315313
}
316314
}

0 commit comments

Comments
 (0)