Skip to content

Commit 8a385f6

Browse files
before merging to the master
1 parent 7108f77 commit 8a385f6

24 files changed

Lines changed: 271 additions & 128 deletions

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
MIT License
22

3-
Copyright (c) 2018 Tsung-Wei Huang, Chun-Xun Lin, Guannan Guo, and Martin D. F. Wong
3+
Copyright (c) 2018 Tsung-Wei Huang, Chun-Xun Lin, and Martin D. F. Wong
44

55
Permission is hereby granted, free of charge, to any person obtaining a copy
66
of this software and associated documentation files (the "Software"), to deal

README.md

Lines changed: 10 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ and is by far faster, more expressive, and easier for drop-in integration than e
2828

2929
*"Cpp-Taskflow has a very simple and elegant tasking interface. The performance also scales very well." [totalgee][totalgee]*
3030

31-
*"Best poster award for open-source parallel programming library." Cpp Conference 2018*
31+
*"Best poster award for open-source parallel programming library." [Cpp Conference 2018][Cpp Conference 2018]*
3232

3333
# Get Started with Cpp-Taskflow
3434

@@ -43,18 +43,18 @@ int main(){
4343
tf::Taskflow tf(std::thread::hardware_concurrency());
4444

4545
auto [A, B, C, D] = tf.silent_emplace(
46-
[] () { std::cout << "TaskA\n"; }, // the taskflow graph
46+
[] () { std::cout << "TaskA\n"; }, // task dependency graph
4747
[] () { std::cout << "TaskB\n"; }, //
4848
[] () { std::cout << "TaskC\n"; }, // +---+
4949
[] () { std::cout << "TaskD\n"; } // +---->| B |-----+
5050
); // | +---+ |
5151
// +---+ +-v-+
52-
A.precede(B); // B runs after A // | A | | D |
53-
A.precede(C); // C runs after A // +---+ +-^-+
54-
B.precede(D); // D runs after B // | +---+ |
55-
C.precede(D); // D runs after C // +---->| C |-----+
52+
A.precede(B); // A runs before B // | A | | D |
53+
A.precede(C); // A runs before C // +---+ +-^-+
54+
B.precede(D); // B runs before D // | +---+ |
55+
C.precede(D); // C runs before D // +---->| C |-----+
5656
// +---+
57-
tf.wait_for_all(); // block until finished
57+
tf.wait_for_all(); // block until finish
5858

5959
return 0;
6060
}
@@ -732,7 +732,7 @@ The folder `example/` contains several examples and is a great place to learn to
732732
| [reduce.cpp](./example/reduce.cpp)| performs reduce operations over linear containers |
733733
| [subflow.cpp](./example/subflow.cpp)| demonstrates how to create a subflow graph that spawns three dynamic tasks |
734734
| [threadpool.cpp](./example/threadpool.cpp)| benchmarks different threadpool implementations |
735-
| [taskflow.cpp](./example/taskflow.cpp)| benchmarks different threadpool implementations |
735+
| [taskflow.cpp](./example/taskflow.cpp)| benchmarks taskflow on different task dependency graphs |
736736
| [threadpool_cxx14.cpp](./example/threadpool_cxx14.cpp)| shows use of the C++14-compatible threadpool implementation, which may be used when you have no inter-task (taskflow) dependencies to express |
737737

738738
# Get Involved
@@ -769,16 +769,13 @@ that incorporate complex task dependencies.
769769

770770
Please [let me know][email me] if I forgot your project!
771771

772-
If you work for a company using Cpp-Taskflow or has the means to do so,
773-
please consider [financial support][PayMe].
774-
775772
# License
776773

777774
Cpp-Taskflow is licensed under the [MIT License](./LICENSE).
778775

779776
* * *
780777

781-
[Tsung-Wei Huang]: http://web.engr.illinois.edu/~thuang19/
778+
[Tsung-Wei Huang]: https://twhuang.ece.illinois.edu/
782779
[Chun-Xun Lin]: https://github.com/clin99
783780
[Martin Wong]: https://ece.illinois.edu/directory/profile/mdfwong
784781
[Andreas Olofsson]: https://github.com/aolofsson
@@ -805,5 +802,5 @@ Cpp-Taskflow is licensed under the [MIT License](./LICENSE).
805802
[PayMe]: https://www.paypal.me/twhuang/10
806803
[C++17]: https://en.wikipedia.org/wiki/C%2B%2B17
807804
[email me]: mailto:[email protected]
808-
805+
[Cpp Conference 2018]: https://github.com/CppCon/CppCon2018
809806

doc/cookbook/a.out

77 KB
Binary file not shown.
Lines changed: 43 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Execute a Task Dependency Graph
22

33
After you create a task dependency graph,
4-
you need to dispatch it for execution.
4+
you need to dispatch it to threads for execution
55
In this tutorial, we will show you how to execute a
66
task dependency graph.
77

@@ -14,20 +14,27 @@ task dependency graph.
1414

1515
# Graph and Topology
1616

17-
Each taskflow object has exactly one graph at a time to represent
18-
task dependencies constructed so far.
17+
Each taskflow object has exactly one graph at a time that represents
18+
the tasks and the dependencies constructed so far.
1919
The graph exists until users dispatch it for execution.
20-
We call a dispatched graph a *topology*.
20+
In Cpp-Taskflow, we call a dispatched graph a *topology*.
21+
A topology is a data structure that wraps up a dispatched graph
22+
and stores a few metadata obtained at runtime.
2123
Each taskflow object has a list of topologies to keep track of the
2224
execution status of dispatched graphs.
23-
All tasks are executed in a shared thread pool.
25+
Users can retrieve this information later on for graph inspection and debugging.
26+
27+
All tasks are executed in a shared thread storage coupled with a
28+
*task scheduler* to decide which thread runs which task.
29+
Cpp-Taskflow provides two ways to dispatch a task dependency graph,
30+
*blocking* and *non-blocking*.
2431

2532
# Blocking Execution
2633

2734
One way to dispatch the present task dependency graph
2835
is to use the method `wait_for_all`.
29-
Calling `wait_for_all` will dispatch the present graph to a shared thread storage
30-
and block the program until all tasks finish.
36+
Calling `wait_for_all` dispatches the graph to threads and blocks the program flow
37+
until all tasks finish.
3138

3239
```cpp
3340
1: tf::Taskflow tf(4);
@@ -47,16 +54,16 @@ All topologies will be cleaned up as well.
4754
4855
Another way to dispatch the present task dependency graph
4956
is to use the method `dispatch` or `silent_dispatch`.
50-
These two methods will dispatch the present graph to threads
51-
and returns immediately without blocking the program.
52-
Non-blocking methods allows the program to perform other computations
53-
that can overlap with the execution of topologies.
57+
These two methods both dispatch the present graph to threads
58+
and return immediately without blocking the program flow.
59+
Non-blocking methods allow the program to perform other computations
60+
that can overlap the graph execution.
5461
5562
```cpp
5663
1: tf::Taskflow tf(4);
5764
2:
58-
3: auto A = tf.silent_emplace([] () { std::cout << "TaskA\n"; });
59-
4: auto B = tf.silent_emplace([] () { std::cout << "TaskB\n"; });
65+
3: auto A = tf.silent_emplace([] () { std::cout << "Task A\n"; });
66+
4: auto B = tf.silent_emplace([] () { std::cout << "Task B\n"; });
6067
5: A.precede(B);
6168
6:
6269
7: auto F = tf.dispatch();
@@ -68,16 +75,15 @@ that can overlap with the execution of topologies.
6875
Debrief:
6976

7077
+ Line 1-5 creates a graph with two tasks and one dependency
71-
+ Line 7 dispatches this graph to a topology
72-
and obtains a `std::future` to access its execution status
73-
+ Line 8-9 performs some computations to overlap the execution of this topology
78+
+ Line 7 dispatches this graph
79+
and obtains a `std::future` object to access its execution status
80+
+ Line 8-9 performs some computations to overlap the execution of task A and task B
7481
+ Line 10 blocks the program until this topology finishes
7582

7683
If you do not care the status of a dispatched graph,
7784
use the method `silent_dispatch`.
7885
This method does not return anything.
7986

80-
8187
```cpp
8288
1: tf::Taskflow tf(4);
8389
2:
@@ -92,11 +98,12 @@ This method does not return anything.
9298
9399
# Wait on Topologies
94100
95-
When you call `dispatch` or `silent_dispatch`,
96-
the taskflow object will dispatch the present graph to threads
97-
and maintain a list of data structures called *topology*
98-
to store the execution status.
99-
These topologies are not cleaned up automatically on completion.
101+
Unlike `wait_for_all`, calling `dispatch` or `silent_dispatch`
102+
will not clean up the topologies upon completion.
103+
This allows users to dump the graph structure, in particular, created from dynamic tasking.
104+
However, it may be necessary at some points of the program
105+
to synchronize with the previously dispatched graphs.
106+
Cpp-Taskflow provides a method `wait_for_topologies` for this purpose.
100107
101108
102109
```cpp
@@ -120,20 +127,21 @@ These topologies are not cleaned up automatically on completion.
120127
Debrief
121128
+ Line 1 creates a taskflow object with four worker threads
122129
+ Line 3-5 creates a dependency graph of two tasks and one dependency
123-
+ Line 7 dispatches the present graph to threads and obtains a future object
130+
+ Line 7 dispatches this graph to threads and obtains a future object for users
124131
to access the execution status
125132
+ Line 9 starts with a new dependency graph with one task
126-
+ Line 11 dispatches the present graph to threads
127-
+ Line 13 blocks the program until all running topologies finish
133+
+ Line 11 dispatches the graph to threads
134+
+ Line 13 blocks the program until both graphs finish
128135

129-
It's clear now Line 9 overlaps the execution of the first graph.
130-
After Line 11, there are two topologies running inside the taskflow object.
136+
It is clear now Line 9 overlaps the execution of the first graph.
137+
After Line 11, there are two topologies in the taskflow object.
131138
Calling the method `wait_for_topologies` blocks the
132-
program until both complete.
139+
program until both graph complete.
133140

134141
# Example 1: Multiple Dispatches
135142

136-
The example below demonstrates how to create multiple task dependency graphs and dispatch each of them asynchronously.
143+
The example below demonstrates how to create multiple task dependency graphs and
144+
dispatch each of them asynchronously.
137145

138146
```cpp
139147
1: #include <taskflow/taskflow.hpp>
@@ -171,14 +179,13 @@ Debrief:
171179
+ Line 12-19 defines a function that iteratively creates a task dependency graph and dispatches it asynchronously
172180
+ Line 23 starts the procedure of multiple dispatches
173181
174-
Notice in Line 24 the counter ends up with 20.
175-
The destructor of a taskflow object will not leave until all running
176-
topologies finish.
182+
Notice in Line 24 the counter ends up being 20.
183+
By default, destructing a taskflow object will wait on all topologies to finish.
177184
178185
# Example 2: Connect Two Dependency Graphs
179186
180-
The example demonstrates how to use the `std::future` to manually impose a dependency link
181-
on two dispatched graphs.
187+
The example demonstrates how to use the `std::future` to explicitly impose a dependency
188+
link on two dispatched graphs.
182189
183190
```cpp
184191
1: #include <taskflow/taskflow.hpp>
@@ -233,10 +240,10 @@ on two dispatched graphs.
233240
Debrief:
234241
+ Line 5 creates a taskflow object with four worker threads
235242
+ Line 7-8 creates a vector of integer items and an integer variable to store the summation value
236-
+ Line 10-21 creates a dependency graph that resizes the vector size and fills it with sequentially increasing values starting with zero
243+
+ Line 10-21 creates a dependency graph that resizes the vector and fills it with sequentially increasing values starting with zero
237244
+ Line 23-39 creates another dependency graph that sums up the values in the vector
238245
+ Line 25-27 creates a task that initializes the variable `sum` to zero, and overlaps its execution with the first dependency graph
239-
+ Line 30-35 creates a task that blocks until the first dependency graph completes and then sums up all integer values in the propertly initialized vector
246+
+ Line 30-35 creates a task that blocks until the first dependency graph completes and then sums up all integer values in the properly initialized vector
240247
+ Line 42 blocks until the second dependency graph finishes
241248
+ Line 44 puts an assertion guard on the final summation value
242249

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,8 @@
11
# Spawn a Task Dependency Graph at Runtime
22

33
It is very common for a parallel program to
4-
spawn tasks at runtime.
5-
In Cpp-Taskflow, we call this *dynamic tasking* -
6-
creating another task dependency graph
7-
during the execution of a task.
4+
spawn task dependency graphs at runtime.
5+
In Cpp-Taskflow, we call this *dynamic tasking*.
86
In this tutorial, we are going to demonstrate how to enable dynamic tasking
97
in Cpp-Taskflow.
108

@@ -19,10 +17,10 @@ These tasks are spawned from a parent task and are grouped together to a
1917
*subflow* dependency graph.
2018
Cpp-Taskflow has an unified interface for static and dynamic tasking.
2119
To create a subflow for dynamic tasking, emplace a callable
22-
that takes one argument of type `SubflowBuilder`.
23-
A `SubflowBuilder` object will be created during the runtime and
20+
that takes one argument of type `tf::SubflowBuilder`.
21+
A `tf::SubflowBuilder` object will be created during the runtime and
2422
passed to the task.
25-
All graph building methods you find in taskflow can be also used in a subflow builder.
23+
All graph building methods you find in taskflow are applicable for a subflow builder.
2624

2725
```cpp
2826
1: tf::Taskflow tf(4); // create a taskflow object with four worker threads
@@ -59,7 +57,8 @@ Debrief:
5957
+ Line 21 dumps the topology that represents the entire task dependency graph
6058
6159
Line 7-13 is the main coding block to enable dynamic tasking.
62-
Cpp-Taskflow uses a variant date type to unify the interface of static tasking and dynamic tasking.
60+
Cpp-Taskflow uses a [std::variant][std::variant] date type to
61+
unify the interface of static tasking and dynamic tasking.
6362
The runtime will create a *subflow builder* passing it to task B,
6463
and spawn a dependency graph as described by the associated callable.
6564
This new subflow graph will be added to the topology to which its parent task B belongs to.
@@ -151,5 +150,7 @@ A detached subflow will run independently and eventually join the topology
151150
of its parent subflow.
152151
153152
153+
* * *
154154
155+
[std::variant]: https://en.cppreference.com/w/cpp/header/variant
155156
File renamed without changes.

doc/cookbook/parallel_for.cpp

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
#include <taskflow/taskflow.hpp>
2+
3+
int main() {
4+
5+
tf::Taskflow tf(4);
6+
7+
std::vector<int> items {1, 2, 3, 4, 5, 6, 7, 8};
8+
9+
auto [S, T] = tf.parallel_for(items.begin(), items.end(), [] (int item) {
10+
std::cout << std::this_thread::get_id() << " runs " << item << std::endl;
11+
});
12+
13+
S.work([] () { std::cout << "S\n"; }).name("S");
14+
T.work([] () { std::cout << "T\n"; }).name("T");
15+
16+
tf.wait_for_all();
17+
18+
return 0;
19+
}
20+
21+
22+
23+
Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ to run a for loop in parallel.
1313
Cpp-Taskflow has a STL-style method `parallel_for`
1414
that takes a range of items and applies a callable to each of the item in parallel.
1515
The method constructs a sub-graph representing this workload
16-
and returns a task pair as synchronization points.
16+
and returns a task pair as two synchronization points to this task pattern.
1717

1818
```cpp
1919
1: tf::Taskflow tf(4);
@@ -32,8 +32,8 @@ and returns a task pair as synchronization points.
3232
14: tf.wait_for_all();
3333
```
3434
35-
The above code generate the following task dependency graph.
36-
The label 0x56\* represents an internal node to execute the callable.
35+
The above code generates the following task dependency graph.
36+
The label 0x56\* represents an internal task node to execute the callable object.
3737
By default, Cpp-Taskflow evenly partitions and distributes the workload
3838
to all threads.
3939
In our example of eight tasks and four workers, each internal node is responsible for two items.
@@ -82,10 +82,10 @@ and would like to enable more efficient parallelization.
8282

8383
![](parallel_for2.png)
8484

85-
## Construct the Graph Manually
85+
## Construct the Graph Explicitly
8686

87-
You can manually construct a dependency graph that represents a parallel execution
88-
of a for loop
87+
You can explicitly construct a dependency graph that represents a parallel execution
88+
of a for loop.
8989
using only the basic methods `silent_emplace` and `precede`.
9090

9191

@@ -210,14 +210,12 @@ The output of this programs is:
210210
sum is: 1024
211211
```
212212

213-
214-
215213
Debrief:
216214
+ Line 5 creates a taskflow object with four worker threads
217215
+ Line 7 creates a vector of 1024 uninitialized integers
218216
+ Line 8 creates an atomic integer variable
219217
+ Line 10-14 creates a task that captures the vector to initialize all items to one
220-
+ Line 16-18 sums up all items with each partition of 128 items (total 1024/128=8 partitions)
218+
+ Line 16-18 sums up all items with each thread running on a partition of 128 items (total 1024/128=8 partitions)
221219
+ Line 20-22 creates a task that outputs the summation value
222220
+ Line 24-25 pipelines the parallel-for workload with the two tasks
223221
+ Line 27 dispatches the graph to threads and blocks until the execution completes

0 commit comments

Comments
 (0)