Optimizing Processing Pipelines Discussion
Background
Since the early 2010s I've dabbled one way or another in optimizing multi-threaded processing pipelines. My first attempt, although difficult, was a success. There were 4 segments to this pipeline (BlockingCollections) and I used all available threads and a custom TaskScheduler to optimize flow through the system.
Since then, I've been experimenting with different processing schemes and have had some success with TPL.Dataflow blocks. But they aren't necessarily a perfect answer.
Push vs Pull
Neither method appears to be a clear winner, and it may be true that both have to be in play to truly maximize throughput. Pushing can bunch up at the start. And pulling may simply not be able to get enough input.
Reverse Pipelining
This is an interesting scheme where all segments are queues, but the segment priority is in descending order. The end of the pipe is more important than the beginning. This prevents the beginning of the pipeline from bunching up.
Adaptive Priority
As queues drain, priorities can change as you may never want your supply to get to zero, nor do you want it to fill to it's boundary. So processing can shift in priority depending on how full each segment is.
Discussion
I'm curious what people have experienced and what methods they've used for success. I'm working on a pipeline now that I may have to go back to what I had done before and that was generating queues that all available threads select from one of those queues based upon which one is more important (higher priority) but it can take some logic as well.
0 comments:
Post a Comment