Halide 14.0.0
Halide compiler and libraries
Schedule.h
Go to the documentation of this file.
1#ifndef HALIDE_SCHEDULE_H
2#define HALIDE_SCHEDULE_H
3
4/** \file
5 * Defines the internal representation of the schedule for a function
6 */
7
8#include <map>
9#include <string>
10#include <utility>
11#include <vector>
12
13#include "DeviceAPI.h"
14#include "Expr.h"
15#include "FunctionPtr.h"
16#include "Parameter.h"
17#include "PrefetchDirective.h"
18
19namespace Halide {
20
21class Func;
22struct VarOrRVar;
23
24namespace Internal {
25class Function;
26struct FunctionContents;
27struct LoopLevelContents;
28} // namespace Internal
29
30/** Different ways to handle a tail case in a split when the
31 * factor does not provably divide the extent. */
32enum class TailStrategy {
33 /** Round up the extent to be a multiple of the split
34 * factor. Not legal for RVars, as it would change the meaning
35 * of the algorithm. Pros: generates the simplest, fastest
36 * code. Cons: if used on a stage that reads from the input or
37 * writes to the output, constrains the input or output size
38 * to be a multiple of the split factor. */
39 RoundUp,
40
41 /** Guard the inner loop with an if statement that prevents
42 * evaluation beyond the original extent. Always legal. The if
43 * statement is treated like a boundary condition, and
44 * factored out into a loop epilogue if possible. Pros: no
45 * redundant re-evaluation; does not constrain input our
46 * output sizes. Cons: increases code size due to separate
47 * tail-case handling; vectorization will scalarize in the tail
48 * case to handle the if statement. */
50
51 /** Guard the loads and stores in the loop with an if statement
52 * that prevents evaluation beyond the original extent. Always
53 * legal. The if statement is treated like a boundary condition,
54 * and factored out into a loop epilogue if possible.
55 * Pros: no redundant re-evaluation; does not constrain input or
56 * output sizes. Cons: increases code size due to separate
57 * tail-case handling. */
59
60 /** Guard the loads in the loop with an if statement that
61 * prevents evaluation beyond the original extent. Only legal
62 * for innermost splits. Not legal for RVars, as it would change
63 * the meaning of the algorithm. The if statement is treated like
64 * a boundary condition, and factored out into a loop epilogue if
65 * possible.
66 * Pros: does not constrain input sizes, output size constraints
67 * are simpler than full predication. Cons: increases code size
68 * due to separate tail-case handling, constrains the output size
69 * to be a multiple of the split factor. */
71
72 /** Guard the stores in the loop with an if statement that
73 * prevents evaluation beyond the original extent. Only legal
74 * for innermost splits. Not legal for RVars, as it would change
75 * the meaning of the algorithm. The if statement is treated like
76 * a boundary condition, and factored out into a loop epilogue if
77 * possible.
78 * Pros: does not constrain output sizes, input size constraints
79 * are simpler than full predication. Cons: increases code size
80 * due to separate tail-case handling, constraints the input size
81 * to be a multiple of the split factor.. */
83
84 /** Prevent evaluation beyond the original extent by shifting
85 * the tail case inwards, re-evaluating some points near the
86 * end. Only legal for pure variables in pure definitions. If
87 * the inner loop is very simple, the tail case is treated
88 * like a boundary condition and factored out into an
89 * epilogue.
90 *
91 * This is a good trade-off between several factors. Like
92 * RoundUp, it supports vectorization well, because the inner
93 * loop is always a fixed size with no data-dependent
94 * branching. It increases code size slightly for inner loops
95 * due to the epilogue handling, but not for outer loops
96 * (e.g. loops over tiles). If used on a stage that reads from
97 * an input or writes to an output, this stategy only requires
98 * that the input/output extent be at least the split factor,
99 * instead of a multiple of the split factor as with RoundUp. */
101
102 /** For pure definitions use ShiftInwards. For pure vars in
103 * update definitions use RoundUp. For RVars in update
104 * definitions use GuardWithIf. */
105 Auto
106};
107
108/** Different ways to handle the case when the start/end of the loops of stages
109 * computed with (fused) are not aligned. */
111 /** Shift the start of the fused loops to align. */
113
114 /** Shift the end of the fused loops to align. */
115 AlignEnd,
116
117 /** compute_with will make no attempt to align the start/end of the
118 * fused loops. */
119 NoAlign,
120
121 /** By default, LoopAlignStrategy is set to NoAlign. */
122 Auto
123};
124
125/** A reference to a site in a Halide statement at the top of the
126 * body of a particular for loop. Evaluating a region of a halide
127 * function is done by generating a loop nest that spans its
128 * dimensions. We schedule the inputs to that function by
129 * recursively injecting realizations for them at particular sites
130 * in this loop nest. A LoopLevel identifies such a site. The site
131 * can either be a loop nest within all stages of a function
132 * or it can refer to a loop nest within a particular function's
133 * stage (initial definition or updates).
134 *
135 * Note that a LoopLevel is essentially a pointer to an underlying value;
136 * all copies of a LoopLevel refer to the same site, so mutating one copy
137 * (via the set() method) will effectively mutate all copies:
138 \code
139 Func f;
140 Var x;
141 LoopLevel a(f, x);
142 // Both a and b refer to LoopLevel(f, x)
143 LoopLevel b = a;
144 // Now both a and b refer to LoopLevel::root()
145 a.set(LoopLevel::root());
146 \endcode
147 * This is quite useful when splitting Halide code into utility libraries, as it allows
148 * a library to schedule code according to a caller's specifications, even if the caller
149 * hasn't fully defined its pipeline yet:
150 \code
151 Func demosaic(Func input,
152 LoopLevel intermed_compute_at,
153 LoopLevel intermed_store_at,
154 LoopLevel output_compute_at) {
155 Func intermed = ...;
156 Func output = ...;
157 intermed.compute_at(intermed_compute_at).store_at(intermed_store_at);
158 output.compute_at(output_compute_at);
159 return output;
160 }
161
162 void process() {
163 // Note that these LoopLevels are all undefined when we pass them to demosaic()
164 LoopLevel intermed_compute_at, intermed_store_at, output_compute_at;
165 Func input = ...;
166 Func demosaiced = demosaic(input, intermed_compute_at, intermed_store_at, output_compute_at);
167 Func output = ...;
168
169 // We need to ensure all LoopLevels have a well-defined value prior to lowering:
170 intermed_compute_at.set(LoopLevel(output, y));
171 intermed_store_at.set(LoopLevel(output, y));
172 output_compute_at.set(LoopLevel(output, x));
173 }
174 \endcode
175 */
178
180 : contents(std::move(c)) {
181 }
182 LoopLevel(const std::string &func_name, const std::string &var_name,
183 bool is_rvar, int stage_index, bool locked = false);
184
185public:
186 /** Return the index of the function stage associated with this loop level.
187 * Asserts if undefined */
188 int stage_index() const;
189
190 /** Identify the loop nest corresponding to some dimension of some function */
191 // @{
192 LoopLevel(const Internal::Function &f, const VarOrRVar &v, int stage_index = -1);
193 LoopLevel(const Func &f, const VarOrRVar &v, int stage_index = -1);
194 // @}
195
196 /** Construct an undefined LoopLevel. Calling any method on an undefined
197 * LoopLevel (other than set()) will assert. */
199
200 /** Construct a special LoopLevel value that implies
201 * that a function should be inlined away. */
203
204 /** Construct a special LoopLevel value which represents the
205 * location outside of all for loops. */
206 static LoopLevel root();
207
208 /** Mutate our contents to match the contents of 'other'. */
209 void set(const LoopLevel &other);
210
211 // All the public methods below this point are meant only for internal
212 // use by Halide, rather than user code; hence, they are deliberately
213 // documented with plain comments (rather than Doxygen) to avoid being
214 // present in user documentation.
215
216 // Lock this LoopLevel.
218
219 // Return the Func name. Asserts if the LoopLevel is_root() or is_inlined() or !defined().
220 std::string func() const;
221
222 // Return the VarOrRVar. Asserts if the LoopLevel is_root() or is_inlined() or !defined().
223 VarOrRVar var() const;
224
225 // Return true iff the LoopLevel is defined. (Only LoopLevels created
226 // with the default ctor are undefined.)
227 bool defined() const;
228
229 // Test if a loop level corresponds to inlining the function.
230 bool is_inlined() const;
231
232 // Test if a loop level is 'root', which describes the site
233 // outside of all for loops.
234 bool is_root() const;
235
236 // Return a string of the form func.var -- note that this is safe
237 // to call for root or inline LoopLevels, but asserts if !defined().
238 std::string to_string() const;
239
240 // Compare this loop level against the variable name of a for
241 // loop, to see if this loop level refers to the site
242 // immediately inside this loop. Asserts if !defined().
243 bool match(const std::string &loop) const;
244
245 bool match(const LoopLevel &other) const;
246
247 // Check if two loop levels are exactly the same.
248 bool operator==(const LoopLevel &other) const;
249
250 bool operator!=(const LoopLevel &other) const {
251 return !(*this == other);
252 }
253
254private:
255 void check_defined() const;
256 void check_locked() const;
257 void check_defined_and_locked() const;
258};
259
262 /** Contains alignment strategies for the fused dimensions (indexed by the
263 * dimension name). If not in the map, use the default alignment strategy
264 * to align the fused dimension (see \ref LoopAlignStrategy::Auto).
265 */
266 std::map<std::string, LoopAlignStrategy> align;
267
269 : level(LoopLevel::inlined().lock()) {
270 }
271 FuseLoopLevel(const LoopLevel &level, const std::map<std::string, LoopAlignStrategy> &align)
272 : level(level), align(align) {
273 }
274};
275
276namespace Internal {
277
278class IRMutator;
279struct ReductionVariable;
280
281struct Split {
282 std::string old_var, outer, inner;
284 bool exact; // Is it required that the factor divides the extent
285 // of the old var. True for splits of RVars. Forces
286 // tail strategy to be GuardWithIf.
288
289 enum SplitType { SplitVar = 0,
293
294 // If split_type is Rename, then this is just a renaming of the
295 // old_var to the outer and not a split. The inner var should
296 // be ignored, and factor should be one. Renames are kept in
297 // the same list as splits so that ordering between them is
298 // respected.
299
300 // If split type is Purify, this replaces the old_var RVar to
301 // the outer Var. The inner var should be ignored, and factor
302 // should be one.
303
304 // If split_type is Fuse, then this does the opposite of a
305 // split, it joins the outer and inner into the old_var.
307
308 bool is_rename() const {
309 return split_type == RenameVar;
310 }
311 bool is_split() const {
312 return split_type == SplitVar;
313 }
314 bool is_fuse() const {
315 return split_type == FuseVars;
316 }
317 bool is_purify() const {
318 return split_type == PurifyRVar;
319 }
320};
321
322/** Each Dim below has a dim_type, which tells you what
323 * transformations are legal on it. When you combine two Dims of
324 * distinct DimTypes (e.g. with Stage::fuse), the combined result has
325 * the greater enum value of the two types. */
326enum class DimType {
327 /** This dim originated from a Var. You can evaluate a Func at
328 * distinct values of this Var in any order over an interval
329 * that's at least as large as the interval required. In pure
330 * definitions you can even redundantly re-evaluate points. */
331 PureVar = 0,
332
333 /** The dim originated from an RVar. You can evaluate a Func at
334 * distinct values of this RVar in any order (including in
335 * parallel) over exactly the interval specified in the
336 * RDom. PureRVars can also be reordered arbitrarily in the dims
337 * list, as there are no data hazards between the evaluation of
338 * the Func at distinct values of the RVar.
339 *
340 * The most common case where an RVar is considered pure is RVars
341 * that are used in a way which obeys all the syntactic
342 * constraints that a Var does, e.g:
343 *
344 \code
345 RDom r(0, 100);
346 f(r.x) = f(r.x) + 5;
347 \endcode
348 *
349 * Other cases where RVars are pure are where the sites being
350 * written to by the Func evaluated at one value of the RVar
351 * couldn't possibly collide with the sites being written or read
352 * by the Func at a distinct value of the RVar. For example, r.x
353 * is pure in the following three definitions:
354 *
355 \code
356
357 // This definition writes to even coordinates and reads from the
358 // same site (which no other value of r.x is writing to) and odd
359 // sites (which no other value of r.x is writing to):
360 f(2*r.x) = max(f(2*r.x), f(2*r.x + 7));
361
362 // This definition writes to scanline zero and reads from the the
363 // same site and scanline one:
364 f(r.x, 0) += f(r.x, 1);
365
366 // This definition reads and writes over non-overlapping ranges:
367 f(r.x + 100) += f(r.x);
368 \endcode
369 *
370 * To give two counterexamples, r.x is not pure in the following
371 * definitions:
372 *
373 \code
374 // The same site is written by distinct values of the RVar
375 // (write-after-write hazard):
376 f(r.x / 2) += f(r.x);
377
378 // One value of r.x reads from a site that another value of r.x
379 // is writing to (read-after-write hazard):
380 f(r.x) += f(r.x + 1);
381 \endcode
382 */
383 PureRVar,
384
385 /** The dim originated from an RVar. You must evaluate a Func at
386 * distinct values of this RVar in increasing order over precisely
387 * the interval specified in the RDom. ImpureRVars may not be
388 * reordered with respect to other ImpureRVars.
389 *
390 * All RVars are impure by default. Those for which we can prove
391 * no data hazards exist get promoted to PureRVar. There are two
392 * instances in which ImpureRVars may be parallelized or reordered
393 * even in the presence of hazards:
394 *
395 * 1) In the case of an update definition that has been proven to be
396 * an associative and commutative reduction, reordering of
397 * ImpureRVars is allowed, and parallelizing them is allowed if
398 * the update has been made atomic.
399 *
400 * 2) ImpureRVars can also be reordered and parallelized if
401 * Func::allow_race_conditions() has been set. This is the escape
402 * hatch for when there are no hazards but the checks above failed
403 * to prove that (RDom::where can encode arbitrary facts about
404 * non-linear integer arithmetic, which is undecidable), or for
405 * when you don't actually care about the non-determinism
406 * introduced by data hazards (e.g. in the algorithm HOGWILD!).
407 */
409};
410
411/** The Dim struct represents one loop in the schedule's
412 * representation of a loop nest. */
413struct Dim {
414 /** Name of the loop variable */
415 std::string var;
416
417 /** How are the loop values traversed (e.g. unrolled, vectorized, parallel) */
419
420 /** On what device does the body of the loop execute (e.g. Host, GPU, Hexagon) */
422
423 /** The DimType tells us what transformations are legal on this
424 * loop (see the DimType enum above). */
426
427 /** Can this loop be evaluated in any order (including in
428 * parallel)? Equivalently, are there no data hazards between
429 * evaluations of the Func at distinct values of this var? */
430 bool is_pure() const {
432 }
433
434 /** Did this loop originate from an RVar (in which case the bounds
435 * of the loops are algorithmically meaningful)? */
436 bool is_rvar() const {
438 }
439
440 /** Could multiple iterations of this loop happen at the same
441 * time, with reads and writes interleaved in arbitrary ways
442 * according to the memory model of the underlying compiler and
443 * machine? */
446 }
447
448 /** Could multiple iterations of this loop happen at the same
449 * time? Vectorized and GPULanes loop types are parallel but not
450 * unordered, because the loop iterations proceed together in
451 * lockstep with some well-defined outcome if there are hazards. */
452 bool is_parallel() const {
454 }
455};
456
457/** A bound on a loop, typically from Func::bound */
458struct Bound {
459 /** The loop var being bounded */
460 std::string var;
461
462 /** Declared min and extent of the loop. min may be undefined if
463 * Func::bound_extent was used. */
465
466 /** If defined, the number of iterations will be a multiple of
467 * "modulus", and the first iteration will be at a value congruent
468 * to "remainder" modulo "modulus". Set by Func::align_bounds and
469 * Func::align_extent. */
471};
472
473/** Properties of one axis of the storage of a Func */
475 /** The var in the pure definition corresponding to this axis */
476 std::string var;
477
478 /** The bounds allocated (not computed) must be a multiple of
479 * "alignment". Set by Func::align_storage. */
481
482 /** The bounds allocated (not computed). Set by Func::bound_storage. */
484
485 /** If the Func is explicitly folded along this axis (with
486 * Func::fold_storage) this gives the extent of the circular
487 * buffer used, and whether it is used in increasing order
488 * (fold_forward = true) or decreasing order (fold_forward =
489 * false). */
492};
493
494/** This represents two stages with fused loop nests from outermost to
495 * a specific loop level. The loops to compute func_1(stage_1) are
496 * fused with the loops to compute func_2(stage_2) from outermost to
497 * loop level var_name and the computation from stage_1 of func_1
498 * occurs first.
499 */
500struct FusedPair {
501 std::string func_1;
502 std::string func_2;
503 size_t stage_1;
504 size_t stage_2;
505 std::string var_name;
506
507 FusedPair() = default;
508 FusedPair(const std::string &f1, size_t s1, const std::string &f2,
509 size_t s2, const std::string &var)
510 : func_1(f1), func_2(f2), stage_1(s1), stage_2(s2), var_name(var) {
511 }
512
513 bool operator==(const FusedPair &other) const {
514 return (func_1 == other.func_1) && (func_2 == other.func_2) &&
515 (stage_1 == other.stage_1) && (stage_2 == other.stage_2) &&
516 (var_name == other.var_name);
517 }
518 bool operator<(const FusedPair &other) const {
519 if (func_1 != other.func_1) {
520 return func_1 < other.func_1;
521 }
522 if (func_2 != other.func_2) {
523 return func_2 < other.func_2;
524 }
525 if (var_name != other.var_name) {
526 return var_name < other.var_name;
527 }
528 if (stage_1 != other.stage_1) {
529 return stage_1 < other.stage_1;
530 }
531 return stage_2 < other.stage_2;
532 }
533};
534
535struct FuncScheduleContents;
536struct StageScheduleContents;
537struct FunctionContents;
538
539/** A schedule for a Function of a Halide pipeline. This schedule is
540 * applied to all stages of the Function. Right now this interface is
541 * basically a struct, offering mutable access to its innards.
542 * In the future it may become more encapsulated. */
545
546public:
548 : contents(std::move(c)) {
549 }
550 FuncSchedule(const FuncSchedule &other) = default;
552
553 /** Return a deep copy of this FuncSchedule. It recursively deep copies all
554 * called functions, schedules, specializations, and reduction domains. This
555 * method takes a map of <old FunctionContents, deep-copied version> as input
556 * and would use the deep-copied FunctionContents from the map if exists
557 * instead of creating a new deep-copy to avoid creating deep-copies of the
558 * same FunctionContents multiple times.
559 */
561 std::map<FunctionPtr, FunctionPtr> &copied_map) const;
562
563 /** This flag is set to true if the schedule is memoized. */
564 // @{
565 bool &memoized();
566 bool memoized() const;
567 // @}
568
569 /** This flag is set to true if the schedule is memoized and has an attached
570 * eviction key. */
571 // @{
574 // @}
575
576 /** Is the production of this Function done asynchronously */
577 bool &async();
578 bool async() const;
579
580 /** The list and order of dimensions used to store this
581 * function. The first dimension in the vector corresponds to the
582 * innermost dimension for storage (i.e. which dimension is
583 * tightly packed in memory) */
584 // @{
585 const std::vector<StorageDim> &storage_dims() const;
586 std::vector<StorageDim> &storage_dims();
587 // @}
588
589 /** The memory type (heap/stack/shared/etc) used to back this Func. */
590 // @{
593 // @}
594
595 /** You may explicitly bound some of the dimensions of a function,
596 * or constrain them to lie on multiples of a given factor. See
597 * \ref Func::bound and \ref Func::align_bounds and \ref Func::align_extent. */
598 // @{
599 const std::vector<Bound> &bounds() const;
600 std::vector<Bound> &bounds();
601 // @}
602
603 /** You may explicitly specify an estimate of some of the function
604 * dimensions. See \ref Func::set_estimate */
605 // @{
606 const std::vector<Bound> &estimates() const;
607 std::vector<Bound> &estimates();
608 // @}
609
610 /** Mark calls of a function by 'f' to be replaced with its identity
611 * wrapper or clone during the lowering stage. If the string 'f' is empty,
612 * it means replace all calls to the function by all other functions
613 * (excluding itself) in the pipeline with the global identity wrapper.
614 * See \ref Func::in and \ref Func::clone_in for more details. */
615 // @{
616 const std::map<std::string, Internal::FunctionPtr> &wrappers() const;
617 std::map<std::string, Internal::FunctionPtr> &wrappers();
618 void add_wrapper(const std::string &f,
619 const Internal::FunctionPtr &wrapper);
620 // @}
621
622 /** At what sites should we inject the allocation and the
623 * computation of this function? The store_level must be outside
624 * of or equal to the compute_level. If the compute_level is
625 * inline, the store_level is meaningless. See \ref Func::store_at
626 * and \ref Func::compute_at */
627 // @{
628 const LoopLevel &store_level() const;
629 const LoopLevel &compute_level() const;
632 // @}
633
634 /** Pass an IRVisitor through to all Exprs referenced in the
635 * Schedule. */
636 void accept(IRVisitor *) const;
637
638 /** Pass an IRMutator through to all Exprs referenced in the
639 * Schedule. */
641};
642
643/** A schedule for a single stage of a Halide pipeline. Right now this
644 * interface is basically a struct, offering mutable access to its
645 * innards. In the future it may become more encapsulated. */
648
649public:
651 : contents(std::move(c)) {
652 }
653 StageSchedule(const StageSchedule &other) = default;
655
656 /** Return a copy of this StageSchedule. */
658
659 /** This flag is set to true if the dims list has been manipulated
660 * by the user (or if a ScheduleHandle was created that could have
661 * been used to manipulate it). It controls the warning that
662 * occurs if you schedule the vars of the pure step but not the
663 * update steps. */
664 // @{
665 bool &touched();
666 bool touched() const;
667 // @}
668
669 /** RVars of reduction domain associated with this schedule if there is any. */
670 // @{
671 const std::vector<ReductionVariable> &rvars() const;
672 std::vector<ReductionVariable> &rvars();
673 // @}
674
675 /** The traversal of the domain of a function can have some of its
676 * dimensions split into sub-dimensions. See \ref Func::split */
677 // @{
678 const std::vector<Split> &splits() const;
679 std::vector<Split> &splits();
680 // @}
681
682 /** The list and ordering of dimensions used to evaluate this
683 * function, after all splits have taken place. The first
684 * dimension in the vector corresponds to the innermost for loop,
685 * and the last is the outermost. Also specifies what type of for
686 * loop to use for each dimension. Does not specify the bounds on
687 * each dimension. These get inferred from how the function is
688 * used, what the splits are, and any optional bounds in the list below. */
689 // @{
690 const std::vector<Dim> &dims() const;
691 std::vector<Dim> &dims();
692 // @}
693
694 /** You may perform prefetching in some of the dimensions of a
695 * function. See \ref Func::prefetch */
696 // @{
697 const std::vector<PrefetchDirective> &prefetches() const;
698 std::vector<PrefetchDirective> &prefetches();
699 // @}
700
701 /** Innermost loop level of fused loop nest for this function stage.
702 * Fusion runs from outermost to this loop level. The stages being fused
703 * should not have producer/consumer relationship. See \ref Func::compute_with
704 * and \ref Func::compute_with */
705 // @{
706 const FuseLoopLevel &fuse_level() const;
708 // @}
709
710 /** List of function stages that are to be fused with this function stage
711 * from the outermost loop to a certain loop level. Those function stages
712 * are to be computed AFTER this function stage at the last fused loop level.
713 * This list is populated when realization_order() is called. See
714 * \ref Func::compute_with */
715 // @{
716 const std::vector<FusedPair> &fused_pairs() const;
717 std::vector<FusedPair> &fused_pairs();
718
719 /** Are race conditions permitted? */
720 // @{
723 // @}
724
725 /** Use atomic update? */
726 // @{
727 bool atomic() const;
728 bool &atomic();
729 // @}
730
731 /** Atomic updates are only allowed on associative reductions.
732 * We try to prove the associativity, but the user can override
733 * the associativity test and suppress compiler error if the prover
734 * fails to recognize the associativity or the user does not care. */
735 // @{
738 // @}
739
740 /** Pass an IRVisitor through to all Exprs referenced in the
741 * Schedule. */
742 void accept(IRVisitor *) const;
743
744 /** Pass an IRMutator through to all Exprs referenced in the
745 * Schedule. */
747};
748
749} // namespace Internal
750} // namespace Halide
751
752#endif
Defines DeviceAPI.
Base classes for Halide expressions (Halide::Expr) and statements (Halide::Internal::Stmt)
Defines the internal representation of parameters to halide piplines.
Defines the PrefetchDirective struct.
A halide function.
Definition: Func.h:703
A schedule for a Function of a Halide pipeline.
Definition: Schedule.h:543
Expr & memoize_eviction_key()
This flag is set to true if the schedule is memoized and has an attached eviction key.
void add_wrapper(const std::string &f, const Internal::FunctionPtr &wrapper)
const std::vector< Bound > & estimates() const
You may explicitly specify an estimate of some of the function dimensions.
const std::vector< Bound > & bounds() const
You may explicitly bound some of the dimensions of a function, or constrain them to lie on multiples ...
void mutate(IRMutator *)
Pass an IRMutator through to all Exprs referenced in the Schedule.
bool & async()
Is the production of this Function done asynchronously.
void accept(IRVisitor *) const
Pass an IRVisitor through to all Exprs referenced in the Schedule.
const LoopLevel & store_level() const
At what sites should we inject the allocation and the computation of this function?...
const std::map< std::string, Internal::FunctionPtr > & wrappers() const
Mark calls of a function by 'f' to be replaced with its identity wrapper or clone during the lowering...
MemoryType memory_type() const
The memory type (heap/stack/shared/etc) used to back this Func.
std::vector< StorageDim > & storage_dims()
FuncSchedule(const FuncSchedule &other)=default
FuncSchedule(IntrusivePtr< FuncScheduleContents > c)
Definition: Schedule.h:547
std::map< std::string, Internal::FunctionPtr > & wrappers()
FuncSchedule deep_copy(std::map< FunctionPtr, FunctionPtr > &copied_map) const
Return a deep copy of this FuncSchedule.
bool & memoized()
This flag is set to true if the schedule is memoized.
const LoopLevel & compute_level() const
std::vector< Bound > & estimates()
const std::vector< StorageDim > & storage_dims() const
The list and order of dimensions used to store this function.
std::vector< Bound > & bounds()
A reference-counted handle to Halide's internal representation of a function.
Definition: Function.h:38
A base class for passes over the IR which modify it (e.g.
Definition: IRMutator.h:26
A base class for algorithms that need to recursively walk over the IR.
Definition: IRVisitor.h:19
A schedule for a single stage of a Halide pipeline.
Definition: Schedule.h:646
StageSchedule(IntrusivePtr< StageScheduleContents > c)
Definition: Schedule.h:650
std::vector< PrefetchDirective > & prefetches()
std::vector< ReductionVariable > & rvars()
const std::vector< ReductionVariable > & rvars() const
RVars of reduction domain associated with this schedule if there is any.
StageSchedule get_copy() const
Return a copy of this StageSchedule.
bool & touched()
This flag is set to true if the dims list has been manipulated by the user (or if a ScheduleHandle wa...
std::vector< FusedPair > & fused_pairs()
const std::vector< FusedPair > & fused_pairs() const
List of function stages that are to be fused with this function stage from the outermost loop to a ce...
std::vector< Split > & splits()
bool allow_race_conditions() const
Are race conditions permitted?
const FuseLoopLevel & fuse_level() const
Innermost loop level of fused loop nest for this function stage.
bool atomic() const
Use atomic update?
const std::vector< Split > & splits() const
The traversal of the domain of a function can have some of its dimensions split into sub-dimensions.
std::vector< Dim > & dims()
const std::vector< Dim > & dims() const
The list and ordering of dimensions used to evaluate this function, after all splits have taken place...
void mutate(IRMutator *)
Pass an IRMutator through to all Exprs referenced in the Schedule.
void accept(IRVisitor *) const
Pass an IRVisitor through to all Exprs referenced in the Schedule.
const std::vector< PrefetchDirective > & prefetches() const
You may perform prefetching in some of the dimensions of a function.
bool override_atomic_associativity_test() const
Atomic updates are only allowed on associative reductions.
StageSchedule(const StageSchedule &other)=default
A reference to a site in a Halide statement at the top of the body of a particular for loop.
Definition: Schedule.h:176
VarOrRVar var() const
std::string to_string() const
static LoopLevel root()
Construct a special LoopLevel value which represents the location outside of all for loops.
LoopLevel(const Internal::Function &f, const VarOrRVar &v, int stage_index=-1)
Identify the loop nest corresponding to some dimension of some function.
static LoopLevel inlined()
Construct a special LoopLevel value that implies that a function should be inlined away.
LoopLevel(const Func &f, const VarOrRVar &v, int stage_index=-1)
bool operator==(const LoopLevel &other) const
std::string func() const
int stage_index() const
Return the index of the function stage associated with this loop level.
LoopLevel()
Construct an undefined LoopLevel.
void set(const LoopLevel &other)
Mutate our contents to match the contents of 'other'.
bool match(const LoopLevel &other) const
bool is_root() const
bool is_inlined() const
bool operator!=(const LoopLevel &other) const
Definition: Schedule.h:250
bool defined() const
bool match(const std::string &loop) const
LoopLevel & lock()
DimType
Each Dim below has a dim_type, which tells you what transformations are legal on it.
Definition: Schedule.h:326
@ ImpureRVar
The dim originated from an RVar.
@ PureRVar
The dim originated from an RVar.
@ PureVar
This dim originated from a Var.
ForType
An enum describing a type of loop traversal.
Definition: Expr.h:399
bool is_unordered_parallel(ForType for_type)
Check if for_type executes for loop iterations in parallel and unordered.
bool is_parallel(ForType for_type)
Returns true if for_type executes for loop iterations in parallel.
This file defines the class FunctionDAG, which is our representation of a Halide pipeline,...
@ Internal
Not visible externally, similar to 'static' linkage in C.
@ GuardWithIf
Guard the prefetch with if-guards that ignores the prefetch if any of the prefetched region ever goes...
TailStrategy
Different ways to handle a tail case in a split when the factor does not provably divide the extent.
Definition: Schedule.h:32
@ RoundUp
Round up the extent to be a multiple of the split factor.
@ Predicate
Guard the loads and stores in the loop with an if statement that prevents evaluation beyond the origi...
@ PredicateStores
Guard the stores in the loop with an if statement that prevents evaluation beyond the original extent...
@ ShiftInwards
Prevent evaluation beyond the original extent by shifting the tail case inwards, re-evaluating some p...
@ PredicateLoads
Guard the loads in the loop with an if statement that prevents evaluation beyond the original extent.
LoopAlignStrategy
Different ways to handle the case when the start/end of the loops of stages computed with (fused) are...
Definition: Schedule.h:110
@ NoAlign
compute_with will make no attempt to align the start/end of the fused loops.
@ AlignEnd
Shift the end of the fused loops to align.
@ AlignStart
Shift the start of the fused loops to align.
DeviceAPI
An enum describing a type of device API.
Definition: DeviceAPI.h:15
MemoryType
An enum describing different address spaces to be used with Func::store_in.
Definition: Expr.h:346
@ Auto
Let Halide select a storage type automatically.
A fragment of Halide syntax.
Definition: Expr.h:256
std::map< std::string, LoopAlignStrategy > align
Contains alignment strategies for the fused dimensions (indexed by the dimension name).
Definition: Schedule.h:266
FuseLoopLevel(const LoopLevel &level, const std::map< std::string, LoopAlignStrategy > &align)
Definition: Schedule.h:271
A bound on a loop, typically from Func::bound.
Definition: Schedule.h:458
Expr min
Declared min and extent of the loop.
Definition: Schedule.h:464
std::string var
The loop var being bounded.
Definition: Schedule.h:460
Expr modulus
If defined, the number of iterations will be a multiple of "modulus", and the first iteration will be...
Definition: Schedule.h:470
The Dim struct represents one loop in the schedule's representation of a loop nest.
Definition: Schedule.h:413
std::string var
Name of the loop variable.
Definition: Schedule.h:415
DimType dim_type
The DimType tells us what transformations are legal on this loop (see the DimType enum above).
Definition: Schedule.h:425
bool is_rvar() const
Did this loop originate from an RVar (in which case the bounds of the loops are algorithmically meani...
Definition: Schedule.h:436
ForType for_type
How are the loop values traversed (e.g.
Definition: Schedule.h:418
DeviceAPI device_api
On what device does the body of the loop execute (e.g.
Definition: Schedule.h:421
bool is_parallel() const
Could multiple iterations of this loop happen at the same time? Vectorized and GPULanes loop types ar...
Definition: Schedule.h:452
bool is_unordered_parallel() const
Could multiple iterations of this loop happen at the same time, with reads and writes interleaved in ...
Definition: Schedule.h:444
bool is_pure() const
Can this loop be evaluated in any order (including in parallel)? Equivalently, are there no data haza...
Definition: Schedule.h:430
A possibly-weak pointer to a Halide function.
Definition: FunctionPtr.h:27
This represents two stages with fused loop nests from outermost to a specific loop level.
Definition: Schedule.h:500
bool operator==(const FusedPair &other) const
Definition: Schedule.h:513
FusedPair(const std::string &f1, size_t s1, const std::string &f2, size_t s2, const std::string &var)
Definition: Schedule.h:508
bool operator<(const FusedPair &other) const
Definition: Schedule.h:518
bool is_fuse() const
Definition: Schedule.h:314
bool is_rename() const
Definition: Schedule.h:308
bool is_split() const
Definition: Schedule.h:311
TailStrategy tail
Definition: Schedule.h:287
std::string old_var
Definition: Schedule.h:282
bool is_purify() const
Definition: Schedule.h:317
Properties of one axis of the storage of a Func.
Definition: Schedule.h:474
std::string var
The var in the pure definition corresponding to this axis.
Definition: Schedule.h:476
Expr alignment
The bounds allocated (not computed) must be a multiple of "alignment".
Definition: Schedule.h:480
Expr bound
The bounds allocated (not computed).
Definition: Schedule.h:483
Expr fold_factor
If the Func is explicitly folded along this axis (with Func::fold_storage) this gives the extent of t...
Definition: Schedule.h:490
A class that can represent Vars or RVars.
Definition: Func.h:30