.. default-domain:: chpl

.. module:: ChapelIteratorSupport
   :synopsis: Data parallel constructs (such as ``forall`` loops) are implicitly

Vectorizing Iterator
====================

Data parallel constructs (such as ``forall`` loops) are implicitly
vectorizable. If the ``--vectorize`` compiler flag is thrown (implied by
``--fast``), the Chapel compiler will emit vectorization hints to the backend
compiler, though the effects will vary based on the target compiler.

In order to allow users to explicitly request vectorization, this prototype
vectorizing iterator is being provided. Loops that invoke this iterator will
be marked with vectorization hints, provided the ``--vectorize`` flag is
thrown.

This iterator is currently available for all Chapel programs and does not
require a ``use`` statement to make it available. In future releases it will
be moved to a standard module and will likely require a ``use`` statement to
make it available.


.. iterfunction:: iter vectorizeOnly(iterables ...)

   
   Vectorize only "wrapper" iterator:
   
   This iterator wraps and vectorizes other iterators. It takes one or more
   iterables (an iterator or class/record with a these() iterator) and yields
   the same elements as the wrapped iterables.
   
   This iterator exists to provide a way to vectorize data parallel loops
   without invoking a parallel iterator with the goal of avoiding task
   creation for loops with small trip counts or where task creation isn't
   desirable.
   
   Data parallel operations in Chapel such as forall loops are
   order-independent. However, a forall is implemented in terms of either
   leader/follower or standalone iterators which typically create tasks.
   This iterator exists to allow vectorization of order-independent loops
   without requiring task creation. By using this wrapper iterator you are
   asserting that the loop is order-independent (and thus a candidate for
   vectorization) just as you are when using a forall loop.
   
   When invoked from a serial for loop, this iterator will simply mark your
   iterator(s) as order-independent. When invoked from a parallel forall loop
   this iterator will implicitly be order-independent because of the
   semantics of a forall, and additionally it will invoke the serial
   iterator instead of the parallel iterators. For instance:
   
   .. code-block:: chapel
   
       forall i in vectorizeOnly(1..10) do;
       for    i in vectorizeOnly(1..10) do;
   
   will both effectively generate:
   
   .. code-block:: c
   
       CHPL_PRAGMA_IVDEP
       for (i=0; i<=10; i+=1) {}
   
   The ``vectorizeOnly`` iterator  automatically handles zippering, so the
   ``zip`` keyword is not needed. For instance, to vectorize:
   
   .. code-block:: chapel
   
       for (i, j) in zip(1..10, 1..10) do;
   
   simply write:
   
   .. code-block:: chapel
   
       for (i, j) in vectorizeOnly(1..10, 1..10) do;
   
   Note that the use of ``zip`` is not explicitly prevented, but all
   iterators being zipped must be wrapped by a ``vectorizeOnly`` iterator.
   Future releases may explicitly prevent the use ``zip`` with this iterator.
   

