| \input texinfo |
| @c % |
| @c % /**-----------------------------------------------------------------** |
| @c % ** CLooG ** |
| @c % **-----------------------------------------------------------------** |
| @c % ** cloog.texi ** |
| @c % **-----------------------------------------------------------------** |
| @c % ** First version: july 6th 2002 ** |
| @c % **-----------------------------------------------------------------**/ |
| @c % |
| @c % release 1.0: September 17th 2002 |
| @c % release 1.1: December 5th 2002 |
| @c % release 1.2: April 22th 2003 |
| @c % release 2.0: November 21th 2005 (and now in texinfo instead of LaTeX) |
| @c % release 2.1: October 15th 2007 |
| @c % |
| @c %/************************************************************************** |
| @c % * CLooG : the Chunky Loop Generator (experimental) * |
| @c % **************************************************************************/ |
| @c %/* CAUTION: the English used is probably the worst you ever read, please |
| @c % * feel free to correct and improve it ! |
| @c % */ |
| |
| @c %\textit{"I found the ultimate transformation functions, optimization for |
| @c %static control programs is now a closed problem, I have \textnormal{just} |
| @c %to generate the target code !"} |
| |
| |
| |
| @c % /************************************************************************* |
| @c % * PART I: HEADER * |
| @c % *************************************************************************/ |
| @c %**start of header |
| @setfilename cloog.info |
| @settitle CLooG - a loop generator for scanning polyhedra |
| |
| @set EDITION 2.1 |
| @include gitversion.texi |
| @set UPDATED October 15th 2007 |
| @setchapternewpage odd |
| |
| @c %**end of header |
| |
| @c % /************************************************************************* |
| @c % * PART II: SUMMARY DESCRIPTION AND COPYRIGHT * |
| @c % *************************************************************************/ |
| |
| @copying |
| This manual is for CLooG version @value{VERSION}, a software |
| which generates loops for scanning Z-polyhedra. That is, CLooG produces a |
| code visiting each integral point of a union of parametrized |
| polyhedra. CLooG is designed to avoid control overhead and to produce a very |
| efficient code. |
| |
| It would be quite kind to refer the following paper in any publication that |
| results from the use of the CLooG software or its library: |
| |
| @example |
| @@InProceedings@{Bas04, |
| @ @ author =@ @ @ @ @{C. Bastoul@}, |
| @ @ title =@ @ @ @ @ @{Code Generation in the Polyhedral Model |
| @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ Is Easier Than You Think@}, |
| @ @ booktitle = @{PACT'13 IEEE International Conference on |
| @ @ @ @ @ @ @ @ @ @ @ @ @ @ @ Parallel Architecture and Compilation Techniques@}, |
| @ @ year =@ @ @ @ @ @ 2004, |
| @ @ pages =@ @ @ @ @ @{7--16@}, |
| @ @ month =@ @ @ @ @ @{september@}, |
| @ @ address =@ @ @ @{Juan-les-Pins@} |
| @} |
| @end example |
| |
| Copyright @copyright{} 2002-2005 C@'edric Bastoul. |
| |
| @c quotation |
| Permission is granted to copy, distribute and/or modify this document under |
| the terms of the GNU Free Documentation License, Version 1.2 |
| published by the Free Software Foundation. To receive a copy of the |
| GNU Free Documentation License, write to the Free Software Foundation, Inc., |
| 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. |
| @c end quotation |
| @end copying |
| |
| @c % /************************************************************************* |
| @c % * PART III: TITLEPAGE, CONTENTS, COPYRIGHT * |
| @c % *************************************************************************/ |
| @titlepage |
| @title CLooG |
| @subtitle A Loop Generator For Scanning Polyhedra |
| @subtitle Edition @value{EDITION}, for CLooG @value{VERSION} |
| @subtitle @value{UPDATED} |
| @author C@'edric Bastoul |
| |
| @c The following two commands start the copyright page. |
| @page |
| @noindent (September 2001) |
| @table @code |
| @item C@'edric Bastoul |
| SCHEDULES GENERATE !!! I just need to apply them now, where can I find |
| a good code generator ?! |
| |
| @item Paul Feautrier |
| Hmmm. I fear that if you want something powerful enough, you'll have to |
| write it yourself ! |
| @end table |
| |
| @vskip 0pt plus 1filll |
| @insertcopying |
| @end titlepage |
| |
| @c Output the table of contents at the beginning. |
| @contents |
| |
| @c % /************************************************************************* |
| @c % * PART IV: TOP NODE AND MASTER MENU * |
| @c % *************************************************************************/ |
| @ifnottex |
| @node Top |
| @top CLooG |
| |
| @insertcopying |
| @end ifnottex |
| |
| @menu |
| * Introduction:: |
| * CLooG Software:: |
| * CLooG Library:: |
| @c * Hacking:: |
| * Installing:: |
| * Documentation:: |
| * References:: |
| @end menu |
| |
| |
| |
| @c % /************************************************************************* |
| @c % * PART V: BODY OF THE DOCUMENT * |
| @c % *************************************************************************/ |
| |
| @c % ****************************** INTRODUCTION ****************************** |
| @node Introduction |
| @chapter Introduction |
| CLooG is a free software and library generating loops for scanning Z-polyhedra. |
| That is, it finds a code (e.g. in C, FORTRAN...) that reaches each integral |
| point of one or more parameterized polyhedra. CLooG has been originally |
| written to solve the code generation problem for optimizing compilers based on |
| the polytope model. Nevertheless it is used now in various area, e.g., to build |
| control automata for high-level synthesis or to find the best polynomial |
| approximation of a function. CLooG may help in any situation where scanning |
| polyhedra matters. It uses the best state-of-the-art code generation |
| algorithm known as the Quiller@'e et al. algorithm (@pxref{Qui00}) |
| with our own improvements and extensions (@pxref{Bas04}). |
| The user has full control on generated code quality. |
| On one hand, generated code size has to be tuned for sake of |
| readability or instruction cache use. On the other hand, we must ensure that |
| a bad control management does not hamper performance of the generated code, |
| for instance by producing redundant guards or complex loop bounds. |
| CLooG is specially designed to avoid control overhead and to produce a very |
| efficient code. |
| |
| CLooG stands for @emph{Chunky Loop Generator}: it is a part of the Chunky |
| project, a research tool for data locality improvement (@pxref{Bas03a}). |
| It is designed |
| also to be the back-end of automatic parallelizers like LooPo (@pxref{Gri04}). |
| Thus it is very |
| compilable code oriented and provides powerful program transformation |
| facilities. Mainly, it allows the user to specify very general schedules where, |
| e.g., unimodularity or invertibility of the transformation doesn't matter. |
| |
| The current version is still under |
| evaluation, and there is no guarantee that the upward compatibility |
| will be respected (but the previous API has been stable for two years, |
| we hope this one will be as successful -and we believe it-). |
| A lot of reports are necessary to freeze the library |
| API and the input file shape. Most API changes from 0.12.x to 0.14.x |
| have been requested by the users themselves. |
| Thus you are very welcome and encouraged |
| to post reports on bugs, wishes, critics, comments, suggestions or |
| successful experiences in the forum of @code{http://www.CLooG.org} |
| or to send them to cedric.bastoul@@inria.fr directly. |
| |
| @menu |
| * Basics:: |
| * Scattering:: |
| @end menu |
| |
| @node Basics |
| @section Basically, what's the point ? |
| If you want to use CLooG, this is because you want to scan or to find |
| something inside the integral points of a set of polyhedra. There are many |
| reasons for that. Maybe you need the generated code itself because it |
| actually implements a very smart program transformation you found. |
| Maybe you want to use the generated code |
| because you know that the solution of your problem belongs to the integral |
| points of those damned polyhedra and you don't know which one. Maybe you just |
| want to know if a polyhedron has integral points depending on some parameters, |
| which is the lexicographic minimum, maximum, the third on the basis of the |
| left etc. Probably you have your own reasons to use CLooG. |
| |
| Let us illustrate a basic use of CLooG. Suppose we have a set of affine |
| constraints that describes a part of a whatever-dimensional space, |
| called a @strong{domain}, and we |
| want to scan it. Let us consider for instance the following set of constraints |
| where @samp{i} |
| and @samp{j} are the unknown (the two dimensions of the space) and |
| @samp{m} and @samp{n} are the parameters (some symbolic constants): |
| @example |
| @group |
| 2<=i<=n |
| 2<=j<=m |
| j<=n+2-i |
| @end group |
| @end example |
| Let us also consider that we have a partial knowledge of the parameter values, |
| called the @strong{context}, expressed as affine constraints as well, |
| for instance: |
| @example |
| @group |
| m>=2 |
| n>=2 |
| @end group |
| @end example |
| Note that using parameters is optional, if you are not comfortable with |
| parameter manipulation, just replace them with any scalar value that fits |
| @code{m>=2} and @code{n>=2}. |
| A graphical representation of this part of the 2-dimensional space, where |
| the integral points are represented using heavy dots would be for instance: |
| |
| @image{images/basic,6cm} |
| |
| The affine constraints of both the domain and the context are what we will |
| provide to CLooG as input (in a particular shape that will be described later). |
| The output of CLooG is a pseudo-code to scan the integral points of the |
| input domain according to the context: |
| @example |
| @group |
| for (i=2;i<=n;i++) @{ |
| for (j=2;j<=min(m,-i+n+2);j++) @{ |
| S1(i,j) ; |
| @} |
| @} |
| @end group |
| @end example |
| If you felt such a basic example is yet interesting, there is a good chance |
| that CLooG is appropriate for you. CLooG can do much more: scanning several |
| polyhedra or unions of polyhedra at the same time, applying general affine |
| transformations to the polyhedra, generate compilable code etc. Welcome |
| to the CLooG's user's guide ! |
| |
| @node Scattering |
| @section Defining a Scanning Order: Scattering Functions |
| In CLooG, domains only define the set of integral points to scan and their |
| coordinates. In particular, CLooG is free to choose the scanning order for |
| generating the most efficient code. This means, for optimizing/parallelizing |
| compiler people, that CLooG doesn't make any speculation on dependences on and |
| between statements (by the way, it's not its job !). |
| For instance, if an user give to |
| CLooG only two domains @code{S1:1<=i<=n}, @code{S2:1<=i<=n} and the context |
| @code{n>=1}, the following pseudo-codes are considered to be equivalent: |
| |
| @example |
| @group |
| /* A convenient target pseudo-code. */ |
| for (i=1;i<=N;i++) @{ |
| S1(i) ; |
| @} |
| for (i=1;i<=N;i++) @{ |
| S2(i) ; |
| @} |
| @end group |
| @end example |
| |
| @example |
| @group |
| /* Another convenient target pseudo-code. */ |
| for (i=1;i<=N;i++) @{ |
| S1(i) ; |
| S2(i) ; |
| @} |
| @end group |
| @end example |
| |
| The default behaviour |
| of CLooG is to generate the second one, since it is optimized in control. |
| It is right if there are no data dependences |
| between @code{S1} and @code{S2}, but wrong otherwise. |
| |
| Thus it is often useful to force scanning to respect a given order. This can be |
| done in CLooG by using @strong{scattering functions}. Scattering is a |
| shortcut for scheduling, allocation, chunking functions and the like we can |
| find in the restructuring compilation literature. There are a lot of reasons |
| to scatter the integral points of the domains (i.e. the statement instances |
| of a program, for compilation people), parallelization or optimization are good |
| examples. For instance, if the user wants for any reason to set some |
| precedence constraints between the statements of our example above |
| in order to force the generation of the |
| first code, he can do it easily by setting (for example) the following |
| scheduling functions: |
| |
| @tex |
| $$\theta _{S1}(i) = (1)$$ |
| $$\theta _{S2}(j) = (2)$$ |
| @end tex |
| |
| @ifnottex |
| @example |
| @group |
| T_S1(i) = (1) |
| T_S2(j) = (2) |
| @end group |
| @end example |
| @end ifnottex |
| |
| This scattering means that each integral point of the domain @code{S1} |
| is scanned at logical date @code{1} while each integral point of the domain |
| @code{S2} is scanned at logical date @code{2}. As a result, the whole |
| domain @code{S1} is scanned before domain @code{S2} and the first code in our |
| example is generated. |
| |
| The user can set every kind of affine scanning order thanks to the |
| scattering functions. Each domain has its own scattering function and |
| each scattering function may be multi-dimensional. A multi-dimensional logical |
| date may be seen as classical date (year,month,day,hour,minute,etc.) where |
| the first dimensions are the most significant. Each scattering dimension |
| may depend linearly on the original dimensions (e.g., @code{i}), the |
| parameters (e.g., @code{n}) ans scalars (e.g., @code{2}). |
| |
| A very useful example of multi-dimensional scattering functions is, for |
| compilation people, the scheduling of the original program. |
| The basic data to use for code generation are statement iteration domains. |
| As we saw, these data are not sufficient to rebuild the original |
| program (what is the ordering between instances of different statements ?). |
| The missing data can be put in the scattering functions as the original |
| scheduling. The method to compute it is quite simple (@pxref{Fea92}). The idea is to |
| build an abstract syntax tree of the program and to read the scheduling for |
| each statement. For instance, let us consider the following implementation of |
| a Cholesky factorization: |
| |
| @example |
| @group |
| /* A Cholesky factorization kernel. */ |
| for (i=1;i<=N;i++) @{ |
| for (j=1;j<=i-1;j++) @{ |
| a[i][i] -= a[i][j] ; /* S1 */ |
| @} |
| a[i][i] = sqrt(a[i][i]) ; /* S2 */ |
| for (j=i+1;j<=N;j++) @{ |
| for (k=1;k<=i-1;k++) @{ |
| a[j][i] -= a[j][k]*a[i][k] ; /* S3 */ |
| @} |
| a[j][i] /= a[i][i] ; /* S4 */ |
| @} |
| @} |
| @} |
| @end group |
| @end example |
| |
| The corresponding abstract syntax tree is given in the following figure. |
| It directly gives the scattering functions (schedules) for all the |
| statements of the program. |
| |
| @image{images/tree,6cm} |
| |
| @tex |
| $$ |
| \hbox{$ \cases{ \theta _{S1}(i,j)^T &$= (0,i,0,j,0)^T$\cr |
| \theta _{S2}(i) &$= (0,i,1)^T$\cr |
| \theta _{S3}(i,j,k)^T &$= (0,i,2,j,0,k,0)^T$\cr |
| \theta _{S4}(i,j)^T &$= (0,i,2,j,1)^T$}$} |
| $$ |
| @end tex |
| |
| @ifnottex |
| @example |
| @group |
| T_S1(i,j)^T = (0,i,0,j,0)^T |
| T_S2(i) = (0,i,1)^T |
| T_S3(i,j,k)^T = (0,i,2,j,0,k,0)^T |
| T_S4(i,j)^T = (0,i,2,j,1)^T |
| @end group |
| @end example |
| @end ifnottex |
| |
| These schedules depend on the iterators and give for each instance of each |
| statement a unique execution date. Using such scattering functions allow |
| CLooG to re-generate the input code. |
| |
| |
| |
| |
| |
| @c % ***********************Using the CLooG Software ************************** |
| @node CLooG Software |
| @chapter Using the CLooG Software |
| |
| |
| @menu |
| * A First Example:: |
| * Writing The Input File:: |
| * Calling CLooG:: |
| * CLooG Options:: |
| * Full Example:: |
| @end menu |
| |
| @c %/************************************************************************* |
| @c % * A FIRST EXAMPLE * |
| @c % *************************************************************************/ |
| @node A First Example |
| @section A First Example |
| CLooG takes as input a file that must be written accordingly to a grammar |
| described in depth in a further section (@pxref{Writing The Input File}). |
| Moreover it supports many options to tune the target code presentation or |
| quality as discussed in a dedicated section (@pxref{Calling CLooG}). |
| However, a basic use |
| of CLooG is not very complex and we present in this section how to generate the |
| code corresponding to a basic example discussed earlier (@pxref{Basics}). |
| |
| The problem is to find the code that scans a 2-dimensional polyhedron |
| where @samp{i} and @samp{j} are the unknown (the two dimensions of the space) |
| and @samp{m} and @samp{n} are the parameters (the symbolic constants), |
| defined by the following set of constraints: |
| @example |
| @group |
| 2<=i<=n |
| 2<=j<=m |
| j<=n+2-i |
| @end group |
| @end example |
| @noindent We also consider a partial knowledge of the parameter values, |
| expressed thanks to the following affine constraints: |
| @example |
| @group |
| m>=2 |
| n>=2 |
| @end group |
| @end example |
| |
| An input file that corresponds to this problem, and asks for a generated |
| code in C, may be the following. Note that we do not describe here precisely |
| the structure and the components of this file (@pxref{Writing The Input File} |
| for such information, if you feel it necessary): |
| |
| @example |
| # ---------------------- CONTEXT ---------------------- |
| c # language is C |
| |
| # Context (constraints on two parameters) |
| 2 4 # 2 lines and 4 columns |
| # eq/in m n 1 eq/in: 1 for inequality >=0, 0 for equality =0 |
| 1 1 0 -2 # 1*m + 0*n -2*1 >= 0, i.e. m>=2 |
| 1 0 1 -2 # 0*m + 1*n -2*1 >= 0, i.e. n>=2 |
| |
| 1 # We want to set manually the parameter names |
| m n # parameter names |
| |
| # --------------------- STATEMENTS -------------------- |
| 1 # Number of statements |
| |
| 1 # First statement: one domain |
| # First domain |
| 5 6 # 5 lines and 6 columns |
| # eq/in i j m n 1 |
| 1 1 0 0 0 -2 # i >= 2 |
| 1 -1 0 0 1 0 # i <= n |
| 1 0 1 0 0 -2 # j >= 2 |
| 1 0 -1 1 0 0 # j <= m |
| 1 -1 -1 0 1 2 # n+2-i>=j |
| 0 0 0 # for future options |
| |
| 1 # We want to set manually the iterator names |
| i j # iterator names |
| |
| # --------------------- SCATTERING -------------------- |
| 0 # No scattering functions |
| @end example |
| |
| This file may be called @samp{basic.cloog} |
| (this example is provided in the CLooG distribution as |
| @code{test/manual_basic.cloog}) and we can ask CLooG to process it |
| and to generate the code by a simple calling to CLooG with this file as input: |
| @samp{cloog basic.cloog}. By default, CLooG will print the generated code in |
| the standard output: |
| |
| @example |
| @group |
| /* Generated by CLooG v@value{VERSION} in 0.00s. */ |
| for (i=2;i<=n;i++) @{ |
| for (j=2;j<=min(m,-i+n+2);j++) @{ |
| S1(i,j) ; |
| @} |
| @} |
| @end group |
| @end example |
| |
| @c %/************************************************************************* |
| @c % * Input file * |
| @c % *************************************************************************/ |
| @node Writing The Input File |
| @section Writing The Input File |
| The input text file contains a problem description, i.e. the context, |
| the domains and the scattering functions. |
| Because CLooG is very 'compilable code generation oriented', we can associate |
| some additional informations to each domain. We call this association a |
| @emph{statement}. The set of all informations is |
| called a @emph{program}. The input file respects the grammar below |
| (terminals are preceded by "_"): |
| |
| @example |
| File ::= Program |
| Program ::= Context Statements Scattering |
| Context ::= Language Domain_union Naming |
| Statements ::= Nb_statements Statement_list Naming |
| Scatterings ::= Nb_functions Scattering_list Naming |
| Naming ::= Option Name_list |
| Name_list ::= _String Name_list | (void) |
| Statement_list ::= Statement Statement_list | (void) |
| Domain_list ::= _Domain Domain_list | (void) |
| Scattering_list ::= Domain_union Scattering_list | (void) |
| Statement ::= Iteration_domain 0 0 0 |
| Iteration_domain ::= Domain_union |
| Domain_union ::= Nb_domains Domain_list |
| Option ::= 0 | 1 |
| Language ::= c | f |
| Nb_statements ::= _Integer |
| Nb_domains ::= _Integer |
| Nb_functions ::= _Integer |
| @end example |
| |
| Note: if there is only one domain in a @samp{Domain_union}, |
| i.e., if @samp{Nb_domains} is 1, then this 1 may be omitted. |
| |
| @itemize @bullet |
| @item @samp{Context} represents the informations that are |
| shared by all the statements. It consists on |
| the language used (which can be @samp{c} for C or @samp{f} for FORTRAN 90) |
| and the global constraints on parameters. |
| These constraints are essential |
| since they give to CLooG the number of parameters. If there is no |
| parameter or no constraints on parameters, just give a constraint |
| always satisfied like @math{1 \geq 0}. @samp{Naming} sets the parameter |
| names. |
| If the naming option @samp{Option} is 1, parameter names will be read |
| on the next line. There must be exactly as many names as parameters. |
| If the naming option @samp{Option} is 0, parameter names are |
| automatically generated. The name of the first parameter will |
| be @samp{M}, and the name of the @math{(n+1)^{th}} parameter directly |
| follows the name of the @math{n^{th}} parameter in ASCII code. |
| It is the user responsibility to ensure that parameter names, |
| iterators and scattering dimension names are different. |
| @item @samp{Statements} represents the informations on the statements. |
| @samp{Nb_statements} is the number of statements in the program, |
| i.e. the number of @samp{Statement} items in the @samp{Statement_list}. |
| @samp{Statement} represents the informations on a given statement. |
| To each statement is associated a domain |
| (the statement iteration domain: @samp{Iteration_domain}) and three |
| zeroes that represents future options. |
| @samp{Naming} sets the iterator names. If the naming option |
| @samp{Option} is 1, the iterator names |
| will be read on the next line. There must be exactly as many names as |
| nesting level in the deepest iteration domain. If the naming option |
| @samp{Option} is 0, iterator names are automatically generated. |
| The iterator name of the outermost loop will be @samp{i}, and the |
| iterator name of the loop at level @math{n+1} directly follows the |
| iterator name of the loop at level @math{n} in ASCII code. |
| @item @samp{Scatterings} represents the informations on scattering functions. |
| @samp{Nb_functions} is the number of functions (it must be |
| equal to the number of statements or 0 if there is no scattering |
| function). The functions themselves are represented through |
| @samp{Scattering_list}. |
| @samp{Naming} sets the scattering dimension names. If the naming option |
| @samp{Option} is 1, the scattering dimension names will be read on the |
| next line. |
| There must be exactly as many names as scattering dimensions. If the |
| naming option @samp{Option} is 0, scattering dimension names are automatically |
| generated. The name of the @math{n^{th}} scattering dimension |
| will be @samp{cn}. |
| @end itemize |
| |
| @menu |
| * Domain Representation:: |
| * Scattering Representation:: |
| @end menu |
| |
| @node Domain Representation |
| @subsection Domain Representation |
| As shown by the grammar, the input file describes the various informations |
| thanks to characters, integers and domains. Each domain is defined by a set of |
| constraints in the PolyLib format (@pxref{Wil93}). They have the |
| following syntax: |
| @enumerate |
| @item some optional comment lines beginning with @samp{#}, |
| @item the row and column numbers, possibly followed by comments, |
| @item the constraint rows, each row corresponds to a constraint the |
| domain have to satisfy. Each row must be on a single line and is possibly |
| followed by comments. The constraint is an equality @math{p(x) = 0} if the |
| first element is 0, an inequality @math{p(x) \geq 0} if the first element |
| is 1. The next elements are the unknown coefficients, followed by |
| the parameter coefficients. The last element is the constant factor. |
| @end enumerate |
| For instance, assuming that @samp{i}, @samp{j} and @samp{k} are iterators and |
| @samp{m} and @samp{n} are parameters, the domain defined by the following |
| constraints : |
| |
| @tex |
| $$ |
| \hbox{$ \cases{ -i + m &$\geq 0$\cr |
| -j + n &$\geq 0$\cr |
| i + j - k &$\geq 0$}$} |
| $$ |
| @end tex |
| |
| @ifnottex |
| @example |
| @group |
| -i + m >= 0 |
| -j + n >= 0 |
| i + j - k >= 0 |
| @end group |
| @end example |
| @end ifnottex |
| |
| @noindent can be written in the input file as follows : |
| |
| @example |
| @group |
| # This is the domain |
| 3 7 # 3 lines and 7 columns |
| # eq/in i j k m n 1 |
| 1 -1 0 0 1 0 0 # -i + m >= 0 |
| 1 0 -1 0 0 1 0 # -j + n >= 0 |
| 1 1 1 -1 0 0 0 # i + j - k >= 0 |
| @end group |
| @end example |
| |
| Each iteration domain @samp{Iteration_domain} of a given statement |
| is a union of polyhedra |
| @samp{Domain_union}. A union is defined by its number of elements |
| @samp{Nb_domains} and the elements themselves @samp{Domain_list}. |
| For instance, let us consider the following pseudo-code: |
| |
| @example |
| @group |
| for (i=1;i<=n;i++) @{ |
| if ((i >= m) || (i <= 2*m)) |
| S1 ; |
| for (j=i+1;j<=m;j++) |
| S2 ; |
| @} |
| @end group |
| @end example |
| |
| @noindent The iteration domain of @samp{S1} can be divided into two |
| polyhedra and written in the input file as follows: |
| |
| @example |
| @group |
| 2 # Number of polyhedra in the union |
| # First domain |
| 3 5 # 3 lines and 5 columns |
| # eq/in i m n 1 |
| 1 1 0 0 -1 # i >= 1 |
| 1 -1 0 1 0 # i <= n |
| 1 1 -1 0 0 # i >= m |
| # Second domain |
| 3 5 # 3 lines and 5 columns |
| # eq/in i m n 1 |
| 1 1 0 0 -1 # i >= 1 |
| 1 -1 0 1 0 # i <= n |
| 1 -1 2 0 0 # i <= 2*m |
| @end group |
| @end example |
| |
| @node Scattering Representation |
| @subsection Scattering Function Representation |
| Scattering functions are depicted in the input file thanks a representation |
| very close to the domain one. |
| An integer gives the number of functions @samp{Nb_functions} and each function |
| is represented by a domain. Each line of the domain corresponds to an equality |
| defining a dimension of the function. Note that at present |
| (CLooG @value{VERSION}) |
| @strong{all functions must have the same scattering dimension number}. If a |
| user wants to set scattering functions with different dimensionality, he has |
| to complete the smaller one with zeroes to reach the maximum dimensionality. |
| For instance, let us consider the following code and |
| scheduling functions: |
| |
| @example |
| @group |
| for (i=1;i<=n;i++) @{ |
| if ((i >= m) || (i <= 2*m)) |
| S1 ; |
| for (j=i+1;j<=m;j++) |
| S2 ; |
| @} |
| @end group |
| @end example |
| |
| @tex |
| $$ |
| \hbox{$ \cases{ \theta _{S1}(i) &$= (i,0)^T$\cr |
| \theta _{S2}(i,j)^T &$= (n,i+j)^T$}$} |
| $$ |
| @end tex |
| |
| @ifnottex |
| @example |
| @group |
| T_S1(i) = (i,0)^T |
| T_S2(i,j)^T = (n,i+j)^T |
| @end group |
| @end example |
| @end ifnottex |
| |
| |
| @noindent This scheduling can be written in the input file as follows: |
| |
| @example |
| @group |
| 2 # Number of scattering functions |
| # First function |
| 2 7 # 2 lines and 7 columns |
| # eq/in c1 c2 i m n 1 |
| 0 1 0 -1 0 0 0 # c1 = i |
| 0 0 1 0 0 0 0 # c2 = 0 |
| # Second function |
| 2 8 # 2 lines and 8 columns |
| # eq/in c1 c2 i j m n 1 |
| 0 1 0 0 0 0 -1 0 # c1 = n |
| 0 0 1 -1 -1 0 0 0 # c2 = i+j |
| @end group |
| @end example |
| The complete input file for the user who wants to generate the code for this |
| example with the preceding scheduling would be |
| (this file is provided in the CLooG distribution |
| as @code{test/manual_scattering.cloog}: |
| |
| @example |
| # ---------------------- CONTEXT ---------------------- |
| c # language is C |
| |
| # Context (no constraints on two parameters) |
| 1 4 # 1 lines and 4 columns |
| # eq/in m n 1 |
| 1 0 0 0 # 0 >= 0, always true |
| |
| 1 # We want to set manually the parameter names |
| m n # parameter names |
| |
| # --------------------- STATEMENTS -------------------- |
| 2 # Number of statements |
| |
| 2 # First statement: two domains |
| # First domain |
| 3 5 # 3 lines and 5 columns |
| # eq/in i m n 1 |
| 1 1 0 0 -1 # i >= 1 |
| 1 -1 0 1 0 # i <= n |
| 1 1 -1 0 0 # i >= m |
| # Second domain |
| 3 5 # 3 lines and 5 columns |
| # eq/in i m n 1 |
| 1 1 0 0 -1 # i >= 1 |
| 1 -1 0 1 0 # i <= n |
| 1 -1 2 0 0 # i <= 2*m |
| 0 0 0 # for future options |
| |
| 1 # Second statement: one domain |
| 4 6 # 4 lines and 6 columns |
| # eq/in i j m n 1 |
| 1 1 0 0 0 -1 # i >= 1 |
| 1 -1 0 0 1 0 # i <= n |
| 1 -1 1 0 0 -1 # j >= i+1 |
| 1 0 -1 1 0 0 # j <= m |
| 0 0 0 # for future options |
| |
| 1 # We want to set manually the iterator names |
| i j # iterator names |
| |
| # --------------------- SCATTERING -------------------- |
| 2 # Scattering functions |
| # First function |
| 2 7 # 2 lines and 7 columns |
| # eq/in p1 p2 i m n 1 |
| 0 1 0 -1 0 0 0 # p1 = i |
| 0 0 1 0 0 0 0 # p2 = 0 |
| # Second function |
| 2 8 # 2 lines and 8 columns |
| # eq/in p1 p2 i j m n 1 |
| 0 1 0 0 0 0 -1 0 # p1 = n |
| 0 0 1 -1 -1 0 0 0 # p2 = i+j |
| |
| 1 # We want to set manually the scattering dimension names |
| p1 p2 # scattering dimension names |
| @end example |
| |
| |
| @c %/************************************************************************* |
| @c % * Calling CLooG * |
| @c % *************************************************************************/ |
| @node Calling CLooG |
| @section Calling CLooG |
| CLooG is called by the following command: |
| @example |
| cloog [ options | file ] |
| @end example |
| The default behavior of CLooG is to read the input informations from a file and |
| to print the generated code or pseudo-code on the standard output. |
| CLooG's behavior and the output code shape is under the user control thanks |
| to many options which are detailed a further section (@pxref{CLooG Options}). |
| @code{file} is the input file. @code{stdin} is a special value: when used, |
| input is standard input. For instance, we can call CLooG to treat the |
| input file @code{basic.cloog} with default options by typing: |
| @code{cloog basic.cloog} or @code{more basic.cloog | cloog stdin}. |
| |
| @c %/************************************************************************* |
| @c % * CLooG Options * |
| @c % *************************************************************************/ |
| @node CLooG Options |
| @section CLooG Options |
| |
| @menu |
| * Last Depth to Optimize Control:: |
| * First Depth to Optimize Control:: |
| * Simplify Convex Hull:: |
| * Once Time Loop Elimination:: |
| * Equality Spreading:: |
| * First Level for Spreading:: |
| * Statement Block:: |
| * Loop Strides:: |
| * Unrolling:: |
| * Compilable Code:: |
| * Output:: |
| * OpenScop:: |
| * Help:: |
| * Version :: |
| * Quiet :: |
| @end menu |
| |
| @node Last Depth to Optimize Control |
| @subsection Last Depth to Optimize Control @code{-l <depth>} |
| |
| @code{-l <depth>}: this option sets the last loop depth to be optimized in |
| control. The higher this depth, the less control overhead. |
| For instance, with some input file, a user can generate |
| different pseudo-codes with different @code{depth} values as shown below. |
| @example |
| @group |
| /* Generated using a given input file and @strong{option -l 1} */ |
| for (i=0;i<=M;i++) @{ |
| S1 ; |
| for (j=0;j<=N;j++) @{ |
| S2 ; |
| @} |
| for (j=0;j<=N;j++) @{ |
| S3 ; |
| @} |
| S4 ; |
| @} |
| @end group |
| @end example |
| @example |
| @group |
| /* Generated using the same input file but @strong{option -l 2} */ |
| for (i=0;i<=M;i++) @{ |
| S1 ; |
| for (j=0;j<=N;j++) @{ |
| S2 ; |
| S3 ; |
| @} |
| S4 ; |
| @} |
| @end group |
| @end example |
| In this example we can see that this option can change the operation |
| execution order between statements. Let us remind that CLooG does not |
| make any speculation on dependences between statements |
| (@pxref{Scattering}). Thus if nothing (i.e. scattering functions) |
| forbids this, CLooG considers the above codes to be equivalent. |
| If there is no scattering functions, the minimum value for @code{depth} |
| is 1 (in the case of 0, the user doesn't really need a loop generator !), |
| and the number of scattering dimensions otherwise (CLooG will warn the |
| user if he doesn't respect such constraint). |
| The maximum value for depth is -1 (infinity). |
| Default value is infinity. |
| |
| @node First Depth to Optimize Control |
| @subsection First Depth to Optimize Control @code{-f <depth>} |
| |
| @code{-f <depth>}: this option sets the first loop depth to be optimized |
| in control. The lower this depth, the less control overhead (and the longer |
| the generated code). For instance, with some input file, a user |
| can generate different pseudo-codes with different @code{depth} values |
| as shown below. |
| The minimum value for @code{depth} is 1, and the |
| maximum value is -1 (infinity). |
| Default value is 1. |
| @example |
| @group |
| /* Generated using a given input file and @strong{option -f 3} */ |
| for (i=1;i<=N;i++) @{ |
| for (j=1;j<=M;j++) @{ |
| S1 ; |
| if (j >= 10) @{ |
| S2 ; |
| @} |
| @} |
| @} |
| @end group |
| @end example |
| @example |
| @group |
| /* Generated using the same input file but @strong{option -f 2} */ |
| for (i=1;i<=N;i++) @{ |
| for (j=1;j<=9;j++) @{ |
| S1 ; |
| @} |
| for (j=10;j<=M;j++) @{ |
| S1 ; |
| S2 ; |
| @} |
| @} |
| @end group |
| @end example |
| |
| @node Simple Convex Hull |
| @subsection Simple Convex Hull @code{-sh <boolean>} |
| |
| @code{-sh <boolean>}: this option enables (@code{boolean=1}) |
| or forbids (@code{boolean=0}) the use of an overapproximation |
| of the convex hull that may be easier to compute |
| (especially in the isl backend) and that may result in |
| simpler bounds. |
| This option works only for generated code without |
| code duplication (it means, you have to tune @code{-f} and |
| @code{-l} options first to generate only a loop nest with internal |
| guards). For instance, with the input file @code{test/union.cloog}, a user |
| can generate different pseudo-codes as shown below. |
| Default value is 0. |
| @example |
| @group |
| /* Generated using test/union.cloog and @strong{option -f -1 -l 2 -override} */ |
| for (i=0;i<=11;i++) @{ |
| for (j=max(0,5*i-50);j<=min(15,5*i+10);j++) @{ |
| if ((i <= 10) && (j <= 10)) @{ |
| S1 ; |
| @} |
| if ((i >= 1) && (j >= 5)) @{ |
| S2 ; |
| @} |
| @} |
| @} |
| @end group |
| @end example |
| @example |
| @group |
| /* Generated using the same input file but @strong{option -sh 1 -f -1 -l 2 -override} */ |
| for (i=0;i<=11;i++) @{ |
| for (j=0;j<=15;j++) @{ |
| if ((i <= 10) && (j <= 10)) @{ |
| S1 ; |
| @} |
| if ((i >= 1) && (j >= 5)) @{ |
| S2 ; |
| @} |
| @} |
| @} |
| @end group |
| @end example |
| |
| @node Once Time Loop Elimination |
| @subsection Once Time Loop Elimination @code{-otl <boolean>} |
| |
| @code{-otl <boolean>}: this option allows (@code{boolean=1}) or |
| forbids (@code{boolean=0}) the simplification of loops running |
| once. Default value is 1. |
| @example |
| @group |
| /* Generated using a given input file and @strong{option -otl 0} */ |
| for (j=i+1;j<=i+1;j++) @{ |
| S1 ; |
| @} |
| @end group |
| @end example |
| @example |
| @group |
| /* Generated using the same input file but @strong{option -otl 1} */ |
| j = i+1 ; |
| S1 ; |
| @end group |
| @end example |
| |
| |
| @node Equality Spreading |
| @subsection Equality Spreading @code{-esp <boolean>} |
| |
| @code{-esp <boolean>}: this option allows (@code{boolean=1}) or |
| forbids (@code{boolean=0}) values spreading when there |
| are equalities. Default value is 1. |
| @example |
| @group |
| /* Generated using a given input file and @strong{option -esp 0} */ |
| i = M+2 ; |
| j = N ; |
| for (k=i;k<=j+M;k++) @{ |
| S1 ; |
| @} |
| @end group |
| @end example |
| @example |
| @group |
| /* Generated using the same input file but @strong{option -esp 1} */ |
| for (k=M+2;k<=N+M;k++) @{ |
| S1(i = M+2, j = N) ; |
| @} |
| @end group |
| @end example |
| |
| |
| @node First Level for Spreading |
| @subsection First Level for Spreading @code{-fsp <level>} |
| |
| @code{-fsp <level>}: it can be useful to set a |
| first level to begin equality spreading. Particularly when using |
| scattering functions, the user may want to see the scattering dimension |
| values instead of spreading or hiding them. If user has set a |
| spreading, @code{level} is |
| the first level to start it. Default value is 1. |
| @example |
| @group |
| /* Generated using a given input file and @strong{option -fsp 1} */ |
| for (j=0;j<=N+M;j++) @{ |
| S1(i = N) ; |
| @} |
| for (j=0;j<=N+M;j++) @{ |
| S1(i = M) ; |
| @} |
| @end group |
| @end example |
| @example |
| @group |
| /* Generated using the same input file but @strong{option -fsp 2} */ |
| c1 = N ; |
| for (j=0;j<=c1+M;j++) @{ |
| S1(i = c1) ; |
| @} |
| c1 = M ; |
| for (j=0;j<=N+c1;j++) @{ |
| S1(i = c1) ; |
| @} |
| @end group |
| @end example |
| |
| |
| @node Statement Block |
| @subsection Statement Block @code{-block <boolean>} |
| |
| @code{-block <boolean>}: this option allows (@code{boolean=1}) to |
| create a statement block for each new iterator, even if there is only |
| an equality. This can be useful in order to parse the generated |
| pseudo-code. When @code{boolean} is set to 0 or when the generation |
| language is FORTRAN, this feature is disabled. Default value is 0. |
| @example |
| @group |
| /* Generated using a given input file and @strong{option -block 0} */ |
| i = M+2 ; |
| j = N ; |
| S1 ; |
| @end group |
| @end example |
| @example |
| @group |
| /* Generated using the same input file but @strong{option -block 1} */ |
| @{ i = M+2 ; |
| @{ j = N ; |
| S1 ; |
| @} |
| @} |
| @end group |
| @end example |
| |
| |
| @node Loop Strides |
| @subsection Loop Strides @code{-strides <boolean>} |
| |
| @code{-strides <boolean>}: this options allows (@code{boolean=1}) to |
| handle non-unit strides for loop increments. This can remove a lot of |
| guards and make the generated code more efficient. Default value is 0. |
| @example |
| @group |
| /* Generated using a given input file and @strong{option -strides 0} */ |
| for (i=1;i<=n;i++) @{ |
| if (i%2 == 0) @{ |
| S1(j = i/2) ; |
| @} |
| if (i%4 == 0) @{ |
| S2(j = i/4) ; |
| @} |
| @} |
| @end group |
| @end example |
| @example |
| @group |
| /* Generated using the same input file but @strong{option -strides 1} */ |
| for (i=2;i<=n;i+=2) @{ |
| S1(j = i/2) ; |
| if (i%4 == 0) @{ |
| S2(j = i/4) ; |
| @} |
| @} |
| @end group |
| @end example |
| |
| |
| @node Unrolling |
| @subsection First Depth to Unroll @code{-first-unroll <depth>} |
| |
| @code{-first-unroll <depth>}: this option sets the first loop depth |
| to unroll. Note that a loop is only unrolled when it is supported |
| by the backend. In case of the isl backend, a loop is unrolled |
| if it has a lower bound that can only be incremented |
| a fixed (non-parametric) amount of times. |
| |
| |
| @node Compilable Code |
| @subsection Compilable Code @code{-compilable <value>} |
| |
| @code{-compilable <value>}: this options allows (@code{value} is not 0) |
| to generate a compilable code where all parameters have the integral value |
| @code{value}. This option creates a macro for each statement. Since |
| CLooG do not know anything about the statement sources, it fills the |
| macros with a basic increment that computes the total number of |
| scanned integral points. The user may change easily the macros according |
| to his own needs. This option is possible only if the generated code is |
| in C. Default value is 0. |
| @example |
| @group |
| /* Generated using a given input file and @strong{option -compilable 0} */ |
| for (i=0;i<=n;i++) @{ |
| for (j=0;j<=n;j++) @{ |
| S1 ; |
| S2 ; |
| @} |
| S3 ; |
| @} |
| @end group |
| @end example |
| @example |
| /* Generated using the same input file but @strong{option -compilable 10} */ |
| /* DON'T FORGET TO USE -lm OPTION TO COMPILE. */ |
| |
| /* Useful headers. */ |
| #include <stdio.h> |
| #include <stdlib.h> |
| #include <math.h> |
| |
| /* Parameter value. */ |
| #define PARVAL 10 |
| |
| /* Statement macros (please set). */ |
| #define S1(i,j) @{total++;@} |
| #define S2(i,j) @{total++;@} |
| #define S3(i) @{total++;@} |
| |
| int main() @{ |
| /* Original iterators. */ |
| int i, j ; |
| /* Parameters. */ |
| int n=PARVAL, total=0 ; |
| |
| for (i=0;i<=n;i++) @{ |
| for (j=0;j<=n;j++) @{ |
| S1(i,j) ; |
| S2(i,j) ; |
| @} |
| S3(i) ; |
| @} |
| |
| printf("Number of integral points: %d.\n",total) ; |
| return 0 ; |
| @} |
| @end example |
| |
| @node Callable Code |
| @subsection Callable Code @code{-callable <boolean>} |
| |
| @code{-callable <boolean>}: if @code{boolean=1}, then a @code{test} |
| function will be generated that has the parameters as arguments. |
| Similarly to the @code{-compilable} option, |
| a macro for each statement is generated. The generated definitions of |
| these macros are as used during the correctness testing, but they |
| can easily be changed by the user to suit her own needs. |
| This option is only available if the target language is C. |
| The default value is 0. |
| |
| @example |
| /* Generated from double.cloog with @strong{option -callable 0} */ |
| for (i=0;i<=M;i++) @{ |
| S1 ; |
| for (j=0;j<=N;j++) @{ |
| S2 ; |
| S3 ; |
| @} |
| S4 ; |
| @} |
| @end example |
| @example |
| /* Generated from double.cloog with @strong{option -callable 1} */ |
| extern void hash(int); |
| |
| /* Useful macros. */ |
| #define floord(n,d) (((n)<0) ? ((n)-(d)+1)/(d) : (n)/(d)) |
| #define ceild(n,d) (((n)<0) ? (n)/(d) : ((n)+(d)+1)/(d)) |
| #define max(x,y) ((x) > (y) ? (x) : (y)) |
| #define min(x,y) ((x) < (y) ? (x) : (y)) |
| |
| #define S1(i) @{ hash(1); hash(i); @} |
| #define S2(i,j) @{ hash(2); hash(i); hash(j); @} |
| #define S3(i,j) @{ hash(3); hash(i); hash(j); @} |
| #define S4(i) @{ hash(4); hash(i); @} |
| |
| void test(int M, int N) |
| @{ |
| /* Original iterators. */ |
| int i, j; |
| for (i=0;i<=M;i++) @{ |
| S1(i) ; |
| for (j=0;j<=N;j++) @{ |
| S2(i,j) ; |
| S3(i,j) ; |
| @} |
| S4(i) ; |
| @} |
| @} |
| @end example |
| |
| @node Output |
| @subsection Output @code{-o <output>} |
| |
| @code{-o <output>}: this option sets the output file. @code{stdout} is a |
| special value: when used, output is standard output. |
| Default value is @code{stdout}. |
| |
| @node OpenScop |
| @subsection OpenScop @code{-openscop} |
| |
| @code{-openscop}: this option states that the input file complies to |
| the OpenScop specification instead of the native file format |
| (@pxref{Bas11}). This option is available only if the OpenScop |
| support has been enabled at compile time (@pxref{Optional Features}). |
| |
| @node Help |
| @subsection Help @code{--help} or @code{-h} |
| |
| @code{--help} or @code{-h}: this option ask CLooG to print a short help. |
| |
| @node Version |
| @subsection Version @code{--version} or @code{-v} |
| |
| @code{--version} or @code{-v}: this option ask CLooG to print some version |
| informations. |
| |
| @node Quiet |
| @subsection Quiet @code{--quiet} or @code{-q} |
| |
| @code{--quiet} or @code{-q}: this option tells CLooG not to print |
| any informational messages. |
| |
| |
| @c %/************************************************************************* |
| @c % * A Full Example * |
| @c % *************************************************************************/ |
| @node Full Example |
| @section A Full Example |
| |
| Let us consider the allocation problem of a Gaussian elimination, i.e. we want |
| to distribute the various statement instances of the compute kernel onto |
| different processors. The original code is the following: |
| @example |
| @group |
| for (i=1;j<=N-1;i++) @{ |
| for (j=i+1;j<=N;j++) @{ |
| c[i][j] = a[j][i]/a[i][i] ; /* S1 */ |
| for (k=i+1;k<=N;k++) @{ |
| a[j][k] -= c[i][j]*a[i][k] ; /* S2 */ |
| @} |
| @} |
| @} |
| @end group |
| @end example |
| |
| @noindent The best affine allocation functions can be found by any good automatic |
| parallelizer like LooPo (@pxref{Gri04}): |
| |
| @tex |
| $$ |
| \hbox{$ \cases{ \theta _{S1}(i,j)^T &$= (i)$\cr |
| \theta _{S2}(i,j,k)^T &$= (k)$}$} |
| $$ |
| @end tex |
| |
| @ifnottex |
| @example |
| @group |
| T_S1(i,j)^T = (i) |
| T_S2(i,j,k)^T = (k) |
| @end group |
| @end example |
| @end ifnottex |
| |
| @noindent To ensure that on each processor, the set of statement instances is |
| executed according to the original ordering, we add as minor scattering |
| dimensions the original scheduling (@pxref{Scattering}): |
| |
| @tex |
| $$ |
| \hbox{$ \cases{ \theta _{S1}(i,j)^T &$= (i,0,i,0,j,0)^T$\cr |
| \theta _{S2}(i,j,k)^T &$= (k,0,i,0,j,1,k,0)^T$}$} |
| $$ |
| @end tex |
| |
| @ifnottex |
| @example |
| @group |
| T_S1(i,j)^T = (i,0,i,0,j,0)^T |
| T_S2(i,j,k)^T = (k,0,i,0,j,1,k,0)^T |
| @end group |
| @end example |
| @end ifnottex |
| |
| @noindent To ensure that the scattering functions have the same dimensionality, we |
| complete the first function with zeroes |
| (this is a CLooG @value{VERSION} and previous versions requirement, |
| it should be removed in a future version, don't worry it's absolutely legal !): |
| |
| @tex |
| $$ |
| \hbox{$ \cases{ \theta _{S1}(i,j)^T &$= (i,0,i,0,j,0,0,0)^T$\cr |
| \theta _{S2}(i,j,k)^T &$= (k,0,i,0,j,1,k,0)^T$}$} |
| $$ |
| @end tex |
| |
| @ifnottex |
| @example |
| @group |
| T_S1(i,j)^T = (i,0,i,0,j,0,0,0)^T |
| T_S2(i,j,k)^T = (k,0,i,0,j,1,k,0)^T |
| @end group |
| @end example |
| @end ifnottex |
| |
| @noindent The input file corresponding to this code generation problem |
| could be (this file is provided in the CLooG distribution |
| as @code{test/manual_gauss.cloog}: |
| |
| @example |
| # ---------------------- CONTEXT ---------------------- |
| c # language is C |
| |
| # Context (no constraints on one parameter) |
| 1 3 # 1 line and 3 columns |
| # eq/in n 1 |
| 1 0 0 # 0 >= 0, always true |
| |
| 1 # We want to set manually the parameter name |
| n # parameter name |
| |
| # --------------------- STATEMENTS -------------------- |
| 2 # Number of statements |
| |
| 1 # First statement: one domain |
| 4 5 # 4 lines and 3 columns |
| # eq/in i j n 1 |
| 1 1 0 0 -1 # i >= 1 |
| 1 -1 0 1 -1 # i <= n-1 |
| 1 -1 1 0 -1 # j >= i+1 |
| 1 0 -1 1 0 # j <= n |
| 0 0 0 # for future options |
| |
| 1 |
| # Second statement: one domain |
| 6 6 # 6 lines and 3 columns |
| # eq/in i j k n 1 |
| 1 1 0 0 0 -1 # i >= 1 |
| 1 -1 0 0 1 -1 # i <= n-1 |
| 1 -1 1 0 0 -1 # j >= i+1 |
| 1 0 -1 0 1 0 # j <= n |
| 1 -1 0 1 0 -1 # k >= i+1 |
| 1 0 0 -1 1 0 # k <= n |
| 0 0 0 # for future options |
| |
| 0 # We let CLooG set the iterator names |
| |
| # --------------------- SCATTERING -------------------- |
| 2 # Scattering functions |
| # First function |
| 8 13 # 3 lines and 3 columns |
| # eq/in p1 p2 p3 p4 p5 p6 p7 p8 i j n 1 |
| 0 1 0 0 0 0 0 0 0 -1 0 0 0 # p1 = i |
| 0 0 1 0 0 0 0 0 0 0 0 0 0 # p2 = 0 |
| 0 0 0 1 0 0 0 0 0 -1 0 0 0 # p3 = i |
| 0 0 0 0 1 0 0 0 0 0 0 0 0 # p4 = 0 |
| 0 0 0 0 0 1 0 0 0 0 -1 0 0 # p5 = j |
| 0 0 0 0 0 0 1 0 0 0 0 0 0 # p6 = 0 |
| 0 0 0 0 0 0 0 1 0 0 0 0 0 # p7 = 0 |
| 0 0 0 0 0 0 0 0 1 0 0 0 0 # p8 = 0 |
| # Second function |
| 8 14 # 3 lines and 3 columns |
| # eq/in p1 p2 p3 p4 p5 p6 p7 p8 i j k n 1 |
| 0 1 0 0 0 0 0 0 0 0 0 -1 0 0 # p1 = k |
| 0 0 1 0 0 0 0 0 0 0 0 0 0 0 # p2 = 0 |
| 0 0 0 1 0 0 0 0 0 -1 0 0 0 0 # p3 = i |
| 0 0 0 0 1 0 0 0 0 0 0 0 0 0 # p4 = 0 |
| 0 0 0 0 0 1 0 0 0 0 -1 0 0 0 # p5 = j |
| 0 0 0 0 0 0 1 0 0 0 0 0 0 -1 # p6 = 1 |
| 0 0 0 0 0 0 0 1 0 0 0 -1 0 0 # p7 = k |
| 0 0 0 0 0 0 0 0 1 0 0 0 0 0 # p8 = 0 |
| |
| 1 # We want to set manually the scattering dimension names |
| p1 p2 p3 p4 p5 p6 p7 p8 # scattering dimension names |
| @end example |
| |
| Calling CLooG, with for instance the command line |
| @code{cloog -fsp 2 gauss.cloog} for a better view |
| of the allocation (the processor number is given by @code{p1}), |
| will result on the following target code that actually implements |
| the transformation. A minor processing on the dimension @code{p1} |
| to implement, e.g., MPI calls, which is not shown here may |
| result in dramatic speedups ! |
| |
| @example |
| if (n >= 2) @{ |
| p1 = 1 ; |
| for (p5=2;p5<=n;p5++) @{ |
| S1(i = 1,j = p5) ; |
| @} |
| @} |
| for (p1=2;p1<=n-1;p1++) @{ |
| for (p3=1;p3<=p1-1;p3++) @{ |
| for (p5=p3+1;p5<=n;p5++) @{ |
| S2(i = p3,j = p5,k = p1) ; |
| @} |
| @} |
| for (p5=p1+1;p5<=n;p5++) @{ |
| S1(i = p1,j = p5) ; |
| @} |
| @} |
| if (n >= 2) @{ |
| p1 = n ; |
| for (p3=1;p3<=n-1;p3++) @{ |
| for (p5=p3+1;p5<=n;p5++) @{ |
| S2(i = p3,j = p5,k = n) ; |
| @} |
| @} |
| @} |
| @end example |
| |
| |
| @c %/************************************************************************* |
| @c % * A Full Example * |
| @c % *************************************************************************/ |
| @node CLooG Library |
| @chapter Using the CLooG Library |
| The CLooG Library was implemented to allow the user to call CLooG |
| directly from his programs, without file accesses or system calls. The |
| user only needs to link his programs with C libraries. The CLooG |
| library mainly provides one function (@code{cloog_clast_create_from_input}) |
| which takes as input the problem |
| description with some options, and returns the data structure corresponding |
| to the generated code (a @code{struct clast_stmt} structure) |
| which is more or less an abstract syntax tree. |
| The user can work with this data structure and/or use |
| our pretty printing function to write the final code in either C or FORTRAN. |
| Some other functions are provided for convenience reasons. |
| These functions as well as the data structures are described in this section. |
| |
| @menu |
| * CLooG Data Structures:: |
| * CLooG Output:: |
| * Retrieving version information:: |
| * Example of Library Utilization:: |
| @end menu |
| |
| |
| @node CLooG Data Structures |
| @section CLooG Data Structures Description |
| In this section, we describe the data structures used by the loop |
| generator to represent and to process a code generation problem. |
| |
| @menu |
| * CloogState:: |
| * CloogMatrix:: |
| * CloogDomain:: |
| * CloogScattering:: |
| * CloogUnionDomain:: |
| * CloogStatement:: |
| * CloogOptions:: |
| * CloogInput:: |
| @end menu |
| |
| |
| @node CloogState |
| @subsection CloogState |
| @example |
| @group |
| CloogState *cloog_state_malloc(void); |
| void cloog_state_free(CloogState *state); |
| @end group |
| @end example |
| |
| @noindent The @code{CloogState} structure is (implicitly) needed to perform |
| any CLooG operation. It should be created using @code{cloog_state_malloc} |
| before any other CLooG objects are created and destroyed using |
| @code{cloog_state_free} after all objects have been freed. |
| It is allowed to use more than one @code{CloogState} structure at |
| the same time, but an object created within the state of a one |
| @code{CloogState} structure is not allowed to interact with an object |
| created within the state of an other @code{CloogState} structure. |
| |
| |
| @node CloogMatrix |
| @subsection CloogMatrix |
| |
| @noindent The @code{CloogMatrix} structure is equivalent to the PolyLib |
| @code{Matrix} data structure (@pxref{Wil93}). This structure is devoted to |
| represent a set of constraints. |
| |
| @example |
| @group |
| struct cloogmatrix |
| @{ unsigned NbRows ; /* Number of rows. */ |
| unsigned NbColumns ; /* Number of columns. */ |
| cloog_int_t **p; /* Array of pointers to the matrix rows. */ |
| cloog_int_t *p_Init; /* Matrix rows contiguously in memory. */ |
| @}; |
| typedef struct cloogmatrix CloogMatrix; |
| |
| CloogMatrix *cloog_matrix_alloc(unsigned NbRows, unsigned NbColumns); |
| void cloog_matrix_print(FILE *foo, CloogMatrix *m); |
| void cloog_matrix_free(CloogMatrix *matrix); |
| @end group |
| @end example |
| |
| @noindent The whole matrix is stored in memory row after row at the |
| @code{p_Init} address. @code{p} is an array of pointers where |
| @code{p[i]} points to the first element of the @math{i^{th}} row. |
| @code{NbRows} and @code{NbColumns} are respectively the number of |
| rows and columns of the matrix. |
| Each row corresponds to a constraint. The first element of each row is an |
| equality/inequality tag. The |
| constraint is an equality @math{p(x) = 0} if the first element is 0, but it is |
| an inequality @math{p(x) \geq 0} if the first element is 1. |
| The next elements are the coefficients of the unknowns, |
| followed by the coefficients of the parameters, and finally the constant term. |
| For instance, the following three constraints: |
| |
| @tex |
| $$ |
| \hbox{$ \cases{ -i + m &$= 0$\cr |
| -j + n &$\geq 0$\cr |
| j + i - k &$\geq 0$}$} |
| $$ |
| @end tex |
| |
| @ifnottex |
| @example |
| @group |
| -i + m = 0 |
| -j + n >= 0 |
| i + j - k >= 0 |
| @end group |
| @end example |
| @end ifnottex |
| |
| @noindent would be represented by the following rows: |
| |
| @example |
| @group |
| # eq/in i j k m n cst |
| 0 0 -1 0 1 0 0 |
| 1 -1 0 0 0 1 0 |
| 1 1 1 -1 0 0 0 |
| @end group |
| @end example |
| |
| @noindent To be able to provide different precision version (CLooG |
| supports 32 bits, 64 bits and arbitrary precision through the GMP library), |
| the @code{cloog_int_t} type depends on the configuration options (it may be |
| @code{long int} for 32 bits version, @code{long long int} for 64 bits version, |
| and @code{mpz_t} for multiple precision version). |
| |
| @node CloogDomain |
| @subsection CloogDomain |
| @example |
| @group |
| CloogDomain *cloog_domain_union_read(CloogState *state, |
| FILE *input, int nb_parameters); |
| CloogDomain *cloog_domain_from_cloog_matrix(CloogState *state, |
| CloogMatrix *matrix, int nb_par); |
| void cloog_domain_free(CloogDomain *domain); |
| @end group |
| @end example |
| |
| @noindent @code{CloogDomain} is an opaque type representing a polyhedral |
| domain (a union of polyhedra). |
| A @code{CloogDomain} can be read |
| from a file using @code{cloog_domain_union_read} or |
| converted from a @code{CloogMatrix}. |
| The input format for @code{cloog_domain_union_read} |
| is that of @ref{Domain Representation}. |
| The function @code{cloog_domain_from_cloog_matrix} takes a @code{CloogState}, a |
| @code{CloogMatrix} and @code{int} as input and returns a pointer to a |
| @code{CloogDomain}. @code{matrix} describes the domain and @code{nb_par} is the |
| number of parameters in this domain. The input data structures are neither |
| modified nor freed. |
| The @code{CloogDomain} can be freed using @code{cloog_domain_free}. |
| There are also some backend dependent functions for creating |
| @code{CloogDomain}s. |
| |
| @menu |
| * CloogDomain/PolyLib:: |
| * CloogDomain/isl:: |
| @end menu |
| |
| @node CloogDomain/PolyLib |
| @subsubsection PolyLib |
| |
| @example |
| #include <cloog/polylib/cloog.h> |
| CloogDomain *cloog_domain_from_polylib_polyhedron(CloogState *state, |
| Polyhedron *, int nb_par); |
| @end example |
| @noindent |
| The function @code{cloog_domain_from_polylib_polyhedron} takes a PolyLib |
| @code{Polyhedron} as input and returns a pointer to a @code{CloogDomain}. |
| The @code{nb_par} parameter indicates the number of parameters |
| in the domain. The input data structure if neither modified nor freed. |
| |
| @node CloogDomain/isl |
| @subsubsection isl |
| |
| @example |
| #include <cloog/isl/cloog.h> |
| CloogDomain *cloog_domain_from_isl_set(struct isl_set *set); |
| __isl_give isl_set *isl_set_from_cloog_domain(CloogDomain *domain); |
| @end example |
| @noindent |
| The function @code{cloog_domain_from_isl_set} takes a |
| @code{struct isl_set} as input and returns a pointer to a @code{CloogDomain}. |
| The function consumes a reference to the given @code{struct isl_set}. |
| Similarly, @code{isl_set_from_cloog_domain} consumes a reference |
| to a @code{CloogDomain} and returns an @code{isl_set}. |
| |
| |
| @node CloogScattering |
| @subsection CloogScattering |
| @example |
| @group |
| CloogScattering *cloog_domain_read_scattering(CloogDomain *domain, |
| FILE *foo); |
| CloogScattering *cloog_scattering_from_cloog_matrix(CloogState *state, |
| CloogMatrix *matrix, int nb_scat, int nb_par); |
| void cloog_scattering_free(CloogScattering *); |
| @end group |
| @end example |
| |
| @noindent |
| The @code{CloogScattering} type represents a scattering function. |
| A @code{CloogScattering} for a given @code{CloogDomain} can be read |
| from a file using @code{cloog_scattering_read} or converted |
| from a @code{CloogMatrix} using @code{cloog_scattering_from_cloog_matrix}. |
| The function @code{cloog_scattering_from_cloog_matrix} takes a |
| @code{CloogState}, a @code{CloogMatrix} and two @code{int}s as input and |
| returns a |
| pointer to a @code{CloogScattering}. |
| @code{matrix} describes the scattering, while @code{nb_scat} and |
| @code{nb_par} are the number of scattering dimensions and |
| the number of parameters, respectively. The input data structures are |
| neither modified nor freed. |
| A @code{CloogScattering} can be freed using @code{cloog_scattering_free}. |
| There are also some backend dependent functions for creating |
| @code{CloogScattering}s. |
| |
| @menu |
| * CloogScattering/PolyLib:: |
| * CloogScattering/isl:: |
| @end menu |
| |
| @node CloogScattering/PolyLib |
| @subsubsection PolyLib |
| |
| @example |
| #include <cloog/polylib/cloog.h> |
| CloogScattering *cloog_scattering_from_polylib_polyhedron( |
| CloogState *state, Polyhedron *polyhedron, int nb_par); |
| @end example |
| @noindent |
| The function @code{cloog_scattering_from_polylib_polyhedron} takes a PolyLib |
| @code{Polyhedron} as input and returns a pointer to a @code{CloogScattering}. |
| The @code{nb_par} parameter indicates the number of parameters |
| in the domain. The input data structure if neither modified nor freed. |
| |
| @node CloogScattering/isl |
| @subsubsection isl |
| |
| @example |
| #include <cloog/isl/cloog.h> |
| CloogScattering *cloog_scattering_from_isl_map(struct isl_map *map); |
| @end example |
| @noindent |
| The function @code{cloog_scattering_from_isl_map} takes a |
| @code{struct isl_map} as input and returns a pointer to a @code{CloogScattering}. |
| The output dimensions of the @code{struct isl_map} correspond to the |
| scattering dimensions, while the input dimensions correspond to the |
| domain dimensions. |
| The function consumes a reference to the given @code{struct isl_map}. |
| |
| |
| @node CloogUnionDomain |
| @subsection CloogUnionDomain |
| @example |
| @group |
| enum cloog_dim_type @{ CLOOG_PARAM, CLOOG_ITER, CLOOG_SCAT @}; |
| |
| CloogUnionDomain *cloog_union_domain_alloc(int nb_par); |
| CloogUnionDomain *cloog_union_domain_add_domain(CloogUnionDomain *ud, |
| const char *name, CloogDomain *domain, |
| CloogScattering *scattering, void *usr); |
| CloogUnionDomain *cloog_union_domain_set_name(CloogUnionDomain *ud, |
| enum cloog_dim_type type, int index, const char *name); |
| void cloog_union_domain_free(CloogUnionDomain *ud); |
| @end group |
| @end example |
| |
| @noindent A @code{CloogUnionDomain} structure represents a union |
| of scattered named domains. A @code{CloogUnionDomain} is |
| initialized by a call to @code{cloog_union_domain_alloc}, |
| after which domains can be added using @code{cloog_union_domain_add_domain}. |
| |
| @code{cloog_union_domain_alloc} takes the number of parameters as input. |
| @code{cloog_union_domain_add_domain} takes a previously created |
| @code{CloogUnionDomain} as input along with an optional name, |
| a domain, an optional scattering function and a user pointer. |
| The name may be @code{NULL} and is duplicated if it is not. |
| If no name is specified, then the statements will be named according |
| to the order in which they were added. |
| @code{domain} and @code{scattering} are taken over |
| by the @code{CloogUnionDomain}. @code{scattering} may be @code{NULL}, |
| but it must be consistently @code{NULL} or not over all calls |
| to @code{cloog_union_domain_add_domain}. |
| @code{cloog_union_domain_set_name} can be used to set the names |
| of parameters, iterators and scattering dimensions. |
| The names of iterators and scattering dimensions can only be set |
| after all domains have been added. |
| |
| There is also a backend dependent function for creating |
| @code{CloogUnionDomain}s. |
| |
| @menu |
| * CloogUnionDomain/isl:: |
| @end menu |
| |
| @node CloogUnionDomain/isl |
| @subsubsection isl |
| |
| @example |
| #include <cloog/isl/cloog.h> |
| CloogUnionDomain *cloog_union_domain_from_isl_union_map( |
| __isl_take isl_union_map *umap); |
| CloogUnionDomain *cloog_union_domain_from_isl_set( |
| __isl_take isl_set *set); |
| @end example |
| @noindent |
| The function @code{cloog_union_domain_from_isl_union_map} takes a |
| @code{isl_union_map} as input and returns a pointer |
| to a @code{CloogUnionDomain}. |
| The input is a mapping from different |
| spaces (different tuple names and possibly different dimensions) |
| to a common space. The iteration domains are set to the domains |
| in each space. The statement names are set to the names of the |
| spaces. The parameter names of the result are set to those of |
| the input, but the iterator and scattering dimension names are |
| left unspecified. |
| The function consumes a reference to the given @code{isl_union_map}. The |
| function @code{cloog_union_domain_from_isl_set} is similar, but takes an |
| unscattered domain as input. It is not defined for an union_set, because the |
| order of iterations from two different isl_sets is undefined, if no scattering |
| is provided. |
| |
| |
| @node CloogStatement |
| @subsection CloogStatement |
| @example |
| @group |
| struct cloogstatement |
| @{ int number ; /* The statement unique number. */ |
| char *name; /* Name of the statement. */ |
| void * usr ; /* Pointer for user's convenience. */ |
| struct cloogstatement * next ;/* Next element of the linked list. */ |
| @} ; |
| typedef struct cloogstatement CloogStatement ; |
| |
| CloogStatement *cloog_statement_malloc(CloogState *state); |
| void cloog_statement_print(FILE *, CloogStatement *); |
| void cloog_statement_free(CloogStatement *); |
| @end group |
| @end example |
| |
| @noindent The @code{CloogStatement} structure represents a @code{NULL} |
| terminated linked |
| list of statements. In CLooG, a statement is only defined by its unique |
| number (@code{number}). The user can use the pointer @code{usr} for his |
| own convenience to link his own statement representation to the |
| corresponding @code{CloogStatement} structure. The whole management of the |
| @code{usr} pointer is under the responsibility of the user, in particular, |
| CLooG never tries to print, to allocate or to free a memory block pointed |
| by @code{usr}. |
| |
| |
| |
| @node CloogOptions |
| @subsection CloogOptions |
| @example |
| @group |
| struct cloogoptions |
| @{ int l; /* -l option. */ |
| int f; /* -f option. */ |
| int strides; /* -strides option. */ |
| int sh; /* -sh option. */ |
| int first_unroll; /* -first-unroll option. */ |
| int esp; /* -esp option. */ |
| int fsp; /* -fsp option. */ |
| int otl; /* -otl option. */ |
| int block; /* -block option. */ |
| int compilable; /* -compilable option. */ |
| int language; /* CLOOG_LANGUAGE_C or CLOOG_LANGUAGE_FORTRAN */ |
| int save_domains; /* Save unsimplified copy of domain. */ |
| @} ; |
| typedef struct cloogoptions CloogOptions ; |
| |
| CloogOptions *cloog_options_malloc(CloogState *state); |
| void cloog_options_print(FILE *foo, CloogOptions *options); |
| void cloog_options_free(CloogOptions *options); |
| @end group |
| @end example |
| |
| @noindent The @code{CloogOptions} structure contains all the possible options to |
| rule CLooG's behaviour (@pxref{Calling CLooG}). |
| As a reminder, the default values are: |
| @itemize @bullet |
| @item @math{l = -1} (optimize control until the innermost loops), |
| @item @math{f = 1} (optimize control from the outermost loops), |
| @item @math{strides = 0} (use only unit strides), |
| @item @math{sh = 0} (do not compute simple convex hulls), |
| @item @math{first\_unroll = -1} (do not perform unrolling), |
| @item @math{esp = 1} (spread complex equalities), |
| @item @math{fsp = 1} (start to spread from the first iterators), |
| @item @math{otl = 1} (simplify loops running only once). |
| @item @math{block = 0} (do not make statement blocks when not necessary). |
| @item @math{compilable = 0} (do not generate a compilable code). |
| @end itemize |
| |
| The @code{save_domains} option is only useful for users of the CLooG |
| library. This option defaults to 0, but when it is set, the @code{domain} |
| field of each @code{clast_user_stmt} will be set to the set of values for the |
| scattering dimensions for which this instance of the user statement is executed. |
| The @code{domain} field of each @code{clast_for} contains the set of values for |
| the scattering dimensions for which an instance of a user statement is executed |
| inside the @code{clast_for}. It is only available if the @code{clast_for} |
| enumerates a scattering dimension. |
| |
| @node CloogInput |
| @subsection CloogInput |
| @example |
| @group |
| CloogInput *cloog_input_read(FILE *file, CloogOptions *options); |
| CloogInput *cloog_input_alloc(CloogDomain *context, |
| CloogUnionDomain *ud); |
| void cloog_input_free(CloogInput *input); |
| |
| void cloog_input_dump_cloog(FILE *, CloogInput *, CloogOptions *); |
| @end group |
| @end example |
| |
| @noindent A @code{CloogInput} structure represents the input to CLooG. |
| It is essentially a @code{CloogUnionDomain} along with a context |
| @code{CloogDomain}. A @code{CloogInput} can be created from |
| a @code{CloogDomain} and a @code{CloogUnionDomains} using |
| @code{cloog_input_alloc}, or it can be read from a CLooG input |
| file using @code{cloog_input_read}. The latter also modifies |
| the @code{language} field of the @code{CloogOptions} structure. |
| The constructed @code{CloogInput} can be used as input |
| to a @code{cloog_clast_create_from_input} call. |
| |
| A @code{CloogInput} data structure and a @code{CloogOptions} contain |
| the same information as a .cloog file. This function dumps the .cloog |
| description of the given data structures into a file. |
| |
| @node Dump CLooG Input File Function |
| @subsection Dump CLooG Input File Function |
| @example |
| @end example |
| |
| @node CLooG Output |
| @section CLooG Output |
| |
| @noindent |
| Given a description of the input, |
| an AST corresponding to the @code{CloogInput} can be constructed |
| using @code{cloog_clast_create_from_input} and destroyed using |
| @code{free_clast_stmt}. |
| @example |
| struct clast_stmt *cloog_clast_create_from_input(CloogInput *input, |
| CloogOptions *options); |
| void free_clast_stmt(struct clast_stmt *s); |
| @end example |
| @noindent |
| @code{clast_stmt} represents a linked list of ``statements''. |
| @example |
| struct clast_stmt @{ |
| const struct clast_stmt_op *op; |
| struct clast_stmt *next; |
| @}; |
| @end example |
| @noindent |
| The entries in the list are not of type @code{clast_stmt} itself, |
| but of some larger type. The following statement types are defined |
| by CLooG. |
| |
| @example |
| struct clast_root @{ |
| struct clast_stmt stmt; |
| CloogNames * names; |
| @}; |
| struct clast_root *new_clast_root(CloogNames *names); |
| |
| struct clast_assignment @{ |
| struct clast_stmt stmt; |
| const char * LHS; |
| struct clast_expr * RHS; |
| @}; |
| struct clast_assignment *new_clast_assignment(const char *lhs, |
| struct clast_expr *rhs); |
| |
| struct clast_block @{ |
| struct clast_stmt stmt; |
| struct clast_stmt * body; |
| @}; |
| struct clast_block *new_clast_block(void); |
| |
| struct clast_user_stmt @{ |
| struct clast_stmt stmt; |
| CloogDomain * domain; |
| CloogStatement * statement; |
| struct clast_stmt * substitutions; |
| @}; |
| struct clast_user_stmt *new_clast_user_stmt(CloogDomain *domain, |
| CloogStatement *stmt, struct clast_stmt *subs); |
| |
| struct clast_for @{ |
| struct clast_stmt stmt; |
| CloogDomain * domain; |
| const char * iterator; |
| struct clast_expr * LB; |
| struct clast_expr * UB; |
| cloog_int_t stride; |
| struct clast_stmt * body; |
| @}; |
| struct clast_for *new_clast_for(CloogDomain *domain, const char *it, |
| struct clast_expr *LB, struct clast_expr *UB, |
| cloog_int_t stride); |
| |
| struct clast_guard @{ |
| struct clast_stmt stmt; |
| struct clast_stmt * then; |
| int n; |
| struct clast_equation eq[1]; |
| @}; |
| struct clast_guard *new_clast_guard(int n); |
| @end example |
| @noindent |
| The @code{clast_stmt} returned by @code{cloog_clast_create} |
| is a @code{clast_root}. |
| It contains a placeholder for all the variable names that appear |
| in the AST and a (list of) nested statement(s). |
| |
| @noindent |
| A @code{clast_assignment} assigns the value given by |
| the @code{clast_expr} @code{RHS} to a variable named @code{LHS}. |
| |
| @noindent |
| A @code{clast_block} groups a list of statements into one statement. |
| These statements are only generated if the @code{block} option is set, |
| @pxref{Statement Block} and @ref{CloogOptions}. |
| |
| @noindent |
| A @code{clast_user_stmt} represents a call to a statement specified |
| by the user, @pxref{CloogStatement}. |
| @code{substitutions} is a list of @code{clast_assignment} statements |
| assigning an expression in terms of the scattering dimensions to |
| each of the original iterators in the original order. |
| The @code{LHS}s of these assignments are left blank (@code{NULL}). |
| The @code{domain} is set to @code{NULL} if the @code{save_domains} option |
| is not set. Otherwise, it is set to the set |
| of values for the scattering dimensions |
| for which this instance of the user statement is executed. |
| Note that unless the @code{noscalars} option has been set, the |
| constant scattering dimensions may have been removed from this set. |
| |
| @noindent |
| A @code{clast_for} represents a for loop, iterating @code{body} for each |
| value of @code{iterator} between @code{LB} and @code{UB} in steps |
| of size @code{stride}. |
| The @code{domain} is set to @code{NULL} if the @code{save_domains} option is not |
| set. Otherwise, it is set to the set of values for the scattering dimensions |
| for which a user statement is executed inside this @code{clast_for}. Note that |
| unless the @code{noscalars} option has been set, the constant scattering |
| dimensions may have been removed from this set. |
| |
| @noindent |
| A @code{clast_guard} represents the guarded execution of the @code{then} |
| (list of) statement(s) by a conjunction of @code{n} (in)equalities. |
| Each (in)equality is represented by a @code{clast_equation}. |
| @example |
| struct clast_equation @{ |
| struct clast_expr * LHS; |
| struct clast_expr * RHS; |
| int sign; |
| @}; |
| @end example |
| @noindent |
| The condition expressed by a @code{clast_equation} is |
| @code{LHS <= RHS}, @code{LHS == RHS} or @code{LHS >= RHS} |
| depending on whether @code{sign} is less than zero, equal |
| to zero, or greater than zero. |
| |
| The dynamic type of a @code{clast_stmt} can be determined |
| using the macro @code{CLAST_STMT_IS_A(stmt,type)}, |
| where @code{stmt} is a pointer to a @code{clast_stmt} |
| and @code{type} is one of @code{stmt_root}, @code{stmt_ass}, |
| @code{stmt_user}, @code{stmt_block}, @code{stmt_for} or |
| @code{stmt_guard}. |
| Users are allowed to define their own statement types by |
| assigning the @code{op} field of the statements a pointer |
| to a @code{clast_stmt_op} structure. |
| @example |
| struct clast_stmt_op @{ |
| void (*free)(struct clast_stmt *); |
| @}; |
| @end example |
| @noindent |
| The @code{free} field of this structure should point |
| to a function that frees the user defined statement. |
| |
| @noindent |
| A @code{clast_expr} can be an identifier, a term, |
| a binary expression or a reduction. |
| @example |
| enum clast_expr_type @{ |
| clast_expr_name, |
| clast_expr_term, |
| clast_expr_bin, |
| clast_expr_red |
| @}; |
| struct clast_expr @{ |
| enum clast_expr_type type; |
| @}; |
| void free_clast_expr(struct clast_expr *e); |
| @end example |
| |
| @noindent |
| Identifiers are of subtype @code{clast_name}. |
| @example |
| struct clast_name @{ |
| struct clast_expr expr; |
| const char * name; |
| @}; |
| struct clast_name *new_clast_name(const char *name); |
| void free_clast_name(struct clast_name *t); |
| @end example |
| @noindent |
| The character string pointed to by @code{name} is |
| assumed to be part of the @code{CloogNames} structure |
| in the root of the clast as is therefore not copied. |
| |
| @noindent |
| Terms are of type @code{clast_term}. |
| @example |
| struct clast_term @{ |
| struct clast_expr expr; |
| cloog_int_t val; |
| struct clast_expr *var; |
| @}; |
| struct clast_term *new_clast_term(cloog_int_t c, struct clast_expr *v); |
| void free_clast_term(struct clast_term *t); |
| @end example |
| @noindent |
| If @code{var} is set to @code{NULL}, then the term represents |
| the integer value @code{val}. Otherwise, it represents |
| the term @code{val * var}. |
| @code{new_clast_term} simply copies the @code{v} pointer |
| without copying the underlying @code{clast_expr}. |
| @code{free_clast_term}, on the other hand, recursively frees |
| @code{var}. |
| |
| @noindent |
| Binary expressions are of type @code{clast_bin_type} and |
| represent either the floor of a division (fdiv), |
| the ceil of a division (cdiv), an exact division or |
| the remainder of an fdiv. |
| @example |
| enum clast_bin_type @{ clast_bin_fdiv, clast_bin_cdiv, |
| clast_bin_div, clast_bin_mod @}; |
| struct clast_binary @{ |
| struct clast_expr expr; |
| enum clast_bin_type type; |
| struct clast_expr* LHS; |
| cloog_int_t RHS; |
| @}; |
| struct clast_binary *new_clast_binary(enum clast_bin_type t, |
| struct clast_expr *lhs, cloog_int_t rhs); |
| void free_clast_binary(struct clast_binary *b); |
| @end example |
| |
| @noindent |
| Reductions are of type @code{clast_reduction} and |
| can represent either the sum, the minimum or the maximum |
| of its elements. |
| @example |
| enum clast_red_type @{ clast_red_sum, clast_red_min, clast_red_max @}; |
| struct clast_reduction @{ |
| struct clast_expr expr; |
| enum clast_red_type type; |
| int n; |
| struct clast_expr* elts[1]; |
| @}; |
| struct clast_reduction *new_clast_reduction(enum clast_red_type t, |
| int n); |
| void free_clast_reduction(struct clast_reduction *r); |
| @end example |
| |
| @node Retrieving version information |
| @section Retrieving version information |
| CLooG provides static and dynamic version checks to assist on |
| including a compatible version of the library. |
| A static version check at compile time can be achieved by |
| querying the version constants defined in @code{version.h}: |
| |
| @itemize @bullet |
| @item @code{CLOOG_VERSION_MAJOR} |
| @item @code{CLOOG_VERSION_MINOR} |
| @item @code{CLOOG_VERSION_REVISION} |
| @end itemize |
| |
| This way it is possible to ensure the included headers are of the |
| correct version. It is still possible that the installed CLooG |
| library version differs from the installed headers. |
| In order to avoid this, a dynamic version check is provided with |
| the functions: |
| |
| @example |
| @group |
| int cloog_version_major(void); |
| int cloog_version_minor(void); |
| int cloog_version_revision(void); |
| @end group |
| @end example |
| |
| By using both the static and the dynamic version check, it is possible |
| to match CLooG's header version with the library's version. |
| |
| @node Example of Library Utilization |
| @section Example of Library Utilization |
| Here is a basic example showing how it is possible to use the CLooG library, |
| assuming that a standard installation has been done. |
| The following C program reads a CLooG input file on the standard input, |
| then prints the solution on the standard output. |
| Options are preselected to the default values of the CLooG software. |
| This example is provided in the @code{example} directory of the |
| CLooG distribution. |
| @example |
| /* example.c */ |
| # include <stdio.h> |
| # include <cloog/cloog.h> |
| |
| int main() |
| @{ |
| CloogState *state; |
| CloogInput *input; |
| CloogOptions * options ; |
| struct clast_stmt *root; |
| |
| /* Setting options and reading program informations. */ |
| state = cloog_state_malloc(); |
| options = cloog_options_malloc(state); |
| input = cloog_input_read(stdin, options); |
| |
| /* Generating and printing the code. */ |
| root = cloog_clast_create_from_input(input, options); |
| clast_pprint(stdout, root, 0, options); |
| |
| cloog_clast_free(root); |
| cloog_options_free(options) ; |
| cloog_state_free(state); |
| return 0; |
| @} |
| @end example |
| |
| @noindent The compilation command could be: |
| @example |
| gcc example.c -lcloog -o example |
| @end example |
| @noindent A calling command with the input file test.cloog could be: |
| @example |
| more test.cloog | ./example |
| @end example |
| |
| |
| @c % ******************************** HACKING ********************************* |
| @c @node Hacking |
| @c @chapter Hacking CLooG |
| |
| @c @menu |
| @c * Program organization:: |
| @c * Special Options:: |
| @c * CLooG Coding Standards:: |
| @c @end menu |
| |
| @c @node Program organization |
| @c @section Program organization |
| |
| @c @node Special Options |
| @c @section Special Options |
| |
| @c @node CLooG Coding Standards |
| @c @section CLooG Coding Standards |
| |
| |
| @c % ****************************** INSTALLING ******************************** |
| @node Installing |
| @chapter Installing CLooG |
| |
| @menu |
| * License:: |
| * Requirements:: |
| * Basic Installation:: |
| * Optional Features:: |
| * Uninstallation:: |
| @end menu |
| |
| @node License |
| @section License |
| First of all, it would be very kind to refer the following paper in any |
| publication that result from the use of the CLooG software or its library, |
| @pxref{Bas04} (a bibtex entry is provided behind the title page of this |
| manual, along with copyright notice, and in the CLooG home |
| @code{http://www.CLooG.org}. |
| |
| This library is free software; you can redistribute it and/or |
| modify it under the terms of the GNU Lesser General Public |
| License as published by the Free Software Foundation; either |
| version 2.1 of the License, or (at your option) any later version. |
| This library is distributed in the hope that it will be useful, |
| but WITHOUT ANY WARRANTY; without even the implied warranty of |
| MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU |
| Lesser General Public License for more details. |
| @code{http://www.gnu.org/licenses/lgpl-2.1.html} |
| |
| Note, though, that if you link CLooG against a GPL library such |
| as the PolyLib backend, then the combination becomes GPL too. |
| In particular, a CLooG library based on the PolyLib backend |
| is GPL version 2 only. |
| Since the isl backend is LGPL, linking against it does not affect |
| the license of CLooG. |
| |
| |
| @node Requirements |
| @section Requirements |
| |
| CLooG can be used with one of two possible backends, |
| one using isl and one using PolyLib. |
| The isl library is included in the CLooG distribution, |
| while the PolyLib library needs to be obtained separately. |
| On the other hand, isl requires GMP, while PolyLib can be |
| compiled with or without the use of GMP. |
| The user therefore needs to install at least one of |
| PolyLib or GMP. |
| |
| @menu |
| * PolyLib:: |
| * GMP Library:: |
| @end menu |
| |
| |
| @node PolyLib |
| @subsection PolyLib (optional) |
| To successfully install CLooG with the PolyLib backend, |
| the user first needs to install PolyLib |
| version 5.22.1 or above (default 64 bits version is satisfying |
| as well as 32 bits or GMP multiple precision version). |
| Polylib can be downloaded freely |
| at @code{http://icps.u-strasbg.fr/PolyLib/} or |
| @code{http://www.irisa.fr/polylib/}. Once downloaded and unpacked |
| (e.g. using the @samp{tar -zxvf polylib-5.22.3.tar.gz} command), |
| the user can compile |
| it by typing the following commands on the PolyLib's root directory: |
| |
| @itemize @bullet |
| @item @code{./configure} |
| @item @code{make} |
| @item And as root: @code{make install} |
| @end itemize |
| |
| Alternatively, the latest development version can be obtained from the |
| git repository: |
| @itemize @bullet |
| @item @code{git clone git://repo.or.cz/polylib.git} |
| @item @code{cd polylib} |
| @item @code{./autogen.sh} |
| @item @code{./configure} |
| @item @code{make} |
| @item And as root: @code{make install} |
| @end itemize |
| |
| The PolyLib default installation is @code{/usr/local}. This directory may |
| not be inside your library path. To fix the problem, the user should set |
| @example |
| export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib |
| @end example |
| @noindent if your shell is, e.g., bash or |
| @example |
| setenv LD_LIBRARY_PATH $LD_LIBRARY_PATH:/usr/local/lib |
| @end example |
| @noindent if your shell is, e.g., tcsh. Add the line to your .bashrc or .tcshrc (or |
| whatever convenient file) to make this change permanent. Another solution |
| is to ask PolyLib to install in the standard path by using the prefix |
| option of the configure script: |
| @samp{./configure --prefix=/usr}. |
| |
| CLooG makes intensive calls to polyhedral operations, and PolyLib |
| functions do the job. Polylib is a free library written in C for the |
| manipulation of polyhedra. The library is operating on objects like |
| vectors, matrices, lattices, polyhedra, Z-polyhedra, unions of |
| polyhedra and a lot of other intermediary structures. It provides |
| functions for all the important operations on these structures. |
| |
| @node GMP Library |
| @subsection GMP Library (optional) |
| |
| To be able to deal with insanely large coefficient, the user will need to |
| install the GNU Multiple Precision Library (GMP for short) version 4.1.4 |
| or above. It can be freely downloaded from @code{http://www.swox.com/gmp}. |
| Note that the isl backend currently requires GMP. |
| The user can compile GMP by typing the following commands on the GMP root |
| directory: |
| |
| @itemize @bullet |
| @item @code{./configure} |
| @item @code{make} |
| @item And as root: @code{make install} |
| @end itemize |
| |
| The GMP default installation is @code{/usr/local}, the same method to |
| fix a library path problem applies as with PolyLib (@pxref{PolyLib}). |
| |
| If you want to use the PolyLib backend, then |
| PolyLib has to be built using the GMP library by specifying the option |
| @samp{--with-libgmp=PATH_TO_GMP} to the PolyLib configure script |
| (where @code{PATH_TO_GMP} is @code{/usr/local} if you did not change the GMP |
| installation directory). Then you have to set the convenient CLooG configure |
| script options to build the GMP version (@pxref{Optional Features}). |
| |
| |
| @node Basic Installation |
| @section CLooG Basic Installation |
| |
| Once downloaded and unpacked |
| (e.g. using the @samp{tar -zxvf cloog-@value{VERSION}.tar.gz} command), |
| you can compile CLooG by typing the following commands on the CLooG's root |
| directory: |
| |
| @itemize @bullet |
| @item @code{./configure} |
| @item @code{make} |
| @item And as root: @code{make install} |
| @end itemize |
| |
| Alternatively, the latest development version can be obtained from the |
| git repository: |
| @itemize @bullet |
| @item @code{git clone git://repo.or.cz/cloog.git} |
| @item @code{cd cloog} |
| @item @code{./get_submodules.sh} |
| @item @code{./autogen.sh} |
| @item @code{./configure} |
| @item @code{make} |
| @item And as root: @code{make install} |
| @end itemize |
| |
| Depending on which backend you want to use and where they |
| are located, you may need to pass some |
| options to the configure script, @pxref{Optional Features}. |
| |
| The program binaries and object files can be removed from the |
| source code directory by typing @code{make clean}. To also remove the |
| files that the @code{configure} script created (so you can compile the |
| package for a different kind of computer) type @code{make distclean}. |
| |
| Both the CLooG software and library have been successfully compiled |
| on the following systems: |
| @itemize @bullet |
| @item PC's under Linux, with the @code{gcc} compiler, |
| @item PC's under Windows (Cygwin), with the @code{gcc} compiler, |
| @item Sparc and UltraSparc Stations, with the @code{gcc} compiler. |
| @end itemize |
| |
| @node Optional Features |
| @section Optional Features |
| The @code{configure} shell script attempts to guess correct values for |
| various system-dependent variables and user options used during compilation. |
| It uses those values to create the @code{Makefile}. Various user options |
| are provided by the CLooG's configure script. They are summarized in the |
| following list and may be printed by typing @code{./configure --help} in the |
| CLooG top-level directory. |
| |
| @itemize @bullet |
| @item By default, the installation directory is @code{/usr/local}: |
| @code{make install} will install the package's files in |
| @code{/usr/local/bin}, @code{/usr/local/lib} and @code{/usr/local/include}. |
| The user can specify an installation prefix other than @code{/usr/local} by |
| giving @code{configure} the option @code{--prefix=PATH}. |
| |
| @item By default, the isl backend will use the version of isl |
| that is @code{bundled} together with CLooG. |
| Using the @code{--with-isl} option of @code{configure} |
| the user can specify that @code{no} isl, |
| a previously installed (@code{system}) isl or a @code{build} isl |
| should be used. |
| In the latter case, the user should also specify the build location |
| using @code{--with-isl-builddir=PATH}. |
| In case of an installed isl, |
| the installation location can be specified using the |
| @code{--with-isl-prefix=PATH} and |
| @code{--with-isl-exec-prefix=PATH} options of @code{configure}. |
| |
| @item By default, the PolyLib backend will use an installed |
| (@code{system}) PolyLib, if any. |
| The installation location can be specified using the |
| @code{--with-polylib-prefix=PATH} and |
| @code{--with-polylib-exec-prefix=PATH} options of @code{configure}. |
| Using the @code{--with-polylib} option of @code{configure} |
| the user can specify that @code{no} PolyLib or a @code{build} PolyLib |
| should be used. |
| In the latter case, the user should also specify the build location |
| using @code{--with-polylib-builddir=PATH}. |
| |
| @item By default, the PolyLib backend of CLooG is built |
| in 64bits version if such version of the |
| PolyLib is found by @code{configure}. If the only existing version of the |
| PolyLib is the 32bits or if the user give to @code{configure} the option |
| @code{--with-bits=32}, the 32bits version of CLooG will be compiled. In the |
| same way, the option @code{--with-bits=gmp} have to be used to build |
| the multiple precision version. |
| |
| @item By default, @code{configure} will look for the GMP library |
| (necessary to build the multiple precision version) in standard |
| locations. If necessary, the user can specify the GMP path by giving |
| @code{configure} the option @code{--with-gmp-prefix=PATH} and/or |
| @code{--with-gmp-exec-prefix=PATH}. |
| |
| @item By default, the OpenScop Library (osl) support is not enabled. |
| @c @code{configure} will use the bundled OpenScop Library (osl). |
| Using the @code{--with-osl} option of @code{configure} |
| the user can specify that @code{no} osl, |
| a previously installed (@code{system}) osl, a @code{bundled} osl, or a |
| @code{build} osl should be used. |
| In the latter case, the user should also specify the build location |
| using @code{--with-osl-builddir=PATH}. |
| In case of an installed osl, |
| the installation location can be specified using the |
| @code{--with-osl-prefix=PATH} and |
| @code{--with-osl-exec-prefix=PATH} options of @code{configure}. |
| @end itemize |
| |
| @node Uninstallation |
| @section Uninstallation |
| The user can easily remove the CLooG software and library from his system |
| by typing (as root if necessary) from the CLooG top-level directory |
| @code{make uninstall}. |
| |
| @c % **************************** DOCUMENTATION ****************************** |
| @node Documentation |
| @chapter Documentation |
| The CLooG distribution provides several documentation sources. First, the |
| source code itself is as documented as possible. The code comments use a |
| Doxygen-compatible presentation (something similar to what JavaDoc does for |
| JAVA). The user may install Doxygen |
| (see @code{http://www.stack.nl/~dimitri/doxygen}) to automatically |
| generate a technical documentation by typing @code{make doc} or |
| @code{doxygen ./autoconf/Doxyfile} at the CLooG top-level directory after |
| running the configure script (@pxref{Installing}). Doxygen will generate |
| documentation sources (in HTML, LaTeX and man) in the @code{doc/source} |
| directory of the CLooG distribution. |
| |
| The Texinfo sources of the present document are also provided in the @code{doc} |
| directory. You can build it in either DVI format (by typing |
| @code{texi2dvi cloog.texi}) or PDF format |
| (by typing @code{texi2pdf cloog.texi}) or HTML format |
| (by typing @code{makeinfo --html cloog.texi}, using @code{--no-split} |
| option to generate a single HTML file) or info format |
| (by typing @code{makeinfo cloog.texi}). |
| |
| @c % ****************************** REFERENCES ******************************** |
| @node References |
| @chapter References |
| |
| @itemize |
| @item |
| @anchor{Bas03a}[Bas03a] C. Bastoul, P. Feautrier. Improving data locality |
| by chunking. CC'12 International Conference on Compiler Construction, |
| LNCS 2622, pages 320-335, Warsaw, april 2003. |
| |
| @item |
| @anchor{Bas03b}[Bas03b] C. Bastoul. Efficient code generation for automatic |
| parallelization and optimization. ISPDC'03 IEEE International Symposium on |
| Parallel and Distributed Computing, pages 23-30, Ljubljana, october 2003. |
| |
| @item |
| @anchor{Bas04}[Bas04] C. Bastoul. Code Generation in the Polyhedral Model |
| Is Easier Than You Think. PACT'13 IEEE International Conference on Parallel |
| Architecture and Compilation Techniques, pages 7-16, Juan-les-Pins, |
| september 2004. |
| |
| @item |
| @anchor{Bas11}[Bas11] C. Bastoul. A Specification and a Library for Data |
| Exchange in Polyhedral Compilation Tools. Technical Report, |
| Paris-Sud University, France, September 2011. |
| |
| @item |
| @anchor{Fea92}[Fea92] P. Feautrier Some efficient solutions to the affine |
| scheduling problem, part II: multidimensional time. |
| International Journal of Parallel Programming, 21(6):389--420, December 1992. |
| |
| @item |
| @anchor{Gri04}[Gri04] M. Griebl. Automatic parallelization of loop programs |
| for distributed memory architectures. Habilitation Thesis. Facult@"at f@"ur |
| Mathematik und Informatik, Universit@"at Passau, 2004. |
| @emph{http://www.infosun.fmi.uni-passau.de/cl/loopo/} |
| |
| @item |
| @anchor{Qui00}[Qui00] F. Quiller@'e, S. Rajopadhye, and D. Wilde. |
| Generation of efficient nested loops from polyhedra. |
| International Journal of Parallel Programming, 28(5):469-498, |
| october 2000. |
| |
| @item |
| @anchor{Wil93}[Wil93] Doran K. Wilde. |
| A library for doing polyhedral operations. |
| Technical Report 785, IRISA, Rennes, France, 1993. |
| |
| @end itemize |
| |
| |
| |
| |
| @c % /************************************************************************* |
| @c % * PART VI: END OF THE DOCUMENT * |
| @c % *************************************************************************/ |
| @c @unnumbered Index |
| |
| @c @printindex cp |
| |
| @bye |