"An Efficient Implementation of Nested Loop Control Instructions for Fine Grain Parallelism"

(by V. Andronache, N. L. Passos, and R. Simpson) in the Proceedings of the Ninth Annual South Central Conference, in the Journal of Computing in Small Colleges, April 1998, Jackson, MS, pp. 67-76.



  A significant amount of research has been done towards finding a technique that will allow maximum use of parallel capabilities in the case of nested loops. The last couple of years have seen a number of theoretical results that attempt to gain the largest possible amount of parallelism. In this paper, one such technique - multidimensional retiming - is integrated with a novel implementation of the way the indices are used to control the loop iterations to provide a practical approach to the parallelization problem. Previous approaches have considered the problem as a whole and emphasized the use of multiple processors, without optimizing the execution for an individual processor. This paper looks at an efficient implementation of nested loops for a single processor. The theory and the algorithm used to achieve the desired result are presented. A detailed example illustrates the use of the technique.


