Skip to main content

A preprocessor that gives C multidimensional arrays

Project description

CnD is a source-to-source translator that makes using n-dimensional arrays in C more pleasant. It will turn this code:

void sgemm(float *a, float *b, float *c, int n)
{
  dimension "fortran" a[n; n];
  dimension "fortran" b[n; n];
  dimension c[n; n];

  for (int i = 1; i <= n; ++i)
    for (int j = 1; j <= n; ++j)
    {
      float tmp = 0;

      for (int k = 1; k <= n; ++k)
        tmp += a[i;k]*b[k;j];

      c[i-1;j-1] = tmp;
    }
}

into this:

void sgemm(float *a, float *b, float *c, int n)
{
  for (int i = 1; i <= n; ++i)
    for (int j = 1; j <= n; ++j)
  {
    float tmp = 0;
    for (int k = 1; k <= n; ++k)
      tmp += a[((k - 1) * ((n - 1) + 1)) + (i - 1)] * b[((j - 1) * ((n - 1) + 1)) + (k - 1)];

    c[((i - 1) * n) + (j - 1)] = tmp;
  }
}

You may also take a look at a more comprehensive example that shows a few extra bells and whistles.

The only effect of a dimension declaration is to modify the interpretation of the array(idx) subscript operator. dimension declarations obey regular C scoping rules. Note that in order to prevent hard-to-find bugs, multi-dimensional array references using square brackets are considered an error.

I’d also like to note that CnD is a robust, parser-based translator, not a flaky text replacement tool. It understands all of C99.

Each axis specification in a dimension declaration has the following form:

start:end:stride:leading_dimension

start may be omitted. end and stride may also be omitted, but if entries after them are to be specified, their trailing colons must remain in place. For example, the axis specification :5 simply specifies a stride of 5. The stride simply acts as a multiplier on the index. No plausibility checking whatsoever is done on the dimension declaration. You may shoot yourself in the foot any way you like.

If the layout is given as “c” or not given at all, the following things are true:

  • The array is laid out in row-major order.

  • The end index is taken to be exclusive, if specified.

  • The start index defaults to 0.

If the layout is given as “fortran”, the following things are true:

  • The array is laid out in column-major order.

  • The end index is taken to be inclusive, if specified.

  • The start index defaults to 1.

(Most) of the knowledge contained in the dimension declaration may be reobtained programmatically by the follwing functions:

  • rankof(a)

  • nitemsof(a)

  • lboundof(a, axis)

  • uboundof(a, axis) (returns the user-specified upper bound)

  • puboundof(a, axis) (returns the index just past the end of axis)

  • ldimof(a, axis)

  • strideof(a, axis)

In each case, axis must be a constant integer (not a constant expression, a plain integer).

Installation / Usage

You may obtain CnD by downloading the tarball from the package index, or from github:

$ git clone git://github.com/inducer/cnd.git
$ cd cnd
$ git submodule init
$ git submodule update

To use CnD, simply add distribution-dir/bin to your PATH.

To get started, simply run (from within the cnd root):

$ cd examples
$ ../bin/cndcc gcc -std=c99 basic.c
$ ./a.out

If you would like more fine-grained control over the translation process, the cnd command exposes just the source-to-source translation. Note that cnd expects preprocessed source. You may pass the option -E to have cnd run the preprocessor on your source for you. Run:

$ cnd -h

to get full help on the command line interface. You may set the CND_CPP environment variable to the preprocessor you wish to use.

FAQ

Semicolons (not commas) to separate indices? Are you kidding me?

No. Turns out our hand is forced in this matter by a curious interaction with the C preprocessor. Consider the following stiuation:

#define MY_MACRO(a) /* something rather */

MY_MACRO(array[i,j])

The preprocessor sees the comma and rips our array access apart into two macro arguments, and then complains that MY_MACRO takes only one argument. Not very smart, but such is life. Thus the most natural choice for array access syntax is out.

(Credit for discovering this goes to Zydrunas Gimbutas.)

After discovering the above fact, we went through a number of choices for the syntax. First, we tried:

a(i,j)

While this was fine technically (and Fortran-compatible), it felt decidedly out of place in a C program, to the point of making the code hard to decipher.

We also considered:

a[i][j]

but this seemed wordy and deemphasized the fact that this was not ‘classic’ C-style array lookup.

But Vim highlights semicolons as an error!

Good point. Add this line:

let g:c_no_bracket_error = 1

to your .vimrc.

Version History

2011.3

  • Syntax change from a(i,j) to a[i;j].

  • Parser support for many more GNU extensions, tgmath.h now works on OS X (10.7) and Linux.

2011.2

  • Syntax change from a[i,j] to a(i,j).

  • Fixes for OS X and two bugs.

  • Generate #line directives.

2011.1

Initial release.

Future Features

  • Caching of lexer/parser tables (faster startup)

  • Bounds checking.

Author

Andreas Kloeckner <inform@tiker.net>, based on discussions with Zydrunas Gimbutas.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cnd-2011.3.tar.gz (95.3 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page