Front page | perl.perl6.language.data |
Postings from September 2000
RFC 203 (v2) Arrays: Notation for declaring and creating arrays
From:
Perl6 RFC Librarian
Date:
September 21, 2000 13:14
Subject:
RFC 203 (v2) Arrays: Notation for declaring and creating arrays
Message ID:
20000921201448.8899.qmail@tmtowtdi.perl.org
This and other RFCs are available on the web at
http://dev.perl.org/rfc/
=head1 TITLE
Arrays: Notation for declaring and creating arrays
=head1 VERSION
Maintainer: Jeremy Howard <j@howard.fm>
Date: 8 Sep 2000
Last Modified: 21 Sep 2000
Mailing List: perl6-language-data@perl.org
Number: 203
Version: 2
Status: Frozen
=head1 DISCUSSION
No objections were noted to the proposals in this RFC. A change of name
from :bounds to :shape was accepted.
=head1 ABSTRACT
RFC 202 described the need to be able to declare a data structure that
contains elements of the same type stored contiguously in memory, which is
called an I<array>. This RFC outlines the syntax to declare and create
arrays. The syntax to create arrays is identical to that to create lists
of lists (described in L<perllol> in the Perl 5 documentation). The syntax
to declare the type of elements is the standard type syntax.
RFC 203 describes a syntax for multidimensional indexing of arrays. A
syntax to declare the bounds of the dimensions is described in this RFC
using a new ':shape' attribute.
=head1 DESCRIPTION
=head2 Compact arrays
It is proposed that if a list is declared that specifies a simple type for
its elements:
my int @a;
then that list be stored as an array--that is, contiguously in memory.
These arrays support I<all> the same syntax as lists. Therefore any
description of syntax for a 'list' also applies to an 'array', and visa
versa. However, their implementation is very different.
=head2 :shape attribute
Furthermore, it is proposed that lists accept a new C<:shape> attribute:
my @a :shape(3);
that defines the number of elements in a list.
This is equivalent to:
my @a;
$#a = 3-1;
except that an attempt at accessing beyond $a[3] would result in an error
if :shape(3) is set. Specfically, adding a :shape attribute to a
declaration has two effects on the array:
=over 4
=item 1
Adds range checking (since exceptions are defined for access outside
the range)
=item 2
Allows the block of memory to be preallocated
=back
(2) can also be achieved by simply setting the bottom right element to
undef, or by setting @#array. The behaviour of (1) removes
autovivification of new elements, since an exception is raised
instead.
:shape doesn't actually reshape. If the returned array overshoots
specified bounds of :shape, an exception is raised. Reshaping is done
with reshape() (RFC 148), merge()/demerge() (RFC 90), and
part()/flatten() (RFC 91).
The :shape attribute can also accept a list:
my @b :shape(3,3);
The second element of the list is the number of elements in the list, as
before. The first element is the number of lists that are referenced as
elements of the list. Therefore
my @b :shape(3,3);
$b[3][4] = 0; # Error: access beyond bounds of @b
would result in an error. :shape can take as many arguments as
required--an n-element list declares a list with at most n levels of
nesting, with the maximum index at level x being (n-x). Because lists of
lists support multidimensional indexing (see RFC 204) the :shape attribute
effectively specifies the bounds of a multidimensional structure.
The parameters of :shape are optional if an array is assigned in the
declaration. Therefore:
my int @array :shape = @rvalue;
is equivalent to:
my int @array :shape(@#rvalue) = @rvalue;
The bounds of an array or list can be specified at run time, of course:
my @t1 :shape(@dimList) = getFromSomeplace();
=head2 Combining compact storage and :shape attribute
Efficient multidimensional arrays can be declared by combining a fixed
simple type with the :shape attribute:
my int @b :shape(4,4);
Perl in this case would set aside enough room for sixteen ints, and store
an attribute with @b that it had two dimensions, each indexed by (0..3).
Because @b here is stored as an array, and supports multidimensional
indexing (see RFC 204), it is a true multidimensional array.
Although @b looks just like a normal list of lists that happens to
have a type and an attribute, it is implemented as a multidimensional
array. Therefore
my int @b :shape(4,4);
@b = ([1,2,3,4],
[5,6,7,8],
[9,10,11,12],
[13,14,15,16]);
creates a multidimensional array @b that contains all sixteen ints in a
contiguous block of memory, but can be accessed using standard list of
lists syntax, along with the extensions proposed in RFC 204.
Where the type and bounds of an array can be derived at run time, it is
not necessary to specify them explicitly:
my int @t1 :shape(@dimList) = getFromSomeplace();
my int @t2 :shape(@dimList) = getFromSomeplaceElse();
my @prod = @t1 * @t2; # @prod magically has type (int) and :shape (@dimlist)
Note that this is using an element-wise multiplication operation,
described in RFC 82. If either @t1 or @t2 was unbounded (i.e. had no
:shape attribute) then @prod would also be unbounded.
A list (of lists...) that contains elements of the same type can be
converted to an array by specifying its type:
my @some_LOL = ([1,2],
[3,4]);
my int @array = @some_LOL;
and its bounds can be locked in as well if required:
my @some_LOL = ([1,2],
[3,4]);
my int @array :shape(@#some_LOL) = @some_LOL;
=head1 IMPLEMENTATION
Too early to get into much detail beyond the obvious... Clearly arrays are
not a list of SVs, but are the raw data stored contiguously in memory,
along with the attributes of the array stored someplace.
:shape applies to lists as well, of course (because lists and arrays share
identical syntax), but this should be fine because it is just an
attribute.
Arrays do not require :shape to be specified. If not specified, they
should grow by doubling in size (like lists), but programmers will be
encouraged to avoid this because a new contiguous memory block will have
to be found each time. Programmers may also allocate memory after the fact
with '@#' (see RFC 206).
=head1 REFERENCES
RFC 202: Overview of multidimensional array RFCs
perllol in the Perl 5 documentation
Implementation in PDL: http://pdl.sourceforge.net/PDLdocs/Internals.html
-
RFC 203 (v2) Arrays: Notation for declaring and creating arrays
by Perl6 RFC Librarian