An Overview of Common Racket Data Structures
The Racket language provides a variety of readytouse data structures and containers, however each such container has advantages and disadvantages when it comes to performance. The “everything is a list” approach makes things simple when you are learning Racket, but if you start using it for more complex programs, you need to be aware that lists are not always the best data structure to use. This guide looks at a few alternatives.
There are many data structures available in Racket, but this guide looks at only three of them: lists, vectors and hashtables. These are readily available with every Racket installation, are versatile and, used carefully, have good performance characteristics.
Lists
Lists are perhaps the most popular data structure used in Racket programs plus Lisp and Scheme too. If you learned Scheme from classics such as HTDP or SICP, you have seen them used a lot. The Racket documentation provides an overview, there is also an overview on cons and pairs, plus reference documentation for them. Lists are a very flexible data structure, and Racket has lots of functions that manipulate them: you can construct, search and reference elements in a variety of ways and for this reason you will find lists used in most example programs.
Not all operations on lists are efficient, and using lists containing a lot of elements can quickly result in performance issues. If the size of the dataset is more than a few hundred items, you need to be aware of the following performance considerations.
Constructing lists is only efficient when it is done by adding an element to the front of the list, using cons
. If you need to construct a list by adding an element to the end of the list, it will be more efficient to construct the list in reverse first, using cons
, than reverse the list at the end. The example below constructs a list of 10 integers by accumulating the result in reverse, than reverting the result at the end to produce the final list.
1 2 3 4 5 6 
The above pattern might be difficult to read and understand and use for complex scenarios, so consider using for/list
or for*/list
to construct lists by iteration.
Iterating over all the elements in a list, using map
or foreach
is usually efficient, however if you need to traverse a lot of large collections, a vector
might be a better alternative. When using iterations and comprehensions, like for
or for/list
to iterate over lists, consider using an inlist
sequence, as this is significantly more efficient, than just passing the list itself to the for
construct. For example, use:
Searching for an element in a list using member
, findf
and related functions takes a linear time and this can become a problem for large lists. If you need to do a lot of lookups, consider using a hash
or hashset
instead.
Referencing an element by its position, using listref
takes a linear time and will become slower as the referenced element is further down the list. If you need to reference elements by their position, consider using a vector
.
Inserting, updating, deleting elements and appending lists are inefficient for large lists. In Racket, lists are immutable, which means that whenever you insert, delete an element from a list, an entire new list is constructed and returned and for large lists, this is very inefficient. Appending a list to another also has to create a new list, making it an inefficient operation. If you need to insert, update or delete elements in a container, consider using a hash
or a hashset
instead. A vector
is also efficient at updating elements.
Vectors
vectors are fixed length arrays of arbitrary values and element access and update is done in a constant time. They are a versatile and efficient; in languages such a Python and C++, vectors are probably the most used container. Note that what Python calls “lists” are really implemented as vectors — this can be confusing if you just start out with Racket, as Python “lists” have different performance characteristics (they are really vectors).
The reference documentation lists all the functions available on vectors and these are similar to the functions that operate on lists (for example the equivalent of map
function for vectors is vectormap
). In Racket, vectors can be either mutable or immutable — immutable vectors cannot have their elements changed after they are created.
Just as with lists, some operations are provided for convenience, but they are not very efficient and you need to be aware of these.
Constructing vectors requires the size to be known in advance, and this size cannot be changed after a vector is constructed, although, Racket also provides a gvector
which can grow. To construct a vector, you can use one of the following:

makevector
to construct a vector where all elements have the same value 
buildvector
to construct a vector with elements provided by a user supplied function 
the
for/vector
form will construct by iterating over some sequences. In general, this form is convenient if you need to construct a vector without knowing the final number of elements, but it will be inefficient. However, the form has a#:length
keyword which can be used to provide the length, making the construction efficient.
Iterating over all the elements in a vector can be done using iterations and comprehensions, like for
, however, when doing so, consider using an invector
sequence, as this is significantly more efficient, than just passing the vector itself to the for
construct. For example, use:
1 2 3 4 
(define myvector (buildvector 10 (lambda (i) i))) (for ([element (invector myvector))) (prinf "~a~%" element)) 
Note that the vectormap
function will produce a new vector with the results of the map function, so it is not a good idea to use it just for simply iterating over the elements. vectormap!
can be used however, to update all elements inplace.
Searching for an element in a vector can be done using vectormember
or vectormemq
but, just like with lists, these functions need to scan the vector linearly and will be inefficient for large vectors. Still, a vector will be more efficient at this linear search than a list with a similar number of elements, as elements in a vector (or at least their references) are next to each other and will have better cache access characteristics in the computer memory.
Referencing an element by its position, using vectorref
, is an efficient operation, and if you need to do that often, vectors are the best container for this.
Updating an element in a vector using vectorset!
is efficient, but inserting or deleting elements or appending two vectors are inefficient as they have to construct new vectors, you should avoid them in performance sensitive code, especially with vectors containing a large number of elements.
Other vector types, such as gvector
and flvector
have better performance characteristics for some use cases. Two of the most important of these are gvector
and flvector
.
A gvector is a vector data type that supports efficient appending of elements at the end of the vector. It is great for constructing vectors when you don’t know the final number of elements.
If you need to store only floating point values in your vector, consider using a flvector which provides a more compact and efficient vector representation for these values. A corresponding suite of functions is available for operating on flvector
instances, see the reference for more details.
If the program you write has a lot of numerical calculations, consider using the [flonum][fl] library, which provides operations for floating point numbers (for example, it provides a fl+
operation which is more efficient than the +
operation).
Hash Tables
Hash tables are containers which map keys to values and can lookup keys efficiently. The Racket manuals have overview and reference sections for hash tables explaining how to use them and what functions are available. Like vectors, hash tables can be mutable or immutable, where immutable ones cannot be changed after they are created.
Just as with lists and vectors, some operations are provided for convenience, but they are not very efficient and you need to be aware of these.
Constructing hash tables can be efficiently done by creating an empty hash table first and adding elements to it one by one. Note that the for/hash
, while convenient to use will actually create an immutable hash table, so you cannot add any elements to it after it was created.
Iterating over all the elements in a hash table* can be done using iterations and comprehensions, but consider using inhash
, inhashkeys
, or inhashvalues
for these for
loops. Avoid using hashkeys
or hashvalues
as these functions construct an intermediate list, which can be a performance problem. For example, to print out all elements in a hash table, you can use:
Searching for an element in a hash table using hashref
is efficient. If you need to scan all elements to look for a value, this will take a linear amount of time, it will be inefficient and should be avoided. If you do need to scan all elements to look for one, at least consider using a for
form and inhashvalues
— the alternative of obtaining a list of elements and scanning them using list operations has even worse performance characteristics.
Inserting, updating, deleting elements can be done efficiently, so if you need to perform a lot of these operations, a hash table is the best container to use. Here is a summary of the available operations:

hashref
will find an element by a key. By default, if the element is not found, an error is raised, but you can specify what value to return if the key is not present. For example to return#f
when a key is not found, use(hashref ahash key #f)

hashref!
is similar tohashref
but it will create a new value for the key, of one does not already exist. It will be more efficient to use this instead of checking if a key exists and inserting it if it does now. 
hashset!
will set a value for a key, possibly replacing an old value. 
hashupdate!
combines can be used to update or create a new value, it effectively combineshashref
andhashset!
into one operation. 
hashremove!
will remove an element from the hash table 
hashclear!
will remove all elements in a hash table and will be more efficient than removing them one by one usinghashremove!
.
Appending hash tables can be done using hashunion
or hashunion!
, but these will not be very efficient for large hash tables, however, hashunion!
can be used for efficiently appending a smaller hash table to a larger one.
Conclusion
One can write performance sensitive applications in Racket, but the choice of data structures and algorithms will greatly affect the final performance. This post is a high level overview of the most important data structures and provided some guidelines on how to choose between them, but ultimately, the final choice will depend on the type of processing program needs to do and what algorithms you choose.