Racket Binary Packages

2019-09-15 :: racket

The Racket source code is distributed under LGPL and this means that any proprietary Racket application distributed must also provide a way for the user to re-link an updated Racket runtime and produce a new executable for that application. In this blog post we’ll explore how to achieve this.

There are discussions about moving Racket to a different license, but at least as of Racket 7.4, this has not completed, so distributing proprietary Racket applications still requires compliance with the LGPL. The LGPL allows proprietary applications to be built and linked against the software, but requires the publisher to distribute, along with the application, the binary bytecode files, such that the LGPL libraries can be modified and re-linked into the application. Exactly what this means is subject to debate, but the Racket developers provide us with the following license clarification (the emphasis is mine):

First, if you distribute your Racket application in source form or as compiled bytecode files, the Racket license does not restrict you at all.

Second, if you distribute your Racket application as compiled binary generated by raco exe, there are no requirements placed on the licensing of your software. However, the LGPL requires that you make it possible to re-link your software with modified versions of Racket. This means, basically, that you need to provide the compiled bytecode files used to produce the compiled binary, if requested by someone who got your software from you. Note that this does not mean that your software has to be made open source, nor do you have to give the source code to anyone, nor do you have to make the compiled bytecode files available to the public or let other people redistribute them. Furthermore, this is not revealing any more of your source code than the raco exe format, since the bytecode is embedded in an extractable way in the resulting executable.

The above paragraph does raise the question of how does one provide the byte code files for a proprietary application, to be relinked into a final executable with a modified version of Racket, plus any other LGPL libraries the application might use?

I asked this question in this racket-users thread, and since no one answered, it seems that this is not common knowledge, so I did some investigations and discovered that, while it is relatively simple to accomplish this, the details are non-obvious. I wrote my findings in this blog post.

DISCLAIMER: before we continue, I need to clarify that I am not a lawyer and this blog post is not legal advice. I only address the technical problem of distributing applications as bytecode code files. I don’t know if this is sufficient to comply with the current Racket license.

Crocodile Safety

Distributing a binary only package

The simplest use case is to distribute a binary-only package which the user can than install on their machine and use it without having access to the source code.

First you’ll need to build and install the proprietary package on your local or build machine. In the example below, I’ll use one of my public packages as an example, but these steps will work even if the package source is not publicly available:

$ git clone https://github.com/alex-hhh/data-frame
$ raco pkg install ./data-frame

Next, you can use the raco pkg create command to create the binary package:

$ raco pkg create --binary --dest . --from-dir data-frame
packing into C:\Users\alexh\Projects\Racket\.\data-frame.zip
writing package checksum to C:\Users\alexh\Projects\Racket\.\data-frame.zip.CHECKSUM

The above command will create two files, an archive file containing the package data, without the source code, and a checksum file containing the SHA1 checksum of the package. This package, along with the checksum file can be installed on any machine with Racket using raco pkg install data-frame.zip, without needing the source code.

The resulting package ZIP file contains all package files except the source code, and this means that more than just the binary files are packaged, so the ZIP file contains more than it is strictly needed for a binary-only distribution: in the case of the data-frame package, the resulting binary package also contains the continuous integration scripts and the test data. There are two possibilities for removing these unneeded files: (1) these files could be removed form the package before the rack pkg create step and (2) the package archive can be edited and the files removed. If you choose the second option, here are a few things to be aware of:

if you extract the archive to remove the files, the new archive has to be created such that the files inside the archive are directly at top-level, without an intermediate directory for the package name — most archive utilities will create archives such that they will extract in a directory with the same name as the archive name, but this is not what raco pkg expects.
regardless how the archive is edited, the checksum will now be invalid, and must be recalculated. The checksum is SHA1, and can be calculated with the sha1sum utility.

It is also worth mentioning that Racket comes with built in libraries for zip archive creation and SHA1 checksum calculation, so the entire open — edit — re-package cycle can be implemented as a Racket script as part of an automated build process.

The resulting binary package will be tied to the Racket version used to build it and to any other packages it uses and will only work correctly with those versions, and perhaps with small variations of those versions.

Distributing byte code (ZO files) for an application

It is not immediately obvious from the raco pkg create documentation, but this command can be used for distributing the bytecode files for an application for relinking purposes. Before we go on, it is worth clarifying that you can build standalone executables using raco exe and raco distribute, or by calling create-embedding-executable and assemble-distribution from a Racket script. The steps described in this section allow distributing the compiled bytecode code for a proprietary application for re-linking with possibly a modified Racket version for compliance with the LGPL and they are not needed if you simply want to build an executable for the application.

As it was the case for packages, you’ll need to create the bytecode files first, this can be done by running raco make on the racket files or calling managed-compile-zo from a Racket script. In the example below, I’ll use my ActivityLog2 application, which has publicly available source code, but this process will work with a proprietary application too. After compiling the source files, the binary package can be created as for a normal racket package. It is interesting to note that ActivityLog2 is not a package and has no info.rkt file, but the raco pkg command does not seem to need it:

$ raco pkg create --binary --dest . --from-dir ActivityLog2
packing into C:\Users\alexh\Projects\.\ActivityLog2.zip
writing package checksum to C:\Users\alexh\Projects\.\ActivityLog2.zip.CHECKSUM

As was the case for distributing a binary package, the resulting ZIP file will contain files which are not strictly needed for a binary-only distribution and the resulting archive will need to be edited to remove unnecessary files — in the case of ActivityLog2, these would be the docs, scripts and test folders, plus if an executable was built, the ActivityLog2.exe and dist folder.

Re-creating the executable and distribution is more tricky, as a simple "raco exe" command will not work, since the "run" module is available only as a compiled ZO file in the compiled sub-folder:

$ raco exe --gui run.rkt
raco exe: source file does not exist
  path: run.rkt

However, my application has a build.rkt script which handles compilation, exe creation and assembling a distribution by calling into create-embedding-executable and assemble-distribution directly, and since there is no compiled ZO file for this file, it is copied directly into the package and can be run to produce the final ActivityLog2 executable and distribution using:

$ racket build.rkt
Compiling .zo files... done.
Building application executable... done.
Assembling distribution... done.

It seems that a simple Racket build script needs to be shipped with the compiled bytecode files, but this should not be a major problem. If you are interested in what a build script might look like, you can have a look at the one used by the ActivityLog2 application here.

Final Thoughts

The above mechanisms outline the technical basis on which someone who decides to publish a proprietary Racket application can comply with the LGPL, at least in good faith and as clarified by the racket authors in their license clarification page. I don’t know if this is sufficient and I am not a lawyer.

There are some limitations to this mechanism: first, the compiled ZO files are dependent on the Racket version, so a binary distribution can only be used with the same Racket version (or perhaps one in which a limited number of modifications are made by the user). This limitation also applies to other libraries which the binary distribution might use, especially since code from other libraries can be inlined in the binary files. At best, this mechanism would be useful for a user to just re-construct the environment in which the application was originally built and perhaps apply some small and limited modifications to that environment before re-linking the application.

A second limitation is that the individual file names for the source code is still present as the names of the compiled ZO files in the binary distribution, depending how proprietary a proprietary application is, this might reveal too much of the original source.

I have no plans for building proprietary applications in Racket myself, but I hope these notes might be useful to others.