Internals

Working on the JIT library

Having checked out the source code (to “src”), you can configure and build the JIT library like this:

mkdir build
mkdir install
PREFIX=$(pwd)/install
cd build
../src/configure \
   --enable-host-shared \
   --enable-languages=jit \
   --disable-bootstrap \
   --enable-checking=release \
   --prefix=$PREFIX
nice make -j4 # altering the "4" to however many cores you have

This should build a libgccjit.so within jit/build/gcc:

[build] $ file gcc/libgccjit.so*
gcc/libgccjit.so:       symbolic link to `libgccjit.so.0'
gcc/libgccjit.so.0:     symbolic link to `libgccjit.so.0.0.1'
gcc/libgccjit.so.0.0.1: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped

Here’s what those configuration options mean:

--enable-host-shared

Configuring with this option means that the compiler is built as position-independent code, which incurs a slight performance hit, but it necessary for a shared library.

--enable-languages=jit

This specifies which frontends to build. The JIT library looks like a frontend to the rest of the code.

--disable-bootstrap

For hacking on the “jit” subdirectory, performing a full bootstrap can be overkill, since it’s unused by a bootstrap. However, when submitting patches, you should remove this option, to ensure that the compiler can still bootstrap itself.

--enable-checking=release

The compile can perform extensive self-checking as it runs, useful when debugging, but slowing things down.

For maximum speed, configure with --enable-checking=release to disable this self-checking.

Running the test suite

[build] $ cd gcc
[gcc] $ make check-jit RUNTESTFLAGS="-v -v -v"

A summary of the tests can then be seen in:

jit/build/gcc/testsuite/jit/jit.sum

and detailed logs in:

jit/build/gcc/testsuite/jit/jit.log

The test executables can be seen as:

jit/build/gcc/testsuite/jit/*.exe

which can be run independently.

You can compile and run individual tests by passing “jit.exp=TESTNAME” to RUNTESTFLAGS e.g.:

[gcc] $ make check-jit RUNTESTFLAGS="-v -v -v jit.exp=test-factorial.c"

and once a test has been compiled, you can debug it directly:

[gcc] $ PATH=.:$PATH \
        LD_LIBRARY_PATH=. \
        LIBRARY_PATH=. \
          gdb --args \
            testsuite/jit/test-factorial.exe

Environment variables

When running client code against a locally-built libgccjit, three environment variables need to be set up:

LD_LIBRARY_PATH
libgccjit.so is dynamically linked into client code, so if running against a locally-built library, LD_LIBRARY_PATH needs to be set up appropriately. The library can be found within the “gcc” subdirectory of the build tree:
$ file libgccjit.so*
libgccjit.so:       symbolic link to `libgccjit.so.0'
libgccjit.so.0:     symbolic link to `libgccjit.so.0.0.1'
libgccjit.so.0.0.1: ELF 64-bit LSB shared object, x86-64, version 1 (GNU/Linux), dynamically linked, not stripped
PATH

The library uses a driver executable for converting from .s assembler files to .so shared libraries. Specifically, it looks for a name expanded from ${target_noncanonical}-gcc-${gcc_BASEVER}${exeext} such as x86_64-unknown-linux-gnu-gcc-5.0.0.

Hence PATH needs to include a directory where the library can locate this executable.

The executable is normally installed to the installation bindir (e.g. /usr/bin), but a copy is also created within the “gcc” subdirectory of the build tree for running the testsuite, and for ease of development.

LIBRARY_PATH

The driver executable invokes the linker, and the latter needs to locate support libraries needed by the generated code, or you will see errors like:

ld: cannot find crtbeginS.o: No such file or directory
ld: cannot find -lgcc
ld: cannot find -lgcc_s

Hence if running directly from a locally-built copy (without installing), LIBRARY_PATH needs to contain the “gcc” subdirectory of the build tree.

For example, to run a binary that uses the library against a non-installed build of the library in LIBGCCJIT_BUILD_DIR you need an invocation of the client code like this, to preprend the dir to each of the environment variables:

$ LD_LIBRARY_PATH=$(LIBGCCJIT_BUILD_DIR):$(LD_LIBRARY_PATH) \
  PATH=$(LIBGCCJIT_BUILD_DIR):$(PATH) \
  LIBRARY_PATH=$(LIBGCCJIT_BUILD_DIR):$(LIBRARY_PATH) \
    ./jit-hello-world
hello world

Overview of code structure

  • libgccjit.c implements the API entrypoints. It performs error checking, then calls into classes of the gcc::jit::recording namespace within jit-recording.c and jit-recording.h.

  • The gcc::jit::recording classes (within jit-recording.c and jit-recording.h) record the API calls that are made:

      /* Indentation indicates inheritance: */
      class context;
      class builtins_manager; // declared within jit-builtins.h
      class memento;
        class string;
        class location;
        class type;
          class function_type;
          class compound_type;
            class struct_;
    	class union_;
        class field;
        class fields;
        class function;
        class block;
        class rvalue;
          class lvalue;
            class local;
    	class global;
            class param;
        class statement;
    
  • When the context is compiled, the gcc::jit::playback classes (within jit-playback.c and jit-playback.h) replay the API calls within langhook:parse_file:

      /* Indentation indicates inheritance: */
      class context;
      class wrapper;
        class type;
          class compound_type;
        class field;
        class function;
        class block;
        class rvalue;
          class lvalue;
            class param;
        class source_file;
        class source_line;
        class location;
    
    Client Code   . Generated .            libgccjit.so
                  . code      .
                  .           . JIT API  . JIT "Frontend". (libbackend.a)
    ....................................................................................
       │          .           .          .               .
        ──────────────────────────>      .               .
                  .           .    │     .               .
                  .           .    V     .               .
                  .           .    ──> libgccjit.c       .
                  .           .        │ (error-checking).
                  .           .        │                 .
                  .           .        ──> jit-recording.c
                  .           .              (record API calls)
                  .           .    <───────              .
                  .           .    │     .               .
       <───────────────────────────      .               .
       │          .           .          .               .
       │          .           .          .               .
       V          .           .  gcc_jit_context_compile .
        ──────────────────────────>      .               .
                  .           .    │     .               .
                  .           .    │ ACQUIRE MUTEX       .
                  .           .    │     .               .
                  .           .    V───────────────────────> toplev::main (for now)
                  .           .          .               .       │
                  .           .          .               .   (various code)
                  .           .          .               .       │
                  .           .          .               .       V
                  .           .          .    <───────────────── langhook:parse_file
                  .           .          .    │          .
                  .           .          .    │ (jit_langhook_parse_file)
                  .           .          .    │          .
    ..........................................│..................VVVVVVVVVVVVV...
                  .           .          .    │          .       No GC in here
                  .           .          .    │ jit-playback.c
                  .           .          .    │   (playback of API calls)
                  .           .          .    ───────────────> creation of functions,
                  .           .          .               .     types, expression trees
                  .           .          .    <──────────────── etc
                  .           .          .    │(handle_locations: add locations to
                  .           .          .    │ linemap and associate them with trees)
                  .           .          .    │          .
                  .           .          .    │          .       No GC in here
    ..........................................│..................AAAAAAAAAAAAA...
                  .           .          .    │ for each function
                  .           .          .    ──> postprocess
                  .           .          .        │      .
                  .           .          .        ────────────> cgraph_finalize_function
                  .           .          .        <────────────
                  .           .          .     <──       .
                  .           .          .    │          .
                  .           .          .    ──────────────────> (end of
                  .           .          .               .       │ langhook_parse_file)
                  .           .          .               .       │
                  .           .          .               .   (various code)
                  .           .          .               .       │
                  .           .          .               .       ↓
                  .           .          .    <───────────────── langhook:write_globals
                  .           .          .    │          .
                  .           .          .    │ (jit_langhook_write_globals)
                  .           .          .    │          .
                  .           .          .    │          .
                  .           .          .    ──────────────────> finalize_compilation_unit
                  .           .          .               .       │
                  .           .          .               .   (the middle─end and backend)
                  .           .          .               .       ↓
                  .           .    <───────────────────────────── end of toplev::main
                  .           .    │ RELEASE MUTEX       .
                  .           .    │     .               .
                  .           .    │ Convert assembler to DSO
                  .           .    │     .               .
                  .           .    │ Load DSO            .
       <───────────────────────────      .               .
       │          .           .          .               .
       Get (void*).           .          .               .
       │          .           .          .               .
       │ Call it  .           .          .               .
       ───────────────>       .          .               .
                  .    │      .          .               .
                  .    │      .          .               .
       <───────────────       .          .               .
       │          .           .          .               .
       │          .           .          .               .
    etc
    

Here is a high-level summary from jit-common.h:

In order to allow jit objects to be usable outside of a compile whilst working with the existing structure of GCC’s code the C API is implemented in terms of a gcc::jit::recording::context, which records the calls made to it.

When a gcc_jit_context is compiled, the recording context creates a playback context. The playback context invokes the bulk of the GCC code, and within the “frontend” parsing hook, plays back the recorded API calls, creating GCC tree objects.

So there are two parallel families of classes: those relating to recording, and those relating to playback:

  • Visibility: recording objects are exposed back to client code, whereas playback objects are internal to the library.
  • Lifetime: recording objects have a lifetime equal to that of the recording context that created them, whereas playback objects only exist within the frontend hook.
  • Memory allocation: recording objects are allocated by the recording context, and automatically freed by it when the context is released, whereas playback objects are allocated within the GC heap, and garbage-collected; they can own GC-references.
  • Integration with rest of GCC: recording objects are unrelated to the rest of GCC, whereas playback objects are wrappers around “tree” instances. Hence you can’t ask a recording rvalue or lvalue what its type is, whereas you can for a playback rvalue of lvalue (since it can work with the underlying GCC tree nodes).
  • Instancing: There can be multiple recording contexts “alive” at once (albeit it only one compiling at once), whereas there can only be one playback context alive at one time (since it interacts with the GC).

Ultimately if GCC could support multiple GC heaps and contexts, and finer-grained initialization, then this recording vs playback distinction could be eliminated.

During a playback, we associate objects from the recording with their counterparts during this playback. For simplicity, we store this within the recording objects, as void *m_playback_obj, casting it to the appropriate playback object subclass. For these casts to make sense, the two class hierarchies need to have the same structure.

Note that the playback objects that m_playback_obj points to are GC-allocated, but the recording objects don’t own references: these associations only exist within a part of the code where the GC doesn’t collect, and are set back to NULL before the GC can run.

Table Of Contents

Previous topic

Compilation results