From d8fe362715804e59cd199edda25fd27d8663b7f9 Mon Sep 17 00:00:00 2001 From: David Malcolm Date: Wed, 21 Oct 2015 17:32:14 -0400 Subject: [PATCH 28/58] FIXME: Describe the range-packing idea in line-map.h --- libcpp/include/line-map.h | 49 ++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 42 insertions(+), 7 deletions(-) diff --git a/libcpp/include/line-map.h b/libcpp/include/line-map.h index 4422707..3d13741 100644 --- a/libcpp/include/line-map.h +++ b/libcpp/include/line-map.h @@ -70,13 +70,48 @@ typedef unsigned int linenum_type; | ordmap[0]->start_location) | first line in ordmap 0 -----------+-------------------------------+------------------------------- | ordmap[1]->start_location | First line in ordmap 1 - | ordmap[1]->start_location+1 | First column in that line - | ordmap[1]->start_location+2 | 2nd column in that line - | | Subsequent lines are offset by - | | (1 << column_bits), - | | e.g. 128 for 7 bits, with a - | | column value of 0 representing - | | "the whole line". + | ordmap[1]->start_location+32 | First column in that line + | (assuming range_bits == 5) | + | ordmap[1]->start_location+64 | 2nd column in that line + | ordmap[1]->start_location+4096| Second line in ordmap 1 + | (assuming column_bits == 12) + | + | Subsequent lines are offset by (1 << column_bits), + | e.g. 4096 for 12 bits, with a column value of 0 representing + | "the whole line". + | + | Within a line, the low "range_bits" (typically 5) are used for + | storing short ranges, so that there's an offset of + | (1 << range_bits) between individual columns within a line, + | typically 32. + | The low range_bits store the offset of the end point from the + | start point, and the start point is found by masking away + | the range bits. + | + | For example: + | ordmap[1]->start_location+64 "2nd column in that line" + | above means a caret at that location, with a range + | starting and finishing at the same place (the range bits + | are 0), a range of length 1. + | + | By contrast: + | ordmap[1]->start_location+68 + | has range bits 0x4, meaning a caret with a range starting at + | that location, but with endpoint 4 columns further on: a range + | of length 5. + | + | Ranges that have caret != start, or have an endpoint too + | far away to fit in range_bits are instead stored as ad-hoc + | locations. Hence for range_bits == 5 we can compactly store + | tokens of length <= 32 without needing to use the ad-hoc + | table. + | + | This packing scheme means we effectively have + | (column_bits - range_bits) + | of bits for the columns, typically (12 - 5) = 7, for 128 + | columns; longer line widths are accomodated by starting a + | new ordmap with a higher column_bits. + | | ordmap[2]->start_location-1 | Final location in ordmap 1 -----------+-------------------------------+------------------------------- | ordmap[2]->start_location | First line in ordmap 2 -- 1.8.5.3