summaryrefslogtreecommitdiff
path: root/bench/statistics.py
diff options
context:
space:
mode:
authorJohn MacFarlane <jgm@berkeley.edu>2015-06-11 14:32:22 -0700
committerJohn MacFarlane <jgm@berkeley.edu>2015-06-16 17:47:19 -0700
commit7f491b0bdf8e206458d284938efa8a0890c9d352 (patch)
treed102cc7abad2628771a26dc35a26a577f0f524cd /bench/statistics.py
parentef77d908553bfdd37b83ae4832d7e6ff36874f24 (diff)
Preliminary changes for new tab handling.
We no longer preprocess tabs to spaces before parsing. Instead, we keep track of both the byte offset and the (virtual) column as we parse block starts. This allows us to handle tabs without converting to spaces first. Tabs are left as tabs in the output. Added `column` and `first_nonspace_column` fields to `parser`. Added utility function to advance the offset, computing the virtual column too. Note that we don't need to deal with UTF-8 here at all. Only ASCII occurs in block starts. Significant performance improvement due to the fact that we're not doing UTF-8 validation -- though we might want to add that back in.
Diffstat (limited to 'bench/statistics.py')
0 files changed, 0 insertions, 0 deletions