Commit Graph

2 Commits

Author SHA1 Message Date
Tom Thorogood 17c1bc6792
Eliminate lexer goroutines (#792)
* Eliminate zlexer goroutine

This replaces the zlexer goroutine and channels with a zlexer struct
that maintains state and provides a channel-like API.

* Eliminate klexer goroutine

This replaces the klexer goroutine and channels with a klexer struct
that maintains state and provides a channel-like API.

* Merge scan into zlexer and klexer

This does result in tokenText existing twice, but it's pretty simple
and small so it's not that bad.

* Avoid using text/scanner.Position to track position

* Track escape within zlexer.Next

* Avoid zl.commt check on space and tab in zlexer

* Track stri within zlexer.Next

* Track comi within zlexer.Next

There is one special case at the start of a comment that needs to be
handled, otherwise this is as simple as stri was.

* Use a single token buffer in zlexer

This is safe as there is never both a non-empty string buffer and a
non-empty comment buffer.

* Don't hardcode length of zl.tok in zlexer

* Eliminate lex.length field

This is always set to len(l.token) and is only queried in a few places.

It was added in 47cc5b052d without any
obvious need.

* Add whitespace to klexer.Next

* Track lex within klexer.Next

* Use a strings.Builder in klexer.Next

* Simplify : case in klexer.Next

* Add whitespace to zlexer.Next

* Change for loop style in zlexer.Next and klexer.Next

* Surface read errors in zlexer

* Surface read errors from klexer

* Remove debug line from parseKey

* Rename tokenText to readByte

* Make readByte return ok bool

Also change the for loop style to match the Next for loops.

* Make readByte errors sticky

klexer.Next calls readByte separately from within the loop. Without
readByte being sticky, an error that occurs during that readByte call
may be lost.

* Panic in testRR if the error is non-nil

* Add whitespace and unify field setting in zlexer.Next

* Remove eof fields from zlexer and klexer

With readByte having sticky errors, this no longer needed. zl.eof = true
was also in the wrong place and could mask an unbalanced brace error.

* Merge zl.tok blocks in zlexer.Next

* Split the tok buffer into separate string and comment buffers

The invariant of stri > 0 && comi > 0 never being true was broken when
x == '\n' && !zl.quote && zl.commt && zl.brace != 0 (the
"If not in a brace this ends the comment AND the RR" block).

Split the buffer back out into two separate buffers to avoid clobbering.

* Replace token slices with arrays in zlexer

* Add a NewRR benchmark

* Move token buffers into zlexer.Next

These don't need to be retained across Next calls and can be stack
allocated inside Next. This drastically reduces memory consumption as
they accounted for nearly half of all the memory used.

name      old alloc/op   new alloc/op   delta
NewRR-12    9.72kB ± 0%    4.98kB ± 0%  -48.72%  (p=0.000 n=10+10)

* Add a ReadRR benchmark

Unlike NewRR, this will use an io.Reader that does not implement any
methods aside from Read. In particular it does not implement
io.ByteReader.

* Avoid using a bufio.Reader for io.ByteReader readers

At the same time use a smaller buffer size of 1KiB rather than the
bufio.NewReader default of 4KiB.

name       old time/op    new time/op    delta
NewRR-12     11.0µs ± 3%     9.5µs ± 2%  -13.77%  (p=0.000 n=9+10)
ReadRR-12    11.2µs ±16%     9.8µs ± 1%  -13.03%  (p=0.000 n=10+10)

name       old alloc/op   new alloc/op   delta
NewRR-12     4.98kB ± 0%    0.81kB ± 0%  -83.79%  (p=0.000 n=10+10)
ReadRR-12    4.87kB ± 0%    1.82kB ± 0%  -62.73%  (p=0.000 n=10+10)

name       old allocs/op  new allocs/op  delta
NewRR-12       19.0 ± 0%      17.0 ± 0%  -10.53%  (p=0.000 n=10+10)
ReadRR-12      19.0 ± 0%      19.0 ± 0%     ~     (all equal)

ReadRR-12    11.2µs ±16%     9.8µs ± 1%  -13.03%  (p=0.000 n=10+10)

* Surface any remaining comment from zlexer.Next

* Improve comment handling in zlexer.Next

This both fixes a regression where comments could be lost under certain
circumstances and now emits comments that occur within braces.

* Remove outdated comment from zlexer.Next and klexer.Next

* Delay converting LF to space in braced comment

* Fixup TestParseZoneComments

* Remove tokenUpper field from lex

Not computing this for every token, and instead only
when needed is a substantial performance improvement.

name       old time/op    new time/op    delta
NewRR-12     9.56µs ± 0%    6.30µs ± 1%  -34.08%  (p=0.000 n=9+10)
ReadRR-12    9.93µs ± 1%    6.67µs ± 1%  -32.77%  (p=0.000 n=10+10)

name       old alloc/op   new alloc/op   delta
NewRR-12       824B ± 0%      808B ± 0%   -1.94%  (p=0.000 n=10+10)
ReadRR-12    1.83kB ± 0%    1.82kB ± 0%   -0.87%  (p=0.000 n=10+10)

name       old allocs/op  new allocs/op  delta
NewRR-12       17.0 ± 0%      17.0 ± 0%     ~     (all equal)
ReadRR-12      19.0 ± 0%      19.0 ± 0%     ~     (all equal)

* Update ParseZone documentation to match comment changes

The zlexer code was changed to return comments more often, so update the
ParseZone documentation to match.
2018-10-15 17:42:31 +10:30
Miek Gieben 388f6eea29
Tests updates (#556)
Use :0 for loopback testing. This is more portable between testing environments.
Add testRR that calls NewRR and throws error away - apply it everywhere where needed.

It seems only Go 1.9 can deal with :0 being used. Disable 1.8 in travis.
2017-11-08 10:01:19 +00:00