dns/dnssec_keyscan.go

310 lines
6.2 KiB
Go
Raw Normal View History

2011-12-15 16:38:14 +00:00
package dns
import (
Eliminate lexer goroutines (#792) * Eliminate zlexer goroutine This replaces the zlexer goroutine and channels with a zlexer struct that maintains state and provides a channel-like API. * Eliminate klexer goroutine This replaces the klexer goroutine and channels with a klexer struct that maintains state and provides a channel-like API. * Merge scan into zlexer and klexer This does result in tokenText existing twice, but it's pretty simple and small so it's not that bad. * Avoid using text/scanner.Position to track position * Track escape within zlexer.Next * Avoid zl.commt check on space and tab in zlexer * Track stri within zlexer.Next * Track comi within zlexer.Next There is one special case at the start of a comment that needs to be handled, otherwise this is as simple as stri was. * Use a single token buffer in zlexer This is safe as there is never both a non-empty string buffer and a non-empty comment buffer. * Don't hardcode length of zl.tok in zlexer * Eliminate lex.length field This is always set to len(l.token) and is only queried in a few places. It was added in 47cc5b052df9b4e5d5a9900cfdd622607de10a6d without any obvious need. * Add whitespace to klexer.Next * Track lex within klexer.Next * Use a strings.Builder in klexer.Next * Simplify : case in klexer.Next * Add whitespace to zlexer.Next * Change for loop style in zlexer.Next and klexer.Next * Surface read errors in zlexer * Surface read errors from klexer * Remove debug line from parseKey * Rename tokenText to readByte * Make readByte return ok bool Also change the for loop style to match the Next for loops. * Make readByte errors sticky klexer.Next calls readByte separately from within the loop. Without readByte being sticky, an error that occurs during that readByte call may be lost. * Panic in testRR if the error is non-nil * Add whitespace and unify field setting in zlexer.Next * Remove eof fields from zlexer and klexer With readByte having sticky errors, this no longer needed. zl.eof = true was also in the wrong place and could mask an unbalanced brace error. * Merge zl.tok blocks in zlexer.Next * Split the tok buffer into separate string and comment buffers The invariant of stri > 0 && comi > 0 never being true was broken when x == '\n' && !zl.quote && zl.commt && zl.brace != 0 (the "If not in a brace this ends the comment AND the RR" block). Split the buffer back out into two separate buffers to avoid clobbering. * Replace token slices with arrays in zlexer * Add a NewRR benchmark * Move token buffers into zlexer.Next These don't need to be retained across Next calls and can be stack allocated inside Next. This drastically reduces memory consumption as they accounted for nearly half of all the memory used. name old alloc/op new alloc/op delta NewRR-12 9.72kB ± 0% 4.98kB ± 0% -48.72% (p=0.000 n=10+10) * Add a ReadRR benchmark Unlike NewRR, this will use an io.Reader that does not implement any methods aside from Read. In particular it does not implement io.ByteReader. * Avoid using a bufio.Reader for io.ByteReader readers At the same time use a smaller buffer size of 1KiB rather than the bufio.NewReader default of 4KiB. name old time/op new time/op delta NewRR-12 11.0µs ± 3% 9.5µs ± 2% -13.77% (p=0.000 n=9+10) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 4.98kB ± 0% 0.81kB ± 0% -83.79% (p=0.000 n=10+10) ReadRR-12 4.87kB ± 0% 1.82kB ± 0% -62.73% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 19.0 ± 0% 17.0 ± 0% -10.53% (p=0.000 n=10+10) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) * Surface any remaining comment from zlexer.Next * Improve comment handling in zlexer.Next This both fixes a regression where comments could be lost under certain circumstances and now emits comments that occur within braces. * Remove outdated comment from zlexer.Next and klexer.Next * Delay converting LF to space in braced comment * Fixup TestParseZoneComments * Remove tokenUpper field from lex Not computing this for every token, and instead only when needed is a substantial performance improvement. name old time/op new time/op delta NewRR-12 9.56µs ± 0% 6.30µs ± 1% -34.08% (p=0.000 n=9+10) ReadRR-12 9.93µs ± 1% 6.67µs ± 1% -32.77% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 824B ± 0% 808B ± 0% -1.94% (p=0.000 n=10+10) ReadRR-12 1.83kB ± 0% 1.82kB ± 0% -0.87% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 17.0 ± 0% 17.0 ± 0% ~ (all equal) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) * Update ParseZone documentation to match comment changes The zlexer code was changed to return comments more often, so update the ParseZone documentation to match.
2018-10-15 07:12:31 +00:00
"bufio"
"crypto"
2011-12-16 16:32:15 +00:00
"crypto/ecdsa"
"crypto/ed25519"
2011-12-16 16:32:15 +00:00
"crypto/rsa"
2011-12-15 16:38:14 +00:00
"io"
2011-12-16 16:32:15 +00:00
"math/big"
"strconv"
2011-12-16 16:32:15 +00:00
"strings"
2011-12-15 16:38:14 +00:00
)
// NewPrivateKey returns a PrivateKey by parsing the string s.
// s should be in the same form of the BIND private key files.
func (k *DNSKEY) NewPrivateKey(s string) (crypto.PrivateKey, error) {
if s == "" || s[len(s)-1] != '\n' { // We need a closing newline
return k.ReadPrivateKey(strings.NewReader(s+"\n"), "")
}
return k.ReadPrivateKey(strings.NewReader(s), "")
}
2014-01-26 13:05:48 +00:00
// ReadPrivateKey reads a private key from the io.Reader q. The string file is
2012-09-02 19:22:24 +00:00
// only used in error reporting.
// The public key must be known, because some cryptographic algorithms embed
// the public inside the privatekey.
func (k *DNSKEY) ReadPrivateKey(q io.Reader, file string) (crypto.PrivateKey, error) {
m, err := parseKey(q, file)
2011-12-16 16:32:15 +00:00
if m == nil {
return nil, err
2011-12-16 16:32:15 +00:00
}
if _, ok := m["private-key-format"]; !ok {
return nil, ErrPrivKey
}
if m["private-key-format"] != "v1.2" && m["private-key-format"] != "v1.3" {
return nil, ErrPrivKey
}
// TODO(mg): check if the pubkey matches the private key
algo, err := strconv.ParseUint(strings.SplitN(m["algorithm"], " ", 2)[0], 10, 8)
if err != nil {
return nil, ErrPrivKey
}
switch uint8(algo) {
case RSASHA1, RSASHA1NSEC3SHA1, RSASHA256, RSASHA512:
priv, err := readPrivateKeyRSA(m)
if err != nil {
return nil, err
2012-04-15 19:00:17 +00:00
}
pub := k.publicKeyRSA()
if pub == nil {
return nil, ErrKey
}
priv.PublicKey = *pub
return priv, nil
case ECDSAP256SHA256, ECDSAP384SHA384:
priv, err := readPrivateKeyECDSA(m)
if err != nil {
return nil, err
2012-04-15 19:00:17 +00:00
}
pub := k.publicKeyECDSA()
if pub == nil {
return nil, ErrKey
}
priv.PublicKey = *pub
return priv, nil
case ED25519:
return readPrivateKeyED25519(m)
default:
return nil, ErrAlg
2011-12-16 16:32:15 +00:00
}
}
// Read a private key (file) string and create a public key. Return the private key.
func readPrivateKeyRSA(m map[string]string) (*rsa.PrivateKey, error) {
2011-12-16 16:32:15 +00:00
p := new(rsa.PrivateKey)
p.Primes = []*big.Int{nil, nil}
for k, v := range m {
switch k {
case "modulus", "publicexponent", "privateexponent", "prime1", "prime2":
v1, err := fromBase64([]byte(v))
2011-12-16 16:32:15 +00:00
if err != nil {
return nil, err
}
switch k {
case "modulus":
p.PublicKey.N = new(big.Int).SetBytes(v1)
2011-12-16 16:32:15 +00:00
case "publicexponent":
i := new(big.Int).SetBytes(v1)
2011-12-16 16:32:15 +00:00
p.PublicKey.E = int(i.Int64()) // int64 should be large enough
case "privateexponent":
p.D = new(big.Int).SetBytes(v1)
2011-12-16 16:32:15 +00:00
case "prime1":
p.Primes[0] = new(big.Int).SetBytes(v1)
2011-12-16 16:32:15 +00:00
case "prime2":
p.Primes[1] = new(big.Int).SetBytes(v1)
2011-12-16 16:32:15 +00:00
}
case "exponent1", "exponent2", "coefficient":
// not used in Go (yet)
case "created", "publish", "activate":
// not used in Go (yet)
}
}
return p, nil
}
func readPrivateKeyECDSA(m map[string]string) (*ecdsa.PrivateKey, error) {
2011-12-16 16:32:15 +00:00
p := new(ecdsa.PrivateKey)
p.D = new(big.Int)
// TODO: validate that the required flags are present
2011-12-16 16:32:15 +00:00
for k, v := range m {
switch k {
case "privatekey":
v1, err := fromBase64([]byte(v))
2011-12-16 16:32:15 +00:00
if err != nil {
return nil, err
}
p.D.SetBytes(v1)
case "created", "publish", "activate":
2011-12-16 16:32:15 +00:00
/* not used in Go (yet) */
}
}
return p, nil
}
func readPrivateKeyED25519(m map[string]string) (ed25519.PrivateKey, error) {
var p ed25519.PrivateKey
// TODO: validate that the required flags are present
for k, v := range m {
switch k {
case "privatekey":
p1, err := fromBase64([]byte(v))
if err != nil {
return nil, err
}
if len(p1) != ed25519.SeedSize {
return nil, ErrPrivKey
}
p = ed25519.NewKeyFromSeed(p1)
case "created", "publish", "activate":
/* not used in Go (yet) */
}
}
return p, nil
}
// parseKey reads a private key from r. It returns a map[string]string,
2011-12-16 16:32:15 +00:00
// with the key-value pairs, or an error when the file is not correct.
func parseKey(r io.Reader, file string) (map[string]string, error) {
2011-12-16 13:48:30 +00:00
m := make(map[string]string)
Eliminate lexer goroutines (#792) * Eliminate zlexer goroutine This replaces the zlexer goroutine and channels with a zlexer struct that maintains state and provides a channel-like API. * Eliminate klexer goroutine This replaces the klexer goroutine and channels with a klexer struct that maintains state and provides a channel-like API. * Merge scan into zlexer and klexer This does result in tokenText existing twice, but it's pretty simple and small so it's not that bad. * Avoid using text/scanner.Position to track position * Track escape within zlexer.Next * Avoid zl.commt check on space and tab in zlexer * Track stri within zlexer.Next * Track comi within zlexer.Next There is one special case at the start of a comment that needs to be handled, otherwise this is as simple as stri was. * Use a single token buffer in zlexer This is safe as there is never both a non-empty string buffer and a non-empty comment buffer. * Don't hardcode length of zl.tok in zlexer * Eliminate lex.length field This is always set to len(l.token) and is only queried in a few places. It was added in 47cc5b052df9b4e5d5a9900cfdd622607de10a6d without any obvious need. * Add whitespace to klexer.Next * Track lex within klexer.Next * Use a strings.Builder in klexer.Next * Simplify : case in klexer.Next * Add whitespace to zlexer.Next * Change for loop style in zlexer.Next and klexer.Next * Surface read errors in zlexer * Surface read errors from klexer * Remove debug line from parseKey * Rename tokenText to readByte * Make readByte return ok bool Also change the for loop style to match the Next for loops. * Make readByte errors sticky klexer.Next calls readByte separately from within the loop. Without readByte being sticky, an error that occurs during that readByte call may be lost. * Panic in testRR if the error is non-nil * Add whitespace and unify field setting in zlexer.Next * Remove eof fields from zlexer and klexer With readByte having sticky errors, this no longer needed. zl.eof = true was also in the wrong place and could mask an unbalanced brace error. * Merge zl.tok blocks in zlexer.Next * Split the tok buffer into separate string and comment buffers The invariant of stri > 0 && comi > 0 never being true was broken when x == '\n' && !zl.quote && zl.commt && zl.brace != 0 (the "If not in a brace this ends the comment AND the RR" block). Split the buffer back out into two separate buffers to avoid clobbering. * Replace token slices with arrays in zlexer * Add a NewRR benchmark * Move token buffers into zlexer.Next These don't need to be retained across Next calls and can be stack allocated inside Next. This drastically reduces memory consumption as they accounted for nearly half of all the memory used. name old alloc/op new alloc/op delta NewRR-12 9.72kB ± 0% 4.98kB ± 0% -48.72% (p=0.000 n=10+10) * Add a ReadRR benchmark Unlike NewRR, this will use an io.Reader that does not implement any methods aside from Read. In particular it does not implement io.ByteReader. * Avoid using a bufio.Reader for io.ByteReader readers At the same time use a smaller buffer size of 1KiB rather than the bufio.NewReader default of 4KiB. name old time/op new time/op delta NewRR-12 11.0µs ± 3% 9.5µs ± 2% -13.77% (p=0.000 n=9+10) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 4.98kB ± 0% 0.81kB ± 0% -83.79% (p=0.000 n=10+10) ReadRR-12 4.87kB ± 0% 1.82kB ± 0% -62.73% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 19.0 ± 0% 17.0 ± 0% -10.53% (p=0.000 n=10+10) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) * Surface any remaining comment from zlexer.Next * Improve comment handling in zlexer.Next This both fixes a regression where comments could be lost under certain circumstances and now emits comments that occur within braces. * Remove outdated comment from zlexer.Next and klexer.Next * Delay converting LF to space in braced comment * Fixup TestParseZoneComments * Remove tokenUpper field from lex Not computing this for every token, and instead only when needed is a substantial performance improvement. name old time/op new time/op delta NewRR-12 9.56µs ± 0% 6.30µs ± 1% -34.08% (p=0.000 n=9+10) ReadRR-12 9.93µs ± 1% 6.67µs ± 1% -32.77% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 824B ± 0% 808B ± 0% -1.94% (p=0.000 n=10+10) ReadRR-12 1.83kB ± 0% 1.82kB ± 0% -0.87% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 17.0 ± 0% 17.0 ± 0% ~ (all equal) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) * Update ParseZone documentation to match comment changes The zlexer code was changed to return comments more often, so update the ParseZone documentation to match.
2018-10-15 07:12:31 +00:00
var k string
c := newKLexer(r)
for l, ok := c.Next(); ok; l, ok = c.Next() {
2011-12-16 13:48:30 +00:00
// It should alternate
2011-12-15 16:38:14 +00:00
switch l.value {
2015-02-19 10:45:59 +00:00
case zKey:
2011-12-16 13:48:30 +00:00
k = l.token
2015-02-19 10:45:59 +00:00
case zValue:
2011-12-16 16:32:15 +00:00
if k == "" {
2012-02-15 22:04:46 +00:00
return nil, &ParseError{file, "no private key seen", l}
2011-12-16 16:32:15 +00:00
}
Eliminate lexer goroutines (#792) * Eliminate zlexer goroutine This replaces the zlexer goroutine and channels with a zlexer struct that maintains state and provides a channel-like API. * Eliminate klexer goroutine This replaces the klexer goroutine and channels with a klexer struct that maintains state and provides a channel-like API. * Merge scan into zlexer and klexer This does result in tokenText existing twice, but it's pretty simple and small so it's not that bad. * Avoid using text/scanner.Position to track position * Track escape within zlexer.Next * Avoid zl.commt check on space and tab in zlexer * Track stri within zlexer.Next * Track comi within zlexer.Next There is one special case at the start of a comment that needs to be handled, otherwise this is as simple as stri was. * Use a single token buffer in zlexer This is safe as there is never both a non-empty string buffer and a non-empty comment buffer. * Don't hardcode length of zl.tok in zlexer * Eliminate lex.length field This is always set to len(l.token) and is only queried in a few places. It was added in 47cc5b052df9b4e5d5a9900cfdd622607de10a6d without any obvious need. * Add whitespace to klexer.Next * Track lex within klexer.Next * Use a strings.Builder in klexer.Next * Simplify : case in klexer.Next * Add whitespace to zlexer.Next * Change for loop style in zlexer.Next and klexer.Next * Surface read errors in zlexer * Surface read errors from klexer * Remove debug line from parseKey * Rename tokenText to readByte * Make readByte return ok bool Also change the for loop style to match the Next for loops. * Make readByte errors sticky klexer.Next calls readByte separately from within the loop. Without readByte being sticky, an error that occurs during that readByte call may be lost. * Panic in testRR if the error is non-nil * Add whitespace and unify field setting in zlexer.Next * Remove eof fields from zlexer and klexer With readByte having sticky errors, this no longer needed. zl.eof = true was also in the wrong place and could mask an unbalanced brace error. * Merge zl.tok blocks in zlexer.Next * Split the tok buffer into separate string and comment buffers The invariant of stri > 0 && comi > 0 never being true was broken when x == '\n' && !zl.quote && zl.commt && zl.brace != 0 (the "If not in a brace this ends the comment AND the RR" block). Split the buffer back out into two separate buffers to avoid clobbering. * Replace token slices with arrays in zlexer * Add a NewRR benchmark * Move token buffers into zlexer.Next These don't need to be retained across Next calls and can be stack allocated inside Next. This drastically reduces memory consumption as they accounted for nearly half of all the memory used. name old alloc/op new alloc/op delta NewRR-12 9.72kB ± 0% 4.98kB ± 0% -48.72% (p=0.000 n=10+10) * Add a ReadRR benchmark Unlike NewRR, this will use an io.Reader that does not implement any methods aside from Read. In particular it does not implement io.ByteReader. * Avoid using a bufio.Reader for io.ByteReader readers At the same time use a smaller buffer size of 1KiB rather than the bufio.NewReader default of 4KiB. name old time/op new time/op delta NewRR-12 11.0µs ± 3% 9.5µs ± 2% -13.77% (p=0.000 n=9+10) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 4.98kB ± 0% 0.81kB ± 0% -83.79% (p=0.000 n=10+10) ReadRR-12 4.87kB ± 0% 1.82kB ± 0% -62.73% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 19.0 ± 0% 17.0 ± 0% -10.53% (p=0.000 n=10+10) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) * Surface any remaining comment from zlexer.Next * Improve comment handling in zlexer.Next This both fixes a regression where comments could be lost under certain circumstances and now emits comments that occur within braces. * Remove outdated comment from zlexer.Next and klexer.Next * Delay converting LF to space in braced comment * Fixup TestParseZoneComments * Remove tokenUpper field from lex Not computing this for every token, and instead only when needed is a substantial performance improvement. name old time/op new time/op delta NewRR-12 9.56µs ± 0% 6.30µs ± 1% -34.08% (p=0.000 n=9+10) ReadRR-12 9.93µs ± 1% 6.67µs ± 1% -32.77% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 824B ± 0% 808B ± 0% -1.94% (p=0.000 n=10+10) ReadRR-12 1.83kB ± 0% 1.82kB ± 0% -0.87% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 17.0 ± 0% 17.0 ± 0% ~ (all equal) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) * Update ParseZone documentation to match comment changes The zlexer code was changed to return comments more often, so update the ParseZone documentation to match.
2018-10-15 07:12:31 +00:00
2011-12-16 16:32:15 +00:00
m[strings.ToLower(k)] = l.token
k = ""
2011-12-15 16:38:14 +00:00
}
}
Eliminate lexer goroutines (#792) * Eliminate zlexer goroutine This replaces the zlexer goroutine and channels with a zlexer struct that maintains state and provides a channel-like API. * Eliminate klexer goroutine This replaces the klexer goroutine and channels with a klexer struct that maintains state and provides a channel-like API. * Merge scan into zlexer and klexer This does result in tokenText existing twice, but it's pretty simple and small so it's not that bad. * Avoid using text/scanner.Position to track position * Track escape within zlexer.Next * Avoid zl.commt check on space and tab in zlexer * Track stri within zlexer.Next * Track comi within zlexer.Next There is one special case at the start of a comment that needs to be handled, otherwise this is as simple as stri was. * Use a single token buffer in zlexer This is safe as there is never both a non-empty string buffer and a non-empty comment buffer. * Don't hardcode length of zl.tok in zlexer * Eliminate lex.length field This is always set to len(l.token) and is only queried in a few places. It was added in 47cc5b052df9b4e5d5a9900cfdd622607de10a6d without any obvious need. * Add whitespace to klexer.Next * Track lex within klexer.Next * Use a strings.Builder in klexer.Next * Simplify : case in klexer.Next * Add whitespace to zlexer.Next * Change for loop style in zlexer.Next and klexer.Next * Surface read errors in zlexer * Surface read errors from klexer * Remove debug line from parseKey * Rename tokenText to readByte * Make readByte return ok bool Also change the for loop style to match the Next for loops. * Make readByte errors sticky klexer.Next calls readByte separately from within the loop. Without readByte being sticky, an error that occurs during that readByte call may be lost. * Panic in testRR if the error is non-nil * Add whitespace and unify field setting in zlexer.Next * Remove eof fields from zlexer and klexer With readByte having sticky errors, this no longer needed. zl.eof = true was also in the wrong place and could mask an unbalanced brace error. * Merge zl.tok blocks in zlexer.Next * Split the tok buffer into separate string and comment buffers The invariant of stri > 0 && comi > 0 never being true was broken when x == '\n' && !zl.quote && zl.commt && zl.brace != 0 (the "If not in a brace this ends the comment AND the RR" block). Split the buffer back out into two separate buffers to avoid clobbering. * Replace token slices with arrays in zlexer * Add a NewRR benchmark * Move token buffers into zlexer.Next These don't need to be retained across Next calls and can be stack allocated inside Next. This drastically reduces memory consumption as they accounted for nearly half of all the memory used. name old alloc/op new alloc/op delta NewRR-12 9.72kB ± 0% 4.98kB ± 0% -48.72% (p=0.000 n=10+10) * Add a ReadRR benchmark Unlike NewRR, this will use an io.Reader that does not implement any methods aside from Read. In particular it does not implement io.ByteReader. * Avoid using a bufio.Reader for io.ByteReader readers At the same time use a smaller buffer size of 1KiB rather than the bufio.NewReader default of 4KiB. name old time/op new time/op delta NewRR-12 11.0µs ± 3% 9.5µs ± 2% -13.77% (p=0.000 n=9+10) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 4.98kB ± 0% 0.81kB ± 0% -83.79% (p=0.000 n=10+10) ReadRR-12 4.87kB ± 0% 1.82kB ± 0% -62.73% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 19.0 ± 0% 17.0 ± 0% -10.53% (p=0.000 n=10+10) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) * Surface any remaining comment from zlexer.Next * Improve comment handling in zlexer.Next This both fixes a regression where comments could be lost under certain circumstances and now emits comments that occur within braces. * Remove outdated comment from zlexer.Next and klexer.Next * Delay converting LF to space in braced comment * Fixup TestParseZoneComments * Remove tokenUpper field from lex Not computing this for every token, and instead only when needed is a substantial performance improvement. name old time/op new time/op delta NewRR-12 9.56µs ± 0% 6.30µs ± 1% -34.08% (p=0.000 n=9+10) ReadRR-12 9.93µs ± 1% 6.67µs ± 1% -32.77% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 824B ± 0% 808B ± 0% -1.94% (p=0.000 n=10+10) ReadRR-12 1.83kB ± 0% 1.82kB ± 0% -0.87% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 17.0 ± 0% 17.0 ± 0% ~ (all equal) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) * Update ParseZone documentation to match comment changes The zlexer code was changed to return comments more often, so update the ParseZone documentation to match.
2018-10-15 07:12:31 +00:00
// Surface any read errors from r.
if err := c.Err(); err != nil {
return nil, &ParseError{file: file, err: err.Error()}
}
2011-12-16 13:48:30 +00:00
return m, nil
2011-12-15 16:38:14 +00:00
}
Eliminate lexer goroutines (#792) * Eliminate zlexer goroutine This replaces the zlexer goroutine and channels with a zlexer struct that maintains state and provides a channel-like API. * Eliminate klexer goroutine This replaces the klexer goroutine and channels with a klexer struct that maintains state and provides a channel-like API. * Merge scan into zlexer and klexer This does result in tokenText existing twice, but it's pretty simple and small so it's not that bad. * Avoid using text/scanner.Position to track position * Track escape within zlexer.Next * Avoid zl.commt check on space and tab in zlexer * Track stri within zlexer.Next * Track comi within zlexer.Next There is one special case at the start of a comment that needs to be handled, otherwise this is as simple as stri was. * Use a single token buffer in zlexer This is safe as there is never both a non-empty string buffer and a non-empty comment buffer. * Don't hardcode length of zl.tok in zlexer * Eliminate lex.length field This is always set to len(l.token) and is only queried in a few places. It was added in 47cc5b052df9b4e5d5a9900cfdd622607de10a6d without any obvious need. * Add whitespace to klexer.Next * Track lex within klexer.Next * Use a strings.Builder in klexer.Next * Simplify : case in klexer.Next * Add whitespace to zlexer.Next * Change for loop style in zlexer.Next and klexer.Next * Surface read errors in zlexer * Surface read errors from klexer * Remove debug line from parseKey * Rename tokenText to readByte * Make readByte return ok bool Also change the for loop style to match the Next for loops. * Make readByte errors sticky klexer.Next calls readByte separately from within the loop. Without readByte being sticky, an error that occurs during that readByte call may be lost. * Panic in testRR if the error is non-nil * Add whitespace and unify field setting in zlexer.Next * Remove eof fields from zlexer and klexer With readByte having sticky errors, this no longer needed. zl.eof = true was also in the wrong place and could mask an unbalanced brace error. * Merge zl.tok blocks in zlexer.Next * Split the tok buffer into separate string and comment buffers The invariant of stri > 0 && comi > 0 never being true was broken when x == '\n' && !zl.quote && zl.commt && zl.brace != 0 (the "If not in a brace this ends the comment AND the RR" block). Split the buffer back out into two separate buffers to avoid clobbering. * Replace token slices with arrays in zlexer * Add a NewRR benchmark * Move token buffers into zlexer.Next These don't need to be retained across Next calls and can be stack allocated inside Next. This drastically reduces memory consumption as they accounted for nearly half of all the memory used. name old alloc/op new alloc/op delta NewRR-12 9.72kB ± 0% 4.98kB ± 0% -48.72% (p=0.000 n=10+10) * Add a ReadRR benchmark Unlike NewRR, this will use an io.Reader that does not implement any methods aside from Read. In particular it does not implement io.ByteReader. * Avoid using a bufio.Reader for io.ByteReader readers At the same time use a smaller buffer size of 1KiB rather than the bufio.NewReader default of 4KiB. name old time/op new time/op delta NewRR-12 11.0µs ± 3% 9.5µs ± 2% -13.77% (p=0.000 n=9+10) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 4.98kB ± 0% 0.81kB ± 0% -83.79% (p=0.000 n=10+10) ReadRR-12 4.87kB ± 0% 1.82kB ± 0% -62.73% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 19.0 ± 0% 17.0 ± 0% -10.53% (p=0.000 n=10+10) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) * Surface any remaining comment from zlexer.Next * Improve comment handling in zlexer.Next This both fixes a regression where comments could be lost under certain circumstances and now emits comments that occur within braces. * Remove outdated comment from zlexer.Next and klexer.Next * Delay converting LF to space in braced comment * Fixup TestParseZoneComments * Remove tokenUpper field from lex Not computing this for every token, and instead only when needed is a substantial performance improvement. name old time/op new time/op delta NewRR-12 9.56µs ± 0% 6.30µs ± 1% -34.08% (p=0.000 n=9+10) ReadRR-12 9.93µs ± 1% 6.67µs ± 1% -32.77% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 824B ± 0% 808B ± 0% -1.94% (p=0.000 n=10+10) ReadRR-12 1.83kB ± 0% 1.82kB ± 0% -0.87% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 17.0 ± 0% 17.0 ± 0% ~ (all equal) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) * Update ParseZone documentation to match comment changes The zlexer code was changed to return comments more often, so update the ParseZone documentation to match.
2018-10-15 07:12:31 +00:00
type klexer struct {
br io.ByteReader
readErr error
line int
column int
key bool
eol bool // end-of-line
}
func newKLexer(r io.Reader) *klexer {
br, ok := r.(io.ByteReader)
if !ok {
br = bufio.NewReaderSize(r, 1024)
}
return &klexer{
br: br,
line: 1,
key: true,
}
}
func (kl *klexer) Err() error {
if kl.readErr == io.EOF {
return nil
}
return kl.readErr
}
// readByte returns the next byte from the input
func (kl *klexer) readByte() (byte, bool) {
if kl.readErr != nil {
return 0, false
}
c, err := kl.br.ReadByte()
if err != nil {
kl.readErr = err
return 0, false
}
// delay the newline handling until the next token is delivered,
// fixes off-by-one errors when reporting a parse error.
if kl.eol {
kl.line++
kl.column = 0
kl.eol = false
}
if c == '\n' {
kl.eol = true
} else {
kl.column++
}
return c, true
}
func (kl *klexer) Next() (lex, bool) {
var (
l lex
str strings.Builder
commt bool
)
for x, ok := kl.readByte(); ok; x, ok = kl.readByte() {
l.line, l.column = kl.line, kl.column
switch x {
case ':':
Eliminate lexer goroutines (#792) * Eliminate zlexer goroutine This replaces the zlexer goroutine and channels with a zlexer struct that maintains state and provides a channel-like API. * Eliminate klexer goroutine This replaces the klexer goroutine and channels with a klexer struct that maintains state and provides a channel-like API. * Merge scan into zlexer and klexer This does result in tokenText existing twice, but it's pretty simple and small so it's not that bad. * Avoid using text/scanner.Position to track position * Track escape within zlexer.Next * Avoid zl.commt check on space and tab in zlexer * Track stri within zlexer.Next * Track comi within zlexer.Next There is one special case at the start of a comment that needs to be handled, otherwise this is as simple as stri was. * Use a single token buffer in zlexer This is safe as there is never both a non-empty string buffer and a non-empty comment buffer. * Don't hardcode length of zl.tok in zlexer * Eliminate lex.length field This is always set to len(l.token) and is only queried in a few places. It was added in 47cc5b052df9b4e5d5a9900cfdd622607de10a6d without any obvious need. * Add whitespace to klexer.Next * Track lex within klexer.Next * Use a strings.Builder in klexer.Next * Simplify : case in klexer.Next * Add whitespace to zlexer.Next * Change for loop style in zlexer.Next and klexer.Next * Surface read errors in zlexer * Surface read errors from klexer * Remove debug line from parseKey * Rename tokenText to readByte * Make readByte return ok bool Also change the for loop style to match the Next for loops. * Make readByte errors sticky klexer.Next calls readByte separately from within the loop. Without readByte being sticky, an error that occurs during that readByte call may be lost. * Panic in testRR if the error is non-nil * Add whitespace and unify field setting in zlexer.Next * Remove eof fields from zlexer and klexer With readByte having sticky errors, this no longer needed. zl.eof = true was also in the wrong place and could mask an unbalanced brace error. * Merge zl.tok blocks in zlexer.Next * Split the tok buffer into separate string and comment buffers The invariant of stri > 0 && comi > 0 never being true was broken when x == '\n' && !zl.quote && zl.commt && zl.brace != 0 (the "If not in a brace this ends the comment AND the RR" block). Split the buffer back out into two separate buffers to avoid clobbering. * Replace token slices with arrays in zlexer * Add a NewRR benchmark * Move token buffers into zlexer.Next These don't need to be retained across Next calls and can be stack allocated inside Next. This drastically reduces memory consumption as they accounted for nearly half of all the memory used. name old alloc/op new alloc/op delta NewRR-12 9.72kB ± 0% 4.98kB ± 0% -48.72% (p=0.000 n=10+10) * Add a ReadRR benchmark Unlike NewRR, this will use an io.Reader that does not implement any methods aside from Read. In particular it does not implement io.ByteReader. * Avoid using a bufio.Reader for io.ByteReader readers At the same time use a smaller buffer size of 1KiB rather than the bufio.NewReader default of 4KiB. name old time/op new time/op delta NewRR-12 11.0µs ± 3% 9.5µs ± 2% -13.77% (p=0.000 n=9+10) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 4.98kB ± 0% 0.81kB ± 0% -83.79% (p=0.000 n=10+10) ReadRR-12 4.87kB ± 0% 1.82kB ± 0% -62.73% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 19.0 ± 0% 17.0 ± 0% -10.53% (p=0.000 n=10+10) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) * Surface any remaining comment from zlexer.Next * Improve comment handling in zlexer.Next This both fixes a regression where comments could be lost under certain circumstances and now emits comments that occur within braces. * Remove outdated comment from zlexer.Next and klexer.Next * Delay converting LF to space in braced comment * Fixup TestParseZoneComments * Remove tokenUpper field from lex Not computing this for every token, and instead only when needed is a substantial performance improvement. name old time/op new time/op delta NewRR-12 9.56µs ± 0% 6.30µs ± 1% -34.08% (p=0.000 n=9+10) ReadRR-12 9.93µs ± 1% 6.67µs ± 1% -32.77% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 824B ± 0% 808B ± 0% -1.94% (p=0.000 n=10+10) ReadRR-12 1.83kB ± 0% 1.82kB ± 0% -0.87% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 17.0 ± 0% 17.0 ± 0% ~ (all equal) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) * Update ParseZone documentation to match comment changes The zlexer code was changed to return comments more often, so update the ParseZone documentation to match.
2018-10-15 07:12:31 +00:00
if commt || !kl.key {
2011-12-15 16:38:14 +00:00
break
}
Eliminate lexer goroutines (#792) * Eliminate zlexer goroutine This replaces the zlexer goroutine and channels with a zlexer struct that maintains state and provides a channel-like API. * Eliminate klexer goroutine This replaces the klexer goroutine and channels with a klexer struct that maintains state and provides a channel-like API. * Merge scan into zlexer and klexer This does result in tokenText existing twice, but it's pretty simple and small so it's not that bad. * Avoid using text/scanner.Position to track position * Track escape within zlexer.Next * Avoid zl.commt check on space and tab in zlexer * Track stri within zlexer.Next * Track comi within zlexer.Next There is one special case at the start of a comment that needs to be handled, otherwise this is as simple as stri was. * Use a single token buffer in zlexer This is safe as there is never both a non-empty string buffer and a non-empty comment buffer. * Don't hardcode length of zl.tok in zlexer * Eliminate lex.length field This is always set to len(l.token) and is only queried in a few places. It was added in 47cc5b052df9b4e5d5a9900cfdd622607de10a6d without any obvious need. * Add whitespace to klexer.Next * Track lex within klexer.Next * Use a strings.Builder in klexer.Next * Simplify : case in klexer.Next * Add whitespace to zlexer.Next * Change for loop style in zlexer.Next and klexer.Next * Surface read errors in zlexer * Surface read errors from klexer * Remove debug line from parseKey * Rename tokenText to readByte * Make readByte return ok bool Also change the for loop style to match the Next for loops. * Make readByte errors sticky klexer.Next calls readByte separately from within the loop. Without readByte being sticky, an error that occurs during that readByte call may be lost. * Panic in testRR if the error is non-nil * Add whitespace and unify field setting in zlexer.Next * Remove eof fields from zlexer and klexer With readByte having sticky errors, this no longer needed. zl.eof = true was also in the wrong place and could mask an unbalanced brace error. * Merge zl.tok blocks in zlexer.Next * Split the tok buffer into separate string and comment buffers The invariant of stri > 0 && comi > 0 never being true was broken when x == '\n' && !zl.quote && zl.commt && zl.brace != 0 (the "If not in a brace this ends the comment AND the RR" block). Split the buffer back out into two separate buffers to avoid clobbering. * Replace token slices with arrays in zlexer * Add a NewRR benchmark * Move token buffers into zlexer.Next These don't need to be retained across Next calls and can be stack allocated inside Next. This drastically reduces memory consumption as they accounted for nearly half of all the memory used. name old alloc/op new alloc/op delta NewRR-12 9.72kB ± 0% 4.98kB ± 0% -48.72% (p=0.000 n=10+10) * Add a ReadRR benchmark Unlike NewRR, this will use an io.Reader that does not implement any methods aside from Read. In particular it does not implement io.ByteReader. * Avoid using a bufio.Reader for io.ByteReader readers At the same time use a smaller buffer size of 1KiB rather than the bufio.NewReader default of 4KiB. name old time/op new time/op delta NewRR-12 11.0µs ± 3% 9.5µs ± 2% -13.77% (p=0.000 n=9+10) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 4.98kB ± 0% 0.81kB ± 0% -83.79% (p=0.000 n=10+10) ReadRR-12 4.87kB ± 0% 1.82kB ± 0% -62.73% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 19.0 ± 0% 17.0 ± 0% -10.53% (p=0.000 n=10+10) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) * Surface any remaining comment from zlexer.Next * Improve comment handling in zlexer.Next This both fixes a regression where comments could be lost under certain circumstances and now emits comments that occur within braces. * Remove outdated comment from zlexer.Next and klexer.Next * Delay converting LF to space in braced comment * Fixup TestParseZoneComments * Remove tokenUpper field from lex Not computing this for every token, and instead only when needed is a substantial performance improvement. name old time/op new time/op delta NewRR-12 9.56µs ± 0% 6.30µs ± 1% -34.08% (p=0.000 n=9+10) ReadRR-12 9.93µs ± 1% 6.67µs ± 1% -32.77% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 824B ± 0% 808B ± 0% -1.94% (p=0.000 n=10+10) ReadRR-12 1.83kB ± 0% 1.82kB ± 0% -0.87% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 17.0 ± 0% 17.0 ± 0% ~ (all equal) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) * Update ParseZone documentation to match comment changes The zlexer code was changed to return comments more often, so update the ParseZone documentation to match.
2018-10-15 07:12:31 +00:00
kl.key = false
// Next token is a space, eat it
kl.readByte()
l.value = zKey
l.token = str.String()
return l, true
case ';':
2011-12-15 16:38:14 +00:00
commt = true
case '\n':
2011-12-15 16:38:14 +00:00
if commt {
// Reset a comment
commt = false
}
Eliminate lexer goroutines (#792) * Eliminate zlexer goroutine This replaces the zlexer goroutine and channels with a zlexer struct that maintains state and provides a channel-like API. * Eliminate klexer goroutine This replaces the klexer goroutine and channels with a klexer struct that maintains state and provides a channel-like API. * Merge scan into zlexer and klexer This does result in tokenText existing twice, but it's pretty simple and small so it's not that bad. * Avoid using text/scanner.Position to track position * Track escape within zlexer.Next * Avoid zl.commt check on space and tab in zlexer * Track stri within zlexer.Next * Track comi within zlexer.Next There is one special case at the start of a comment that needs to be handled, otherwise this is as simple as stri was. * Use a single token buffer in zlexer This is safe as there is never both a non-empty string buffer and a non-empty comment buffer. * Don't hardcode length of zl.tok in zlexer * Eliminate lex.length field This is always set to len(l.token) and is only queried in a few places. It was added in 47cc5b052df9b4e5d5a9900cfdd622607de10a6d without any obvious need. * Add whitespace to klexer.Next * Track lex within klexer.Next * Use a strings.Builder in klexer.Next * Simplify : case in klexer.Next * Add whitespace to zlexer.Next * Change for loop style in zlexer.Next and klexer.Next * Surface read errors in zlexer * Surface read errors from klexer * Remove debug line from parseKey * Rename tokenText to readByte * Make readByte return ok bool Also change the for loop style to match the Next for loops. * Make readByte errors sticky klexer.Next calls readByte separately from within the loop. Without readByte being sticky, an error that occurs during that readByte call may be lost. * Panic in testRR if the error is non-nil * Add whitespace and unify field setting in zlexer.Next * Remove eof fields from zlexer and klexer With readByte having sticky errors, this no longer needed. zl.eof = true was also in the wrong place and could mask an unbalanced brace error. * Merge zl.tok blocks in zlexer.Next * Split the tok buffer into separate string and comment buffers The invariant of stri > 0 && comi > 0 never being true was broken when x == '\n' && !zl.quote && zl.commt && zl.brace != 0 (the "If not in a brace this ends the comment AND the RR" block). Split the buffer back out into two separate buffers to avoid clobbering. * Replace token slices with arrays in zlexer * Add a NewRR benchmark * Move token buffers into zlexer.Next These don't need to be retained across Next calls and can be stack allocated inside Next. This drastically reduces memory consumption as they accounted for nearly half of all the memory used. name old alloc/op new alloc/op delta NewRR-12 9.72kB ± 0% 4.98kB ± 0% -48.72% (p=0.000 n=10+10) * Add a ReadRR benchmark Unlike NewRR, this will use an io.Reader that does not implement any methods aside from Read. In particular it does not implement io.ByteReader. * Avoid using a bufio.Reader for io.ByteReader readers At the same time use a smaller buffer size of 1KiB rather than the bufio.NewReader default of 4KiB. name old time/op new time/op delta NewRR-12 11.0µs ± 3% 9.5µs ± 2% -13.77% (p=0.000 n=9+10) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 4.98kB ± 0% 0.81kB ± 0% -83.79% (p=0.000 n=10+10) ReadRR-12 4.87kB ± 0% 1.82kB ± 0% -62.73% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 19.0 ± 0% 17.0 ± 0% -10.53% (p=0.000 n=10+10) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) * Surface any remaining comment from zlexer.Next * Improve comment handling in zlexer.Next This both fixes a regression where comments could be lost under certain circumstances and now emits comments that occur within braces. * Remove outdated comment from zlexer.Next and klexer.Next * Delay converting LF to space in braced comment * Fixup TestParseZoneComments * Remove tokenUpper field from lex Not computing this for every token, and instead only when needed is a substantial performance improvement. name old time/op new time/op delta NewRR-12 9.56µs ± 0% 6.30µs ± 1% -34.08% (p=0.000 n=9+10) ReadRR-12 9.93µs ± 1% 6.67µs ± 1% -32.77% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 824B ± 0% 808B ± 0% -1.94% (p=0.000 n=10+10) ReadRR-12 1.83kB ± 0% 1.82kB ± 0% -0.87% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 17.0 ± 0% 17.0 ± 0% ~ (all equal) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) * Update ParseZone documentation to match comment changes The zlexer code was changed to return comments more often, so update the ParseZone documentation to match.
2018-10-15 07:12:31 +00:00
if kl.key && str.Len() == 0 {
// ignore empty lines
break
}
Eliminate lexer goroutines (#792) * Eliminate zlexer goroutine This replaces the zlexer goroutine and channels with a zlexer struct that maintains state and provides a channel-like API. * Eliminate klexer goroutine This replaces the klexer goroutine and channels with a klexer struct that maintains state and provides a channel-like API. * Merge scan into zlexer and klexer This does result in tokenText existing twice, but it's pretty simple and small so it's not that bad. * Avoid using text/scanner.Position to track position * Track escape within zlexer.Next * Avoid zl.commt check on space and tab in zlexer * Track stri within zlexer.Next * Track comi within zlexer.Next There is one special case at the start of a comment that needs to be handled, otherwise this is as simple as stri was. * Use a single token buffer in zlexer This is safe as there is never both a non-empty string buffer and a non-empty comment buffer. * Don't hardcode length of zl.tok in zlexer * Eliminate lex.length field This is always set to len(l.token) and is only queried in a few places. It was added in 47cc5b052df9b4e5d5a9900cfdd622607de10a6d without any obvious need. * Add whitespace to klexer.Next * Track lex within klexer.Next * Use a strings.Builder in klexer.Next * Simplify : case in klexer.Next * Add whitespace to zlexer.Next * Change for loop style in zlexer.Next and klexer.Next * Surface read errors in zlexer * Surface read errors from klexer * Remove debug line from parseKey * Rename tokenText to readByte * Make readByte return ok bool Also change the for loop style to match the Next for loops. * Make readByte errors sticky klexer.Next calls readByte separately from within the loop. Without readByte being sticky, an error that occurs during that readByte call may be lost. * Panic in testRR if the error is non-nil * Add whitespace and unify field setting in zlexer.Next * Remove eof fields from zlexer and klexer With readByte having sticky errors, this no longer needed. zl.eof = true was also in the wrong place and could mask an unbalanced brace error. * Merge zl.tok blocks in zlexer.Next * Split the tok buffer into separate string and comment buffers The invariant of stri > 0 && comi > 0 never being true was broken when x == '\n' && !zl.quote && zl.commt && zl.brace != 0 (the "If not in a brace this ends the comment AND the RR" block). Split the buffer back out into two separate buffers to avoid clobbering. * Replace token slices with arrays in zlexer * Add a NewRR benchmark * Move token buffers into zlexer.Next These don't need to be retained across Next calls and can be stack allocated inside Next. This drastically reduces memory consumption as they accounted for nearly half of all the memory used. name old alloc/op new alloc/op delta NewRR-12 9.72kB ± 0% 4.98kB ± 0% -48.72% (p=0.000 n=10+10) * Add a ReadRR benchmark Unlike NewRR, this will use an io.Reader that does not implement any methods aside from Read. In particular it does not implement io.ByteReader. * Avoid using a bufio.Reader for io.ByteReader readers At the same time use a smaller buffer size of 1KiB rather than the bufio.NewReader default of 4KiB. name old time/op new time/op delta NewRR-12 11.0µs ± 3% 9.5µs ± 2% -13.77% (p=0.000 n=9+10) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 4.98kB ± 0% 0.81kB ± 0% -83.79% (p=0.000 n=10+10) ReadRR-12 4.87kB ± 0% 1.82kB ± 0% -62.73% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 19.0 ± 0% 17.0 ± 0% -10.53% (p=0.000 n=10+10) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) * Surface any remaining comment from zlexer.Next * Improve comment handling in zlexer.Next This both fixes a regression where comments could be lost under certain circumstances and now emits comments that occur within braces. * Remove outdated comment from zlexer.Next and klexer.Next * Delay converting LF to space in braced comment * Fixup TestParseZoneComments * Remove tokenUpper field from lex Not computing this for every token, and instead only when needed is a substantial performance improvement. name old time/op new time/op delta NewRR-12 9.56µs ± 0% 6.30µs ± 1% -34.08% (p=0.000 n=9+10) ReadRR-12 9.93µs ± 1% 6.67µs ± 1% -32.77% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 824B ± 0% 808B ± 0% -1.94% (p=0.000 n=10+10) ReadRR-12 1.83kB ± 0% 1.82kB ± 0% -0.87% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 17.0 ± 0% 17.0 ± 0% ~ (all equal) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) * Update ParseZone documentation to match comment changes The zlexer code was changed to return comments more often, so update the ParseZone documentation to match.
2018-10-15 07:12:31 +00:00
kl.key = true
2015-02-19 10:45:59 +00:00
l.value = zValue
Eliminate lexer goroutines (#792) * Eliminate zlexer goroutine This replaces the zlexer goroutine and channels with a zlexer struct that maintains state and provides a channel-like API. * Eliminate klexer goroutine This replaces the klexer goroutine and channels with a klexer struct that maintains state and provides a channel-like API. * Merge scan into zlexer and klexer This does result in tokenText existing twice, but it's pretty simple and small so it's not that bad. * Avoid using text/scanner.Position to track position * Track escape within zlexer.Next * Avoid zl.commt check on space and tab in zlexer * Track stri within zlexer.Next * Track comi within zlexer.Next There is one special case at the start of a comment that needs to be handled, otherwise this is as simple as stri was. * Use a single token buffer in zlexer This is safe as there is never both a non-empty string buffer and a non-empty comment buffer. * Don't hardcode length of zl.tok in zlexer * Eliminate lex.length field This is always set to len(l.token) and is only queried in a few places. It was added in 47cc5b052df9b4e5d5a9900cfdd622607de10a6d without any obvious need. * Add whitespace to klexer.Next * Track lex within klexer.Next * Use a strings.Builder in klexer.Next * Simplify : case in klexer.Next * Add whitespace to zlexer.Next * Change for loop style in zlexer.Next and klexer.Next * Surface read errors in zlexer * Surface read errors from klexer * Remove debug line from parseKey * Rename tokenText to readByte * Make readByte return ok bool Also change the for loop style to match the Next for loops. * Make readByte errors sticky klexer.Next calls readByte separately from within the loop. Without readByte being sticky, an error that occurs during that readByte call may be lost. * Panic in testRR if the error is non-nil * Add whitespace and unify field setting in zlexer.Next * Remove eof fields from zlexer and klexer With readByte having sticky errors, this no longer needed. zl.eof = true was also in the wrong place and could mask an unbalanced brace error. * Merge zl.tok blocks in zlexer.Next * Split the tok buffer into separate string and comment buffers The invariant of stri > 0 && comi > 0 never being true was broken when x == '\n' && !zl.quote && zl.commt && zl.brace != 0 (the "If not in a brace this ends the comment AND the RR" block). Split the buffer back out into two separate buffers to avoid clobbering. * Replace token slices with arrays in zlexer * Add a NewRR benchmark * Move token buffers into zlexer.Next These don't need to be retained across Next calls and can be stack allocated inside Next. This drastically reduces memory consumption as they accounted for nearly half of all the memory used. name old alloc/op new alloc/op delta NewRR-12 9.72kB ± 0% 4.98kB ± 0% -48.72% (p=0.000 n=10+10) * Add a ReadRR benchmark Unlike NewRR, this will use an io.Reader that does not implement any methods aside from Read. In particular it does not implement io.ByteReader. * Avoid using a bufio.Reader for io.ByteReader readers At the same time use a smaller buffer size of 1KiB rather than the bufio.NewReader default of 4KiB. name old time/op new time/op delta NewRR-12 11.0µs ± 3% 9.5µs ± 2% -13.77% (p=0.000 n=9+10) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 4.98kB ± 0% 0.81kB ± 0% -83.79% (p=0.000 n=10+10) ReadRR-12 4.87kB ± 0% 1.82kB ± 0% -62.73% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 19.0 ± 0% 17.0 ± 0% -10.53% (p=0.000 n=10+10) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) * Surface any remaining comment from zlexer.Next * Improve comment handling in zlexer.Next This both fixes a regression where comments could be lost under certain circumstances and now emits comments that occur within braces. * Remove outdated comment from zlexer.Next and klexer.Next * Delay converting LF to space in braced comment * Fixup TestParseZoneComments * Remove tokenUpper field from lex Not computing this for every token, and instead only when needed is a substantial performance improvement. name old time/op new time/op delta NewRR-12 9.56µs ± 0% 6.30µs ± 1% -34.08% (p=0.000 n=9+10) ReadRR-12 9.93µs ± 1% 6.67µs ± 1% -32.77% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 824B ± 0% 808B ± 0% -1.94% (p=0.000 n=10+10) ReadRR-12 1.83kB ± 0% 1.82kB ± 0% -0.87% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 17.0 ± 0% 17.0 ± 0% ~ (all equal) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) * Update ParseZone documentation to match comment changes The zlexer code was changed to return comments more often, so update the ParseZone documentation to match.
2018-10-15 07:12:31 +00:00
l.token = str.String()
return l, true
2011-12-15 16:38:14 +00:00
default:
if commt {
break
}
Eliminate lexer goroutines (#792) * Eliminate zlexer goroutine This replaces the zlexer goroutine and channels with a zlexer struct that maintains state and provides a channel-like API. * Eliminate klexer goroutine This replaces the klexer goroutine and channels with a klexer struct that maintains state and provides a channel-like API. * Merge scan into zlexer and klexer This does result in tokenText existing twice, but it's pretty simple and small so it's not that bad. * Avoid using text/scanner.Position to track position * Track escape within zlexer.Next * Avoid zl.commt check on space and tab in zlexer * Track stri within zlexer.Next * Track comi within zlexer.Next There is one special case at the start of a comment that needs to be handled, otherwise this is as simple as stri was. * Use a single token buffer in zlexer This is safe as there is never both a non-empty string buffer and a non-empty comment buffer. * Don't hardcode length of zl.tok in zlexer * Eliminate lex.length field This is always set to len(l.token) and is only queried in a few places. It was added in 47cc5b052df9b4e5d5a9900cfdd622607de10a6d without any obvious need. * Add whitespace to klexer.Next * Track lex within klexer.Next * Use a strings.Builder in klexer.Next * Simplify : case in klexer.Next * Add whitespace to zlexer.Next * Change for loop style in zlexer.Next and klexer.Next * Surface read errors in zlexer * Surface read errors from klexer * Remove debug line from parseKey * Rename tokenText to readByte * Make readByte return ok bool Also change the for loop style to match the Next for loops. * Make readByte errors sticky klexer.Next calls readByte separately from within the loop. Without readByte being sticky, an error that occurs during that readByte call may be lost. * Panic in testRR if the error is non-nil * Add whitespace and unify field setting in zlexer.Next * Remove eof fields from zlexer and klexer With readByte having sticky errors, this no longer needed. zl.eof = true was also in the wrong place and could mask an unbalanced brace error. * Merge zl.tok blocks in zlexer.Next * Split the tok buffer into separate string and comment buffers The invariant of stri > 0 && comi > 0 never being true was broken when x == '\n' && !zl.quote && zl.commt && zl.brace != 0 (the "If not in a brace this ends the comment AND the RR" block). Split the buffer back out into two separate buffers to avoid clobbering. * Replace token slices with arrays in zlexer * Add a NewRR benchmark * Move token buffers into zlexer.Next These don't need to be retained across Next calls and can be stack allocated inside Next. This drastically reduces memory consumption as they accounted for nearly half of all the memory used. name old alloc/op new alloc/op delta NewRR-12 9.72kB ± 0% 4.98kB ± 0% -48.72% (p=0.000 n=10+10) * Add a ReadRR benchmark Unlike NewRR, this will use an io.Reader that does not implement any methods aside from Read. In particular it does not implement io.ByteReader. * Avoid using a bufio.Reader for io.ByteReader readers At the same time use a smaller buffer size of 1KiB rather than the bufio.NewReader default of 4KiB. name old time/op new time/op delta NewRR-12 11.0µs ± 3% 9.5µs ± 2% -13.77% (p=0.000 n=9+10) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 4.98kB ± 0% 0.81kB ± 0% -83.79% (p=0.000 n=10+10) ReadRR-12 4.87kB ± 0% 1.82kB ± 0% -62.73% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 19.0 ± 0% 17.0 ± 0% -10.53% (p=0.000 n=10+10) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) * Surface any remaining comment from zlexer.Next * Improve comment handling in zlexer.Next This both fixes a regression where comments could be lost under certain circumstances and now emits comments that occur within braces. * Remove outdated comment from zlexer.Next and klexer.Next * Delay converting LF to space in braced comment * Fixup TestParseZoneComments * Remove tokenUpper field from lex Not computing this for every token, and instead only when needed is a substantial performance improvement. name old time/op new time/op delta NewRR-12 9.56µs ± 0% 6.30µs ± 1% -34.08% (p=0.000 n=9+10) ReadRR-12 9.93µs ± 1% 6.67µs ± 1% -32.77% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 824B ± 0% 808B ± 0% -1.94% (p=0.000 n=10+10) ReadRR-12 1.83kB ± 0% 1.82kB ± 0% -0.87% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 17.0 ± 0% 17.0 ± 0% ~ (all equal) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) * Update ParseZone documentation to match comment changes The zlexer code was changed to return comments more often, so update the ParseZone documentation to match.
2018-10-15 07:12:31 +00:00
str.WriteByte(x)
2011-12-15 16:38:14 +00:00
}
}
Eliminate lexer goroutines (#792) * Eliminate zlexer goroutine This replaces the zlexer goroutine and channels with a zlexer struct that maintains state and provides a channel-like API. * Eliminate klexer goroutine This replaces the klexer goroutine and channels with a klexer struct that maintains state and provides a channel-like API. * Merge scan into zlexer and klexer This does result in tokenText existing twice, but it's pretty simple and small so it's not that bad. * Avoid using text/scanner.Position to track position * Track escape within zlexer.Next * Avoid zl.commt check on space and tab in zlexer * Track stri within zlexer.Next * Track comi within zlexer.Next There is one special case at the start of a comment that needs to be handled, otherwise this is as simple as stri was. * Use a single token buffer in zlexer This is safe as there is never both a non-empty string buffer and a non-empty comment buffer. * Don't hardcode length of zl.tok in zlexer * Eliminate lex.length field This is always set to len(l.token) and is only queried in a few places. It was added in 47cc5b052df9b4e5d5a9900cfdd622607de10a6d without any obvious need. * Add whitespace to klexer.Next * Track lex within klexer.Next * Use a strings.Builder in klexer.Next * Simplify : case in klexer.Next * Add whitespace to zlexer.Next * Change for loop style in zlexer.Next and klexer.Next * Surface read errors in zlexer * Surface read errors from klexer * Remove debug line from parseKey * Rename tokenText to readByte * Make readByte return ok bool Also change the for loop style to match the Next for loops. * Make readByte errors sticky klexer.Next calls readByte separately from within the loop. Without readByte being sticky, an error that occurs during that readByte call may be lost. * Panic in testRR if the error is non-nil * Add whitespace and unify field setting in zlexer.Next * Remove eof fields from zlexer and klexer With readByte having sticky errors, this no longer needed. zl.eof = true was also in the wrong place and could mask an unbalanced brace error. * Merge zl.tok blocks in zlexer.Next * Split the tok buffer into separate string and comment buffers The invariant of stri > 0 && comi > 0 never being true was broken when x == '\n' && !zl.quote && zl.commt && zl.brace != 0 (the "If not in a brace this ends the comment AND the RR" block). Split the buffer back out into two separate buffers to avoid clobbering. * Replace token slices with arrays in zlexer * Add a NewRR benchmark * Move token buffers into zlexer.Next These don't need to be retained across Next calls and can be stack allocated inside Next. This drastically reduces memory consumption as they accounted for nearly half of all the memory used. name old alloc/op new alloc/op delta NewRR-12 9.72kB ± 0% 4.98kB ± 0% -48.72% (p=0.000 n=10+10) * Add a ReadRR benchmark Unlike NewRR, this will use an io.Reader that does not implement any methods aside from Read. In particular it does not implement io.ByteReader. * Avoid using a bufio.Reader for io.ByteReader readers At the same time use a smaller buffer size of 1KiB rather than the bufio.NewReader default of 4KiB. name old time/op new time/op delta NewRR-12 11.0µs ± 3% 9.5µs ± 2% -13.77% (p=0.000 n=9+10) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 4.98kB ± 0% 0.81kB ± 0% -83.79% (p=0.000 n=10+10) ReadRR-12 4.87kB ± 0% 1.82kB ± 0% -62.73% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 19.0 ± 0% 17.0 ± 0% -10.53% (p=0.000 n=10+10) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) * Surface any remaining comment from zlexer.Next * Improve comment handling in zlexer.Next This both fixes a regression where comments could be lost under certain circumstances and now emits comments that occur within braces. * Remove outdated comment from zlexer.Next and klexer.Next * Delay converting LF to space in braced comment * Fixup TestParseZoneComments * Remove tokenUpper field from lex Not computing this for every token, and instead only when needed is a substantial performance improvement. name old time/op new time/op delta NewRR-12 9.56µs ± 0% 6.30µs ± 1% -34.08% (p=0.000 n=9+10) ReadRR-12 9.93µs ± 1% 6.67µs ± 1% -32.77% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 824B ± 0% 808B ± 0% -1.94% (p=0.000 n=10+10) ReadRR-12 1.83kB ± 0% 1.82kB ± 0% -0.87% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 17.0 ± 0% 17.0 ± 0% ~ (all equal) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) * Update ParseZone documentation to match comment changes The zlexer code was changed to return comments more often, so update the ParseZone documentation to match.
2018-10-15 07:12:31 +00:00
Add new ZoneParser API (#794) * Improve ParseZone tests * Add new ZoneParser API * Use the ZoneParser API directly in ReadRR * Merge parseZoneHelper into ParseZone * Make generate string building slightly more efficient * Add SetDefaultTTL method to ZoneParser This makes it possible for external consumers to implement ReadRR. * Make $INCLUDE directive opt-in The $INCLUDE directive opens a user controlled file and parses it as a DNS zone file. The error messages may reveal portions of sensitive files, such as: /etc/passwd: dns: not a TTL: "root:x:0:0:root:/root:/bin/bash" at line: 1:31 /etc/shadow: dns: not a TTL: "root:$6$<redacted>::0:99999:7:::" at line: 1:125 Both ParseZone and ReadRR are currently opt-in for backward compatibility. * Disable $INCLUDE support in ReadRR ReadRR and NewRR are often passed untrusted input. At the same time, $INCLUDE isn't really useful for ReadRR as it only ever returns the first record. This is a breaking change, but it currently represents a slight security risk. * Document the need to drain the ParseZone chan * Cleanup the documentation of NewRR, ReadRR and ParseZone * Document the ZoneParser API * Deprecated the ParseZone function * Add whitespace to ZoneParser.Next * Remove prevName field from ZoneParser This doesn't track anything meaningful as both zp.prevName and h.Name are only ever set at the same point and to the same value. * Use uint8 for ZoneParser.include field It has a maximum value of 7 which easily fits within uint8. This reduces the size of ZoneParser from 160 bytes to 152 bytes. * Add setParseError helper to ZoneParser * Surface $INCLUDE os.Open error in error message * Rename ZoneParser.include field to includeDepth * Make maximum $INCLUDE depth a const * Add ParseZone and ZoneParser benchmarks * Parse $GENERATE directive with a single ZoneParser This should be more efficient than calling NewRR for each generated record. * Run go fmt on generate_test.go * Add a benchmark for $GENERATE directives * Use a custom reader for generate This avoids the overhead and memory usage of building the zone string. name old time/op new time/op delta Generate-12 165µs ± 4% 157µs ± 2% -5.06% (p=0.000 n=25+25) name old alloc/op new alloc/op delta Generate-12 42.1kB ± 0% 31.8kB ± 0% -24.42% (p=0.000 n=20+23) name old allocs/op new allocs/op delta Generate-12 1.56k ± 0% 1.55k ± 0% -0.38% (p=0.000 n=25+25) * Return correct ParseError from generateReader The last commit made these regular errors while they had been ParseErrors before. * Return error message as string from modToPrintf This is slightly simpler and they don't need to be errors. * Skip setting includeDepth in generate This sub parser isn't allowed to use $INCLUDE directives anyway. Note: If generate is ever changed to allow $INCLUDE directives, then this line must be added back. Without doing that, it would be be possible to exceed maxIncludeDepth. * Make generateReader errors sticky ReadByte should not be called after an error has been returned, but this is cheap insurance. * Move file and lex fields to end of generateReader These are only used for creating a ParseError and so are unlikely to be accessed. * Don't return offset with error in modToPrintf Along for the ride, are some whitespace and style changes. * Add whitespace to generate and simplify step * Use a for loop instead of goto in generate * Support $INCLUDE directives inside $GENERATE directives This was previously supported and may be useful. This is now more rigorous as the maximum include depth is respected and relative $INCLUDE directives are now supported from within $GENERATE. * Don't return any lexer tokens after read error Without this, read errors are likely to be lost and become parse errors of the remaining str. The $GENERATE code relies on surfacing errors from the reader. * Support $INCLUDE in NewRR and ReadRR Removing $INCLUDE support from these is a breaking change and should not be included in this pull request. * Add test to ensure $GENERATE respects $INCLUDE support * Unify TestZoneParserIncludeDisallowed with other tests * Remove stray whitespace from TestGenerateSurfacesErrors * Move ZoneParser SetX methods above Err method * $GENERATE should not accept step of 0 If step is allowed to be 0, then generateReader (and the code it replaced) will get stuck in an infinite loop. This is a potential DOS vulnerability. * Fix ReadRR comment for file argument I missed this previosuly. The file argument is also used to resolve relative $INCLUDE directives. * Prevent test panics on nil error * Rework ZoneParser.subNext This is slightly cleaner and will close the underlying *os.File even if an error occurs. * Make ZoneParser.generate call subNext This also moves the calls to setParseError into generate. * Report errors when parsing rest of $GENERATE directive * Report proper error location in $GENERATE directive This makes error messages much clearer. * Simplify modToPrintf func Note: When width is 0, the leading 0 of the fmt string is now excluded. This should not alter the formatting of numbers in anyway. * Add comment explaining sub field * Remove outdated error comment from generate
2018-10-20 01:17:56 +00:00
if kl.readErr != nil && kl.readErr != io.EOF {
// Don't return any tokens after a read error occurs.
return lex{value: zEOF}, false
}
Eliminate lexer goroutines (#792) * Eliminate zlexer goroutine This replaces the zlexer goroutine and channels with a zlexer struct that maintains state and provides a channel-like API. * Eliminate klexer goroutine This replaces the klexer goroutine and channels with a klexer struct that maintains state and provides a channel-like API. * Merge scan into zlexer and klexer This does result in tokenText existing twice, but it's pretty simple and small so it's not that bad. * Avoid using text/scanner.Position to track position * Track escape within zlexer.Next * Avoid zl.commt check on space and tab in zlexer * Track stri within zlexer.Next * Track comi within zlexer.Next There is one special case at the start of a comment that needs to be handled, otherwise this is as simple as stri was. * Use a single token buffer in zlexer This is safe as there is never both a non-empty string buffer and a non-empty comment buffer. * Don't hardcode length of zl.tok in zlexer * Eliminate lex.length field This is always set to len(l.token) and is only queried in a few places. It was added in 47cc5b052df9b4e5d5a9900cfdd622607de10a6d without any obvious need. * Add whitespace to klexer.Next * Track lex within klexer.Next * Use a strings.Builder in klexer.Next * Simplify : case in klexer.Next * Add whitespace to zlexer.Next * Change for loop style in zlexer.Next and klexer.Next * Surface read errors in zlexer * Surface read errors from klexer * Remove debug line from parseKey * Rename tokenText to readByte * Make readByte return ok bool Also change the for loop style to match the Next for loops. * Make readByte errors sticky klexer.Next calls readByte separately from within the loop. Without readByte being sticky, an error that occurs during that readByte call may be lost. * Panic in testRR if the error is non-nil * Add whitespace and unify field setting in zlexer.Next * Remove eof fields from zlexer and klexer With readByte having sticky errors, this no longer needed. zl.eof = true was also in the wrong place and could mask an unbalanced brace error. * Merge zl.tok blocks in zlexer.Next * Split the tok buffer into separate string and comment buffers The invariant of stri > 0 && comi > 0 never being true was broken when x == '\n' && !zl.quote && zl.commt && zl.brace != 0 (the "If not in a brace this ends the comment AND the RR" block). Split the buffer back out into two separate buffers to avoid clobbering. * Replace token slices with arrays in zlexer * Add a NewRR benchmark * Move token buffers into zlexer.Next These don't need to be retained across Next calls and can be stack allocated inside Next. This drastically reduces memory consumption as they accounted for nearly half of all the memory used. name old alloc/op new alloc/op delta NewRR-12 9.72kB ± 0% 4.98kB ± 0% -48.72% (p=0.000 n=10+10) * Add a ReadRR benchmark Unlike NewRR, this will use an io.Reader that does not implement any methods aside from Read. In particular it does not implement io.ByteReader. * Avoid using a bufio.Reader for io.ByteReader readers At the same time use a smaller buffer size of 1KiB rather than the bufio.NewReader default of 4KiB. name old time/op new time/op delta NewRR-12 11.0µs ± 3% 9.5µs ± 2% -13.77% (p=0.000 n=9+10) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 4.98kB ± 0% 0.81kB ± 0% -83.79% (p=0.000 n=10+10) ReadRR-12 4.87kB ± 0% 1.82kB ± 0% -62.73% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 19.0 ± 0% 17.0 ± 0% -10.53% (p=0.000 n=10+10) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) * Surface any remaining comment from zlexer.Next * Improve comment handling in zlexer.Next This both fixes a regression where comments could be lost under certain circumstances and now emits comments that occur within braces. * Remove outdated comment from zlexer.Next and klexer.Next * Delay converting LF to space in braced comment * Fixup TestParseZoneComments * Remove tokenUpper field from lex Not computing this for every token, and instead only when needed is a substantial performance improvement. name old time/op new time/op delta NewRR-12 9.56µs ± 0% 6.30µs ± 1% -34.08% (p=0.000 n=9+10) ReadRR-12 9.93µs ± 1% 6.67µs ± 1% -32.77% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 824B ± 0% 808B ± 0% -1.94% (p=0.000 n=10+10) ReadRR-12 1.83kB ± 0% 1.82kB ± 0% -0.87% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 17.0 ± 0% 17.0 ± 0% ~ (all equal) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) * Update ParseZone documentation to match comment changes The zlexer code was changed to return comments more often, so update the ParseZone documentation to match.
2018-10-15 07:12:31 +00:00
if str.Len() > 0 {
2011-12-18 18:59:01 +00:00
// Send remainder
2015-02-19 10:45:59 +00:00
l.value = zValue
Eliminate lexer goroutines (#792) * Eliminate zlexer goroutine This replaces the zlexer goroutine and channels with a zlexer struct that maintains state and provides a channel-like API. * Eliminate klexer goroutine This replaces the klexer goroutine and channels with a klexer struct that maintains state and provides a channel-like API. * Merge scan into zlexer and klexer This does result in tokenText existing twice, but it's pretty simple and small so it's not that bad. * Avoid using text/scanner.Position to track position * Track escape within zlexer.Next * Avoid zl.commt check on space and tab in zlexer * Track stri within zlexer.Next * Track comi within zlexer.Next There is one special case at the start of a comment that needs to be handled, otherwise this is as simple as stri was. * Use a single token buffer in zlexer This is safe as there is never both a non-empty string buffer and a non-empty comment buffer. * Don't hardcode length of zl.tok in zlexer * Eliminate lex.length field This is always set to len(l.token) and is only queried in a few places. It was added in 47cc5b052df9b4e5d5a9900cfdd622607de10a6d without any obvious need. * Add whitespace to klexer.Next * Track lex within klexer.Next * Use a strings.Builder in klexer.Next * Simplify : case in klexer.Next * Add whitespace to zlexer.Next * Change for loop style in zlexer.Next and klexer.Next * Surface read errors in zlexer * Surface read errors from klexer * Remove debug line from parseKey * Rename tokenText to readByte * Make readByte return ok bool Also change the for loop style to match the Next for loops. * Make readByte errors sticky klexer.Next calls readByte separately from within the loop. Without readByte being sticky, an error that occurs during that readByte call may be lost. * Panic in testRR if the error is non-nil * Add whitespace and unify field setting in zlexer.Next * Remove eof fields from zlexer and klexer With readByte having sticky errors, this no longer needed. zl.eof = true was also in the wrong place and could mask an unbalanced brace error. * Merge zl.tok blocks in zlexer.Next * Split the tok buffer into separate string and comment buffers The invariant of stri > 0 && comi > 0 never being true was broken when x == '\n' && !zl.quote && zl.commt && zl.brace != 0 (the "If not in a brace this ends the comment AND the RR" block). Split the buffer back out into two separate buffers to avoid clobbering. * Replace token slices with arrays in zlexer * Add a NewRR benchmark * Move token buffers into zlexer.Next These don't need to be retained across Next calls and can be stack allocated inside Next. This drastically reduces memory consumption as they accounted for nearly half of all the memory used. name old alloc/op new alloc/op delta NewRR-12 9.72kB ± 0% 4.98kB ± 0% -48.72% (p=0.000 n=10+10) * Add a ReadRR benchmark Unlike NewRR, this will use an io.Reader that does not implement any methods aside from Read. In particular it does not implement io.ByteReader. * Avoid using a bufio.Reader for io.ByteReader readers At the same time use a smaller buffer size of 1KiB rather than the bufio.NewReader default of 4KiB. name old time/op new time/op delta NewRR-12 11.0µs ± 3% 9.5µs ± 2% -13.77% (p=0.000 n=9+10) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 4.98kB ± 0% 0.81kB ± 0% -83.79% (p=0.000 n=10+10) ReadRR-12 4.87kB ± 0% 1.82kB ± 0% -62.73% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 19.0 ± 0% 17.0 ± 0% -10.53% (p=0.000 n=10+10) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) * Surface any remaining comment from zlexer.Next * Improve comment handling in zlexer.Next This both fixes a regression where comments could be lost under certain circumstances and now emits comments that occur within braces. * Remove outdated comment from zlexer.Next and klexer.Next * Delay converting LF to space in braced comment * Fixup TestParseZoneComments * Remove tokenUpper field from lex Not computing this for every token, and instead only when needed is a substantial performance improvement. name old time/op new time/op delta NewRR-12 9.56µs ± 0% 6.30µs ± 1% -34.08% (p=0.000 n=9+10) ReadRR-12 9.93µs ± 1% 6.67µs ± 1% -32.77% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 824B ± 0% 808B ± 0% -1.94% (p=0.000 n=10+10) ReadRR-12 1.83kB ± 0% 1.82kB ± 0% -0.87% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 17.0 ± 0% 17.0 ± 0% ~ (all equal) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) * Update ParseZone documentation to match comment changes The zlexer code was changed to return comments more often, so update the ParseZone documentation to match.
2018-10-15 07:12:31 +00:00
l.token = str.String()
return l, true
2011-12-18 18:59:01 +00:00
}
Eliminate lexer goroutines (#792) * Eliminate zlexer goroutine This replaces the zlexer goroutine and channels with a zlexer struct that maintains state and provides a channel-like API. * Eliminate klexer goroutine This replaces the klexer goroutine and channels with a klexer struct that maintains state and provides a channel-like API. * Merge scan into zlexer and klexer This does result in tokenText existing twice, but it's pretty simple and small so it's not that bad. * Avoid using text/scanner.Position to track position * Track escape within zlexer.Next * Avoid zl.commt check on space and tab in zlexer * Track stri within zlexer.Next * Track comi within zlexer.Next There is one special case at the start of a comment that needs to be handled, otherwise this is as simple as stri was. * Use a single token buffer in zlexer This is safe as there is never both a non-empty string buffer and a non-empty comment buffer. * Don't hardcode length of zl.tok in zlexer * Eliminate lex.length field This is always set to len(l.token) and is only queried in a few places. It was added in 47cc5b052df9b4e5d5a9900cfdd622607de10a6d without any obvious need. * Add whitespace to klexer.Next * Track lex within klexer.Next * Use a strings.Builder in klexer.Next * Simplify : case in klexer.Next * Add whitespace to zlexer.Next * Change for loop style in zlexer.Next and klexer.Next * Surface read errors in zlexer * Surface read errors from klexer * Remove debug line from parseKey * Rename tokenText to readByte * Make readByte return ok bool Also change the for loop style to match the Next for loops. * Make readByte errors sticky klexer.Next calls readByte separately from within the loop. Without readByte being sticky, an error that occurs during that readByte call may be lost. * Panic in testRR if the error is non-nil * Add whitespace and unify field setting in zlexer.Next * Remove eof fields from zlexer and klexer With readByte having sticky errors, this no longer needed. zl.eof = true was also in the wrong place and could mask an unbalanced brace error. * Merge zl.tok blocks in zlexer.Next * Split the tok buffer into separate string and comment buffers The invariant of stri > 0 && comi > 0 never being true was broken when x == '\n' && !zl.quote && zl.commt && zl.brace != 0 (the "If not in a brace this ends the comment AND the RR" block). Split the buffer back out into two separate buffers to avoid clobbering. * Replace token slices with arrays in zlexer * Add a NewRR benchmark * Move token buffers into zlexer.Next These don't need to be retained across Next calls and can be stack allocated inside Next. This drastically reduces memory consumption as they accounted for nearly half of all the memory used. name old alloc/op new alloc/op delta NewRR-12 9.72kB ± 0% 4.98kB ± 0% -48.72% (p=0.000 n=10+10) * Add a ReadRR benchmark Unlike NewRR, this will use an io.Reader that does not implement any methods aside from Read. In particular it does not implement io.ByteReader. * Avoid using a bufio.Reader for io.ByteReader readers At the same time use a smaller buffer size of 1KiB rather than the bufio.NewReader default of 4KiB. name old time/op new time/op delta NewRR-12 11.0µs ± 3% 9.5µs ± 2% -13.77% (p=0.000 n=9+10) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 4.98kB ± 0% 0.81kB ± 0% -83.79% (p=0.000 n=10+10) ReadRR-12 4.87kB ± 0% 1.82kB ± 0% -62.73% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 19.0 ± 0% 17.0 ± 0% -10.53% (p=0.000 n=10+10) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) ReadRR-12 11.2µs ±16% 9.8µs ± 1% -13.03% (p=0.000 n=10+10) * Surface any remaining comment from zlexer.Next * Improve comment handling in zlexer.Next This both fixes a regression where comments could be lost under certain circumstances and now emits comments that occur within braces. * Remove outdated comment from zlexer.Next and klexer.Next * Delay converting LF to space in braced comment * Fixup TestParseZoneComments * Remove tokenUpper field from lex Not computing this for every token, and instead only when needed is a substantial performance improvement. name old time/op new time/op delta NewRR-12 9.56µs ± 0% 6.30µs ± 1% -34.08% (p=0.000 n=9+10) ReadRR-12 9.93µs ± 1% 6.67µs ± 1% -32.77% (p=0.000 n=10+10) name old alloc/op new alloc/op delta NewRR-12 824B ± 0% 808B ± 0% -1.94% (p=0.000 n=10+10) ReadRR-12 1.83kB ± 0% 1.82kB ± 0% -0.87% (p=0.000 n=10+10) name old allocs/op new allocs/op delta NewRR-12 17.0 ± 0% 17.0 ± 0% ~ (all equal) ReadRR-12 19.0 ± 0% 19.0 ± 0% ~ (all equal) * Update ParseZone documentation to match comment changes The zlexer code was changed to return comments more often, so update the ParseZone documentation to match.
2018-10-15 07:12:31 +00:00
return lex{value: zEOF}, false
2011-12-15 16:38:14 +00:00
}