Eliminate lexer goroutines (#792)

* Eliminate zlexer goroutine

This replaces the zlexer goroutine and channels with a zlexer struct
that maintains state and provides a channel-like API.

* Eliminate klexer goroutine

This replaces the klexer goroutine and channels with a klexer struct
that maintains state and provides a channel-like API.

* Merge scan into zlexer and klexer

This does result in tokenText existing twice, but it's pretty simple
and small so it's not that bad.

* Avoid using text/scanner.Position to track position

* Track escape within zlexer.Next

* Avoid zl.commt check on space and tab in zlexer

* Track stri within zlexer.Next

* Track comi within zlexer.Next

There is one special case at the start of a comment that needs to be
handled, otherwise this is as simple as stri was.

* Use a single token buffer in zlexer

This is safe as there is never both a non-empty string buffer and a
non-empty comment buffer.

* Don't hardcode length of zl.tok in zlexer

* Eliminate lex.length field

This is always set to len(l.token) and is only queried in a few places.

It was added in 47cc5b052d without any
obvious need.

* Add whitespace to klexer.Next

* Track lex within klexer.Next

* Use a strings.Builder in klexer.Next

* Simplify : case in klexer.Next

* Add whitespace to zlexer.Next

* Change for loop style in zlexer.Next and klexer.Next

* Surface read errors in zlexer

* Surface read errors from klexer

* Remove debug line from parseKey

* Rename tokenText to readByte

* Make readByte return ok bool

Also change the for loop style to match the Next for loops.

* Make readByte errors sticky

klexer.Next calls readByte separately from within the loop. Without
readByte being sticky, an error that occurs during that readByte call
may be lost.

* Panic in testRR if the error is non-nil

* Add whitespace and unify field setting in zlexer.Next

* Remove eof fields from zlexer and klexer

With readByte having sticky errors, this no longer needed. zl.eof = true
was also in the wrong place and could mask an unbalanced brace error.

* Merge zl.tok blocks in zlexer.Next

* Split the tok buffer into separate string and comment buffers

The invariant of stri > 0 && comi > 0 never being true was broken when
x == '\n' && !zl.quote && zl.commt && zl.brace != 0 (the
"If not in a brace this ends the comment AND the RR" block).

Split the buffer back out into two separate buffers to avoid clobbering.

* Replace token slices with arrays in zlexer

* Add a NewRR benchmark

* Move token buffers into zlexer.Next

These don't need to be retained across Next calls and can be stack
allocated inside Next. This drastically reduces memory consumption as
they accounted for nearly half of all the memory used.

name      old alloc/op   new alloc/op   delta
NewRR-12    9.72kB ± 0%    4.98kB ± 0%  -48.72%  (p=0.000 n=10+10)

* Add a ReadRR benchmark

Unlike NewRR, this will use an io.Reader that does not implement any
methods aside from Read. In particular it does not implement
io.ByteReader.

* Avoid using a bufio.Reader for io.ByteReader readers

At the same time use a smaller buffer size of 1KiB rather than the
bufio.NewReader default of 4KiB.

name       old time/op    new time/op    delta
NewRR-12     11.0µs ± 3%     9.5µs ± 2%  -13.77%  (p=0.000 n=9+10)
ReadRR-12    11.2µs ±16%     9.8µs ± 1%  -13.03%  (p=0.000 n=10+10)

name       old alloc/op   new alloc/op   delta
NewRR-12     4.98kB ± 0%    0.81kB ± 0%  -83.79%  (p=0.000 n=10+10)
ReadRR-12    4.87kB ± 0%    1.82kB ± 0%  -62.73%  (p=0.000 n=10+10)

name       old allocs/op  new allocs/op  delta
NewRR-12       19.0 ± 0%      17.0 ± 0%  -10.53%  (p=0.000 n=10+10)
ReadRR-12      19.0 ± 0%      19.0 ± 0%     ~     (all equal)

ReadRR-12    11.2µs ±16%     9.8µs ± 1%  -13.03%  (p=0.000 n=10+10)

* Surface any remaining comment from zlexer.Next

* Improve comment handling in zlexer.Next

This both fixes a regression where comments could be lost under certain
circumstances and now emits comments that occur within braces.

* Remove outdated comment from zlexer.Next and klexer.Next

* Delay converting LF to space in braced comment

* Fixup TestParseZoneComments

* Remove tokenUpper field from lex

Not computing this for every token, and instead only
when needed is a substantial performance improvement.

name       old time/op    new time/op    delta
NewRR-12     9.56µs ± 0%    6.30µs ± 1%  -34.08%  (p=0.000 n=9+10)
ReadRR-12    9.93µs ± 1%    6.67µs ± 1%  -32.77%  (p=0.000 n=10+10)

name       old alloc/op   new alloc/op   delta
NewRR-12       824B ± 0%      808B ± 0%   -1.94%  (p=0.000 n=10+10)
ReadRR-12    1.83kB ± 0%    1.82kB ± 0%   -0.87%  (p=0.000 n=10+10)

name       old allocs/op  new allocs/op  delta
NewRR-12       17.0 ± 0%      17.0 ± 0%     ~     (all equal)
ReadRR-12      19.0 ± 0%      19.0 ± 0%     ~     (all equal)

* Update ParseZone documentation to match comment changes

The zlexer code was changed to return comments more often, so update the
ParseZone documentation to match.
This commit is contained in:
Tom Thorogood 2018-10-15 17:42:31 +10:30 committed by GitHub
parent 4a9ca7e98d
commit 17c1bc6792
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
10 changed files with 980 additions and 676 deletions

View File

@ -1,6 +1,7 @@
package dns package dns
import ( import (
"bufio"
"crypto" "crypto"
"crypto/dsa" "crypto/dsa"
"crypto/ecdsa" "crypto/ecdsa"
@ -194,23 +195,12 @@ func readPrivateKeyED25519(m map[string]string) (ed25519.PrivateKey, error) {
// parseKey reads a private key from r. It returns a map[string]string, // parseKey reads a private key from r. It returns a map[string]string,
// with the key-value pairs, or an error when the file is not correct. // with the key-value pairs, or an error when the file is not correct.
func parseKey(r io.Reader, file string) (map[string]string, error) { func parseKey(r io.Reader, file string) (map[string]string, error) {
s, cancel := scanInit(r)
m := make(map[string]string) m := make(map[string]string)
c := make(chan lex) var k string
k := ""
defer func() { c := newKLexer(r)
cancel()
// zlexer can send up to two tokens, the next one and possibly 1 remainders. for l, ok := c.Next(); ok; l, ok = c.Next() {
// Do a non-blocking read.
_, ok := <-c
_, ok = <-c
if !ok {
// too bad
}
}()
// Start the lexer
go klexer(s, c)
for l := range c {
// It should alternate // It should alternate
switch l.value { switch l.value {
case zKey: case zKey:
@ -219,41 +209,111 @@ func parseKey(r io.Reader, file string) (map[string]string, error) {
if k == "" { if k == "" {
return nil, &ParseError{file, "no private key seen", l} return nil, &ParseError{file, "no private key seen", l}
} }
//println("Setting", strings.ToLower(k), "to", l.token, "b")
m[strings.ToLower(k)] = l.token m[strings.ToLower(k)] = l.token
k = "" k = ""
} }
} }
// Surface any read errors from r.
if err := c.Err(); err != nil {
return nil, &ParseError{file: file, err: err.Error()}
}
return m, nil return m, nil
} }
// klexer scans the sourcefile and returns tokens on the channel c. type klexer struct {
func klexer(s *scan, c chan lex) { br io.ByteReader
var l lex
str := "" // Hold the current read text readErr error
commt := false
key := true line int
x, err := s.tokenText() column int
defer close(c)
for err == nil { key bool
l.column = s.position.Column
l.line = s.position.Line eol bool // end-of-line
}
func newKLexer(r io.Reader) *klexer {
br, ok := r.(io.ByteReader)
if !ok {
br = bufio.NewReaderSize(r, 1024)
}
return &klexer{
br: br,
line: 1,
key: true,
}
}
func (kl *klexer) Err() error {
if kl.readErr == io.EOF {
return nil
}
return kl.readErr
}
// readByte returns the next byte from the input
func (kl *klexer) readByte() (byte, bool) {
if kl.readErr != nil {
return 0, false
}
c, err := kl.br.ReadByte()
if err != nil {
kl.readErr = err
return 0, false
}
// delay the newline handling until the next token is delivered,
// fixes off-by-one errors when reporting a parse error.
if kl.eol {
kl.line++
kl.column = 0
kl.eol = false
}
if c == '\n' {
kl.eol = true
} else {
kl.column++
}
return c, true
}
func (kl *klexer) Next() (lex, bool) {
var (
l lex
str strings.Builder
commt bool
)
for x, ok := kl.readByte(); ok; x, ok = kl.readByte() {
l.line, l.column = kl.line, kl.column
switch x { switch x {
case ':': case ':':
if commt { if commt || !kl.key {
break break
} }
l.token = str
if key { kl.key = false
l.value = zKey
c <- l // Next token is a space, eat it
// Next token is a space, eat it kl.readByte()
s.tokenText()
key = false l.value = zKey
str = "" l.token = str.String()
} else { return l, true
l.value = zValue
}
case ';': case ';':
commt = true commt = true
case '\n': case '\n':
@ -261,24 +321,27 @@ func klexer(s *scan, c chan lex) {
// Reset a comment // Reset a comment
commt = false commt = false
} }
kl.key = true
l.value = zValue l.value = zValue
l.token = str l.token = str.String()
c <- l return l, true
str = ""
commt = false
key = true
default: default:
if commt { if commt {
break break
} }
str += string(x)
str.WriteByte(x)
} }
x, err = s.tokenText()
} }
if len(str) > 0 {
if str.Len() > 0 {
// Send remainder // Send remainder
l.token = str
l.value = zValue l.value = zValue
c <- l l.token = str.String()
return l, true
} }
return lex{value: zEOF}, false
} }

View File

@ -846,3 +846,13 @@ func TestRsaExponentUnpack(t *testing.T) {
t.Fatalf("cannot verify RRSIG with keytag [%d]. Cause [%s]", ksk.KeyTag(), e.Error()) t.Fatalf("cannot verify RRSIG with keytag [%d]. Cause [%s]", ksk.KeyTag(), e.Error())
} }
} }
func TestParseKeyReadError(t *testing.T) {
m, err := parseKey(errReader{}, "")
if err == nil || !strings.Contains(err.Error(), errTestReadError.Error()) {
t.Errorf("expected error to contain %q, but got %v", errTestReadError, err)
}
if m != nil {
t.Errorf("expected a nil map, but got %v", m)
}
}

View File

@ -20,7 +20,7 @@ import (
// of $ after that are interpreted. // of $ after that are interpreted.
// Any error are returned as a string value, the empty string signals // Any error are returned as a string value, the empty string signals
// "no error". // "no error".
func generate(l lex, c chan lex, t chan *Token, o string) string { func generate(l lex, c *zlexer, t chan *Token, o string) string {
step := 1 step := 1
if i := strings.IndexAny(l.token, "/"); i != -1 { if i := strings.IndexAny(l.token, "/"); i != -1 {
if i+1 == len(l.token) { if i+1 == len(l.token) {
@ -52,11 +52,11 @@ func generate(l lex, c chan lex, t chan *Token, o string) string {
return "bad range in $GENERATE range" return "bad range in $GENERATE range"
} }
<-c // _BLANK c.Next() // _BLANK
// Create a complete new string, which we then parse again. // Create a complete new string, which we then parse again.
s := "" s := ""
BuildRR: BuildRR:
l = <-c l, _ = c.Next()
if l.value != zNewline && l.value != zEOF { if l.value != zNewline && l.value != zEOF {
s += l.token s += l.token
goto BuildRR goto BuildRR
@ -107,7 +107,7 @@ BuildRR:
mod, offset, err = modToPrintf(s[j+2 : j+2+sep]) mod, offset, err = modToPrintf(s[j+2 : j+2+sep])
if err != nil { if err != nil {
return err.Error() return err.Error()
} else if start + offset < 0 || end + offset > 1<<31-1 { } else if start+offset < 0 || end+offset > 1<<31-1 {
return "bad offset in $GENERATE" return "bad offset in $GENERATE"
} }
j += 2 + sep // Jump to it j += 2 + sep // Jump to it

View File

@ -847,20 +847,25 @@ func TestPX(t *testing.T) {
func TestComment(t *testing.T) { func TestComment(t *testing.T) {
// Comments we must see // Comments we must see
comments := map[string]bool{"; this is comment 1": true, comments := map[string]bool{
"; this is comment 4": true, "; this is comment 6": true, "; this is comment 1": true,
"; this is comment 7": true, "; this is comment 8": true} "; this is comment 2": true,
"; this is comment 4": true,
"; this is comment 6": true,
"; this is comment 7": true,
"; this is comment 8": true,
}
zone := ` zone := `
foo. IN A 10.0.0.1 ; this is comment 1 foo. IN A 10.0.0.1 ; this is comment 1
foo. IN A ( foo. IN A (
10.0.0.2 ; this is comment2 10.0.0.2 ; this is comment 2
) )
; this is comment3 ; this is comment 3
foo. IN A 10.0.0.3 foo. IN A 10.0.0.3
foo. IN A ( 10.0.0.4 ); this is comment 4 foo. IN A ( 10.0.0.4 ); this is comment 4
foo. IN A 10.0.0.5 foo. IN A 10.0.0.5
; this is comment5 ; this is comment 5
foo. IN A 10.0.0.6 foo. IN A 10.0.0.6
@ -872,13 +877,112 @@ foo. IN TXT "THIS IS TEXT MAN"; this is comment 8
if x.Error == nil { if x.Error == nil {
if x.Comment != "" { if x.Comment != "" {
if _, ok := comments[x.Comment]; !ok { if _, ok := comments[x.Comment]; !ok {
t.Errorf("wrong comment %s", x.Comment) t.Errorf("wrong comment %q", x.Comment)
} }
} }
} }
} }
} }
func TestParseZoneComments(t *testing.T) {
for i, test := range []struct {
zone string
comments []string
}{
{
`name. IN SOA a6.nstld.com. hostmaster.nic.name. (
203362132 ; serial
5m ; refresh (5 minutes)
5m ; retry (5 minutes)
2w ; expire (2 weeks)
300 ; minimum (5 minutes)
) ; y
. 3600000 IN NS ONE.MY-ROOTS.NET. ; x`,
[]string{"; serial ; refresh (5 minutes) ; retry (5 minutes) ; expire (2 weeks) ; minimum (5 minutes) ; y", "; x"},
},
{
`name. IN SOA a6.nstld.com. hostmaster.nic.name. (
203362132 ; serial
5m ; refresh (5 minutes)
5m ; retry (5 minutes)
2w ; expire (2 weeks)
300 ; minimum (5 minutes)
) ; y
. 3600000 IN NS ONE.MY-ROOTS.NET.`,
[]string{"; serial ; refresh (5 minutes) ; retry (5 minutes) ; expire (2 weeks) ; minimum (5 minutes) ; y", ""},
},
{
`name. IN SOA a6.nstld.com. hostmaster.nic.name. (
203362132 ; serial
5m ; refresh (5 minutes)
5m ; retry (5 minutes)
2w ; expire (2 weeks)
300 ; minimum (5 minutes)
)
. 3600000 IN NS ONE.MY-ROOTS.NET.`,
[]string{"; serial ; refresh (5 minutes) ; retry (5 minutes) ; expire (2 weeks) ; minimum (5 minutes)", ""},
},
{
`name. IN SOA a6.nstld.com. hostmaster.nic.name. (
203362132 ; serial
5m ; refresh (5 minutes)
5m ; retry (5 minutes)
2w ; expire (2 weeks)
300 ; minimum (5 minutes)
)
. 3600000 IN NS ONE.MY-ROOTS.NET. ; x`,
[]string{"; serial ; refresh (5 minutes) ; retry (5 minutes) ; expire (2 weeks) ; minimum (5 minutes)", "; x"},
},
{
`name. IN SOA a6.nstld.com. hostmaster.nic.name. (
203362132 ; serial
5m ; refresh (5 minutes)
5m ; retry (5 minutes)
2w ; expire (2 weeks)
300 ; minimum (5 minutes)
)`,
[]string{"; serial ; refresh (5 minutes) ; retry (5 minutes) ; expire (2 weeks) ; minimum (5 minutes)"},
},
{
`. 3600000 IN NS ONE.MY-ROOTS.NET. ; x`,
[]string{"; x"},
},
{
`. 3600000 IN NS ONE.MY-ROOTS.NET.`,
[]string{""},
},
{
`. 3600000 IN NS ONE.MY-ROOTS.NET. ;;x`,
[]string{";;x"},
},
} {
r := strings.NewReader(test.zone)
var j int
for r := range ParseZone(r, "", "") {
if r.Error != nil {
t.Fatal(r.Error)
}
if j >= len(test.comments) {
t.Fatalf("too many records for zone %d at %d record, expected %d", i, j+1, len(test.comments))
}
if r.Comment != test.comments[j] {
t.Errorf("invalid comment for record %d:%d %v / %v", i, j, r.RR, r.Error)
t.Logf("expected %q", test.comments[j])
t.Logf("got %q", r.Comment)
}
j++
}
if j != len(test.comments) {
t.Errorf("too few records for zone %d, got %d, expected %d", i, j, len(test.comments))
}
}
}
func TestEUIxx(t *testing.T) { func TestEUIxx(t *testing.T) {
tests := map[string]string{ tests := map[string]string{
"host.example. IN EUI48 00-00-5e-90-01-2a": "host.example.\t3600\tIN\tEUI48\t00-00-5e-90-01-2a", "host.example. IN EUI48 00-00-5e-90-01-2a": "host.example.\t3600\tIN\tEUI48\t00-00-5e-90-01-2a",

View File

@ -105,7 +105,7 @@ func PrivateHandle(rtypestr string, rtype uint16, generator func() PrivateRdata)
return rr, off, err return rr, off, err
} }
setPrivateRR := func(h RR_Header, c chan lex, o, f string) (RR, *ParseError, string) { setPrivateRR := func(h RR_Header, c *zlexer, o, f string) (RR, *ParseError, string) {
rr := mkPrivateRR(h.Rrtype) rr := mkPrivateRR(h.Rrtype)
rr.Hdr = h rr.Hdr = h
@ -115,7 +115,7 @@ func PrivateHandle(rtypestr string, rtype uint16, generator func() PrivateRdata)
for { for {
// TODO(miek): we could also be returning _QUOTE, this might or might not // TODO(miek): we could also be returning _QUOTE, this might or might not
// be an issue (basically parsing TXT becomes hard) // be an issue (basically parsing TXT becomes hard)
switch l = <-c; l.value { switch l, _ = c.Next(); l.value {
case zNewline, zEOF: case zNewline, zEOF:
break Fetch break Fetch
case zString: case zString:

View File

@ -1,7 +1,11 @@
package dns package dns
// testRR returns the RR from string s. The error is thrown away. // testRR is a helper that wraps a call to NewRR and panics if the error is non-nil.
func testRR(s string) RR { func testRR(s string) RR {
r, _ := NewRR(s) r, err := NewRR(s)
if err != nil {
panic(err)
}
return r return r
} }

514
scan.go
View File

@ -1,6 +1,7 @@
package dns package dns
import ( import (
"bufio"
"fmt" "fmt"
"io" "io"
"os" "os"
@ -74,15 +75,13 @@ func (e *ParseError) Error() (s string) {
} }
type lex struct { type lex struct {
token string // text of the token token string // text of the token
tokenUpper string // uppercase text of the token err bool // when true, token text has lexer error
length int // length of the token value uint8 // value: zString, _BLANK, etc.
err bool // when true, token text has lexer error torc uint16 // type or class as parsed in the lexer, we only need to look this up in the grammar
value uint8 // value: zString, _BLANK, etc. line int // line in the file
torc uint16 // type or class as parsed in the lexer, we only need to look this up in the grammar column int // column in the file
line int // line in the file comment string // any comment text seen
column int // column in the file
comment string // any comment text seen
} }
// Token holds the token that are returned when a zone file is parsed. // Token holds the token that are returned when a zone file is parsed.
@ -152,7 +151,8 @@ func ReadRR(q io.Reader, filename string) (RR, error) {
// foo. IN A 10.0.0.1 ; this is a comment // foo. IN A 10.0.0.1 ; this is a comment
// //
// The text "; this is comment" is returned in Token.Comment. Comments inside the // The text "; this is comment" is returned in Token.Comment. Comments inside the
// RR are discarded. Comments on a line by themselves are discarded too. // RR are returned concatenated along with the RR. Comments on a line by themselves
// are discarded.
func ParseZone(r io.Reader, origin, file string) chan *Token { func ParseZone(r io.Reader, origin, file string) chan *Token {
return parseZoneHelper(r, origin, file, nil, 10000) return parseZoneHelper(r, origin, file, nil, 10000)
} }
@ -169,22 +169,9 @@ func parseZone(r io.Reader, origin, f string, defttl *ttlState, t chan *Token, i
close(t) close(t)
} }
}() }()
s, cancel := scanInit(r)
c := make(chan lex)
// Start the lexer
go zlexer(s, c)
defer func() { c := newZLexer(r)
cancel()
// zlexer can send up to three tokens, the next one and possibly 2 remainders.
// Do a non-blocking read.
_, ok := <-c
_, ok = <-c
_, ok = <-c
if !ok {
// too bad
}
}()
// 6 possible beginnings of a line, _ is a space // 6 possible beginnings of a line, _ is a space
// 0. zRRTYPE -> all omitted until the rrtype // 0. zRRTYPE -> all omitted until the rrtype
// 1. zOwner _ zRrtype -> class/ttl omitted // 1. zOwner _ zRrtype -> class/ttl omitted
@ -206,7 +193,7 @@ func parseZone(r io.Reader, origin, f string, defttl *ttlState, t chan *Token, i
st := zExpectOwnerDir // initial state st := zExpectOwnerDir // initial state
var h RR_Header var h RR_Header
var prevName string var prevName string
for l := range c { for l, ok := c.Next(); ok; l, ok = c.Next() {
// Lexer spotted an error already // Lexer spotted an error already
if l.err { if l.err {
t <- &Token{Error: &ParseError{f, l.token, l}} t <- &Token{Error: &ParseError{f, l.token, l}}
@ -279,9 +266,9 @@ func parseZone(r io.Reader, origin, f string, defttl *ttlState, t chan *Token, i
return return
} }
neworigin := origin // There may be optionally a new origin set after the filename, if not use current one neworigin := origin // There may be optionally a new origin set after the filename, if not use current one
switch l := <-c; l.value { switch l, _ := c.Next(); l.value {
case zBlank: case zBlank:
l := <-c l, _ := c.Next()
if l.value == zString { if l.value == zString {
name, ok := toAbsoluteName(l.token, origin) name, ok := toAbsoluteName(l.token, origin)
if !ok { if !ok {
@ -482,69 +469,157 @@ func parseZone(r io.Reader, origin, f string, defttl *ttlState, t chan *Token, i
} }
// If we get here, we and the h.Rrtype is still zero, we haven't parsed anything, this // If we get here, we and the h.Rrtype is still zero, we haven't parsed anything, this
// is not an error, because an empty zone file is still a zone file. // is not an error, because an empty zone file is still a zone file.
// Surface any read errors from r.
if err := c.Err(); err != nil {
t <- &Token{Error: &ParseError{file: f, err: err.Error()}}
}
} }
// zlexer scans the sourcefile and returns tokens on the channel c. type zlexer struct {
func zlexer(s *scan, c chan lex) { br io.ByteReader
var l lex
str := make([]byte, maxTok) // Should be enough for any token readErr error
stri := 0 // Offset in str (0 means empty)
com := make([]byte, maxTok) // Hold comment text line int
comi := 0 column int
quote := false
escape := false com string
space := false
commt := false l lex
rrtype := false
owner := true brace int
brace := 0 quote bool
x, err := s.tokenText() space bool
defer close(c) commt bool
for err == nil { rrtype bool
l.column = s.position.Column owner bool
l.line = s.position.Line
if stri >= maxTok { nextL bool
eol bool // end-of-line
}
func newZLexer(r io.Reader) *zlexer {
br, ok := r.(io.ByteReader)
if !ok {
br = bufio.NewReaderSize(r, 1024)
}
return &zlexer{
br: br,
line: 1,
owner: true,
}
}
func (zl *zlexer) Err() error {
if zl.readErr == io.EOF {
return nil
}
return zl.readErr
}
// readByte returns the next byte from the input
func (zl *zlexer) readByte() (byte, bool) {
if zl.readErr != nil {
return 0, false
}
c, err := zl.br.ReadByte()
if err != nil {
zl.readErr = err
return 0, false
}
// delay the newline handling until the next token is delivered,
// fixes off-by-one errors when reporting a parse error.
if zl.eol {
zl.line++
zl.column = 0
zl.eol = false
}
if c == '\n' {
zl.eol = true
} else {
zl.column++
}
return c, true
}
func (zl *zlexer) Next() (lex, bool) {
l := &zl.l
if zl.nextL {
zl.nextL = false
return *l, true
}
if l.err {
// Parsing errors should be sticky.
return lex{value: zEOF}, false
}
var (
str [maxTok]byte // Hold string text
com [maxTok]byte // Hold comment text
stri int // Offset in str (0 means empty)
comi int // Offset in com (0 means empty)
escape bool
)
if zl.com != "" {
comi = copy(com[:], zl.com)
zl.com = ""
}
for x, ok := zl.readByte(); ok; x, ok = zl.readByte() {
l.line, l.column = zl.line, zl.column
l.comment = ""
if stri >= len(str) {
l.token = "token length insufficient for parsing" l.token = "token length insufficient for parsing"
l.err = true l.err = true
c <- l return *l, true
return
} }
if comi >= maxTok { if comi >= len(com) {
l.token = "comment length insufficient for parsing" l.token = "comment length insufficient for parsing"
l.err = true l.err = true
c <- l return *l, true
return
} }
switch x { switch x {
case ' ', '\t': case ' ', '\t':
if escape { if escape || zl.quote {
// Inside quotes or escaped this is legal.
str[stri] = x
stri++
escape = false escape = false
str[stri] = x
stri++
break break
} }
if quote {
// Inside quotes this is legal if zl.commt {
str[stri] = x
stri++
break
}
if commt {
com[comi] = x com[comi] = x
comi++ comi++
break break
} }
var retL lex
if stri == 0 { if stri == 0 {
// Space directly in the beginning, handled in the grammar // Space directly in the beginning, handled in the grammar
} else if owner { } else if zl.owner {
// If we have a string and its the first, make it an owner // If we have a string and its the first, make it an owner
l.value = zOwner l.value = zOwner
l.token = string(str[:stri]) l.token = string(str[:stri])
l.tokenUpper = strings.ToUpper(l.token)
l.length = stri
// escape $... start with a \ not a $, so this will work // escape $... start with a \ not a $, so this will work
switch l.tokenUpper { switch strings.ToUpper(l.token) {
case "$TTL": case "$TTL":
l.value = zDirTTL l.value = zDirTTL
case "$ORIGIN": case "$ORIGIN":
@ -554,258 +629,311 @@ func zlexer(s *scan, c chan lex) {
case "$GENERATE": case "$GENERATE":
l.value = zDirGenerate l.value = zDirGenerate
} }
c <- l
retL = *l
} else { } else {
l.value = zString l.value = zString
l.token = string(str[:stri]) l.token = string(str[:stri])
l.tokenUpper = strings.ToUpper(l.token)
l.length = stri if !zl.rrtype {
if !rrtype { tokenUpper := strings.ToUpper(l.token)
if t, ok := StringToType[l.tokenUpper]; ok { if t, ok := StringToType[tokenUpper]; ok {
l.value = zRrtpe l.value = zRrtpe
l.torc = t l.torc = t
rrtype = true
} else { zl.rrtype = true
if strings.HasPrefix(l.tokenUpper, "TYPE") { } else if strings.HasPrefix(tokenUpper, "TYPE") {
t, ok := typeToInt(l.token) t, ok := typeToInt(l.token)
if !ok { if !ok {
l.token = "unknown RR type" l.token = "unknown RR type"
l.err = true l.err = true
c <- l return *l, true
return
}
l.value = zRrtpe
rrtype = true
l.torc = t
} }
l.value = zRrtpe
l.torc = t
zl.rrtype = true
} }
if t, ok := StringToClass[l.tokenUpper]; ok {
if t, ok := StringToClass[tokenUpper]; ok {
l.value = zClass l.value = zClass
l.torc = t l.torc = t
} else { } else if strings.HasPrefix(tokenUpper, "CLASS") {
if strings.HasPrefix(l.tokenUpper, "CLASS") { t, ok := classToInt(l.token)
t, ok := classToInt(l.token) if !ok {
if !ok { l.token = "unknown class"
l.token = "unknown class" l.err = true
l.err = true return *l, true
c <- l
return
}
l.value = zClass
l.torc = t
} }
l.value = zClass
l.torc = t
} }
} }
c <- l
}
stri = 0
if !space && !commt { retL = *l
}
zl.owner = false
if !zl.space {
zl.space = true
l.value = zBlank l.value = zBlank
l.token = " " l.token = " "
l.length = 1
c <- l if retL == (lex{}) {
return *l, true
}
zl.nextL = true
}
if retL != (lex{}) {
return retL, true
} }
owner = false
space = true
case ';': case ';':
if escape { if escape || zl.quote {
// Inside quotes or escaped this is legal.
str[stri] = x
stri++
escape = false escape = false
str[stri] = x
stri++
break break
} }
if quote {
// Inside quotes this is legal zl.commt = true
str[stri] = x zl.com = ""
stri++
break if comi > 1 {
// A newline was previously seen inside a comment that
// was inside braces and we delayed adding it until now.
com[comi] = ' ' // convert newline to space
comi++
} }
if stri > 0 {
l.value = zString
l.token = string(str[:stri])
l.tokenUpper = strings.ToUpper(l.token)
l.length = stri
c <- l
stri = 0
}
commt = true
com[comi] = ';' com[comi] = ';'
comi++ comi++
if stri > 0 {
zl.com = string(com[:comi])
l.value = zString
l.token = string(str[:stri])
return *l, true
}
case '\r': case '\r':
escape = false escape = false
if quote {
if zl.quote {
str[stri] = x str[stri] = x
stri++ stri++
} }
// discard if outside of quotes // discard if outside of quotes
case '\n': case '\n':
escape = false escape = false
// Escaped newline // Escaped newline
if quote { if zl.quote {
str[stri] = x str[stri] = x
stri++ stri++
break break
} }
// inside quotes this is legal
if commt { if zl.commt {
// Reset a comment // Reset a comment
commt = false zl.commt = false
rrtype = false zl.rrtype = false
stri = 0
// If not in a brace this ends the comment AND the RR // If not in a brace this ends the comment AND the RR
if brace == 0 { if zl.brace == 0 {
owner = true zl.owner = true
owner = true
l.value = zNewline l.value = zNewline
l.token = "\n" l.token = "\n"
l.tokenUpper = l.token
l.length = 1
l.comment = string(com[:comi]) l.comment = string(com[:comi])
c <- l return *l, true
l.comment = ""
comi = 0
break
} }
com[comi] = ' ' // convert newline to space
comi++ zl.com = string(com[:comi])
break break
} }
if brace == 0 { if zl.brace == 0 {
// If there is previous text, we should output it here // If there is previous text, we should output it here
var retL lex
if stri != 0 { if stri != 0 {
l.value = zString l.value = zString
l.token = string(str[:stri]) l.token = string(str[:stri])
l.tokenUpper = strings.ToUpper(l.token)
l.length = stri if !zl.rrtype {
if !rrtype { tokenUpper := strings.ToUpper(l.token)
if t, ok := StringToType[l.tokenUpper]; ok { if t, ok := StringToType[tokenUpper]; ok {
zl.rrtype = true
l.value = zRrtpe l.value = zRrtpe
l.torc = t l.torc = t
rrtype = true
} }
} }
c <- l
retL = *l
} }
l.value = zNewline l.value = zNewline
l.token = "\n" l.token = "\n"
l.tokenUpper = l.token l.comment = zl.com
l.length = 1
c <- l zl.com = ""
stri = 0 zl.rrtype = false
commt = false zl.owner = true
rrtype = false
owner = true if retL != (lex{}) {
comi = 0 zl.nextL = true
return retL, true
}
return *l, true
} }
case '\\': case '\\':
// comments do not get escaped chars, everything is copied // comments do not get escaped chars, everything is copied
if commt { if zl.commt {
com[comi] = x com[comi] = x
comi++ comi++
break break
} }
// something already escaped must be in string // something already escaped must be in string
if escape { if escape {
str[stri] = x str[stri] = x
stri++ stri++
escape = false escape = false
break break
} }
// something escaped outside of string gets added to string // something escaped outside of string gets added to string
str[stri] = x str[stri] = x
stri++ stri++
escape = true escape = true
case '"': case '"':
if commt { if zl.commt {
com[comi] = x com[comi] = x
comi++ comi++
break break
} }
if escape { if escape {
str[stri] = x str[stri] = x
stri++ stri++
escape = false escape = false
break break
} }
space = false
zl.space = false
// send previous gathered text and the quote // send previous gathered text and the quote
var retL lex
if stri != 0 { if stri != 0 {
l.value = zString l.value = zString
l.token = string(str[:stri]) l.token = string(str[:stri])
l.tokenUpper = strings.ToUpper(l.token)
l.length = stri
c <- l retL = *l
stri = 0
} }
// send quote itself as separate token // send quote itself as separate token
l.value = zQuote l.value = zQuote
l.token = "\"" l.token = "\""
l.tokenUpper = l.token
l.length = 1 zl.quote = !zl.quote
c <- l
quote = !quote if retL != (lex{}) {
zl.nextL = true
return retL, true
}
return *l, true
case '(', ')': case '(', ')':
if commt { if zl.commt {
com[comi] = x com[comi] = x
comi++ comi++
break break
} }
if escape {
if escape || zl.quote {
// Inside quotes or escaped this is legal.
str[stri] = x str[stri] = x
stri++ stri++
escape = false escape = false
break break
} }
if quote {
str[stri] = x
stri++
break
}
switch x { switch x {
case ')': case ')':
brace-- zl.brace--
if brace < 0 {
if zl.brace < 0 {
l.token = "extra closing brace" l.token = "extra closing brace"
l.tokenUpper = l.token
l.err = true l.err = true
c <- l return *l, true
return
} }
case '(': case '(':
brace++ zl.brace++
} }
default: default:
escape = false escape = false
if commt {
if zl.commt {
com[comi] = x com[comi] = x
comi++ comi++
break break
} }
str[stri] = x str[stri] = x
stri++ stri++
space = false
zl.space = false
} }
x, err = s.tokenText()
} }
var retL lex
if stri > 0 { if stri > 0 {
// Send remainder // Send remainder of str
l.token = string(str[:stri])
l.tokenUpper = strings.ToUpper(l.token)
l.length = stri
l.value = zString l.value = zString
c <- l l.token = string(str[:stri])
retL = *l
if comi <= 0 {
return retL, true
}
} }
if brace != 0 {
if comi > 0 {
// Send remainder of com
l.value = zNewline
l.token = "\n"
l.comment = string(com[:comi])
if retL != (lex{}) {
zl.nextL = true
return retL, true
}
return *l, true
}
if zl.brace != 0 {
l.comment = "" // in case there was left over string and comment
l.token = "unbalanced brace" l.token = "unbalanced brace"
l.tokenUpper = l.token
l.err = true l.err = true
c <- l return *l, true
} }
return lex{value: zEOF}, false
} }
// Extract the class number from CLASSxx // Extract the class number from CLASSxx
@ -966,12 +1094,12 @@ func locCheckEast(token string, longitude uint32) (uint32, bool) {
} }
// "Eat" the rest of the "line". Return potential comments // "Eat" the rest of the "line". Return potential comments
func slurpRemainder(c chan lex, f string) (*ParseError, string) { func slurpRemainder(c *zlexer, f string) (*ParseError, string) {
l := <-c l, _ := c.Next()
com := "" com := ""
switch l.value { switch l.value {
case zBlank: case zBlank:
l = <-c l, _ = c.Next()
com = l.comment com = l.comment
if l.value != zNewline && l.value != zEOF { if l.value != zNewline && l.value != zEOF {
return &ParseError{f, "garbage after rdata", l}, "" return &ParseError{f, "garbage after rdata", l}, ""

File diff suppressed because it is too large Load Diff

View File

@ -1,6 +1,7 @@
package dns package dns
import ( import (
"io"
"io/ioutil" "io/ioutil"
"net" "net"
"os" "os"
@ -89,3 +90,47 @@ func TestParseTA(t *testing.T) {
t.Fatal(`expected a normal RR, but got nil`) t.Fatal(`expected a normal RR, but got nil`)
} }
} }
var errTestReadError = &Error{"test error"}
type errReader struct{}
func (errReader) Read(p []byte) (int, error) { return 0, errTestReadError }
func TestParseZoneReadError(t *testing.T) {
rr, err := ReadRR(errReader{}, "")
if err == nil || !strings.Contains(err.Error(), errTestReadError.Error()) {
t.Errorf("expected error to contain %q, but got %v", errTestReadError, err)
}
if rr != nil {
t.Errorf("expected a nil RR, but got %v", rr)
}
}
func BenchmarkNewRR(b *testing.B) {
const name1 = "12345678901234567890123456789012345.12345678.123."
const s = name1 + " 3600 IN MX 10 " + name1
for n := 0; n < b.N; n++ {
_, err := NewRR(s)
if err != nil {
b.Fatal(err)
}
}
}
func BenchmarkReadRR(b *testing.B) {
const name1 = "12345678901234567890123456789012345.12345678.123."
const s = name1 + " 3600 IN MX 10 " + name1 + "\n"
for n := 0; n < b.N; n++ {
r := struct{ io.Reader }{strings.NewReader(s)}
// r is now only an io.Reader and won't benefit from the
// io.ByteReader special-case in zlexer.Next.
_, err := ReadRR(r, "")
if err != nil {
b.Fatal(err)
}
}
}

View File

@ -1,56 +0,0 @@
package dns
// Implement a simple scanner, return a byte stream from an io reader.
import (
"bufio"
"context"
"io"
"text/scanner"
)
type scan struct {
src *bufio.Reader
position scanner.Position
eof bool // Have we just seen a eof
ctx context.Context
}
func scanInit(r io.Reader) (*scan, context.CancelFunc) {
s := new(scan)
s.src = bufio.NewReader(r)
s.position.Line = 1
ctx, cancel := context.WithCancel(context.Background())
s.ctx = ctx
return s, cancel
}
// tokenText returns the next byte from the input
func (s *scan) tokenText() (byte, error) {
c, err := s.src.ReadByte()
if err != nil {
return c, err
}
select {
case <-s.ctx.Done():
return c, context.Canceled
default:
break
}
// delay the newline handling until the next token is delivered,
// fixes off-by-one errors when reporting a parse error.
if s.eof {
s.position.Line++
s.position.Column = 0
s.eof = false
}
if c == '\n' {
s.eof = true
return c, nil
}
s.position.Column++
return c, nil
}