From b22c4c43077f5ac3fb7aeb129afec7ea2e1a024a Mon Sep 17 00:00:00 2001 From: Nick Craig-Wood Date: Mon, 19 Jun 2017 17:36:14 +0100 Subject: [PATCH] http: fix, tidy and rework ready for release * Fix remaining problems * Refactor to make testing easier and add a test suite * Make path parsing more robust. * Add single file operations * Add MimeType reading for objects * Add documentation * Note go1.7+ is required to build --- README.md | 1 + bin/make_manual.py | 3 +- cmd/cmd.go | 1 + docs/content/about.md | 1 + docs/content/docs.md | 1 + docs/content/http.md | 137 +++++++++ docs/content/overview.md | 2 + docs/layouts/chrome/navbar.html | 1 + http/http.go | 422 ++++++++++++--------------- http/http_internal_test.go | 308 +++++++++++++++++++ http/http_unsupported.go | 6 + http/test/files/four/underfour.txt | 1 + http/test/files/one.txt | 1 + http/test/files/three/underthree.txt | 1 + http/test/files/two.html | 1 + http/test/index_files/apache.html | 28 ++ http/test/index_files/caddy.html | 378 ++++++++++++++++++++++++ http/test/index_files/empty.html | 0 http/test/index_files/memstore.html | 77 +++++ http/test/index_files/nginx.html | 12 + 20 files changed, 1148 insertions(+), 234 deletions(-) create mode 100644 docs/content/http.md create mode 100644 http/http_internal_test.go create mode 100644 http/http_unsupported.go create mode 100644 http/test/files/four/underfour.txt create mode 100644 http/test/files/one.txt create mode 100644 http/test/files/three/underthree.txt create mode 100644 http/test/files/two.html create mode 100644 http/test/index_files/apache.html create mode 100644 http/test/index_files/caddy.html create mode 100644 http/test/index_files/empty.html create mode 100644 http/test/index_files/memstore.html create mode 100644 http/test/index_files/nginx.html diff --git a/README.md b/README.md index d0920d2e9..71ea3a568 100644 --- a/README.md +++ b/README.md @@ -27,6 +27,7 @@ Rclone is a command line program to sync files and directories to and from * Yandex Disk * SFTP * FTP + * HTTP * The local filesystem Features diff --git a/bin/make_manual.py b/bin/make_manual.py index 8c38ba8b2..2b35753cb 100755 --- a/bin/make_manual.py +++ b/bin/make_manual.py @@ -30,8 +30,9 @@ docs = [ "b2.md", "yandex.md", "sftp.md", - "crypt.md", "ftp.md", + "http.md", + "crypt.md", "local.md", "changelog.md", "bugs.md", diff --git a/cmd/cmd.go b/cmd/cmd.go index e2bc2cc0a..dc02c1f57 100644 --- a/cmd/cmd.go +++ b/cmd/cmd.go @@ -53,6 +53,7 @@ from various cloud storage systems and using file transfer services, such as: * Yandex Disk * SFTP * FTP + * HTTP * The local filesystem Features diff --git a/docs/content/about.md b/docs/content/about.md index 4079a8d91..c9f6e3ea3 100644 --- a/docs/content/about.md +++ b/docs/content/about.md @@ -25,6 +25,7 @@ Rclone is a command line program to sync files and directories to and from * Yandex Disk * SFTP * FTP + * HTTP * The local filesystem Features diff --git a/docs/content/docs.md b/docs/content/docs.md index 6c55a323a..34d941aaa 100644 --- a/docs/content/docs.md +++ b/docs/content/docs.md @@ -32,6 +32,7 @@ See the following for detailed instructions for * [Yandex Disk](/yandex/) * [SFTP](/sftp/) * [FTP](/ftp/) + * [HTTP](/http/) * [Crypt](/crypt/) - to encrypt other remotes Usage diff --git a/docs/content/http.md b/docs/content/http.md new file mode 100644 index 000000000..27944e907 --- /dev/null +++ b/docs/content/http.md @@ -0,0 +1,137 @@ +--- +title: "HTTP Remote" +description: "Read only remote for HTTP servers" +date: "2017-06-19" +--- + + HTTP +------------------------------------------------- + +The HTTP remote is a read only remote for reading files of a +webserver. The webserver should provide file listings which rclone +will read and turn into a remote. This has been tested with common +webservers such as Apache/Nginx/Caddy and will likely work with file +listings from most web servers. (If it doesn't then please file an +issue, or send a pull request!) + +Paths are specified as `remote:` or `remote:path/to/dir`. + +Here is an example of how to make a remote called `remote`. First +run: + + rclone config + +This will guide you through an interactive setup process: + +``` +No remotes found - make a new one +n) New remote +s) Set configuration password +q) Quit config +n/s/q> n +name> remote +Type of storage to configure. +Choose a number from below, or type in your own value + 1 / Amazon Drive + \ "amazon cloud drive" + 2 / Amazon S3 (also Dreamhost, Ceph, Minio) + \ "s3" + 3 / Backblaze B2 + \ "b2" + 4 / Dropbox + \ "dropbox" + 5 / Encrypt/Decrypt a remote + \ "crypt" + 6 / FTP Connection + \ "ftp" + 7 / Google Cloud Storage (this is not Google Drive) + \ "google cloud storage" + 8 / Google Drive + \ "drive" + 9 / Hubic + \ "hubic" +10 / Local Disk + \ "local" +11 / Microsoft OneDrive + \ "onedrive" +12 / Openstack Swift (Rackspace Cloud Files, Memset Memstore, OVH) + \ "swift" +13 / SSH/SFTP Connection + \ "sftp" +14 / Yandex Disk + \ "yandex" +15 / http Connection + \ "http" +Storage> http +URL of http host to connect to +Choose a number from below, or type in your own value + 1 / Connect to example.com + \ "https://example.com" +url> https://beta.rclone.org +Remote config +-------------------- +[remote] +url = https://beta.rclone.org +-------------------- +y) Yes this is OK +e) Edit this remote +d) Delete this remote +y/e/d> y +Current remotes: + +Name Type +==== ==== +remote http + +e) Edit existing remote +n) New remote +d) Delete remote +r) Rename remote +c) Copy remote +s) Set configuration password +q) Quit config +e/n/d/r/c/s/q> q +``` + +This remote is called `remote` and can now be used like this + +See all the top level directories + + rclone lsd remote: + +List the contents of a directory + + rclone ls remote:directory + +Sync the remote `directory` to `/home/local/directory`, deleting any excess files. + + rclone sync remote:directory /home/local/directory + +### Read only ### + +This remote is read only - you can't upload files to an HTTP server. + +### Modified time ### + +Most HTTP servers store time accurate to 1 second. + +### Checksum ### + +No checksums are stored. + +### Usage without a config file ### + +Note that since only two environment variable need to be set, it is +easy to use without a config file like this. + +``` +RCLONE_CONFIG_ZZ_TYPE=http RCLONE_CONFIG_ZZ_URL=https://beta.rclone.org rclone lsd zz: +``` + +Or if you prefer + +``` +export RCLONE_CONFIG_ZZ_TYPE=http +export RCLONE_CONFIG_ZZ_URL=https://beta.rclone.org +rclone lsd zz: +``` diff --git a/docs/content/overview.md b/docs/content/overview.md index 971c991c7..f2515fac2 100644 --- a/docs/content/overview.md +++ b/docs/content/overview.md @@ -29,6 +29,7 @@ Here is an overview of the major features of each cloud storage system. | Yandex Disk | MD5 | Yes | No | No | R/W | | SFTP | - | Yes | Depends | No | - | | FTP | - | No | Yes | No | - | +| HTTP | - | No | Yes | No | R | | The local filesystem | All | Yes | Depends | No | - | ### Hash ### @@ -122,6 +123,7 @@ operations more efficient. | Yandex Disk | Yes | No | No | No | No [#575](https://github.com/ncw/rclone/issues/575) | Yes | | SFTP | No | No | Yes | Yes | No | No | | FTP | No | No | Yes | Yes | No | No | +| HTTP | No | No | No | No | No | No | | The local filesystem | Yes | No | Yes | Yes | No | No | diff --git a/docs/layouts/chrome/navbar.html b/docs/layouts/chrome/navbar.html index e699d6039..cc9c56c2d 100644 --- a/docs/layouts/chrome/navbar.html +++ b/docs/layouts/chrome/navbar.html @@ -62,6 +62,7 @@
  • Yandex Disk
  • SFTP
  • FTP
  • +
  • HTTP
  • Crypt (encrypts the above)
  • diff --git a/http/http.go b/http/http.go index 87f8f4f7c..b10373b85 100644 --- a/http/http.go +++ b/http/http.go @@ -1,18 +1,16 @@ // Package http provides a filesystem interface using golang.org/net/http // -// It treads HTML pages served from the endpoint as directory +// It treats HTML pages served from the endpoint as directory // listings, and includes any links found as files. -// +build !plan9 +// +build go1.7 package http import ( - "fmt" "io" "net/http" "net/url" - "os" "path" "strconv" "strings" @@ -23,7 +21,10 @@ import ( "golang.org/x/net/html" ) -var errorReadOnly = errors.New("http remotes are read only") +var ( + errorReadOnly = errors.New("http remotes are read only") + timeUnset = time.Unix(0, 0) +) func init() { fsi := &fs.RegInfo{ @@ -31,7 +32,7 @@ func init() { Description: "http Connection", NewFs: NewFs, Options: []fs.Option{{ - Name: "endpoint", + Name: "url", Help: "URL of http host to connect to", Optional: false, Examples: []fs.OptionExample{{ @@ -54,49 +55,86 @@ type Fs struct { // Object is a remote object that has been stat'd (so it exists, but is not necessarily open for reading) type Object struct { - fs *Fs - remote string - info os.FileInfo + fs *Fs + remote string + size int64 + modTime time.Time + contentType string } -// ObjectReader holds the File interface to a remote http file opened for reading -type ObjectReader struct { - object *Object - httpFile io.ReadCloser -} - -func urlJoin(u *url.URL, paths ...string) string { - r := u - for _, p := range paths { - if p == "/" { - continue - } - rel, _ := url.Parse(p) - r = r.ResolveReference(rel) +// Join a URL and a path returning a new URL +func urlJoin(base *url.URL, path string) *url.URL { + rel, err := url.Parse(path) + if err != nil { + fs.Errorf(nil, "Error parsing %q as URL: %v", path, err) } - return r.String() + return base.ResolveReference(rel) +} + +// statusError returns an error if the res contained an error +func statusError(res *http.Response, err error) error { + if err != nil { + return err + } + if res.StatusCode < 200 || res.StatusCode > 299 { + _ = res.Body.Close() + return errors.Errorf("HTTP Error %d: %s", res.StatusCode, res.Status) + } + return nil } // NewFs creates a new Fs object from the name and root. It connects to // the host specified in the config file. func NewFs(name, root string) (fs.Fs, error) { - endpoint := fs.ConfigFileGet(name, "endpoint") + endpoint := fs.ConfigFileGet(name, "url") + if !strings.HasSuffix(endpoint, "/") { + endpoint += "/" + } - u, err := url.Parse(endpoint) + // Parse the endpoint and stick the root onto it + base, err := url.Parse(endpoint) + if err != nil { + return nil, err + } + rootURL, err := url.Parse(root) + if err != nil { + return nil, err + } + u := base.ResolveReference(rootURL) + + client := fs.Config.Client() + + var isFile = false + if !strings.HasSuffix(u.String(), "/") { + // Make a client which doesn't follow redirects so the server + // doesn't redirect http://host/dir to http://host/dir/ + noRedir := *client + noRedir.CheckRedirect = func(req *http.Request, via []*http.Request) error { + return http.ErrUseLastResponse + } + // check to see if points to a file + res, err := noRedir.Head(u.String()) + err = statusError(res, err) + if err == nil { + isFile = true + } + } + + newRoot := u.String() + if isFile { + // Point to the parent if this is a file + newRoot, _ = path.Split(u.String()) + } else { + if !strings.HasSuffix(newRoot, "/") { + newRoot += "/" + } + } + + u, err = url.Parse(newRoot) if err != nil { return nil, err } - if !strings.HasSuffix(root, "/") && root != "" { - root += "/" - } - - client := fs.Config.Client() - - _, err = client.Head(urlJoin(u, root)) - if err != nil { - return nil, errors.Wrap(err, "couldn't connect http") - } f := &Fs{ name: name, root: root, @@ -104,6 +142,9 @@ func NewFs(name, root string) (fs.Fs, error) { endpoint: u, } f.features = (&fs.Features{}).Fill(f) + if isFile { + return f, fs.ErrorIsFile + } return f, nil } @@ -119,7 +160,7 @@ func (f *Fs) Root() string { // String returns the URL for the filesystem func (f *Fs) String() string { - return urlJoin(f.endpoint, f.root) + return f.endpoint.String() } // Features returns the optional features of this Fs @@ -145,51 +186,6 @@ func (f *Fs) NewObject(remote string) (fs.Object, error) { return o, nil } -// dirExists returns true,nil if the directory exists, false, nil if -// it doesn't or false, err -func (f *Fs) dirExists(dir string) (bool, error) { - res, err := f.httpClient.Head(urlJoin(f.endpoint, dir)) - if err != nil { - return false, err - } - if res.StatusCode == http.StatusOK { - return true, nil - } - return false, nil -} - -type entry struct { - name string - url string - size int64 - mode os.FileMode - mtime int64 -} - -func (e *entry) Name() string { - return e.name -} - -func (e *entry) Size() int64 { - return e.size -} - -func (e *entry) Mode() os.FileMode { - return os.FileMode(e.mode) -} - -func (e *entry) ModTime() time.Time { - return time.Unix(e.mtime, 0) -} - -func (e *entry) IsDir() bool { - return e.mode&os.ModeDir != 0 -} - -func (e *entry) Sys() interface{} { - return nil -} - func parseInt64(s string) int64 { n, e := strconv.ParseInt(s, 10, 64) if e != nil { @@ -198,84 +194,95 @@ func parseInt64(s string) int64 { return n } -func parseBool(s string) bool { - b, e := strconv.ParseBool(s) - if e != nil { - return false +// parseName turns a name as found in the page into a remote path or returns false +func parseName(base *url.URL, val string) (string, bool) { + name, err := url.QueryUnescape(val) + if err != nil { + return "", false } - return b -} - -func prepareTimeString(ts string) string { - return strings.Trim(strings.Join(strings.SplitN(strings.Trim(ts, "\t "), " ", 3)[0:2], " "), "\r\n\t ") -} - -func parseTime(n *html.Node) (t time.Time) { - if ts := prepareTimeString(n.Data); ts != "" { - t, _ = time.Parse("2-Jan-2006 15:04", ts) + u := urlJoin(base, name) + uStr := u.String() + if strings.Index(uStr, "?") >= 0 { + return "", false } - return t + baseStr := base.String() + // check has URL prefix + if !strings.HasPrefix(uStr, baseStr) { + return "", false + } + // check has path prefix + if !strings.HasPrefix(u.Path, base.Path) { + return "", false + } + // calculate the name relative to the base + name = u.Path[len(base.Path):] + // musn't be empty + if name == "" { + return "", false + } + // mustn't contain a / + slash := strings.Index(name, "/") + if slash >= 0 && slash != len(name)-1 { + return "", false + } + return name, true } -func (f *Fs) readDir(p string) ([]*entry, error) { - entries := make([]*entry, 0) - res, err := f.httpClient.Get(urlJoin(f.endpoint, p)) +// Parse turns HTML for a directory into names +// base should be the base URL to resolve any relative names from +func parse(base *url.URL, in io.Reader) (names []string, err error) { + doc, err := html.Parse(in) if err != nil { return nil, err } - if res.Body == nil || res.StatusCode != http.StatusOK { - //return nil, errors.Errorf("directory listing failed with error: % (%d)", res.Status, res.StatusCode) - return nil, nil + var walk func(*html.Node) + walk = func(n *html.Node) { + if n.Type == html.ElementNode && n.Data == "a" { + for _, a := range n.Attr { + if a.Key == "href" { + name, ok := parseName(base, a.Val) + if ok { + names = append(names, name) + } + break + } + } + } + for c := n.FirstChild; c != nil; c = c.NextSibling { + walk(c) + } + } + walk(doc) + return names, nil +} + +// Read the directory passed in +func (f *Fs) readDir(dir string) (names []string, err error) { + u := urlJoin(f.endpoint, dir) + if !strings.HasSuffix(u.String(), "/") { + return nil, errors.Errorf("internal error: readDir URL %q didn't end in /", u.String()) + } + res, err := f.httpClient.Get(u.String()) + if err == nil && res.StatusCode == http.StatusNotFound { + return nil, fs.ErrorDirNotFound + } + err = statusError(res, err) + if err != nil { + return nil, errors.Wrap(err, "failed to readDir") } defer fs.CheckClose(res.Body, &err) - switch strings.SplitN(res.Header.Get("Content-Type"), ";", 2)[0] { + contentType := strings.SplitN(res.Header.Get("Content-Type"), ";", 2)[0] + switch contentType { case "text/html": - doc, err := html.Parse(res.Body) + names, err = parse(u, res.Body) if err != nil { - return nil, err + return nil, errors.Wrap(err, "readDir") } - var walk func(*html.Node) - walk = func(n *html.Node) { - if n.Type == html.ElementNode && n.Data == "a" { - for _, a := range n.Attr { - if a.Key == "href" { - name, err := url.QueryUnescape(a.Val) - if err != nil { - continue - } - if name == "../" || name == "./" || name == ".." { - break - } - if strings.Index(name, "?") >= 0 || strings.HasPrefix(name, "http") { - break - } - u, err := url.Parse(name) - if err != nil { - break - } - name = path.Clean(u.Path) - e := &entry{ - name: strings.TrimRight(name, "/"), - url: name, - } - if a.Val[len(a.Val)-1] == '/' { - e.mode = os.FileMode(0555) | os.ModeDir - } else { - e.mode = os.FileMode(0444) - } - entries = append(entries, e) - break - } - } - } - for c := n.FirstChild; c != nil; c = c.NextSibling { - walk(c) - } - } - walk(doc) + default: + return nil, errors.Errorf("Can't parse content type %q", contentType) } - return entries, nil + return names, nil } // List the objects and directories in dir into entries. The @@ -288,36 +295,21 @@ func (f *Fs) readDir(p string) ([]*entry, error) { // This should return ErrDirNotFound if the directory isn't // found. func (f *Fs) List(dir string) (entries fs.DirEntries, err error) { - endpoint := path.Join(f.root, dir) - if !strings.HasSuffix(dir, "/") { - endpoint += "/" + if !strings.HasSuffix(dir, "/") && dir != "" { + dir += "/" } - ok, err := f.dirExists(endpoint) - if err != nil { - return nil, errors.Wrap(err, "List failed") - } - if !ok { - return nil, fs.ErrorDirNotFound - } - httpDir := path.Join(f.root, dir) - if !strings.HasSuffix(dir, "/") { - httpDir += "/" - } - infos, err := f.readDir(httpDir) + names, err := f.readDir(dir) if err != nil { return nil, errors.Wrapf(err, "error listing %q", dir) } - for _, info := range infos { - remote := "" - if dir != "" { - remote = dir + "/" + info.Name() - } else { - remote = info.Name() - } - if info.IsDir() { + for _, name := range names { + isDir := name[len(name)-1] == '/' + name = strings.TrimRight(name, "/") + remote := path.Join(dir, name) + if isDir { dir := &fs.Dir{ Name: remote, - When: info.ModTime(), + When: timeUnset, Bytes: 0, Count: 0, } @@ -326,7 +318,6 @@ func (f *Fs) List(dir string) (entries fs.DirEntries, err error) { file := &Object{ fs: f, remote: remote, - info: info, } if err = file.stat(); err != nil { continue @@ -371,12 +362,12 @@ func (o *Object) Hash(r fs.HashType) (string, error) { // Size returns the size in bytes of the remote http file func (o *Object) Size() int64 { - return o.info.Size() + return o.size } // ModTime returns the modification time of the remote http file func (o *Object) ModTime() time.Time { - return o.info.ModTime() + return o.modTime } // path returns the native path of the object @@ -386,37 +377,19 @@ func (o *Object) path() string { // stat updates the info field in the Object func (o *Object) stat() error { - endpoint := urlJoin(o.fs.endpoint, o.fs.root, o.remote) - if o.info.IsDir() { - endpoint += "/" - } + endpoint := urlJoin(o.fs.endpoint, o.remote).String() res, err := o.fs.httpClient.Head(endpoint) + err = statusError(res, err) if err != nil { - return err + return errors.Wrap(err, "failed to stat") } - if res.StatusCode != http.StatusOK { - return errors.New("failed to stat") - } - var mtime int64 t, err := http.ParseTime(res.Header.Get("Last-Modified")) if err != nil { - mtime = 0 - } else { - mtime = t.Unix() + t = timeUnset } - size := parseInt64(res.Header.Get("Content-Length")) - e := &entry{ - name: o.remote, - size: size, - mtime: mtime, - mode: os.FileMode(0444), - } - if strings.HasSuffix(o.remote, "/") { - e.mode = os.FileMode(0555) | os.ModeDir - e.size = 0 - e.name = o.remote[:len(o.remote)-1] - } - o.info = e + o.size = parseInt64(res.Header.Get("Content-Length")) + o.modTime = t + o.contentType = res.Header.Get("Content-Type") return nil } @@ -429,52 +402,29 @@ func (o *Object) SetModTime(modTime time.Time) error { // Storable returns whether the remote http file is a regular file (not a directory, symbolic link, block device, character device, named pipe, etc) func (o *Object) Storable() bool { - return o.info.Mode().IsRegular() -} - -// Read from a remote http file object reader -func (file *ObjectReader) Read(p []byte) (n int, err error) { - n, err = file.httpFile.Read(p) - return n, err -} - -// Close a reader of a remote http file -func (file *ObjectReader) Close() (err error) { - return file.httpFile.Close() + return true } // Open a remote http file object for reading. Seek is supported func (o *Object) Open(options ...fs.OpenOption) (in io.ReadCloser, err error) { - var offset int64 - endpoint := urlJoin(o.fs.endpoint, o.fs.root, o.remote) - offset = 0 - for _, option := range options { - switch x := option.(type) { - case *fs.SeekOption: - offset = x.Offset - default: - if option.Mandatory() { - fs.Logf(o, "Unsupported mandatory option: %v", option) - } - } - } - + endpoint := urlJoin(o.fs.endpoint, o.remote).String() req, err := http.NewRequest("GET", endpoint, nil) if err != nil { return nil, errors.Wrap(err, "Open failed") } - if offset > 0 { - req.Header.Set("Range", fmt.Sprintf("bytes=%d-", offset)) + + // Add optional headers + for k, v := range fs.OpenOptionHeaders(options) { + req.Header.Add(k, v) } + + // Do the request res, err := o.fs.httpClient.Do(req) + err = statusError(res, err) if err != nil { return nil, errors.Wrap(err, "Open failed") } - in = &ObjectReader{ - object: o, - httpFile: res.Body, - } - return in, nil + return res.Body, nil } // Hashes returns fs.HashNone to indicate remote hashing is unavailable @@ -502,8 +452,14 @@ func (o *Object) Update(in io.Reader, src fs.ObjectInfo, options ...fs.OpenOptio return errorReadOnly } +// MimeType of an Object if known, "" otherwise +func (o *Object) MimeType() string { + return o.contentType +} + // Check the interfaces are satisfied var ( - _ fs.Fs = &Fs{} - _ fs.Object = &Object{} + _ fs.Fs = &Fs{} + _ fs.Object = &Object{} + _ fs.MimeTyper = &Object{} ) diff --git a/http/http_internal_test.go b/http/http_internal_test.go new file mode 100644 index 000000000..1cc8c8fcb --- /dev/null +++ b/http/http_internal_test.go @@ -0,0 +1,308 @@ +// +build go1.7 + +package http + +import ( + "fmt" + "io/ioutil" + "net/http" + "net/http/httptest" + "net/url" + "os" + "path/filepath" + "sort" + "testing" + "time" + + "github.com/ncw/rclone/fs" + "github.com/ncw/rclone/fstest" + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" +) + +var ( + remoteName = "TestHTTP" + testPath = "test" + filesPath = filepath.Join(testPath, "files") +) + +// prepareServer the test server and return a function to tidy it up afterwards +func prepareServer(t *testing.T) func() { + // file server for test/files + fileServer := http.FileServer(http.Dir(filesPath)) + + // Make the test server + ts := httptest.NewServer(fileServer) + + // Configure the remote + fs.LoadConfig() + // fs.Config.LogLevel = fs.LogLevelDebug + // fs.Config.DumpHeaders = true + // fs.Config.DumpBodies = true + fs.ConfigFileSet(remoteName, "type", "http") + fs.ConfigFileSet(remoteName, "url", ts.URL) + + // return a function to tidy up + return ts.Close +} + +// prepare the test server and return a function to tidy it up afterwards +func prepare(t *testing.T) (fs.Fs, func()) { + tidy := prepareServer(t) + + // Instantiate it + f, err := NewFs(remoteName, "") + require.NoError(t, err) + + return f, tidy +} + +func testListRoot(t *testing.T, f fs.Fs) { + entries, err := f.List("") + require.NoError(t, err) + + sort.Sort(entries) + + require.Equal(t, 4, len(entries)) + + e := entries[0] + assert.Equal(t, "four", e.Remote()) + assert.Equal(t, int64(0), e.Size()) + _, ok := e.(*fs.Dir) + assert.True(t, ok) + + e = entries[1] + assert.Equal(t, "one.txt", e.Remote()) + assert.Equal(t, int64(6), e.Size()) + _, ok = e.(*Object) + assert.True(t, ok) + + e = entries[2] + assert.Equal(t, "three", e.Remote()) + assert.Equal(t, int64(0), e.Size()) + _, ok = e.(*fs.Dir) + assert.True(t, ok) + + e = entries[3] + assert.Equal(t, "two.html", e.Remote()) + assert.Equal(t, int64(7), e.Size()) + _, ok = e.(*Object) + assert.True(t, ok) +} + +func TestListRoot(t *testing.T) { + f, tidy := prepare(t) + defer tidy() + testListRoot(t, f) +} + +func TestListSubDir(t *testing.T) { + f, tidy := prepare(t) + defer tidy() + + entries, err := f.List("three") + require.NoError(t, err) + + sort.Sort(entries) + + assert.Equal(t, 1, len(entries)) + + e := entries[0] + assert.Equal(t, "three/underthree.txt", e.Remote()) + assert.Equal(t, int64(9), e.Size()) + _, ok := e.(*Object) + assert.True(t, ok) +} + +func TestNewObject(t *testing.T) { + f, tidy := prepare(t) + defer tidy() + + o, err := f.NewObject("four/underfour.txt") + require.NoError(t, err) + + assert.Equal(t, "four/underfour.txt", o.Remote()) + assert.Equal(t, int64(9), o.Size()) + _, ok := o.(*Object) + assert.True(t, ok) + + // Test the time is correct on the object + + tObj := o.ModTime() + + fi, err := os.Stat(filepath.Join(filesPath, "four", "underfour.txt")) + require.NoError(t, err) + tFile := fi.ModTime() + + dt, ok := fstest.CheckTimeEqualWithPrecision(tObj, tFile, time.Second) + assert.True(t, ok, fmt.Sprintf("%s: Modification time difference too big |%s| > %s (%s vs %s) (precision %s)", o.Remote(), dt, time.Second, tObj, tFile, time.Second)) +} + +func TestOpen(t *testing.T) { + f, tidy := prepare(t) + defer tidy() + + o, err := f.NewObject("four/underfour.txt") + require.NoError(t, err) + + // Test normal read + fd, err := o.Open() + require.NoError(t, err) + data, err := ioutil.ReadAll(fd) + require.NoError(t, fd.Close()) + assert.Equal(t, "beetroot\n", string(data)) + + // Test with range request + fd, err = o.Open(&fs.RangeOption{Start: 1, End: 5}) + require.NoError(t, err) + data, err = ioutil.ReadAll(fd) + require.NoError(t, fd.Close()) + assert.Equal(t, "eetro", string(data)) +} + +func TestMimeType(t *testing.T) { + f, tidy := prepare(t) + defer tidy() + + o, err := f.NewObject("four/underfour.txt") + require.NoError(t, err) + + do, ok := o.(fs.MimeTyper) + require.True(t, ok) + assert.Equal(t, "text/plain; charset=utf-8", do.MimeType()) +} + +func TestIsAFileRoot(t *testing.T) { + tidy := prepareServer(t) + defer tidy() + + f, err := NewFs(remoteName, "one.txt") + assert.Equal(t, err, fs.ErrorIsFile) + + testListRoot(t, f) +} + +func TestIsAFileSubDir(t *testing.T) { + tidy := prepareServer(t) + defer tidy() + + f, err := NewFs(remoteName, "three/underthree.txt") + assert.Equal(t, err, fs.ErrorIsFile) + + entries, err := f.List("") + require.NoError(t, err) + + sort.Sort(entries) + + assert.Equal(t, 1, len(entries)) + + e := entries[0] + assert.Equal(t, "underthree.txt", e.Remote()) + assert.Equal(t, int64(9), e.Size()) + _, ok := e.(*Object) + assert.True(t, ok) +} + +func TestParseName(t *testing.T) { + for i, test := range []struct { + base string + val string + wantOK bool + want string + }{ + {"http://example.com/", "potato", true, "potato"}, + {"http://example.com/dir/", "potato", true, "potato"}, + {"http://example.com/dir/", "../dir/potato", true, "potato"}, + {"http://example.com/dir/", "..", false, ""}, + {"http://example.com/dir/", "http://example.com/", false, ""}, + {"http://example.com/dir/", "http://example.com/dir/", false, ""}, + {"http://example.com/dir/", "http://example.com/dir/potato", true, "potato"}, + {"http://example.com/dir/", "/dir/", false, ""}, + {"http://example.com/dir/", "/dir/potato", true, "potato"}, + {"http://example.com/dir/", "subdir/potato", false, ""}, + } { + u, err := url.Parse(test.base) + require.NoError(t, err) + got, gotOK := parseName(u, test.val) + what := fmt.Sprintf("test %d base=%q, val=%q", i, test.base, test.val) + assert.Equal(t, test.wantOK, gotOK, what) + assert.Equal(t, test.want, got, what) + } +} + +// Load HTML from the file given and parse it, checking it against the entries passed in +func parseHTML(t *testing.T, name string, base string, want []string) { + in, err := os.Open(filepath.Join(testPath, "index_files", name)) + require.NoError(t, err) + defer func() { + require.NoError(t, in.Close()) + }() + if base == "" { + base = "http://example.com/" + } + u, err := url.Parse(base) + require.NoError(t, err) + entries, err := parse(u, in) + require.NoError(t, err) + assert.Equal(t, want, entries) +} + +func TestParseEmpty(t *testing.T) { + parseHTML(t, "empty.html", "", []string(nil)) +} + +func TestParseApache(t *testing.T) { + parseHTML(t, "apache.html", "http://example.com/nick/pub/", []string{ + "SWIG-embed.tar.gz", + "avi2dvd.pl", + "cambert.exe", + "cambert.gz", + "fedora_demo.gz", + "gchq-challenge/", + "mandelterm/", + "pgp-key.txt", + "pymath/", + "rclone", + "readdir.exe", + "rush_hour_solver_cut_down.py", + "snake-puzzle/", + "stressdisk/", + "timer-test", + "words-to-regexp.pl", + }) +} + +func TestParseMemstore(t *testing.T) { + parseHTML(t, "memstore.html", "", []string{ + "test/", + "v1.35/", + "v1.36-01-g503cd84/", + "rclone-beta-latest-freebsd-386.zip", + "rclone-beta-latest-freebsd-amd64.zip", + "rclone-beta-latest-windows-amd64.zip", + }) +} + +func TestParseNginx(t *testing.T) { + parseHTML(t, "nginx.html", "", []string{ + "deltas/", + "objects/", + "refs/", + "state/", + "config", + "summary", + }) +} + +func TestParseCaddy(t *testing.T) { + parseHTML(t, "caddy.html", "", []string{ + "mimetype.zip", + "rclone-delete-empty-dirs.py", + "rclone-show-empty-dirs.py", + "stat-windows-386.zip", + "v1.36-155-gcf29ee8b-team-driveβ/", + "v1.36-156-gca76b3fb-team-driveβ/", + "v1.36-156-ge1f0e0f5-team-driveβ/", + "v1.36-22-g06ea13a-ssh-agentβ/", + }) +} diff --git a/http/http_unsupported.go b/http/http_unsupported.go new file mode 100644 index 000000000..7699b1dcf --- /dev/null +++ b/http/http_unsupported.go @@ -0,0 +1,6 @@ +// Build for mount for unsupported platforms to stop go complaining +// about "no buildable Go source files " + +// +build !go1.7 + +package http diff --git a/http/test/files/four/underfour.txt b/http/test/files/four/underfour.txt new file mode 100644 index 000000000..748393ce8 --- /dev/null +++ b/http/test/files/four/underfour.txt @@ -0,0 +1 @@ +beetroot diff --git a/http/test/files/one.txt b/http/test/files/one.txt new file mode 100644 index 000000000..ce0136250 --- /dev/null +++ b/http/test/files/one.txt @@ -0,0 +1 @@ +hello diff --git a/http/test/files/three/underthree.txt b/http/test/files/three/underthree.txt new file mode 100644 index 000000000..1031dc5b4 --- /dev/null +++ b/http/test/files/three/underthree.txt @@ -0,0 +1 @@ +rutabaga diff --git a/http/test/files/two.html b/http/test/files/two.html new file mode 100644 index 000000000..4bc562871 --- /dev/null +++ b/http/test/files/two.html @@ -0,0 +1 @@ +potato diff --git a/http/test/index_files/apache.html b/http/test/index_files/apache.html new file mode 100644 index 000000000..5f1a46ad7 --- /dev/null +++ b/http/test/index_files/apache.html @@ -0,0 +1,28 @@ + + + + Index of /nick/pub + + +

    Index of /nick/pub

    + + + + + + + + + + + + + + + + + + + +
    [ICO]NameLast modifiedSizeDescription

    [DIR]Parent Directory  -  
    [   ]SWIG-embed.tar.gz29-Nov-2005 16:27 2.3K 
    [TXT]avi2dvd.pl14-Apr-2010 23:07 17K 
    [   ]cambert.exe15-Dec-2006 18:07 54K 
    [   ]cambert.gz14-Apr-2010 23:07 18K 
    [   ]fedora_demo.gz08-Jun-2007 11:01 1.0M 
    [DIR]gchq-challenge/24-Dec-2016 15:24 -  
    [DIR]mandelterm/13-Jul-2013 22:22 -  
    [TXT]pgp-key.txt14-Apr-2010 23:07 400  
    [DIR]pymath/24-Dec-2016 15:24 -  
    [   ]rclone09-May-2017 17:15 22M 
    [   ]readdir.exe21-Oct-2016 14:47 1.6M 
    [TXT]rush_hour_solver_cut_down.py23-Jul-2009 11:44 14K 
    [DIR]snake-puzzle/25-Sep-2016 20:56 -  
    [DIR]stressdisk/08-Nov-2016 14:25 -  
    [   ]timer-test09-May-2017 17:05 1.5M 
    [TXT]words-to-regexp.pl01-Mar-2005 20:43 6.0K 

    + diff --git a/http/test/index_files/caddy.html b/http/test/index_files/caddy.html new file mode 100644 index 000000000..bd7250abf --- /dev/null +++ b/http/test/index_files/caddy.html @@ -0,0 +1,378 @@ + + + + / + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +

    + / +

    +
    +
    +
    +
    + 4 directories + 4 files + +
    +
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + Name + + Size + + Modified +
    + + + mimetype.zip + + 765 KiB
    + + + rclone-delete-empty-dirs.py + + 1.2 KiB
    + + + rclone-show-empty-dirs.py + + 868 B
    + + + stat-windows-386.zip + + 688 KiB
    + + + v1.36-155-gcf29ee8b-team-driveβ + +
    + + + v1.36-156-gca76b3fb-team-driveβ + +
    + + + v1.36-156-ge1f0e0f5-team-driveβ + +
    + + + v1.36-22-g06ea13a-ssh-agentβ + +
    +
    +
    + + + + + diff --git a/http/test/index_files/empty.html b/http/test/index_files/empty.html new file mode 100644 index 000000000..e69de29bb diff --git a/http/test/index_files/memstore.html b/http/test/index_files/memstore.html new file mode 100644 index 000000000..7616cad26 --- /dev/null +++ b/http/test/index_files/memstore.html @@ -0,0 +1,77 @@ + + + + + + Index of / + + +
    +

    Index of /

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    NameTypeSizeLast modifiedMD5
    test/application/directory0 bytes--
    v1.35/application/directory0 bytes--
    v1.36-01-g503cd84/application/directory0 bytes--
    rclone-beta-latest-freebsd-386.zipapplication/zip4.6 MB2017-06-19 14:04:52e747003c69c81e675f206a715264bfa8
    rclone-beta-latest-freebsd-amd64.zipapplication/zip5.0 MB2017-06-19 14:04:53ff30b5e9bf2863a2373069142e6f2b7f
    rclone-beta-latest-windows-amd64.zipapplication/x-zip-compressed4.9 MB2017-06-19 13:56:02851a5547a0495cbbd94cbc90a80ed6f5
    +

    Memset Ltd.

    +
    + + diff --git a/http/test/index_files/nginx.html b/http/test/index_files/nginx.html new file mode 100644 index 000000000..850f7f036 --- /dev/null +++ b/http/test/index_files/nginx.html @@ -0,0 +1,12 @@ + +Index of /atomic/fedora/ + +

    Index of /atomic/fedora/


    ../
    +deltas/                                            04-May-2017 21:37                   -
    +objects/                                           04-May-2017 20:44                   -
    +refs/                                              04-May-2017 20:42                   -
    +state/                                             04-May-2017 21:36                   -
    +config                                             04-May-2017 20:42                 118
    +summary                                            04-May-2017 21:36                 806
    +

    +