-
Notifications
You must be signed in to change notification settings - Fork 76
/
CHANGELOG
941 lines (814 loc) · 46.3 KB
/
CHANGELOG
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
*********
ChangeLog
*********
23.0.0 (2023-08-17)
===================
- S3 Improvements: support sub-folder being configured as add-on root
22.0.1 (2022-07-05)
===================
- Fix: Don't call `json_api_serialized` on a list when a folder name conflict is detected.
22.0.0 (2022-06-22)
===================
- Fix: Stop double-reporting certain HTTP errors. Some error paths were resulting in two error
bodies being written out by the server. This could be confusing for free-text errors and
unparseable for json-formatted errors.
- Feature: If an upload request is made, but there is already an existing file with that name,
include metadata about the target file in the 409 Conflict response. This will save the caller a
metadata request if they wish to overwrite the existing file.
- Fix: Add missing read-write JSON-API links to OneDrive files and folders. These were overlooked
when OneDrive was converted from read-only to read-write.
- Fix: Update the key used to fetch the download url for OneDrive file revisions.
- Fix: Update the OneDrive intra-copy code to stop sending auth to the monitoring endpoint and to
expect the 200 OK response that is observed in practice rather than the 303 See Other mentioned in
the api documentation.
- Code: Migrate repo CI from TravisCI to GitHub Actions.
21.2.0 (2021-12-06)
===================
- Feature: Give WaterButler the ability to enforce rate-limits. This requires a reachable redis
instance. See `docs/rate-limiting.rst` for details on setting up and configuring.
21.1.0 (2021-09-30)
===================
- Fix: Stop breaking Google Doc (.gdoc, .gsheet, .gslide) downloads. Google Doc files are not
downloadable in their native format. Instead they must be exported to an externally-supported
format (e.g. .gdoc => .docx, .gsheet => .xlsx, .gslide => .pptx). WB had been assuming that a file
with an undefined size was a google doc, but recently GDrive has started reporting that Google Docs
have a size of 1kb. WB now explictly checks the mime-type of the requested file to determine
whether to download or export.
21.0.1 (2021-09-14)
===================
- Fix: Properly decode OneDrivePaths for small files uploaded under certain provider root
configurations. Some paths were failing to properly decode unicode and url-encoded characters.
21.0.0 (2021-08-12)
===================
- Feature: Add read/write support to OneDrive provider, adapted from the previous work of
Ryan Casey (@caseyrygt) & Oleksandr Melnykov (@alexandr-melnikov-dev-pro).
- Feature: Update OneDrive provider to support both OneDrive Personal and OneDrive for School or
Business. Switch the provider to use the MS Graph API and update documentation to follow.
- Fix: Return a valid empty zip file when downloading a directory with no contents.
- Code: Remove unneeded `async`s from synchronous Box tests.
- Code: Pin pip to v18.1 on TravisCI to fix dependency resolution.
20.1.0 (2020-11-03)
===================
- Feature: Support storage limits on osfstorage. The OSF now enforces storage quotas for OSF
projects. Update WaterButler to check the project's quota status before executing uploads,
moves, and copies.
20.0.3 (2020-10-21)
===================
- Feature: Add flags to toggle logging of WaterButler events to specific Keen collections.
20.0.2 (2020-06-02)
===================
- Fix: Bump sentry-sdk dependency from v0.14.0 to v0.14.4 to avoid potential memory leak.
20.0.1 (2020-03-09)
===================
- Fix: Extend the default timeout for GET, PUT, and POST requests made by aiohttp. A five minute
timeout was added to aiohttp between versions v0.18.2 and v3.6.2. WaterButler needs to support large
uploads and downloads, so this has been changed to an hour, configurable by the `AIOHTTP_TIMEOUT`
envvar. Global request timeouts should be decided by the proxy set up in front of WB.
20.0.0 (2020-02-18)
===================
- Code: Upgrade core library aiohttp from v0.18.2 to v3.6.2. WaterButler uses aiohttp to make
requests to the external providers. Its interface has changed significantly from v0.18 and much
of the internal `BaseProvider.make_request` code was refactored/rewritten. Each provider has been
updated to use the new `make_request` code.
- The newest aiohttp favors context managers for performing requests, which does not mesh well
with WB's tendency to pass around response objects. To work around this, the core provider now
manually manages session setup and teardown. Sessions are closed when the object is
garbage-collected.
- WB is moving away from using request context managers in provider code, where passing response
objects is more common.
- Update WB's stream classes to be compatible with recent asyncio.StreamReader.
- Code: Requirements / dependency upgrades:
- Upgrade Tornado from v4.3 to v6.0.
- Add support for Newrelic monitoring, made possible by the upgrade to Tornado v6.0.
- Update Sentry logging to replace the deprecated `raven` library with `sentry-sdk`.
- Remove hack for `certifi` library and upgrade to the most recent version.
- Replace `agent` library with core async generators.
- Bump testing libraries to newer versions and update testing code to accommodate.
- Pin some unpinned dependencies in dev-requirements.txt.
- Fix: Many miscellaneous test updates:
- Fix tests where copy-n-paste had gone awry.
- Remove `async` from `async def` tests that never `await`-ed anything.
- Stop saving unused return values into variables.
- Replace `assert $foo.called_` with `$foo.assert_called_`.
- Remove no-longer-valid tests.
- Move common move/copy task-testing code into a conftest.py.
- Fix: Fix invalid pre-decrement of retry count in Box provider.
- Fix: Support Rackspace Cloudfiles as an osfstorage backend. During the transition to the Google
Cloud backend, the bucket-identifying key in the osfstorage configuration changed from `container`
to `bucket`. Re-enable support for `container` to support legacy cloudfiles backends.
- Fix: Support Figshare uploads where the content hashing has not completed before Figshare
responds. Previously, uploads less than 70Mb in size always had the checksum embedded in the
response metadata. Now the availability of the hash appears to be subject to some limit on
Figshare's side. If present, verify that it matches our checksum. If not, bypass the check and
set the `hashingInProgress` metadata field to true.
- Code: Upgrade WB's Dockerfile to be based off python3.6 and Debian buster. Remove manual gosu
installation and fetch from apt-get instead. python3.6 is now the only supported python version.
- Code: Remove old GitLab workarounds. As a side-effect, the GitLab provider no longer supports
`Range` headers for downloads. To support this would require slurping the file contents into
memory (which was done by the workarounds), but this opens WB up to DOS attacks via large files.
19.2.0 (2019-10-07)
===================
- Feature: Support rate-limiting in large move/copies for GitHub. GitHub limits the number of
per-user API requests to 5,000 per hour. To avoid running afoul of these limits, slow down the
rate of requests made by WB as we start to approach the rate-limiting threshold. Reserve a
configurable number of requests for regular requests. If the request limit is exhausted by an
external request, WB will throw a `GitHubRateLimitExceededError` error.
- Fix: Consistently handle revision-identifying query parameters in the GitHub provider. Over its
lifetime the GH provider has supported "ref", "sha", "revision", "version", and (sometimes)
"branch" as valid revision-identifying parameters. With this change, all of those params are
handled in a single method and in a defined fashion. Revisions may be commit SHAs or branch
names. A revision is assumed to be a commit SHA if it is a valid 40-digit base-16 number.
Otherwise it is assumed to be a branch name. See method documentation for how multiple conflicting
parameters are handled.
- Fix: Update figshare provider to report the correct url for publicly-viewable files.
- Fix: Update figshare provider to support "dataset" articles as folders. figshare has many
article types. Previously, only the "fileset" article type represented a folder in the figshare WB
provider. figshare now automatically updates "fileset"s to "dataset"s. Promote "dataset"s to be the
default folder-analagous article type.
19.1.0 (2019-09-04)
===================
- Fix: Correctly annotate HEAD requests as metadata requests and revision requests as revision
requests in the auth payload sent to the OSF. These were being incorrectly marked as download
requests, causing spurious download counts on OSF files.
19.0.1 (2019-08-07)
===================
- Fix: Fix bug in the Box provider that was causing files uploaded to subdirectories to end up in
the project root instead. This would NOT cause an accidental overwrite; if a file with the same
name was already present in the project root, Box would throw a 409 Conflict error and refuse to
proceed.
19.0.0 (2019-06-11)
===================
- Fix: Bitbucket has retired their v1 API. Update the WaterButler provider to use v2 instead.
- Code: Update WaterButler Dockerfile to explicitly use gnupg v2. The default gpg binary on Debian
jessie is v1, but Debian stretch changes this to v2. Adjust for small differences in how they are
called.
18.0.3 (2019-01-31)
===================
- Fix: Send metadata about request to the OSF auth endpoint to help OSF distinguish file views from
downloads. (thanks, @Johnetordoff!)
18.0.2 (2018-12-14)
===================
- Fix: Get CI running again by adding workaround for bad TravisCI/Boto interaction. See:
https://github.com/travis-ci/travis-ci/issues/7940.
18.0.1 (2018-12-12)
===================
- Fix: Restore the `filename` parameter to WaterButler's Content-Disposition headers. It had been
removed in favor of `filename*`, which can correctly encode multibyte filenames. This broke scripts
that expected the `filename` version. Since `filename` cannot support multibyte characters, those
characters are converted to their nearest ascii equivalent or stripped if no equivalent exists. WB
now returns both forms of the parameter in the header. Clients should prefer `filename*` for the
most accurate representation of the filename. (thanks, @jcohenadad!)
18.0.0 (2018-10-31)
===================
- UPDATE: WaterButler is now following the CalVer (https://calver.org/) versioning scheme to match
the OSF. All new versions will use the `YY.MINOR.MICRO` format.
- Fix: Teach WaterButler to properly encode non-ascii file names for download requests. WB was
constructing Content-Disposition headers in many places throughout the codebase. Some correctly
encoded non-ascii characters as UTF-8, but many did not. These have been consolidated into a
single routine that can build explicitly-declared UTF-8 encoded Content-Disposition headers.
- Fix: Allow users to set the name of a file when downloading directly from a provider. Some
providers (osfstorage, s3) support signed download urls, which permit the user to download directly
from the provider without passing through WB. WB was failing to relay the `displayName` query
parameter to the upstream providers. It now does so, which allows the user to override the default
download file name. Downloads that are proxied through WB already respect the parameter and are
unchanged by this.
0.42.2 (2018-10-24)
===================
- Feature: Relay MFR-identifying header to OSF when requesting auth on MFR's behalf. This enables
more accurate couting of file download and render requests.
0.42.1 (2018-09-17)
===================
- Fix: Fix file updates and overwrites on Dropbox by making sure to set the "overwrite" mode when
replace-on-conflict semantics are requested.
0.42.0 (2018-09-16)
===================
- Feature: Support multi-regional backend storage buckets in osfstorage. WaterButler now includes
the path and version of the file when requesting auth from the OSF. Different versions of a file
may be stored in different regions and will therefore require different credentials and settings.
This change also requires osfstorage to implement custom move() and copy() methods. Cross-region
osfstorage moves and copies are expected to preserve version history and linked guids. This is
normally handled by the intra_move() and intra_copy() methods, but those methods only operate on
metadata and do not copy data from one region bucket to another. The new move() and copy() methods
take care of copying the file data across regions before calling intra_move() and intra_copy() to
update the metadata.
- Fix: Stop logging folder metadata requests in v1. A misunderstanding of the GET endpoint for
folders causing some folder metadata requests to be logged as download-as-zip requests.
- Code: Remove cold-archiving and parity-generating tasks from osfstorage. These tasks are better
done on the backend storage provider.
- Code: Turn on branch coverage testing. Overall coverage has decreased to 90% with this change.
0.41.3 (2018-08-27)
===================
- Feature: Make download-as-zip compression level a tunable configuration parameter.
0.41.2 (2018-08-27)
===================
- Fix: Brown-paper-bag release: Check in settings file needed for last fix.
0.41.1 (2018-08-27)
===================
- Fix: Expand list of file extensions that don't get deflated during download-as-zip.
0.41.0 (2018-08-15)
===================
- Feature: WaterButler now uses the multipart/chunked upload interfaces provided by Box, Dropbox,
and Amazon S3 when the uploaded file size exceeds provider-defined thresholds. These providers
limit the size of files that can be uploaded in a single request and require multipart uploads for
larger files. WB itself does not provide a multipart upload interface, so uploads via WB will
still appear to happen as a single request. The maximum file sizes for each provider at the time
of this release are:
- Box: 250MB / 2GB / 5GB, depending on account type
https://community.box.com/t5/Upload-and-Download-Files-and/Understand-the-Maximum-File-Size-You-Can-Upload-to-Box/ta-p/50590
- Dropbox: 350GB
https://www.dropbox.com/help/space/upload-limitations
- S3: 5TB
https://aws.amazon.com/s3/faqs/ under "How much data can I store in Amazon S3"
- Fix: Improve Figshare upload reliability, especially for cross-provider move/copies. WaterButler
was trying to slurp an entire file segment into memory before uploading to Figshare. If the source
feed was slow, the connection could time out before the transfer commenced, resulting in broken
uploads on Figshare. WB now streams chunks of the segment as it receives them, allowing the
connection to stay open to completion.
- Fix: Fix minor warning in documentation build.
0.40.2 (2018-08-08)
===================
- Fix: Fix download of public Figshare files.
0.40.1 (2018-07-25)
===================
- Feature: Add `X-CSRFToken` to list of acceptable CORS headers.
- Feature: Tell Keen analytics to strip ip on upload.
- Code: Remove never-implemented anonymous geolocation code.
0.40.0 (2018-06-22)
===================
- Feature: Listen for MFR-originating metadata requests and relay the nature of the request to
the OSF. This will allow the OSF to tally better metrics on file views.
- Feature: Add new "sizeInt" file metadata field. Some providers return file size as a string.
The "size" field in WaterButler's file metadata follows this behavior. The "sizeInt" field will
always be an integer (if file size is available) or `null` (if not).
- Code: Expand tests for WaterButler's APIv1 server.
0.39.2 (2018-06-05)
===================
- Fix: Brown-paper-bag release: actually change version in the code.
0.39.1 (2018-06-05)
===================
- Code: Pin base docker image to python:3.5-slim-jessie. 3.5-slim was recently updated to Debian
stretch, and WaterButler has not yet been verified to work on it.
0.39.0 (2018-05-08)
===================
- Feature: WaterButler now lets osfstorage know who the requesting user is when listing folder
contents, so that it can return metadata about whether the user has seen the most recent version
of the file. (thanks, @erinspace!)
- Feature: WaterButler includes metadata about the request in the logging callback to help the
logger tally more accurate download counts. (thanks, @johnetordoff!)
- Feature: Stop logging revisions metadata requests to logging callback. (thanks, @johnetordoff!)
- Fix: Don't try to decode the body of HEAD requests that error. HEAD requests don't have bodies!
- Code: Clean up existing mypy annotations and add new ones. WaterButler still isn't 100% clean,
but it's getting there!
- Code: Allow passing JSON-encoded objects as config settings via environment variables.
0.38.6 (2018-04-25)
===================
- Fix: `CELERY_RESULT_PERSISTENT` should default to True, now that `amqp` is the default result
backend.
0.38.5 (2018-04-24)
===================
- Fix: Don't overwrite evokable url with generated url inside the make_request retry loop. Third
time's the charm, right?
0.38.4 (2018-04-24)
===================
- Fix: Evoke url-generating functions *inside* the make_request retry loop to make sure signed
urls are as fresh as possible.
0.38.3 (2018-04-24)
===================
- Fix: Remove temporary files after osfstorage provider has completed its archiving and
parity-generation tasks.
0.38.2 (2018-04-23)
===================
- Fix: Delay construction of Google Cloud signed urls until right before issuing them. The provider
was building them outside of the retry loop, resulting in slow-to-fail requests having their
signatures expire before all subsequent retries could be issued.
0.38.1 (2018-04-10)
===================
- Fix: Don't log 206 Partial requests to the download callback. Most of these are for streaming
video or resumable downloads and shouldn't be tallied as a download.
0.38.0 (2018-03-28)
===================
- Fix: WaterButler now handles Range header requests correctly! There was an off-by-one error where
WB was returning one byte more than requested. This manifested in Safari being unable to load
videos rendered by MFR. Safari requests the first two bytes of a video before issuing a full
request. When WB returns three bytes, Safari refuses to try to render the file. Safari can now
render videos from most providers that support Range headers. Unfortunately, it still struggles
with osfstorage for reasons that are still being investigated. (thanks, @abought and @johetordoff!)
- Feature: WaterButler now logs download and download-as-zip requests to the callback given by the
auth provider. This is intended to allow the OSF (the default auth provider) to better log download
counts for files. If WB detects that a request originates from MFR, it will relay that on to the
callback, so the auth provider can account for that when compiling statistics.
- Feature: Add a new limited provider for Google Cloud Storage! A limited provider is one that can
be used as a data-storage backend provider for osfstorage. It does NOT have a corresponding OSF
addon interface, nor does it support the full range of WB actions. It has been tested locally, but
has yet to be tested in a staging or production environment, so EARLY ADOPT AT YOUR OWN RISK!
- Fix: Stop installing aiohttpretty in editable mode to avoid containers hanging while awaiting user
input.
0.37.1 (2018-02-06)
===================
- Feature: Increase default move/copy task timeout to 20s. WaterButler will now wait 20 seconds for
move/copy tasks to complete before backgrounding the task and returning a 202 Accepted.
0.37.0 (2018-01-26)
===================
- ANNOUNCEMENT! The WaterButler v0 API is once again DEPRECATED! COS has removed the last remnants
of it from the OSF (thanks, @alexschiller!) and has moved entirely to the v1 API. It will be
removed in the first release after *April 1*. If you depend on v0, please get in contact with us
([email protected]) before then.
- Feature: Don't re-compress already-zipped files when downloading a directory as a zip. Re-zipping
wastes CPU time and will tend to result in larger zips overall. (thanks, @johnetordoff!)
- Feature: Add a unique request ID to WaterButler's response headers to help with tracking down
errors. (thanks, @johnetordoff!)
- Fix: Use the correct revision parameter name for fetching Amazon S3 revisions. S3 revisions are
an optional feature that must be turned on via the AWS interface. (thanks, @TomBaxter!)
- Fix: Allow the GitHub to delete a folder in the repository root when it's the only remaining
object. (thanks, @TomBaxter!)
- Fix: Purge osfstorage uploads from the pending directory when the upload to the backend storage
provider fails. (thanks, @johnetordoff!)
- Fix: Return a 400 Bad Request when a user tries to copy a root folder to another provider without
setting the `rename` parameter. (thanks, @AddisonSchiller!)
- Fix: Don't send invalid payloads to WaterButler's metrics tracking service. (thanks,
@AddisonSchiller!)
- Fix: Compute correct name for nested folders in the filesystem provider. (thanks,
@AddisonSchiller & @johnetordoff!)
- Fix: Remove obsolete and unused `unsizable` flag from ResponseStreamReader. (thanks,
@AddisonSchiller & @johnetordoff!)
- Code: Upgrade the Dropbox provider to only use Dropbox's v2 endpoints. (thanks, @johnetordoff!)
- Code: Update WaterButler to support setuptools versions greater than v30.4.0. WaterButler's
`__version__` declaration has moved from `waterbutler.__init__` to `waterbutler.version`. (thanks,
@johnetordoff!)
- Code: Distinguish between provider actions and authentication actions in the v1 move/copy code.
0.36.2 (2017-12-20)
===================
- Feature: Log osfstorage parity file metadata back to the OSF after upload.
0.36.1 (2017-12-11)
===================
- Fix: Update OneDrive metadata to report the correct materialized name.
0.36.0 (2017-12-05)
===================
- Feature: WaterButler now supports two new read-only providers, GitLab and Microsoft OneDrive!
Read-only providers support browsing, downloading, downloading-as-zip, getting file revision
history, and copying from connected repositories. Thanks to the following devs for their hard
work!
- GitLab: @danielneis, @luismulinari, @rafaeldelucena
- OneDrive: @caseyrygt, @alexandr-melnikov-dev-pro, @johnetordoff
0.35.0 (2017-11-13)
===================
- Feature: Allow copying from public resources with the OSF provider. WaterButler had been
requiring write permissions on the source resource for both moves and copies, but copy only needs
read. Update the v1 API to distinguish between the two types of requests.
- Docs: Document supported query parameters in the v1 API.
- Code: Improve test coverage for osfstorage and figshare.
- Code: Cleanups for Box, Google Drive, and GitHub providers.
- Code: Don't include test directories in module search paths.
- Code: Don't let query parameters override the HTTP verb in v1.
0.34.1 (2017-10-18)
===================
- Fix: Don't crash when a file on Google Drive is missing an md5 in its metadata. This occurs
for non-exportable files like Google Maps, Google Forms, etc.
0.34.0 (2017-09-29)
===================
- ANNOUNCEMENT! Sadly, the WaterButler v0 API is now *undeprecated*. We've discovered that the
OSF is still using it in a few places, so it's been given a temporary reprieve. Once those are
converted to v1, v0 will be re-deprecated then removed after an appropriate warning period.
- Feature: For providers that return hashes on upload, WaterButler will calculate the same hash
as the file streams and throw an informative error if its hash and the provider's hash differ.
- Fix: Stop throwing exceptions when building exceptions to throw. Pickled exceptions are
resurrected in a peculiar fashion that some of WaterButler's exception classes could not survive.
- Fix: Validate that the move/copy destination path is really a folder.
- Fix: Update the Box and Google Drive intra-{move,copy} actions to include children in the
returned metadata for folders (and document it).
- Fix: Release Box responses on error.
- Code: Update the Postman test suite to include CRUD and move tests.
- Code: Start testing with python-3.6 on Travis.
- Code: Improve test coverage for all providers except osfstorage and figshare (coming soon!).
- Code: Teach WaterButler to listen for a SIGTERM signal and exit immediately upon receiving it.
This bypasses the 10 second wait for shutdown when running it in Docker.
- Code: Fix sphinx syntax errors in the WaterButler docs.
0.33.1 (2017-09-05)
===================
- Fix: Reject requests for Box IDs if the ID is valid, but the file or folder is outside of the
project root. (thanks, @AddisonSchiller!)
0.33.0 (2017-08-09)
===================
- ANNOUNCEMENT! The WaterButler v0 API is DEPRECATED! COS no longer uses it and has moved
entirely to the v1 API. It will be removed in the first release after October 1. If you depend on
v0, please get in contact with us ([email protected]) before then, and let us know.
- Feature: WaterButler now supports Bitbucket as a read-only provider! As a read-only provider,
you can browse, download, download-as-zip, get file revision history, and copy out of your
connected Bitbucket repository.
- Feature: Sentry errors now include provider and resource as searchable parameters. (thanks,
@abought!)
- Fix: Correctly describe the response of folder intra-move actions in GoogleDrive as folders,
rather than files.
- Fix: WaterButler now correctly throttles multiple parallel requests. The maximum number of
simultaneous requests is set by the waterbutler.settings.OP_CONCURRENCY config variable.
0.32.3 (2017-07-20)
===================
- Fix: Quiet some overly-verbose error logging.
0.32.2 (2017-07-09)
===================
- Fix: Fix hanging figshare uploads by replacing a StreamReader.readexactly call with a
StreamReader.read. The underlying cause of this problem is still unknown.
0.32.1 (2017-07-07)
===================
- Fix: Correctly format the Last-Modified header returned from v1 HEAD requests. WB had been
setting it to the datetime format used by the provider, but we should be following the format laid
out by RFC 7232, S2.2. (thanks, @icereval and @luizirber!)
0.32.0 (2017-06-14)
===================
- Fix: Send back the correct modified date when uploading a file to osfstorage. osfstorage had
been sending back the modified date of the stored blob rather than the metadata from the OSF.
- Fix: Support metadata and revisions for files shared on Google Drive with view- or comment-only
permissions. Google Drive forbids access to the version-listing endpoint for these sorts of files,
and WaterButler was not coping with that gracefully.
- Fix: Update WaterButler tests to work on Python >= v3.5.3. A change to coroutine function
detection in Python v3.5.3 and v3.6.0 was causing tests to fail, as the mocked coroutines were not
being properly unwrapped.
- Code: Add type annotations and a mypy test to the core WaterButler provider and the box, dropbox,
googledrive, and figshare providers! And lo, a new era of type-safety, strictness, and peace was
ushered in, and its name was `inv mypy`. (thanks, @abought!)
- Code: Add support for code-coverage checking via coveralls.io. (thanks, @abought!)
0.31.1 (2017-06-01)
===================
- Fix: Fix OwnCloud issue that could result in folder creation outside the base folder. OwnCloud
was making assumptions about the formatting of the base folder that were not necessarily true.
0.31.0 (2017-04-07)
===================
- Feature: Stop creating empty commits on GitHub when the requested action doesn't change the tree
e.g. when updating a file with the exact same content as before.
- Fix: Moving or copying a folder within osfstorage will now return the metadata for the folder's
children in the response.
- Fix: Reject PATCH requests gracefully, instead of 500-ing.
- Fix: Disallow accessing Google Docs (.gdoc, .gsheet, etc.) without the extension.
- Fix: Fix poor error handling in Dropbox provider.
- Fix: Log WaitTimeoutErrors as log level info to Sentry. These are expected and shouldn't be
considered full errors.
0.30.0 (2017-02-02)
===================
- Feature: Support the new London and Central Canada regions in Amazon S3. (thanks, @johnetordoff!)
- Feature: Include provider-specific metrics in metric logging payloads, including number of
requests issued.
- Fix: Don't crash when fetching file revision metadata from Google Drive.
- Fix: WaterButler docs are once again building on readthedocs.org! (thanks, @johnetordoff!)
- Code: Update WaterButler to use invoke 0.13.0. If you have an existing checkout, you will need to
upgrade invoke manually: pip install invoke==0.13.0 (thanks, @johnetordoff!)
- Code: Postman collections to test file and folder copy behavior for providers have been added to
tests/postman/. See docs/testing.rst for instructions on setting up and running them.
- Docs: WaterButler has been verified to work with python 3.5.3 and 3.6.0. From now on, the docs
will mention which python versions WB has been verfied to work on. (thanks, @johnetordoff!)
0.29.1 (2017-01-04)
===================
- Happy New Year!
- Fix: Be more ruthless about fixing setuptools breakage in Dockerfile. (thanks, @cwisecarver!)
0.29.0 (2016-12-14)
===================
- Feature: WaterButler now uses the V2 APIs for both Dropbox and Figshare.
- Feature: Add a created timestamp to osfstorage file metadata.
- Feature: Support the new Mumbai and Ohio regions in Amazon S3. (thanks, @erinspace!)
- Feature: The server logs a message on startup, instead of just staring blankly at you.
- Fix: Start appending extensions to Google Doc files to disambiguate identically-named
files. e.g. foo.docx vs. foo.gsheet
0.28.1 (2016-12-13)
===================
- Pin setuptools to v30.4.0 to avoid package-namespace-related breakage.
0.28.0 (2016-10-31)
===================
- HALLOWEEN RELEASE! (^ ,–, ^)
- Feature: Download-as-zip now includes empty directories! (thanks, @darioncassel!)
- Feature: WaterButler now lists the full contents of Google Drive directories with more than 1,000
children. (thanks, @TomBaxter!)
- Feature: WaterButler now lists the full contents of Box.com directories with more than 1,000
children. (thanks, @TomBaxter!)
- Fix: Teach ownCloud to be more efficient about moving and copying files with a single provider.
0.27.1 (2016-10-24)
===================
- Fix: Fix broken Download-as-zip for GitHub by propagating the target branch during recursive
traversal.
- Fix: Fix incorrectly detected self-overwrite when copying a file between the root paths of two
separate Box.com accounts.
0.27.0 (2016-10-19)
===================
- Feature: Attempting to move or copy a file over itself will now fail with a 409 Conflict, even
across different resources.
- Fix: Fix bugs in ownCloud provider that were breaking renames.
- Fix: v1 metadata requests now accept the `version` and `revision` query parameters like v0.
0.26.1 (2016-10-18)
===================
- Feature: Dockerized WaterButler can now take a commit sha from the environment to indicate
version to deploy. (thanks, @icereval!)
0.26.0 (2016-10-11)
===================
- Feature: WaterButler now supports ownCloud as a full provider! (thanks, @kwierman!)
- Feature: WaterButler accepts configuration from the environment, overriding any file-based
configuration. This helps WB integrate nicer in a docker-compose environment. (thanks, @icereval!)
- Fix: Gracefully handle branch-related GitHub errors.
- Code: Start labeling user-caused errors as level=info in Sentry (thanks, @TomBaxter!)
- Code: Log redirect-based downloads to analytics.
- Code: Bump dependencies on Raven and cryptography. cryptography v1.5.2 now installs on OSX via
wheel. This should silence scary-sounding cffi warnings!
0.25.0 (2016-09-22)
===================
- Feature: Include user id when requesting files from osfstorage, to allow the OSF to distinguish
contributing users in download counts. (thanks, @darioncassel!)
0.24.0 (2016-09-14)
===================
- Feature: Update the v1 API to passthrough unrecognized query params to the provider.
- Feature: Teach the GitHub provider to accept branch identifiers in the URL and body of v1
operations. You can now do cross-branch move/copies with the v1 API!
- Feature: The GitHub provider now includes the branch operated on in its callback logs.
- Fix: Add `--pty` arguments to invoke install and invoke wheelhouse, to support building WB Docker
images without pseudoterminals. (thanks, @emetsger!)
0.23.3 (2016-08-31)
===================
- Fix: Fix flake error in remote_logging. Not at all embarrassing.
0.23.2 (2016-08-31)
===================
- Fix: For analytics, convert byte sizes into more convenient units.
- Code: in sizes.py, call kilobytes "KBs", not "Bs"
0.23.1 (2016-08-31)
===================
- Fix: Dataverse changed their API file metadata repsonse format. Update provider to handle both
formats.
0.23.0 (2016-08-25)
===================
- Code: Rewrite public file action logging to sync with MFR.
- Docs: Document intra_move and intra_copy in core provider.
0.22.1 (2016-08-19)
===================
- Fix: Don't try to derive modified_utc for osfstorage files that lack a modified date.
0.22.0 (2016-08-19)
===================
- Feature: File metadata now includes a modified_utc field that is the modified timestamp in
standard ISO-8601 format (YYYY-MM-DDTHH:MM::SS+00:00). (thanks, @TomBaxter!)
- Feature: Metadata for osfstorage files will contain the file GUID, once the OSF is updated to
return that information. (thanks, @Johnetordoff!)
- Feature: WaterButler can now log file actions to Keen.io.
- Fix: core.utils.async_retry was always intended to be fire-and-forget, but wasn't when await()-ed.
It will now run until done, instead of executing one retry per await. The tests have been updated
to match this behavior.
- Fix: Update copy and move celery tasks so that a failed logging callback will not make them return
a 500. Now they always return the result and metadata of the move/copy action.
- Fix: Logging callbacks are now retried five times, no matter where they are called from.
- Fix: Update Dockerfile to register plugins.
- Fix: Correct minor typos and links to Python and aiohttp in the docs.
0.21.4 (2016-08-02)
===================
- Fix: Ask Box not to zip our download requests. (thanks, @caseyrygt!)
0.21.3 (2016-08-01)
===================
- Fix: Bump wheel dep to 0.26.0 to fix travis build.
0.21.2 (2016-08-01)
===================
- Fix: Pin cryptography to v1.3.4 to avoid v1,4 incompatibilities with OS X's vendor openssl.
(thanks, @erinspace!)
0.21.1 (2016-07-12)
===================
- Fix: Stop duplicating parent folder when searching Google Drive for Google docs. (thanks,
@TomBaxter!)
0.21.0 (2016-06-16)
===================
- Feature: Allow cross origin requests when an Authorization header is provided without
cookies. (thanks, @samchrisinger!)
- Feature: Don't set any CORS headers if Origin is not provided (e.g. non-browser client)
- Fix: v0's copy action now checks can_intra_copy instead of can_intra_move.
- Fix: Stop sending requests to GitHub with the `application/vnd.github.VERSION.raw` media-type.
Start sending them `application/vnd.github.v3.raw`.
- Fix: Our API docs had typos in the example URLs. For shame.
- Code: Refactor logging callback into one location. Previously, v0 creates, v0 move/copies, v1
actions, and the move/copy celery tasks each had their own bespoke code for logging actions to the
authorizer-provided callback. All of that has been merged into one method with a common interface.
This shouldn't have any user-visible changes, but it will make the developers' lives much easier.
- Code: Remove unused code from googledrive provider
- Code: WaterButlerPath learned `materialized_path()` (a.k.a its __str__ representation).
0.20.4 (2016-06-16)
===================
- Finish support for zipfiles > 4Gb. Large zipfiles should now be uncompressible by
double-clicking on the zipfiles in OS X, Windows, and Linux. On OS X, /usr/bin/ditto and
The Unarchiver have been confirmed to work. /usr/bin/unzip will *not* work, as the version
they include does not have Zip64 support.
- Update the install docs to pin invoke to 0.11.1.
0.20.3 (2016-06-13)
===================
- Release fixes for deployment and intermediate DAZ > 4Gb fixes.
0.20.2 (2016-06-13)
===================
- Pin invoke to v0.11.1. Our tasks.py is incompatible with v0.13.
0.20.1 (2016-06-13)
===================
- Try fixing Download-As-Zip for resulting zipfiles > 4Gb. The default zip format only supports
up to 4Gb files, but the zip64 extension can handle much larger sizes. WB hand-rolls its zips,
so the zip64 format has to be constructed manually. Unfortunately, the fixes applied in 0.20.1
were not enough. The remaining updates were added in 0.20.4.
- Add a Dockerfile to simplify running WaterButler in dev environments.
- Pin some dependencies and update our travis config to avoid spurious build failures.
0.20.0 (2016-04-29)
===================
- Fix Download-As-Zip for Google Drive to set the correct extensions for exported Google Doc
files.
- Add 'resource' to V1 response.
- Minor doc updates: Add urls to external API documentation and note provider quirks.
0.19.5 (2016-04-18)
===================
- Brown Paper Bag release: **REALLY** fix syntax error in Github's Unsupported Repo exception.
The 0.19.4 release fixed nothing. I (@felliott) am an idiot.
0.19.4 (2016-04-18)
===================
- Fix syntax error in Github's Unsupported Repo exception.
0.19.3 (2016-04-15)
===================
- Increase number of files returned from a Box directory from 50 to 1000. (thanks, @TomBaxter!)
- Exclude Google Maps from Google Drive listing. Google doesn't provide a way to export maps,
so exclude them from listing / downloading for now. This is the same behavior as Google Forms.
0.19.2 (2016-04-14)
===================
- Make v1 move/copy return file metadata in its logging payload, so the OSF can track files
across providers.
0.19.1 (2016-04-13)
===================
- Update boto dependency to get fixes for large file uploads to Amazon Glacier. (thanks,
@caileyfitz, for your patient testing!)
- Increase number of files returned from a GDrive directory listing from 100 to 1000.
0.19.0 (2016-04-04)
===================
- Feature: WaterButler now runs on Python 3.5! A major speed boost should be
noticeable, especially for zipped folder downloads. All hail @chrisseto and
@TomBaxter for seeing this through!
- Feature: Zipped folder downloads now use the folder name as the zip filename. If the folder
is the storage root, the name is `$provider.zip` (e.g. `googledrive.zip`).
- Code: WaterButlerPath has a new `is_folder` method. It's the same as `is_dir`.
- Code: waterbutler.core.BaseProvider learned `path_from_metadata`.
0.18.4 (2016-04-01)
===================
- Fix: Bump PyJWE dependency to 1.0.0 to match with OSF v0.66.0
0.18.3 (2016-03-28)
===================
- Fix: Renaming a Google doc (.gdoc, .gsheet) file should not and no longer truncates the new
filename to just the first letter.
0.18.2 (2016-03-17)
===================
- The "Leprechauns Ate My Blob" emergency release!
- Fix: Don't throw an error if the github file being requested is larger than 1Mb. There will
be a more correct fix in the next minor release.
0.18.1 (2016-03-15)
===================
- Fix: Stop crashing if the `Content-Length` header is not set on a v1 folder create request.
A missing `Content-Length` is fine for folder create requests.
0.18.0 (2016-03-09)
===================
- BREAKING v1 API CHANGE (sortof): Updating a file by issuing a PUT to its parent folder and
passing its name as a query parameter is no longer supported and will now throw a 409 Conflict.
This is still the correct way to create a file, but updating must be done by issuing the PUT to
the file's own endpoint. This was supposed to be fixed back in December, but I (@felliott) did
a **very** poor job of it, meaning some providers still allowed it. The API documentation has
been updated to match. If you use the `/links/upload` action from the JSON-API response, you
**DO NOT** need to update your code, That link is already correct.
- Feature: DELETEing a file or folder in the GDrive provider now sends it to the trash instead
or hard-deleting it.
- Feature: Issuing a DELETE to the storage root of a provider will now clear out its content,
but not delete the storage root itself. This was undefined behavior before. Some providers
would disallow it, some would crash, others would do the right thing. This is now officially
supported. To make sure the file contents are not wiped out on accident, the query parameter
`confirm_delete=1` must be passed when clearing the contents. Otherwise, WB will return a 400.
- Fix: GoogleDrive provider now returns 201 Created when creating and 200 OK when overwriting
during copy operations.
- Fix: Copying / moving empty folders into a directory with a similarly named folder with 'keep
both' semantics no longer overwrites old folder and properly increments the new file name.
- Fix: Always set `kind` parameter for files and folders in v1 logging callback to avoid crashing
the OSF waterbutler logging endpoint.
- Fix: For 'keep both' conflict semantics when more than one incremented version already exists.
- Fix: Github move/copy folder requests with 'replace' semantics now correctly overwrites the
target folder, rather than merging the contents.
- Docs: @TomBaxter++ has added more docstrings to the base provider and has started documenting
the quirks of our existing providers.
- Docs: v0.18.0 WaterButler DO NOT work with python 3.5. 3.4 is required. Mention this.
- Docs: The Center for Open Science is hiring! (thanks, @andrewsallans!)
0.17.0 (2016-02-29)
===================
- LEAP DAY RELEASE!
- Feature: Add throttling to make_request! Some hosts don't like it when we fire off too many
requests at once. Now we limit it to 10 requests / second and a maximum of 5 concurrently. To
adjust the rates, update the REQUEST_LIMIT and OP_CONCURRENCY attributes in settings.py. (thanks
@chrisseto!)
- Fix: Update dropbox API urls.
0.16.5 (2016-02-22)
===================
- No changes. Applied and reverted throttling patches.
0.16.4 (2016-02-18)
===================
- Fix: Add semaphore to make_request to limit concurrent outgoing requests to 25.
0.16.3 (2016-02-14)
===================
- Fix: Apply certificate fix in proper scope.
0.16.2 (2016-02-14)
===================
- Fix: Patch to prevent certification verification failure w/ rackspace services.
0.16.1 (2016-02-11)
===================
- Fix: Properly freeze time in tests to avoid spurious test failures.
- Update Sentry Raven client now that it has core asyncio support.
0.16.0 (2016-02-03)
===================
- Feature: Support S3 buckets with periods in their name for non-US East
regions.
- Feature: Started filling out and reorganizng the dev docs! The v1 API is
now documented, and waterbutler.core.metadata, waterbutler.core.path have
a lot more docstrings. A skeleton overview of WaterButler is available. More
to come...
- Fix: Make the filesystem provider declare timezones for their modified date.
- Fix: Update FigShare's web view URL for viewing unpublished files.
0.15.1 (2016-01-21)
===================
- Fix incorrect logging of paths for move/copy task and v0 deletes
- Fix incorrect path calculation for cross-resource move/copys via Dropbox
- Properly encode googledrive path in log payloads
0.15.0 (2016-01-15)
===================
- Enforce V1 path semantics. Folders must have trailing slash, files must NOT
have trailing slash. Failing to abide results in a 404
- Allow creating a file or folder in a directory with an identically-named
entity of the opposite type, but only for those providers that allow it.
- Fix recursive delete via s3 when folder name has special characters
- Fix multiple 500s on github provider
- Fix many v1 logging issues
- Clarify install instructions, (thanks @rafaeldelucena!)
- Add `clean` task to remove old .pyc files
0.14.0 (2015-11-05)
===================
- Update to a python3.5 compliant version of invoke
- Raise proper exceptions for github repos with too many fields
- Updates for OSFStorage file checkout
- Clean up JSON API Responses
0.13.0 (2015-10-08)
===================
- waterbutler API v1 now returns JSON API formatted data
- DEBUG is now an option for waterbutler root settings
- OSF auth handler now authenticates via JWTs
- Moves and copies done via v1 will now return a 409 rather than implicitly overwriting
- Failed log callbacks are now logged
- Various smaller fixes
0.12.0 (2015-09-17)
===================
- waterbutler.server
- Restructured into API version modules
- API v1 has been implemented
- Only one endpoint exists /v1/resources/<>/providers/<>/<path>
- API v0 is now deprecated
- Callbacks will be retried if they do not get a 200 response
- waterbutler.core
- Invalid providers are now handled properly
- WaterbutlerPath now has an identifier_path property for id based backends
- Revision is now an accepted parameter of Provider#metadata and Provider#download
- Github now returns modifed dates when available
- Google drive's title queries now only use single quotes (')
- OsfStorage's validate_path function now works properly
- Osfstorage now properly responds created to internal copies
0.11.0 (2015-08-31)
===================
- OsfStorage now returns hashes
0.10.0 (2015-08-10)
===================
- Allow S3 uploads to be encrypted at rest via S3's API
0.9.0 (2015-07-29)
==================
- Web view links are included in the extra field when available
- Add many a test for moving and copying, tasks and endpoints
- Allow OsfStorage tasks to be disabled by adding including archive: false
0.8.0 (2015-07-14)
==================
- Add support for passing the Range head through
- Exceptions are no longer raised when a client connection cuts off early
- ResponseStreamReader may override file names via .name
- Calls to metadata now returns BaseMetadata objects or a list thereof
- Upgrade to tornado 4.2, which increases compatability with asyncio
- General code clean up
- Add a style/contributing guide
- Uploading files is now implemented with unix sockets and will not buffer the
entire file into memory
- Accept files up to 4.9GBs
- view_url is included with file metadata requests
- Flake8 is now much more aggressive
- General code clean up
0.7.0 (2015-06-18)
==================
- Read me updates
- Various fixes for S3
- Fixes to Dataverse's copy and move
- Various fixes for figshare
0.6.0 (2015-06-07)
==================
- Various fixes to Google drive
- Allow response streams to be "unsizable"
- Return an additional "etag" field with file metadata
0.5.0 (2015-05-25)
==================
- Implement moving and expose it via http
- Implement copying and expose it via http
- Implement downloading as zip and expose it via http
0.4.0 (2015-04-28)
==================
- Add folder creation
0.3.0 (2015-04-20)
==================
- Add harvard dataverse as a provider
0.2.4 (2015-03-18)
==================
- Allow ssl certs to be specified in the config