Apache Arrow 0.10.0 (6 August 2018)
This is a major release.
Download
Contributors
$ git shortlog -sn apache-arrow-0.9.0..apache-arrow-0.10.0
70 Antoine Pitrou
49 Kouhei Sutou
40 Korn, Uwe
37 Wes McKinney
32 Krisztián Szűcs
30 Andy Grove
20 Philipp Moritz
13 Phillip Cloud
11 Bryan Cutler
11 yosuke shiro
7 Dimitri Vorona
6 Zhijun Fu
5 Bruce Mitchener
5 Joshua Storck
5 Robert Nishihara
5 ptaylor
4 Maximilian Roos
4 Sebastien Binet
3 Alex
3 Brian Hulette
3 Chao Sun
3 Dominik Moritz
3 Kenji Okimoto
3 Marco Neumann
3 Yuhong Guo
2 Abhi
2 Dhruv Madeka
2 Dmitry Kalinkin
2 Donal Simmie
2 Frank Wessels
2 Julius Neuffer
2 Manabu Ejima
2 Omer Katz
2 Paddy
2 Paddy Horan
2 Robert Gruener
2 Teddy Choi
2 Vanco Buca
2 Venki Korukanti
2 bomeng
2 fjetter
2 liurenjie1024
2 songqing
1 284km
1 Adrian Dorr
1 Albert Shieh
1 Alessandro Andrioni
1 Alok Singh
1 Aneesh Karve
1 Atul Dambalkar
1 Ben Wolfson
1 Brent Kerby
1 Daniel Chalef
1 Daniel Compton
1 Florian Rathgeber
1 Gatis Seja
1 HE, Tao
1 James Lamb
1 Jeff Zhang
1 Juan Paulo Gutierrez
1 Kane
1 Kee Chong Tan
1 Kelsey Jordahl
1 Kendall Willets
1 Li Jin
1 Licht-T
1 Lizhou Gao
1 Louis Potok
1 Markus Klein
1 Matt Topol
1 Matthew Topol
1 Michael Sarahan
1 Paul Taylor
1 Peter Schafhalter
1 Philipp Hoch
1 Renato Marroquin
1 Richard Gowers
1 Robbie Gruener
Patch Committers
The following Apache committers committed contributed patches to the repository.
$ git shortlog -csn apache-arrow-0.9.0..apache-arrow-0.10.0
120 Wes McKinney
119 Korn, Uwe
63 Antoine Pitrou
50 Uwe L. Korn
28 Kouhei Sutou
27 Philipp Moritz
15 Bryan Cutler
15 Phillip Cloud
8 Robert Nishihara
6 Sidd
4 Brian Hulette
2 GitHub
1 Your Name Here
1 ptaylor
Changelog
New Features and Improvements
- ARROW-1018 - [C++] Add option to create FileOutputStream, ReadableFile from OS file descriptor
- ARROW-1163 - [Plasma][Java] Java client for Plasma
- ARROW-1388 - [Python] Add Table.drop method for removing columns
- ARROW-1454 - [Python] More informative error message when attempting to write an unsupported Arrow type to Parquet format
- ARROW-1715 - [Python] Implement pickling for Column, ChunkedArray, RecordBatch, Table
- ARROW-1722 - [C++] Add linting script to look for C++/CLI issues
- ARROW-1731 - [Python] Provide for selecting a subset of columns to convert in RecordBatch/Table.from_pandas
- ARROW-1744 - [Plasma] Provide TensorFlow operator to read tensors from plasma
- ARROW-1780 - [Java] JDBC Adapter for Apache Arrow
- ARROW-1858 - [Python] Add documentation about parquet.write_to_dataset and related methods
- ARROW-1868 - [Java] Change vector getMinorType to use MinorType instead of Types.MinorType
- ARROW-1886 - [Python] Add function to “flatten” structs within tables
- ARROW-1913 - [Java] Fix Javadoc generation bugs with JDK8
- ARROW-1928 - [C++] Add benchmarks comparing performance of internal::BitmapReader/Writer with naive approaches
- ARROW-1954 - [Python] Add metadata accessor to pyarrow.Field
- ARROW-1964 - [Python] Expose Builder classes
- ARROW-2014 - [Python] Document read_pandas method in pyarrow.parquet
- ARROW-2055 - [Java] Upgrade to Java 8
- ARROW-2060 - [Python] Documentation for creating StructArray using from_arrays or a sequence of dicts
- ARROW-2061 - [C++] Run ASAN builds in Travis CI
- ARROW-2074 - [Python] Allow type inference for struct arrays
- ARROW-2097 - [Python] Suppress valgrind stdout/stderr in Travis CI builds when there are no errors
- ARROW-2100 - [Python] Drop Python 3.4 support
- ARROW-2140 - [Python] Conversion from Numpy float16 array unimplemented
- ARROW-2141 - [Python] Conversion from Numpy object array to varsize binary unimplemented
- ARROW-2147 - [Python] Type inference doesn’t work on lists of Numpy arrays
- ARROW-2207 - [GLib] Support decimal type
- ARROW-2222 - [C++] Add option to validate Flatbuffers messages
- ARROW-2224 - [C++] Get rid of boost regex usage
- ARROW-2241 - [Python] Simple script for running all current ASV benchmarks at a commit or tag
- ARROW-2264 - [Python] Efficiently serialize numpy arrays with dtype of unicode fixed length string
- ARROW-2267 - Rust bindings
- ARROW-2276 - [Python] Tensor could implement the buffer protocol
- ARROW-2281 - [Python] Expose MakeArray to construct arrays from buffers
- ARROW-2285 - [Python] Can’t convert Numpy string arrays
- ARROW-2286 - [Python] Allow subscripting pyarrow.lib.StructValue
- ARROW-2287 - [Python] chunked array not iterable, not indexable
- ARROW-2299 - [Go] Go language implementation
- ARROW-2301 - [Python] Add source distribution publishing instructions to package / release management documentation
- ARROW-2302 - [GLib] Run autotools and meson Linux builds in same Travis CI build entry
- ARROW-2308 - Serialized tensor data should be 64-byte aligned.
- ARROW-2315 - [C++/Python] Add method to flatten a struct array
- ARROW-2319 - [C++] Add buffered output class implementing OutputStream interface
- ARROW-2322 - Document requirements to run dev/release/01-perform.sh
- ARROW-2325 - [Python] Update setup.py to use Markdown project description
- ARROW-2330 - [C++] Optimize delta buffer creation with partially finishable array builders
- ARROW-2332 - [Python] Provide API for reading multiple Feather files
- ARROW-2334 - [C++] Update boost to 1.66.0
- ARROW-2335 - [Go] Move Go README one directory higher
- ARROW-2340 - [Website] Add blog post about Go codebase donation
- ARROW-2341 - [Python] pa.union() mode argument unintuitive
- ARROW-2343 - [Java/Packaging] Run mvn clean in API doc builds
- ARROW-2344 - [Go] Run Go unit tests in Travis CI
- ARROW-2345 - [Documentation] Fix bundle exec and set sphinx nosidebar to True
- ARROW-2348 - [GLib] Remove Go example
- ARROW-2350 - Shrink size of spark_integration Docker container
- ARROW-2353 - Test correctness of built wheel on AppVeyor
- ARROW-2361 - [Rust] Start native Rust Implementation
- ARROW-2364 - [Plasma] PlasmaClient::Get() could take vector of object ids
- ARROW-2376 - [Rust] Travis should run tests for Rust library
- ARROW-2378 - [Rust] Use rustfmt to format source code
- ARROW-2381 - [Rust] Buffer
should have an Iterator - ARROW-2384 - Rust: Use Traits rather than defining methods directly
- ARROW-2385 - [Rust] Implement to_json() for Field and DataType
- ARROW-2388 - [C++] Arrow::StringBuilder::Append() uses null_bytes not valid_bytes
- ARROW-2389 - [C++] Add StatusCode::OverflowError
- ARROW-2390 - [C++/Python] CheckPyError() could inspect exception type
- ARROW-2395 - [Python] Correct flake8 errors outside of pyarrow/ directory
- ARROW-2396 - Unify Rust Errors
- ARROW-2397 - Document changes in Tensor encoding in IPC.md.
- ARROW-2398 - [Rust] Provide a zero-copy builder for type-safe Buffer
- ARROW-2400 - [C++] Status destructor is expensive
- ARROW-2401 - Support filters on Hive partitioned Parquet files
- ARROW-2402 - [C++] FixedSizeBinaryBuilder::Append lacks “const char*” overload
- ARROW-2404 - Fix declaration of ‘type_id’ hides class member warning in msvc build
- ARROW-2407 - [GLib] Add garrow_string_array_builder_append_values()
- ARROW-2408 - [Rust] It should be possible to get a &mut[T] from Builder
- ARROW-2411 - [C++] Add method to append batches of null-terminated strings to StringBuilder
- ARROW-2413 - [Rust] Remove useless use of `format!`
- ARROW-2414 - [Documentation] Fix miscellaneous documentation typos
- ARROW-2415 - [Rust] Fix using references in pattern matching
- ARROW-2416 - [C++] Support system libprotobuf
- ARROW-2417 - [Rust] Review APIs for safety
- ARROW-2422 - [Python] Support more filter operators on Hive partitioned Parquet files
- ARROW-2427 - [C++] ReadAt implementations suboptimal
- ARROW-2430 - MVP for branch based packaging automation
- ARROW-2433 - [Rust] Add Builder.push_slice(&[T])
- ARROW-2434 - [Rust] Add windows support
- ARROW-2435 - [Rust] Add memory pool abstraction.
- ARROW-2436 - [Rust] Add windows CI
- ARROW-2440 - [Rust] Implement ListBuilder
- ARROW-2442 - [C++] Disambiguate Builder::Append overloads
- ARROW-2445 - [Rust] Add documentation and make some fields private
- ARROW-2448 - Segfault when plasma client goes out of scope before buffer.
- ARROW-2451 - Handle more dtypes efficiently in custom numpy array serializer.
- ARROW-2453 - [Python] Improve Table column access
- ARROW-2458 - [Plasma] PlasmaClient uses global variable
- ARROW-2463 - [C++] Update flatbuffers to 1.9.0
- ARROW-2464 - [Python] Use a python_version marker instead of a condition
- ARROW-2469 - Make out arguments last in ReadMessage API.
- ARROW-2470 - [C++] FileGetSize() should not seek
- ARROW-2472 - [Rust] The Schema and Fields types should not have public attributes
- ARROW-2477 - [Rust] Set up code coverage in CI
- ARROW-2478 - [C++] Introduce a checked_cast function that performs a dynamic_cast in debug mode
- ARROW-2479 - [C++] Have a global thread pool
- ARROW-2480 - [C++] Enable casting the value of a decimal to int32_t or int64_t
- ARROW-2481 - [Rust] Move calls to free() into memory.rs
- ARROW-2482 - [Rust] support nested types
- ARROW-2484 - [C++] Document ABI compliance checking
- ARROW-2485 - [C++] Output diff when run_clang_format.py reports a change
- ARROW-2486 - [C++/Python] Provide a Docker image that contains all dependencies for development
- ARROW-2488 - [C++] List Boost 1.67 as supported version
- ARROW-2493 - [Python] Add support for pickling to buffers and arrays
- ARROW-2494 - Return status codes from PlasmaClient::Seal
- ARROW-2498 - [Java] Upgrade to JDK 1.8
- ARROW-2499 - [C++] Add iterator facility for Python sequences
- ARROW-2505 - [C++] Disable MSVC warning C4800
- ARROW-2506 - [Plasma] Build error on macOS
- ARROW-2507 - [Rust] Don’t take a reference when not needed
- ARROW-2508 - [Python] pytest API changes make tests fail
- ARROW-2513 - [Python] DictionaryType should give access to index type and dictionary array
- ARROW-2516 - AppVeyor Build Matrix should be specific to the changes made in a PR
- ARROW-2521 - [Rust] Refactor Rust API to use traits and generics
- ARROW-2522 - [C++] Version shared library files
- ARROW-2525 - [GLib] Add garrow_struct_array_flatten()
- ARROW-2526 - [GLib] Update .gitignore
- ARROW-2527 - [GLib] Enable GPU document
- ARROW-2529 - [C++] Update mention of clang-format to 5.0 in the docs
- ARROW-2531 - [C++] Update clang bits to 6.0
- ARROW-2533 - [CI] Fast finish failing AppVeyor builds
- ARROW-2536 - [Rust] ListBuilder uses wrong initial size for offset builder
- ARROW-2537 - [Ruby] Import
- ARROW-2539 - [Plasma] Use unique_ptr instead of raw pointer
- ARROW-2540 - [Plasma] add constructor/destructor to make sure dlfree is called automatically
- ARROW-2541 - [Plasma] Clean up macro usage
- ARROW-2543 - [Rust] CI should cache dependencies for faster builds
- ARROW-2544 - [CI] Run C++ tests with two jobs on Travis-CI
- ARROW-2547 - [Format] Fix off-by-one in List<List
> example - ARROW-2548 - [Format] Clarify `List
\` Array example - ARROW-2549 - [GLib] Apply arrow::StatusCodes changes to GArrowError
- ARROW-2550 - [C++] Add missing status codes into arrow::StatusCode::CodeAsString()
- ARROW-2551 - [Plasma] Improve notification logic
- ARROW-2553 - [Python] Set MACOSX_DEPLOYMENT_TARGET in wheel build
- ARROW-2558 - [Plasma] avoid walk through all the objects when a client disconnects
- ARROW-2562 - [C++] Upload coverage data to codecov.io
- ARROW-2563 - [Rust] Poor caching in Travis-CI
- ARROW-2566 - [CI] Add codecov.io badge to README
- ARROW-2567 - [C++/Python] Unit is ignored on comparison of TimestampArrays
- ARROW-2568 - [Python] Expose thread pool size setting to Python, and deprecate “nthreads”
- ARROW-2569 - [C++] Improve thread pool size heuristic
- ARROW-2574 - [CI] Collect and publish Python coverage
- ARROW-2576 - [GLib] Add abs functions for Decimal128.
- ARROW-2577 - [Plasma] Add ASV benchmarks
- ARROW-2580 - [GLib] Fix abs functions for Decimal128
- ARROW-2582 - [GLib] Add negate functions for Decimal128
- ARROW-2585 - [C++] Add Decimal128::FromBigEndian
- ARROW-2586 - [C++] Make child builders of ListBuilder and StructBuilder shared_ptr’s
- ARROW-2595 - [Plasma] operator[] creates entries in map
- ARROW-2596 - [GLib] Use the default value of GTK-Doc
- ARROW-2597 - [Plasma] remove UniqueIDHasher
- ARROW-2604 - [Java] Add method overload for VarCharVector.set(int,String)
- ARROW-2608 - [Java/Python] Add pyarrow.{Array,Field}.from_jvm / jvm_buffer
- ARROW-2611 - [Python] Python 2 integer serialization
- ARROW-2612 - [Plasma] Fix deprecated PLASMA_DEFAULT_RELEASE_DELAY
- ARROW-2613 - [Docs] Update the gen_apidocs docker script
- ARROW-2614 - [CI] Remove ‘group: deprecated’ in Travis
- ARROW-2626 - [Python] pandas ArrowInvalid message should include failing column name
- ARROW-2634 - [Go] Add LICENSE additions for Go subproject
- ARROW-2635 - [Ruby] LICENSE.txt isn’t suitable
- ARROW-2636 - [Ruby] “Unofficial” package note is missing
- ARROW-2638 - [Python] Prevent calling extension class constructors directly
- ARROW-2639 - [Python] Remove unnecessary _check_nullptr methods
- ARROW-2641 - [C++] Investigate spurious memset() calls
- ARROW-2645 - [Java] ArrowStreamWriter accumulates DictionaryBatch ArrowBlocks
- ARROW-2649 - [C++] Add std::generate()-like function for faster bitmap writing
- ARROW-2656 - [Python] Improve ParquetManifest creation time
- ARROW-2660 - [Python] Experiment with zero-copy pickling
- ARROW-2661 - [Python/C++] Allow passing HDFS Config values via map/dict instead of needing an hdfs-site.xml file
- ARROW-2662 - [Python] Add to_pandas / to_numpy to ChunkedArray
- ARROW-2663 - [Python] Make dictionary_encode and unique accesible on Column / ChunkedArray
- ARROW-2664 - [Python] Implement __getitem__ / slicing on Buffer
- ARROW-2666 - [Python] numpy.asarray should trigger to_pandas on Array/ChunkedArray
- ARROW-2672 - [Python] Build ORC extension in manylinux1 wheels
- ARROW-2674 - [Packaging] Start building nightlies
- ARROW-2676 - [Packaging] Deploy build artifacts to github releases
- ARROW-2677 - [Python] Expose Parquet ZSTD compression
- ARROW-2678 - [GLib] Add extra information to common build problems on macOS
- ARROW-2680 - [Python] Add documentation about type inference in Table.from_pandas
- ARROW-2682 - [CI] Notify in Slack about broken builds
- ARROW-2689 - [Python] Remove references to timestamps_to_ms argument from documentation
- ARROW-2692 - [Python] Add test for writing dictionary encoded columns to chunked Parquet files
- ARROW-2695 - [Python] Prevent calling scalar contructors directly
- ARROW-2696 - [JAVA] enhance AllocationListener with an onFailedAllocation() call
- ARROW-2699 - [C++/Python] Add Table method that replaces a column with a new supplied column
- ARROW-2700 - [Python] Add simple examples to Array.cast docstring
- ARROW-2701 - [C++] Make MemoryMappedFile resizable
- ARROW-2704 - [Java] IPC stream handling should be more friendly to low level processing
- ARROW-2713 - [Packaging] Fix linux package builds
- ARROW-2717 - [Packaging] Postfix conda artifacts with target arch
- ARROW-2718 - [Packaging] GPG sign downloaded artifacts
- ARROW-2724 - [Packaging] Determine whether all the expected artifacts are uploaded
- ARROW-2725 - [JAVA] make Accountant.AllocationOutcome publicly visible
- ARROW-2729 - [GLib] Add decimal128 array builder
- ARROW-2731 - Allow usage of external ORC library
- ARROW-2732 - Update brew packages for macOS
- ARROW-2733 - [GLib] Cast garrow_decimal128 to gint64
- ARROW-2738 - [GLib] Use Brewfile on installation process
- ARROW-2739 - [GLib] Use G_DECLARE_DERIVABLE_TYPE for GArrowDecimalDataType and GArrowDecimal128ArrayBuilder
- ARROW-2740 - [Python] Add address property to Buffer
- ARROW-2742 - [Python] Allow Table.from_batches to use Iterator of ArrowRecordBatches
- ARROW-2748 - [GLib] Add garrow_decimal_data_type_get_scale() (and _precision())
- ARROW-2749 - [GLib] Rename *garrow_decimal128_array_get_value to *garrow_decimal128_array_format_value
- ARROW-2751 - [GLib] Add garrow_table_replace_column()
- ARROW-2752 - [GLib] Document garrow_decimal_data_type_new()
- ARROW-2753 - [GLib] Add garrow_schema_*_field()
- ARROW-2755 - [Python] Allow using Ninja to build extension
- ARROW-2756 - [Python] Remove redundant imports and minor fixes in parquet tests
- ARROW-2758 - [Plasma] Use Scope enum in Plasma
- ARROW-2760 - [Python] Remove legacy property definition syntax from parquet module and test them
- ARROW-2761 - Support set filter operators on Hive partitioned Parquet files
- ARROW-2763 - [Python] Make parquet _metadata file accessible from ParquetDataset
- ARROW-2780 - [Go] Run code coverage analysis
- ARROW-2784 - [C++] MemoryMappedFile::WriteAt allow writing past the end
- ARROW-2790 - [C++] Buffers contain uninitialized memory
- ARROW-2791 - [Packaging] Build Ubuntu 18.04 packages
- ARROW-2792 - [Packaging] Consider uploading tarballs to avoid naming conflicts
- ARROW-2794 - [Plasma] Add Delete method for multiple objects
- ARROW-2798 - [Plasma] Use hashing function that takes into account all UniqueID bytes
- ARROW-2802 - [Docs] Move release management guide to project wiki
- ARROW-2804 - [Website] Link to Developer wiki (Confluence) from front page
- ARROW-2805 - [Python] TensorFlow import workaround not working with tensorflow-gpu if CUDA is not installed
- ARROW-2809 - [C++] Decrease verbosity of lint checks in Travis CI
- ARROW-2811 - [Python] Test serialization for determinism
- ARROW-2815 - [CI] Suppress DEBUG logging when building Java library in C++ CI entries
- ARROW-2816 - [Python] Add __iter__ method to NativeFile
- ARROW-2821 - [C++] Only zero memory in BooleanBuilder in one place
- ARROW-2822 - [C++] Zero padding bytes in PoolBuffer::Resize
- ARROW-2824 - [GLib] Add garrow_decimal128_array_get_value()
- ARROW-2825 - [C++] Need AllocateBuffer / AllocateResizableBuffer variant with default memory pool
- ARROW-2826 - [C++] Clarification needed between ArrayBuilder::Init(), Resize() and Reserve()
- ARROW-2827 - [C++] LZ4 and Zstd build may be failed in parallel build
- ARROW-2829 - [GLib] Add GArrowORCFileReader
- ARROW-2830 - [Packaging] Enable parallel build for deb package build again
- ARROW-2833 - [Python] Column.__repr__ will lock up Jupyter with large datasets
- ARROW-2834 - [GLib] Remove “enable_” prefix from Meson options
- ARROW-2836 - [Packaging] Expand build matrices to multiple tasks
- ARROW-2837 - [C++] ArrayBuilder::null_bitmap returns PoolBuffer
- ARROW-2838 - [Python] Speed up null testing with Pandas semantics
- ARROW-2844 - [Packaging] Test OSX wheels after build
- ARROW-2845 - [Packaging] Upload additional debian artifacts
- ARROW-2846 - [Packaging] Update nightly build in crossbow as well as the sample configuration
- ARROW-2847 - [Packaging] Fix artifact name matching for conda forge packages
- ARROW-2848 - [Packaging] lib*.deb package name doesn’t match so version
- ARROW-2849 - [Ruby] Arrow::Table#load supports ORC
- ARROW-2855 - [C++] Blog post that outlines the benefits of using jemalloc
- ARROW-2859 - [Python] Handle objects exporting the buffer protocol in open_stream, open_file, and RecordBatch*Reader APIs
- ARROW-2861 - [Python] Add extra tips about using Parquet to store index-less pandas data
- ARROW-2864 - [Plasma] Add deletion cache to delete objects later
- ARROW-2868 - [Packaging] Fix centos-7 build
- ARROW-2869 - [Python] Add documentation for Array.to_numpy
- ARROW-2875 - [Packaging] Don’t attempt to download arrow archive in linux builds
- ARROW-2881 - [Website] Add Community tab to website
- ARROW-2884 - [Packaging] Options to build packages from apache source archive
- ARROW-2886 - [Release] An unused variable exists
- ARROW-2890 - [Plasma] Make Python PlasmaClient.release private
- ARROW-2893 - [C++] Remove PoolBuffer class from public API and hide implementation details behind factory functions
- ARROW-2897 - Organize supported Ubuntu versions
- ARROW-2898 - [Packaging] Setuptools_scm just shipped a new version which fails to parse `apache-arrow-
\` tag - ARROW-2906 - [Website] Remove the link to slack channel
- ARROW-2907 - [GitHub] Improve “How to contribute patches”
- ARROW-2908 - [Rust] Update version to 0.10.0
- ARROW-2914 - [Integration] Add WindowPandasUDFTests to Spark Integration
- ARROW-2915 - [Packaging] Remove artifact form ubuntu-trusty build
- ARROW-2918 - [C++] Improve formatting of Struct pretty prints
- ARROW-2921 - [Release] Update .deb/.rpm changelos in preparation
- ARROW-2922 - [Release] Make python command name customizable
- ARROW-2923 - [Doc] Add instructions for running Spark integration tests
- ARROW-2924 - [Java] mvn release fails when an older maven javadoc plugin is installed
- ARROW-2927 - [Packaging] AppVeyor wheel task is failing on initial checkout
- ARROW-2928 - [Packaging] AppVeyor crossbow conda builds are picking up boost 1.63.0 instead of the installed version
- ARROW-2929 - [C++] ARROW-2826 Breaks parquet-cpp 1.4.0 builds
- ARROW-2934 - [Packaging] Add checksums creation to sign subcommand
- ARROW-2935 - [Packaging] Add verify_binary_artifacts function to verify-release-candidate.sh
- ARROW-2937 - [Java] Follow-up changes to ARROW-2704
- ARROW-2943 - [C++] Implement BufferedOutputStream::Flush
- ARROW-2944 - [Format] Arrow columnar format docs mentions VectorLayout that does not exist anymore
- ARROW-2946 - [Packaging] Stop to use PWD in debian/rules
- ARROW-2947 - [Packaging] Remove Ubuntu Artful
- ARROW-2949 - [CI] repo.continuum.io can be flaky in builds
- ARROW-2951 - [CI] Changes in format/ should cause Appveyor builds to run
- ARROW-2953 - [Plasma] Store memory usage
- ARROW-2954 - [Plasma] Store object_id only once in object table
- ARROW-2962 - [Packaging] Bintray descriptor files are no longer needed
- ARROW-2977 - [Packaging] Release verification script should check rust too
- ARROW-2985 - [Ruby] Run unit tests in verify-release-candidate.sh
- ARROW-2988 - [Release] More automated release verification on Windows
- ARROW-2990 - [GLib] Fail to build with rpath-ed Arrow C++ on macOS
- ARROW-530 - C++/Python: Provide subpools for better memory allocation tracking
- ARROW-564 - [Python] Add methods to return vanilla NumPy arrays (plus boolean mask array if there are nulls)
- ARROW-889 - [C++] Implement arrow::PrettyPrint for ChunkedArray
- ARROW-902 - [C++] Build C++ project including thirdparty dependencies from local tarballs
- ARROW-906 - [C++] Serialize Field metadata to IPC metadata
Bug Fixes
- ARROW-2059 - [Python] Possible performance regression in Feather read/write path
- ARROW-2101 - [Python] from_pandas reads ‘str’ type as binary Arrow data with Python 2
- ARROW-2122 - [Python] Pyarrow fails to serialize dataframe with timestamp.
- ARROW-2182 - [Python] ASV benchmark setup does not account for C++ library changing
- ARROW-2193 - [Plasma] plasma_store has runtime dependency on Boost shared libraries when ARROW_BOOST_USE_SHARED=on
- ARROW-2195 - [Plasma] Segfault when retrieving RecordBatch from plasma store
- ARROW-2247 - [Python] Statically-linking boost_regex in both libarrow and libparquet results in segfault
- ARROW-2273 - Cannot deserialize pandas SparseDataFrame
- ARROW-2300 - [Python] python/testing/test_hdfs.sh no longer works
- ARROW-2305 - [Python] Cython 0.25.2 compilation failure
- ARROW-2314 - [Python] Union array slicing is defective
- ARROW-2326 - [Python] cannot import pip installed pyarrow on OS X (10.9)
- ARROW-2328 - Writing a slice with feather ignores the offset
- ARROW-2331 - [Python] Fix indexing implementations
- ARROW-2333 - [Python] boost bundling fails in setup.py
- ARROW-2342 - [Python] Aware timestamp type fails pickling
- ARROW-2346 - [Python] PYARROW_CXXFLAGS doesn’t accept multiple options
- ARROW-2349 - [Python] Boost shared library bundling is broken for MSVC
- ARROW-2351 - [C++] StringBuilder::append(vector
...) not implemented - ARROW-2354 - [C++] PyDecimal_Check() is much too slow
- ARROW-2355 - [Python] Unable to import pyarrow [0.9.0] OSX
- ARROW-2357 - Benchmark PandasObjectIsNull
- ARROW-2368 - DecimalVector#setBigEndian is not padding correctly for negative values
- ARROW-2369 - Large (>~20 GB) files written to Parquet via PyArrow are corrupted
- ARROW-2370 - [GLib] include path is wrong on Meson build
- ARROW-2371 - [GLib] gio-2.0 isn’t required on GNU Autotools build
- ARROW-2372 - [Python] ArrowIOError: Invalid argument when reading Parquet file
- ARROW-2375 - [Rust] Buffer should release memory when dropped
- ARROW-2377 - [GLib] Travis-CI failures
- ARROW-2380 - [Python] Correct issues in numpy_to_arrow conversion routines
- ARROW-2382 - [Rust] List
was not using memory safely - ARROW-2383 - [C++] Debian packages need to depend on libprotobuf
- ARROW-2387 - [Python] negative decimal values get spurious rescaling error
- ARROW-2391 - [Python] Segmentation fault from PyArrow when mapping Pandas datetime column to pyarrow.date64
- ARROW-2393 - [C++] arrow/status.h does not define ARROW_CHECK needed for ARROW_CHECK_OK
- ARROW-2403 - [C++] arrow::CpuInfo::model_name_ destructed twice on exit
- ARROW-2405 - [C++]
is missing in plasma/client.h - ARROW-2418 - [Rust] List builder fails due to memory not being reserved correctly
- ARROW-2419 - [Site] Website generation depends on local timezone
- ARROW-2420 - [Rust] Memory is never released
- ARROW-2423 - [Python] PyArrow datatypes raise ValueError on equality checks against non-PyArrow objects
- ARROW-2424 - [Rust] Missing import causing broken build
- ARROW-2425 - [Rust] Array::from missing mapping for u8 type
- ARROW-2426 - [CI] glib build failure
- ARROW-2432 - [Python] from_pandas fails when converting decimals if have None values
- ARROW-2437 - [C++] Change of arrow::ipc::ReadMessage signature breaks ABI compability
- ARROW-2441 - [Rust] Builder
::slice\_mut assertions are too strict - ARROW-2443 - [Python] Conversion from pandas of empty categorical fails with ArrowInvalid
- ARROW-2450 - [Python] Saving to parquet fails for empty lists
- ARROW-2452 - [TEST] Spark integration test fails with permission error
- ARROW-2454 - [Python] Empty chunked array slice crashes
- ARROW-2455 - [C++] The bytes_allocated_ in CudaContextImpl isn’t initialized
- ARROW-2457 - garrow_array_builder_append_values() won’t work for large arrays
- ARROW-2459 - pyarrow: Segfault with pyarrow.deserialize_pandas
- ARROW-2462 - [C++] Segfault when writing a parquet table containing a dictionary column from Record Batch Stream
- ARROW-2465 - [Plasma] plasma_store fails to find libarrow_gpu.so
- ARROW-2466 - [C++] misleading “append” flag to FileOutputStream
- ARROW-2468 - [Rust] Builder::slice_mut should take mut self
- ARROW-2471 - [Rust] Assertion when pushing value to Builder/ListBuilder with zero capacity
- ARROW-2473 - [Rust] List assertion error with list of zero length
- ARROW-2474 - [Rust] Add windows support for memory pool abstraction
- ARROW-2489 - [Plasma] test_plasma.py crashes
- ARROW-2491 - [Python] Array.from_buffers does not work for ListArray
- ARROW-2492 - [Python] Prevent segfault on accidental call of pyarrow.Array
- ARROW-2500 - [Java] IPC Writers/readers are not always setting validity bits correctly
- ARROW-2502 - [Rust] Restore Windows Compatibility
- ARROW-2503 - [Python] Trailing space character in RowGroup statistics of pyarrow.parquet.ParquetFile
- ARROW-2509 - [CI] Intermittent npm failures
- ARROW-2511 - BaseVariableWidthVector.allocateNew is not throwing OOM when it can’t allocate memory
- ARROW-2514 - [Python] Inferring / converting nested Numpy array is very slow
- ARROW-2515 - Errors with DictionaryArray inside of ListArray or other DictionaryArray
- ARROW-2518 - [Java] Restore Java unit tests and javadoc test to CI matrix
- ARROW-2530 - [GLib] Out-of-source build is failed
- ARROW-2534 - [C++] libarrow.so leaks zlib symbols
- ARROW-2545 - [Python] Arrow fails linking against statically-compiled Python
- ARROW-2554 - pa.array type inference bug when using NS-timestamp
- ARROW-2557 - [Rust] Add badge for code coverage in README
- ARROW-2561 - [C++] Crash in cuda-test shutdown with coverage enabled
- ARROW-2564 - [C++] Rowwise Tutorial is out of date
- ARROW-2565 - [Plasma] new subscriber cannot receive notifications about existing objects
- ARROW-2570 - [Python] Add support for writing parquet files with LZ4 compression
- ARROW-2571 - [C++] Lz4Codec doesn’t properly handle empty data
- ARROW-2575 - [Python] Exclude hidden files when reading Parquet dataset
- ARROW-2578 - [Plasma] Valgrind errors related to std::random_device
- ARROW-2589 - [Python] test_parquet.py regression with Pandas 0.23.0
- ARROW-2593 - [Python] TypeError: data type “mixed-integer” not understood
- ARROW-2594 - [Java] Vector reallocation does not properly clear reused buffers
- ARROW-2601 - [Python] MemoryPool bytes_allocated causes seg
- ARROW-2603 - [Python] from pandas raises ArrowInvalid for date(time) subclasses
- ARROW-2615 - [Rust] Refactor introduced a bug around Arrays of String
- ARROW-2629 - [Plasma] Iterator invalidation for pending_notifications_
- ARROW-2630 - [Java] Typo in the document
- ARROW-2632 - [Java] ArrowStreamWriter accumulates ArrowBlock but does not use them
- ARROW-2640 - JS Writer should serialize schema metadata
- ARROW-2643 - [C++] Travis-CI build failure with cpp toolchain enabled
- ARROW-2644 - [Python] parquet binding fails building on AppVeyor
- ARROW-2655 - [C++] Failure with -Werror=conversion on gcc 7.3.0
- ARROW-2657 - Segfault when importing TensorFlow after Pyarrow
- ARROW-2668 - [C++] -Wnull-pointer-arithmetic warning with dlmalloc.c on clang 6.0, Ubuntu 14.04
- ARROW-2669 - [C++] EP_CXX_FLAGS not passed on when building gbenchmark
- ARROW-2675 - Arrow build error with clang-10 (Apple Clang / LLVM)
- ARROW-2683 - [Python] Resource Warning (Unclosed File) when using pyarrow.parquet.read_table()
- ARROW-2690 - [C++] Plasma does not follow style conventions for variable and function names
- ARROW-2691 - [Rust] Travis fails due to formatting diff
- ARROW-2693 - [Python] pa.chunked_array causes a segmentation fault on empty input
- ARROW-2694 - [Python] ArrayValue string conversion returns the representation instead of the converted python object string
- ARROW-2698 - [Python] Exception when passing a string to Table.column
- ARROW-2711 - [Python/C++] Pandas-Arrow doesn’t roundtrip when column of lists has empty first element
- ARROW-2716 - [Python] Make manylinux1 base image independent of Python patch releases
- ARROW-2721 - [C++] Link error with Arrow C++ build with -DARROW_ORC=ON on CentOS 7
- ARROW-2722 - [Python] ndarray to arrow conversion fails when downcasted from pandas to_numeric
- ARROW-2723 - [C++] arrow-orc.pc is missing
- ARROW-2726 - [C++] The latest Boost version is wrong
- ARROW-2727 - [Java] Unable to build java/adapters module
- ARROW-2741 - [Python] pa.array from np.datetime[D] and type=pa.date64 produces invalid results
- ARROW-2744 - [Python] Writing to parquet crashes when writing a ListArray of empty lists
- ARROW-2745 - [C++] ORC ExternalProject needs to declare dependency on vendored protobuf
- ARROW-2747 - [CI] [Plasma] huge tables test failure on Travis
- ARROW-2754 - [Python] When installing pyarrow via pip, a debug build is created
- ARROW-2770 - [Packaging] Account for conda-forge compiler migration in conda recipes
- ARROW-2773 - [Python] Corrected parquet docs partition_cols parameter name
- ARROW-2781 - [Python] Download boost using curl in manylinux1 image
- ARROW-2787 - [Python] Memory Issue passing table from python to c++ via cython
- ARROW-2795 - [Python] Run TensorFlow import workaround only on Linux
- ARROW-2806 - [Python] Inconsistent handling of np.nan
- ARROW-2810 - [Plasma] Plasma public headers leak flatbuffers.h
- ARROW-2812 - [Ruby] StructArray#[] raises NoMethodError
- ARROW-2820 - [Python] RecordBatch.from_arrays does not validate array lengths are all equal
- ARROW-2823 - [C++] Search for flatbuffers in
/lib64 - ARROW-2841 - [Go] Fix recent Go build failures in Travis CI
- ARROW-2850 - [C++/Python] PARQUET_RPATH_ORIGIN=ON missing in manylinux1 build
- ARROW-2851 - [C++] Update RAT excludes for new install file names
- ARROW-2852 - [Rust] Mark Array as Sync and Send
- ARROW-2862 - [C++] Ensure thirdparty download directory has been created in thirdparty/download_thirdparty.sh
- ARROW-2867 - [Python] Incorrect example for Cython usage
- ARROW-2871 - [Python] Array.to_numpy is invalid for boolean arrays
- ARROW-2872 - [Python] Add pytest mark to opt into TensorFlow-related unit tests
- ARROW-2876 - [Packaging] Crossbow builds can hang if you cloned using SSH
- ARROW-2877 - [Packaging] crossbow submit results in duplicate Travis CI build
- ARROW-2878 - [Packaging] README.md does not mention setting GitHub API token in user’s crossbow repo settings
- ARROW-2883 - [Plasma] Compilation warnings
- ARROW-2891 - Preserve schema in write_to_dataset
- ARROW-2894 - [Glib] Format tests broken due to recent refactor
- ARROW-2895 - [Ruby] CI isn’t ran when C++ is changed
- ARROW-2896 - [GLib] export are missing
- ARROW-2901 - [Java] Build is failing on Java9
- ARROW-2902 - [Python] HDFS Docker integration tests leave around files created by root
- ARROW-2911 - [Python] Parquet binary statistics that end in ‘\0’ truncate last byte
- ARROW-2917 - [Python] Tensor requiring gradiant cannot be serialized with pyarrow.serialize
- ARROW-2920 - [Python] Segfault with pytorch 0.4
- ARROW-2926 - [Python] ParquetWriter segfaults in example where passed schema and table schema do not match
- ARROW-2930 - [C++] Trying to set target properties on not existing CMake target
- ARROW-2940 - [Python] Import error with pytorch 0.3
- ARROW-2945 - [Packaging] Update argument check for 02-source.sh
- ARROW-2955 - [Python] Typo in pyarrow’s HDFS API result
- ARROW-2963 - [Python] Deadlock during fork-join and use_threads=True
- ARROW-2978 - [Rust] Travis CI build is failing
- ARROW-2982 - The “–show-progress” option is only supported in wget 1.16 and higher
- ARROW-640 - [Python] Arrow scalar values should have a sensible __hash__ and comparison