Skip to content

Commit

Permalink
ARROW-5893: [C++][Python][GLib][Ruby][MATLAB][R] Remove arrow::Column…
Browse files Browse the repository at this point in the history
… class

This completely removes `arrow::Column` from the C++ library and adapts the Python bindings to follow, to help assist with mailing list discussion.

There are other places that this would touch:

- [x] GLib
- [x] Ruby
- [x] MATLAB
- [x] R

You can see the evident reduction of boilerplate and simplification in many places -- it is particularly pronounced in parquet/arrow/arrow-reader-writer-test.cc. This refactor also exposed a bug where a Column's data type did not match its Field type.

Closes #4841 from wesm/remove-column-class and squashes the following commits:

b389d42 <Wes McKinney> Fix up Python/C++ docs to remove references to arrow::Column
cc34548 <Wes McKinney> Fix non-deterministic Table.from_pydict behavior. Add unicode/utf8 test for pyarrow.table. Fix usages of ChunkedArray.data and add unit test
4be8d5a <Wes McKinney> UTF-8 py2/3 compatibility issues
9b5a61e <Wes McKinney> Try to fix unicode issue
a5ee7b6 <Wes McKinney> Fix MATLAB code (I hope)
2c7c51e <Wes McKinney> code review comments
020f52a <Wes McKinney> Re-render R README
fd4473e <Wes McKinney> Fix up R C++ code and unit tests
244d2a7 <Wes McKinney> Begin refactoring R library. Change Feather to return ChunkedArray from GetColumn
d30a365 <Sutou Kouhei>  Remove unused variable
d89366a <Sutou Kouhei>  Follow API change
983d378 <Sutou Kouhei>  Follow API change
3099328 <Sutou Kouhei>  Follow API change
f092f1b <Sutou Kouhei>  Add missing available annotations
41dd1fe <Sutou Kouhei>  Revert needless style change
f120a7d <Sutou Kouhei>  Suppress a warning with MSVC
62f3649 <Sutou Kouhei>  Remove unused lambda captures
0090051 <Sutou Kouhei>  Add API index for 1.0.0
bd6002a <Sutou Kouhei>  Remove entries
335d7b6 <Sutou Kouhei>  Implement Arrow::Column in Ruby
fb5ad1c <Sutou Kouhei>  Add garrow_chunked_array_get_n_rows() for consistency
60dfb0c <Sutou Kouhei>  Add garrow_schema_get_field_index()
c4ff972 <Sutou Kouhei>  Remove backward compatible column API from GArrowRecordBatch
179aa67 <Sutou Kouhei>  Use "column_data" instead of "column"
795b6c1 <Sutou Kouhei>  Follow arrow::Column remove
ac6f372 <Kevin Gurney> Remove use of arrow::Column from feather_reader.cc. Remove use of deprecated StatusCode enum class values in handle_status.cc
fcaaf8c <Wes McKinney> Add pyarrow.ChunkedArray.flatten method. Remove Column from glib, but haven't fixed unit tests yet
f0d48cc <Wes McKinney> Adapt Python bindings
c161a9a <Wes McKinney> Fix up Parquet, too
93e3cad <Wes McKinney> arrow-tests all passing again
a7344a8 <Wes McKinney> Stage 1 of cutting away Column

Lead-authored-by: Wes McKinney <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Co-authored-by: Kevin Gurney <[email protected]>
Signed-off-by: Wes McKinney <[email protected]>
  • Loading branch information
3 people committed Jul 17, 2019
1 parent cbaa066 commit c350bba
Show file tree
Hide file tree
Showing 128 changed files with 1,630 additions and 3,061 deletions.
9 changes: 5 additions & 4 deletions c_glib/arrow-glib/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,6 @@ libarrow_glib_la_headers = \
buffer.h \
chunked-array.h \
codec.h \
column.h \
composite-array.h \
composite-data-type.h \
data-type.h \
Expand Down Expand Up @@ -107,7 +106,6 @@ libarrow_glib_la_sources = \
buffer.cpp \
chunked-array.cpp \
codec.cpp \
column.cpp \
composite-array.cpp \
composite-data-type.cpp \
decimal128.cpp \
Expand Down Expand Up @@ -153,7 +151,6 @@ libarrow_glib_la_cpp_headers = \
buffer.hpp \
chunked-array.hpp \
codec.hpp \
column.hpp \
data-type.hpp \
decimal128.hpp \
error.hpp \
Expand Down Expand Up @@ -187,9 +184,13 @@ libarrow_glib_la_cpp_headers += \
orc-file-reader.hpp
endif

libarrow_glib_la_cpp_internal_headers = \
internal-index.hpp

libarrow_glib_la_SOURCES = \
$(libarrow_glib_la_sources) \
$(libarrow_glib_la_cpp_headers)
$(libarrow_glib_la_cpp_headers) \
$(libarrow_glib_la_cpp_internal_headers)

BUILT_SOURCES = \
$(libarrow_glib_la_genearted_headers) \
Expand Down
1 change: 0 additions & 1 deletion c_glib/arrow-glib/arrow-glib.h
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@
#include <arrow-glib/array-builder.h>
#include <arrow-glib/chunked-array.h>
#include <arrow-glib/codec.h>
#include <arrow-glib/column.h>
#include <arrow-glib/compute.h>
#include <arrow-glib/data-type.h>
#include <arrow-glib/enums.h>
Expand Down
1 change: 0 additions & 1 deletion c_glib/arrow-glib/arrow-glib.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@
#include <arrow-glib/buffer.hpp>
#include <arrow-glib/chunked-array.hpp>
#include <arrow-glib/codec.hpp>
#include <arrow-glib/column.hpp>
#include <arrow-glib/data-type.hpp>
#include <arrow-glib/error.hpp>
#include <arrow-glib/field.hpp>
Expand Down
16 changes: 16 additions & 0 deletions c_glib/arrow-glib/chunked-array.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -206,9 +206,25 @@ garrow_chunked_array_get_value_type(GArrowChunkedArray *chunked_array)
* @chunked_array: A #GArrowChunkedArray.
*
* Returns: The total number of rows in the chunked array.
*
* Deprecated: 1.0.0: Use garrow_chunked_array_get_n_rows() instead.
*/
guint64
garrow_chunked_array_get_length(GArrowChunkedArray *chunked_array)
{
return garrow_chunked_array_get_n_rows(chunked_array);
}

/**
* garrow_chunked_array_get_n_rows:
* @chunked_array: A #GArrowChunkedArray.
*
* Returns: The total number of rows in the chunked array.
*
* Since: 1.0.0
*/
guint64
garrow_chunked_array_get_n_rows(GArrowChunkedArray *chunked_array)
{
const auto arrow_chunked_array = garrow_chunked_array_get_raw(chunked_array);
return arrow_chunked_array->length();
Expand Down
3 changes: 3 additions & 0 deletions c_glib/arrow-glib/chunked-array.h
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,10 @@ garrow_chunked_array_get_value_data_type(GArrowChunkedArray *chunked_array);
GArrowType
garrow_chunked_array_get_value_type(GArrowChunkedArray *chunked_array);

GARROW_DEPRECATED_IN_1_0_FOR(garrow_chunked_array_get_n_rows)
guint64 garrow_chunked_array_get_length (GArrowChunkedArray *chunked_array);
GARROW_AVAILABLE_IN_1_0
guint64 garrow_chunked_array_get_n_rows (GArrowChunkedArray *chunked_array);
guint64 garrow_chunked_array_get_n_nulls(GArrowChunkedArray *chunked_array);
guint garrow_chunked_array_get_n_chunks (GArrowChunkedArray *chunked_array);

Expand Down
Loading

0 comments on commit c350bba

Please sign in to comment.