Skip to content

Commit ea8940a

Browse files
romainfrancoiswesm
authored andcommitted
ARROW-3282: [R] initial R functionality
* Wrapping C++ pointers to arrow objects as R6 classes holding an R external pointer. * Factory functions for the metadata types, int32(), ... * Factory to create schemas and struct * Create Array, RecordBatch, Table from R vectors and data frames. initially only support integer (int32), numeric (float64) and raw (int8) vectors. * Reading and Writing record batches and Table to files. Author: Romain Francois <[email protected]> Closes #2596 from romainfrancois/r-dev-buffer and squashes the following commits: 9ab1882 <Romain Francois> mark Roxygen and Rcpp generated files 661f370 <Romain Francois> Using FirstTimeBitmapWriter instead of BitmapWriter. e81b72b <Romain Francois> only set null_bitmap if null_count > 0 bfe853d <Romain Francois> using 0-based indices in the tests. b391556 <Romain Francois> Also use arrow::internak::BitmapWriter 9e60555 <Romain Francois> name fixes. Using __ consistently bf814bb <Romain Francois> Using arrow::internal::BitmapReader c8aa703 <Romain Francois> Also use std::shared_ptr for MemoryPool. 2aa8a5f <Romain Francois> need dev version of `vctrs` 394bd33 <Romain Francois> 🐀 + RecordBatch$Slice de93a4f <Romain Francois> RecordBatch tests 9d208a4 <Romain Francois> +Array$RangeEquals f860063 <Romain Francois> Move each class to their own file a89a9a8 <Romain Francois> Move RecordBatch impl to own file a2f9f51 <Romain Francois> correctly handling offset() 8263c0d <Romain Francois> + tests for ChunkedArray e02e24f <Romain Francois> +chunked_array and tests b20e4b0 <Romain Francois> More tests d11cda0 <Romain Francois> +R6 class ChunkedArray 29af2ea <Romain Francois> license headers 2f53ebf <Romain Francois> Additional tests for read_arrow / write_arrow 4237c32 <Romain Francois> Clear the bit for non NA. ede8e44 <Romain Francois> Handle null buffer in R <-> Array conversions a5b8190 <Romain Francois> update README with example of reading/writing arrow::Table d951db8 <Romain Francois> "documentation" to quiet check() 908c2ac <Romain Francois> read_arrow and write_arrow now relate to arrow::Table. 110b00d <Romain Francois> resolving conflicts ae55f8b <Romain Francois> .. 767e9d9 <Romain Francois> more generic print method 8d8cdd1 <Romain Francois> + read_arrow / write_arrow for now c1385a0 <Romain Francois> export Array_as_vector, +Array$ToString 23fbd01 <Romain Francois> + column names 97659ff <Romain Francois> + as_tibble.arrow::RecordBatch fa4ee22 <Romain Francois> + read_record_batch f27eeba <Romain Francois> - MakeArray 4977bb2 <Romain Francois> no need to make ArrayData directly ef7cda1 <Romain Francois> class constructors only take the external pointers, logic moved to factory functions 81e059a <Romain Francois> rebasing 421e471 <Romain Francois> +macro R_ERROR_NOT_OK similar to RETURN_NOT_OK but that Rcpp::stop()s f5e3eff <Romain Francois> attempt RecordBatch$to_file 79205fb <Romain Francois> initial stab at arrow::table(data.frame) f6f1775 <Romain Francois> s/data/.data/ b9c215b <Romain Francois> "document" array and record_batch edf6098 <Romain Francois> Need to install `vctrs` from github for now 6aecdce <Romain Francois> skip using rpath linker option b8dac54 <Romain Francois> +RecordBatch$schema 1fc3cc2 <Romain Francois> no longer need this 05da931 <Romain Francois> initial stab at record_batch f4d0a34 <Romain Francois> must include arrow_types.h first aee2d0a <Romain Francois> initial stab at arrow::array a6ae2f3 <Romain Francois> cleanup e14b546 <Romain Francois> follow up from @wesm comments on #2489 36e9801 <Romain Francois> + installation instructions 108caf9 <Romain Francois> not checking for headers on these files b829bdf <Romain Francois> initial R 📦 with travis setup and testthat suite, that links to arrow c++ library and calls arrow::int32() 26e712d <Romain Francois> Initial work for type metadata, with tests. e251299 <Romain Francois> + installation instructions a9a8bbb <Romain Francois> not checking for headers on these files e0a7eff <Romain Francois> initial R 📦 with travis setup and testthat suite, that links to arrow c++ library and calls arrow::int32() b1c1109 <Romain Francois> finished rebasing after initial R patch merged 887df48 <Romain Francois> skip using rpath linker option a6de975 <Romain Francois> cleanup 8526e51 <Romain Francois> follow up from @wesm comments on #2489 f03a277 <Romain Francois> + installation instructions 0995ca4 <Romain Francois> not checking for headers on these files 1cb547e <Romain Francois> initial R 📦 with travis setup and testthat suite, that links to arrow c++ library and calls arrow::int32() 705c125 <Romain Francois> exclude Rd files 🐀 605e302 <Romain Francois> time32 only handles second and millisecond time64 only handles microsecond and nanosecond afdbae6 <Romain Francois> + licence header for R6.R file 65563f5 <Romain Francois> minimal documentation for check() b7135c7 <Romain Francois> stop exporting everything 6aaf192 <Romain Francois> ignoring the .clang-format file d854f2f <Romain Francois> + license headers for R files 🙊 d992b26 <Romain Francois> Initial work for type metadata, with tests. 614dd07 <Romain Francois> + installation instructions afce06a <Romain Francois> initial R 📦 with travis setup and testthat suite, that links to arrow c++ library and calls arrow::int32()
1 parent 5167502 commit ea8940a

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

51 files changed

+4714
-32
lines changed

.gitattributes

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
r/R/RcppExports.R linguist-generated=true
2+
r/src/RcppExports.cpp linguist-generated=true
3+
r/man/*.Rd linguist-generated=true
4+

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,3 +39,5 @@ python/.eggs/
3939
.pytest_cache/
4040
pkgs
4141
.Rproj.user
42+
arrow.Rcheck/
43+

dev/release/rat_exclude_files.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -127,3 +127,5 @@ r/.Rbuildignore
127127
r/arrow.Rproj
128128
r/README.md
129129
r/README.Rmd
130+
r/man/*.Rd
131+
.gitattributes

r/.Rbuildignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,6 @@
11
^.*\.Rproj$
22
^\.Rproj\.user$
33
^README\.Rmd$
4+
src/.clang-format
5+
LICENSE.md
6+
^data-raw$

r/DESCRIPTION

Lines changed: 35 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ Package: arrow
22
Title: R Integration to 'Apache' 'Arrow'
33
Version: 0.0.0.9000
44
Authors@R: c(
5-
person("Romain", "François", email = "[email protected]", role = c("aut", "cre")),
5+
person("Romain", "François", email = "[email protected]", role = c("aut", "cre")),
66
person("Apache Arrow", email = "[email protected]", role = c("aut", "cph"))
77
)
88
Description: R Integration to 'Apache' 'Arrow'.
@@ -11,11 +11,39 @@ License: Apache License (>= 2.0)
1111
Encoding: UTF-8
1212
LazyData: true
1313
SystemRequirements: C++11
14-
LinkingTo:
15-
Rcpp
16-
Imports:
17-
Rcpp
14+
LinkingTo:
15+
Rcpp (>= 0.12.18)
16+
Imports:
17+
Rcpp (>= 0.12.18),
18+
rlang,
19+
purrr,
20+
assertthat,
21+
glue,
22+
R6,
23+
vctrs,
24+
fs,
25+
tibble,
26+
crayon
27+
Remotes:
28+
r-lib/vctrs
1829
Roxygen: list(markdown = TRUE)
19-
RoxygenNote: 6.0.1.9000
20-
Suggests:
30+
RoxygenNote: 6.1.0.9000
31+
Suggests:
2132
testthat
33+
Collate:
34+
'enums.R'
35+
'R6.R'
36+
'ArrayData.R'
37+
'ChunkedArray.R'
38+
'Column.R'
39+
'Field.R'
40+
'List.R'
41+
'RcppExports.R'
42+
'RecordBatch.R'
43+
'Schema.R'
44+
'Struct.R'
45+
'Table.R'
46+
'array.R'
47+
'memory_pool.R'
48+
'reexports-tibble.R'
49+
'zzz.R'

r/NAMESPACE

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,61 @@
11
# Generated by roxygen2: do not edit by hand
22

3+
S3method("!=","arrow::Object")
4+
S3method("$","arrow-enum")
5+
S3method("==","arrow::Array")
6+
S3method("==","arrow::DataType")
7+
S3method("==","arrow::Field")
8+
S3method("==","arrow::RecordBatch")
9+
S3method(as_tibble,"arrow::RecordBatch")
10+
S3method(as_tibble,"arrow::Table")
11+
S3method(length,"arrow::Array")
12+
S3method(names,"arrow::RecordBatch")
13+
S3method(print,"arrow-enum")
14+
export(DateUnit)
15+
export(StatusCode)
16+
export(TimeUnit)
17+
export(Type)
18+
export(array)
19+
export(as_tibble)
20+
export(boolean)
21+
export(chunked_array)
22+
export(date32)
23+
export(date64)
24+
export(decimal)
25+
export(float16)
26+
export(float32)
27+
export(float64)
28+
export(int16)
29+
export(int32)
30+
export(int64)
31+
export(int8)
32+
export(list_of)
33+
export(null)
34+
export(read_arrow)
35+
export(record_batch)
36+
export(schema)
37+
export(struct)
38+
export(table)
39+
export(time32)
40+
export(time64)
41+
export(timestamp)
42+
export(uint16)
43+
export(uint32)
44+
export(uint64)
45+
export(uint8)
46+
export(utf8)
47+
export(write_arrow)
48+
importFrom(R6,R6Class)
349
importFrom(Rcpp,sourceCpp)
50+
importFrom(assertthat,assert_that)
51+
importFrom(glue,glue)
52+
importFrom(purrr,map)
53+
importFrom(purrr,map2)
54+
importFrom(purrr,map_chr)
55+
importFrom(purrr,map_int)
56+
importFrom(rlang,dots_n)
57+
importFrom(rlang,quo_name)
58+
importFrom(rlang,seq2)
59+
importFrom(rlang,set_names)
60+
importFrom(tibble,as_tibble)
461
useDynLib(arrow, .registration = TRUE)

r/R/ArrayData.R

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one
2+
# or more contributor license agreements. See the NOTICE file
3+
# distributed with this work for additional information
4+
# regarding copyright ownership. The ASF licenses this file
5+
# to you under the Apache License, Version 2.0 (the
6+
# "License"); you may not use this file except in compliance
7+
# with the License. You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing,
12+
# software distributed under the License is distributed on an
13+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
# KIND, either express or implied. See the License for the
15+
# specific language governing permissions and limitations
16+
# under the License.
17+
18+
#' @include R6.R
19+
20+
`arrow::ArrayData` <- R6Class("arrow::ArrayData",
21+
inherit = `arrow::Object`,
22+
active = list(
23+
type = function() `arrow::DataType`$dispatch(ArrayData__get_type(self)),
24+
length = function() ArrayData__get_length(self),
25+
null_count = function() ArrayData__get_null_count(self),
26+
offset = function() ArrayData__get_offset(self)
27+
)
28+
)

r/R/ChunkedArray.R

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one
2+
# or more contributor license agreements. See the NOTICE file
3+
# distributed with this work for additional information
4+
# regarding copyright ownership. The ASF licenses this file
5+
# to you under the Apache License, Version 2.0 (the
6+
# "License"); you may not use this file except in compliance
7+
# with the License. You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing,
12+
# software distributed under the License is distributed on an
13+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
# KIND, either express or implied. See the License for the
15+
# specific language governing permissions and limitations
16+
# under the License.
17+
18+
#' @include R6.R
19+
20+
`arrow::ChunkedArray` <- R6Class("arrow::ChunkedArray", inherit = `arrow::Object`,
21+
public = list(
22+
length = function() ChunkedArray__length(self),
23+
null_count = function() ChunkedArray__null_count(self),
24+
num_chunks = function() ChunkedArray__num_chunks(self),
25+
chunk = function(i) `arrow::Array`$new(ChunkedArray__chunk(self, i)),
26+
chunks = function() purrr::map(ChunkedArray__chunks(self), `arrow::Array`$new),
27+
type = function() `arrow::DataType`$dispatch(ChunkedArray__type(self)),
28+
as_vector = function() ChunkedArray__as_vector(self),
29+
Slice = function(offset, length = NULL){
30+
if (is.null(length)) {
31+
`arrow::ChunkedArray`$new(ChunkArray__Slice1(self, offset))
32+
} else {
33+
`arrow::ChunkedArray`$new(ChunkArray__Slice2(self, offset, length))
34+
}
35+
}
36+
)
37+
)
38+
39+
#' create an arrow::Array from an R vector
40+
#'
41+
#' @param \dots Vectors to coerce
42+
#'
43+
#' @export
44+
chunked_array <- function(...){
45+
`arrow::ChunkedArray`$new(ChunkedArray__from_list(rlang::list2(...)))
46+
}

r/R/Column.R

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one
2+
# or more contributor license agreements. See the NOTICE file
3+
# distributed with this work for additional information
4+
# regarding copyright ownership. The ASF licenses this file
5+
# to you under the Apache License, Version 2.0 (the
6+
# "License"); you may not use this file except in compliance
7+
# with the License. You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing,
12+
# software distributed under the License is distributed on an
13+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
# KIND, either express or implied. See the License for the
15+
# specific language governing permissions and limitations
16+
# under the License.
17+
18+
#' @include R6.R
19+
20+
`arrow::Column` <- R6Class("arrow::Column", inherit = `arrow::Object`,
21+
public = list(
22+
length = function() Column__length(self),
23+
null_count = function() Column__null_count(self),
24+
type = function() `arrow::DataType`$dispatch(Column__type(self)),
25+
data = function() `arrow::ChunkedArray`$new(Column__data(self))
26+
)
27+
)

r/R/Field.R

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one
2+
# or more contributor license agreements. See the NOTICE file
3+
# distributed with this work for additional information
4+
# regarding copyright ownership. The ASF licenses this file
5+
# to you under the Apache License, Version 2.0 (the
6+
# "License"); you may not use this file except in compliance
7+
# with the License. You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing,
12+
# software distributed under the License is distributed on an
13+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
# KIND, either express or implied. See the License for the
15+
# specific language governing permissions and limitations
16+
# under the License.
17+
18+
#' @include R6.R
19+
20+
`arrow::Field` <- R6Class("arrow::Field",
21+
inherit = `arrow::Object`,
22+
public = list(
23+
ToString = function() {
24+
Field__ToString(self)
25+
},
26+
name = function() {
27+
Field__name(self)
28+
},
29+
nullable = function() {
30+
Field__nullable(self)
31+
},
32+
Equals = function(other) {
33+
inherits(other, "arrow::Field") && Field__Equals(self, other)
34+
}
35+
)
36+
)
37+
38+
#' @export
39+
`==.arrow::Field` <- function(lhs, rhs){
40+
lhs$Equals(rhs)
41+
}
42+
43+
field <- function(name, type) {
44+
`arrow::Field`$new(Field__initialize(name, type))
45+
}
46+
47+
.fields <- function(.list){
48+
assert_that( !is.null(nms <- names(.list)) )
49+
map2(nms, .list, field)
50+
}

0 commit comments

Comments
 (0)