-
-
Notifications
You must be signed in to change notification settings - Fork 2k
Insights: pola-rs/polars
Overview
Could not load contribution data
Please try again later
1 Release published by 1 person
-
py-1.14.0 Python Polars 1.14.0
published
Nov 17, 2024
50 Pull requests merged by 14 people
-
fix: Fix scalar object
#19940 merged
Nov 23, 2024 -
perf: Memoize duplicates in rolling-gb-dyn
#19939 merged
Nov 23, 2024 -
perf: More efficient row encoding for
pl.List
#19907 merged
Nov 23, 2024 -
fix: Raise InvalidOperationError for invalid float to decimal casts (e.g. Inf, NaN)
#19938 merged
Nov 23, 2024 -
docs(rust): Minor doc fixes and cleanup
#19935 merged
Nov 23, 2024 -
perf: Half the size of Booleans in row encoding
#19927 merged
Nov 23, 2024 -
feat(python): Allow Python Enums as dtype inputs
#19926 merged
Nov 22, 2024 -
refactor(python): Minor non-breaking space (
) tweak for HTML rendering#19864 merged
Nov 22, 2024 -
feat: Speed up starts_with for small prefixes
#19904 merged
Nov 22, 2024 -
refactor: Implement nested row encoding / decoding
#19874 merged
Nov 22, 2024 -
perf: Rolling 'iter_lookbehind' breeze through duplicates
#19922 merged
Nov 22, 2024 -
fix(python): Address indexing edge-case with
numpy
arrays#19895 merged
Nov 22, 2024 -
refactor(python): Mark
concat_arr
as unstable#19909 merged
Nov 21, 2024 -
doc(python): Explicit note section for coalescing note
#19891 merged
Nov 21, 2024 -
fix: Fix panic with combination of hive and parquet prefiltering
#19905 merged
Nov 21, 2024 -
fix: Fix panic when joining with empty frame (debug only)
#19896 merged
Nov 21, 2024 -
feat: Auto-enable hive partitioning if hive_schema was given
#19902 merged
Nov 21, 2024 -
refactor(rust): Remove use of cast in
ArrowArray::new
#19899 merged
Nov 21, 2024 -
fix: Fix incorrect result from inequality filter after join on LazyFrame
#19898 merged
Nov 21, 2024 -
fix: Misleading
ShapeError
error message on dataframe creation#19901 merged
Nov 21, 2024 -
fix: Fix panic with empty delta scan, or empty parquet scan with a provided schema
#19884 merged
Nov 21, 2024 -
fix: Ensure type object of inputs for cached any-value conversion functions are kept alive
#19866 merged
Nov 20, 2024 -
feat: Add
pl.concat_arr
to concatenate columns into an Array column#19881 merged
Nov 20, 2024 -
feat: Support both "iso" and "iso:strict" format options for
dt.to_string
#19840 merged
Nov 20, 2024 -
docs: Complete parameters description and add an example for
clip()
#19875 merged
Nov 20, 2024 -
feat: Add rounding for Decimal type
#19760 merged
Nov 19, 2024 -
fix: Fix panic using
scan_parquet().with_row_index()
with hive partitioning enabled#19865 merged
Nov 19, 2024 -
chore: Switch back to PyO3 0.22
#19851 merged
Nov 19, 2024 -
fix: Improve histogram bin logic
#18761 merged
Nov 19, 2024 -
fix: Raise informative error instead of panicking for list arithmetic on some invalid dtypes
#19841 merged
Nov 19, 2024 -
test(python): Adjust flaky
with_columns
test#19844 merged
Nov 19, 2024 -
fix: Properly handle Zero-Field Structs in row encoding
#19846 merged
Nov 19, 2024 -
fix: Incorrect explode schema for
LazyFrame.explode()
#19860 merged
Nov 19, 2024 -
docs: Fix some warnings during docs build
#19848 merged
Nov 18, 2024 -
refactor(rust): Make chunked gathers generic over chunk bit width
#19856 merged
Nov 18, 2024 -
perf: Initially trim leading and trailing filtered rows
#19850 merged
Nov 18, 2024 -
fix(python): DataFrame
rows_by_key
returning key tuples with elements in wrong order#19486 merged
Nov 18, 2024 -
chore: Add proper tests for row encoding
#19843 merged
Nov 18, 2024 -
feat: Improved array arithmetic support
#19837 merged
Nov 18, 2024 -
fix: Ensure
List
element truncation ellipses respectASCII*
table formats#19835 merged
Nov 18, 2024 -
python Polars 1.14.0
#19834 merged
Nov 17, 2024 -
refactor: Add ToField context for common args
#19833 merged
Nov 17, 2024 -
fix(python): Fix
read_database(…,iter_batches=True)
type annotations#19832 merged
Nov 17, 2024 -
feat: Raise informative error on Unknown unnest
#19830 merged
Nov 17, 2024 -
fix: Validate subnodes in validate IR
#19831 merged
Nov 17, 2024 -
perf: Increase default async thread count for low core count systems
#19829 merged
Nov 17, 2024 -
fix: Raise if merge non-global categoricals in unpivot
#19826 merged
Nov 17, 2024 -
perf: Move row group decode off async thread for local streaming parquet scan
#19828 merged
Nov 17, 2024 -
fix: Type hints for window_size incorrectly included timedelta in some rolling functions
#19827 merged
Nov 17, 2024
12 Pull requests opened by 11 people
-
docs: Add example and tests for `pl.concat()` with `Expr` input
#19836 opened
Nov 17, 2024 -
fix(python): Parse uppercase config keys
#19852 opened
Nov 18, 2024 -
refactor(rust): Add equi joins to new streaming engine
#19869 opened
Nov 19, 2024 -
perf: Add fast paths for series.arg_sort and dataframe.sort + bug fix in existing fast path
#19872 opened
Nov 19, 2024 -
fix(python): Fix `row_by_key` typing
#19888 opened
Nov 20, 2024 -
feat: Start of Series.index_of(), for primitive numeric types
#19894 opened
Nov 20, 2024 -
perf: Reduce the size of row encoding UTF-8
#19911 opened
Nov 21, 2024 -
perf: Add a VarInt encoding for the row encoding
#19929 opened
Nov 22, 2024 -
perf: fast path to generate group idxs for vanilla int_range in group_by_dynamic
#19932 opened
Nov 22, 2024 -
refactor(python): Split file of lazyframe methods
#19937 opened
Nov 23, 2024 -
feat(python): Add support for python builtin `round`
#19941 opened
Nov 23, 2024 -
docs(python): Remove duplicate sentence in `Series.bottom_k` docstring
#19947 opened
Nov 23, 2024
44 Issues closed by 10 people
-
sort a 1 row DataFrame with object panics
#19925 closed
Nov 23, 2024 -
Casting `x/0` (`Inf`) to Decimal causes panic (instead of error)
#19934 closed
Nov 23, 2024 -
POLARS_SKIP_CPU_CHECK doesn't prevent illegal hardware instruction even with bare import anymore
#19936 closed
Nov 23, 2024 -
Cannot filter on decimal fields in parquet files using is_in()
#16150 closed
Nov 23, 2024 -
Point numpy crate to official release
#19695 closed
Nov 22, 2024 -
Support StrEnums in Enum Construction
#19724 closed
Nov 22, 2024 -
Strings showing non-breaking space in PyCharm Jupyter Notebooks after 1.14 upgrade
#19859 closed
Nov 22, 2024 -
Support arbitrary nested lists in row-encoding
#10747 closed
Nov 22, 2024 -
rolling is super slow on large dataset if not grouped first
#19912 closed
Nov 22, 2024 -
Multiplication not allowed for bool dtype
#19919 closed
Nov 22, 2024 -
_select_column, getitem misinterprets 0d numpy array as multi-dimensional array
#19882 closed
Nov 22, 2024 -
polars 1.13.0 filtering by date column doesn't work in lazyframe obtained via scan_parquet
#19766 closed
Nov 21, 2024 -
Join with empty dataframe causes a panic
#19863 closed
Nov 21, 2024 -
Lazy/Eager right-join is giving different results
#19772 closed
Nov 21, 2024 -
Misleading ShapeError error message
#19795 closed
Nov 21, 2024 -
max() on empty LazyFrame returned by scan_delta() causes PanicException
#19890 closed
Nov 21, 2024 -
[1.14.0 Regression] scan_delta on empty partitioned tables fails
#19854 closed
Nov 21, 2024 -
Reading empty delta table panics
#19876 closed
Nov 21, 2024 -
Push down `is_between` date filter to pyarrow
#7553 closed
Nov 21, 2024 -
memory does not get freed
#9225 closed
Nov 20, 2024 -
Concatenation of array columns, similar to concat_list
#18090 closed
Nov 20, 2024 -
enh: exponential weighted covariance between two time series
#19883 closed
Nov 20, 2024 -
to_torch doesn't support list/array types
#19092 closed
Nov 20, 2024 -
Parameters in `clip()` parse strings as column names, which is undocumented
#18345 closed
Nov 20, 2024 -
Spurious CI failure `test_lit_datetime_subclass_w_allow_object`
#18630 closed
Nov 19, 2024 -
With Row index not working on scan_parquet with hive partitioned data
#19861 closed
Nov 19, 2024 -
Incorrect `DataFrame` schema produced in some situations when given empty `List` dtype data
#11044 closed
Nov 19, 2024 -
PyCharm docstring Example formatting issue (and proposed fix)
#5513 closed
Nov 19, 2024 -
`PanicException` reading back nested Object (in a list)
#16926 closed
Nov 19, 2024 -
`hist` panics after creating zero bins
#18650 closed
Nov 19, 2024 -
Panic adding `List(Struct)` to `Int64`
#19839 closed
Nov 19, 2024 -
`join` fails because of an uninstructed cast from int to array[int, x] on 1.14.0
#19763 closed
Nov 19, 2024 -
Pandas interoperability: Inconsistencies with lists
#19809 closed
Nov 18, 2024 -
Array broadcasting support
#19356 closed
Nov 18, 2024 -
Incorrect outer validity for array arithmetic result
#19838 closed
Nov 18, 2024 -
List columns uses non-ascii characters in display, even when pl.Config.set_ascii_tables(True)
#19821 closed
Nov 18, 2024 -
Incorrect typing of `polars.read_database` when `iter_batches=True`
#19800 closed
Nov 17, 2024 -
polars.read_delta from a GCP bucket using service account
#17253 closed
Nov 17, 2024 -
Schema issue when writing new delta tables - parquet schema not valid delta lake schema
#9795 closed
Nov 17, 2024 -
`list.to_struct().struct.unnest()` fails
#19812 closed
Nov 17, 2024 -
`.struct.field` + `.filter` PanicException instead of ColumnNotFoundError
#18787 closed
Nov 17, 2024 -
`unpivot` Categorical PanicException
#19770 closed
Nov 17, 2024 -
Rolling Expr claims to support timedelta but doesn't appear to
#19825 closed
Nov 17, 2024
45 Issues opened by 40 people
-
Remove duplicate sentence in `Series.bottom_k` docstring
#19946 opened
Nov 23, 2024 -
nulls_last parameter not honored while sorting an already sorted array
#19945 opened
Nov 23, 2024 -
predicate/projection pushdown causes ShapeError
#19944 opened
Nov 23, 2024 -
Creating frames with `List(Struct(...Categorical))` fails depending on order
#19943 opened
Nov 23, 2024 -
Support python builtin `round`
#19942 opened
Nov 23, 2024 -
Issue with lazy frames reading from s3 in long running process
#19933 opened
Nov 22, 2024 -
`datetime_range` PanicException with nanosecond interval
#19931 opened
Nov 22, 2024 -
Saving parquet to AWS S3 with df.write_parquet() fails with FileNotFound
#19930 opened
Nov 22, 2024 -
err: strptime panics (instead of erroring) for out-of-bounds dates
#19928 opened
Nov 22, 2024 -
Feature request: Use multithreading for simple operations on larger dfs
#19924 opened
Nov 22, 2024 -
Add a .str.find_many with AhoCorasick (similar to extract_many)
#19923 opened
Nov 22, 2024 -
LazyFrame sort in batch only
#19921 opened
Nov 22, 2024 -
Add to clickbench Rust Polars use
#19920 opened
Nov 22, 2024 -
chunked array is not contiguous
#19918 opened
Nov 22, 2024 -
Inconsistent type casting of `datetime` in functions on `List` column
#19917 opened
Nov 22, 2024 -
Imperfect behavior with `scan_csv()` on `zstd` compressed file with `new_columns` param
#19916 opened
Nov 21, 2024 -
Errors in scan_delta and write_delta with nested struct schema evolution (aka adding new field)
#19915 opened
Nov 21, 2024 -
Allow loading lists of variable-length numpy arrays into polars dataframe
#19913 opened
Nov 21, 2024 -
Cannot merge polar df into existing spark deltalake: Unable to convert expression to string
#19910 opened
Nov 21, 2024 -
Enable floordiv on durations: duration // duration
#19908 opened
Nov 21, 2024 -
Use pre-optimized query plan by benchmark LazyFrame for the evaluation of another LazyFrame
#19906 opened
Nov 21, 2024 -
[Tracking] Improvements to `polars-row`
#19903 opened
Nov 21, 2024 -
Appending series of type pl.Categorical("lexical") may incorrectly set the sorted flag
#19900 opened
Nov 21, 2024 -
`Expr.replace` fails when replacement is a column/expression
#19893 opened
Nov 20, 2024 -
replace multiple seeds in hash functions to a single seed
#19892 opened
Nov 20, 2024 -
df.fill_null(0) fails for Decimal Columns
#19889 opened
Nov 20, 2024 -
Wrong `row_by_key` typing
#19887 opened
Nov 20, 2024 -
Add notes on parameter "use_pyarrow" in read_csv and potentially other functions
#19886 opened
Nov 20, 2024 -
`read_csv_batched` encoding parameter does not work as described in the documenati
#19885 opened
Nov 20, 2024 -
Document copy-on-write logic
#19879 opened
Nov 20, 2024 -
Regression: Polars 1.14 doesn't accept AWS_S3_ALLOW_UNSAFE_RENAME anymore
#19878 opened
Nov 20, 2024 -
`pl.concat()` output with Expr input is wrong when using the new streaming engine
#19877 opened
Nov 20, 2024 -
PanicException when combining very specific operations
#19873 opened
Nov 19, 2024 -
LazyFrame `type_coercion` breaks complex query output
#19871 opened
Nov 19, 2024 -
Failing inner join with join_nulls=False on some hardware
#19870 opened
Nov 19, 2024 -
Regression: Incorrect categorical encoding
#19868 opened
Nov 19, 2024 -
Add modular parquet de-/encryption
#19858 opened
Nov 18, 2024 -
DataFrame.sum slow for u8, u16, i8, i16
#19857 opened
Nov 18, 2024 -
Panic after join when dataframe has a column of dtype pl.Object and resulting dataframe has only one row
#19855 opened
Nov 18, 2024 -
Invalid format specified when converting Time type with `to_string` causes panic
#19853 opened
Nov 18, 2024 -
Scan_delta parameters are now lowercase since 1.14.0 for S3
#19849 opened
Nov 18, 2024 -
Allow skipping non-matching schema in json_decode
#19847 opened
Nov 18, 2024 -
Enable "partition_by" for "sink_parquet" function
#19845 opened
Nov 18, 2024 -
Scan_parquet should filter on partitions before evaluating schema
#19842 opened
Nov 18, 2024
47 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
feat: Add `dt.replace`
#19708 commented on
Nov 18, 2024 • 6 new comments -
feat: Support max/min method for Time dtype
#19815 commented on
Nov 23, 2024 • 1 new comment -
Should `pl.concat()` accept `Expr` input?
#19813 commented on
Nov 17, 2024 • 0 new comments -
Can not utilize more than one CPU core running under docker
#19807 commented on
Nov 21, 2024 • 0 new comments -
Roadmap for `Decimal` Type? (missing features like `round`; currently duckdb/spark)
#19630 commented on
Nov 21, 2024 • 0 new comments -
Bug with join and selectors in LazyFrames (equivalent code works for eager DataFrames)
#19822 commented on
Nov 21, 2024 • 0 new comments -
Sampling with groupby
#16725 commented on
Nov 22, 2024 • 0 new comments -
`is_in` operation not supported for list types
#14830 commented on
Nov 22, 2024 • 0 new comments -
Unable to write polars DataFrame to parquet file with UUID in column data
#17486 commented on
Nov 22, 2024 • 0 new comments -
Polars cannot read native pyarrow parquet tables that use dictionary encodings (e.g., categorical types)
#17945 commented on
Nov 22, 2024 • 0 new comments -
`write_excel(formulas={})` appears non-functional in latest version
#18782 commented on
Nov 22, 2024 • 0 new comments -
Implement scan_avro
#6903 commented on
Nov 22, 2024 • 0 new comments -
Groupby using lazy mode on a csv throw an memory allocation error when running on AWS lambda
#17946 commented on
Nov 22, 2024 • 0 new comments -
Inconsistency dealing with "file://..." paths
#19749 commented on
Nov 22, 2024 • 0 new comments -
`polars` and `polars-lts-cpu` as package dependencies
#12880 commented on
Nov 23, 2024 • 0 new comments -
Filter-and-aggregate error with Object column
#19085 commented on
Nov 23, 2024 • 0 new comments -
return integer indicators of the bins in `qcut`
#13278 commented on
Nov 23, 2024 • 0 new comments -
Left join large memory usage regression
#18106 commented on
Nov 23, 2024 • 0 new comments -
Saving parquet to Google Cloud Storage with `df.write_parquet()`
#14630 commented on
Nov 23, 2024 • 0 new comments -
write_parquet encoding no longer recognized by PBI Service parquet connector after Polars 1.5.0 onwards
#18819 commented on
Nov 23, 2024 • 0 new comments -
feat: Add `keep_column(s)` params to `to_dummies`
#14844 commented on
Nov 18, 2024 • 0 new comments -
WASM group by 32bit/64bit conversion bugfix
#17793 commented on
Nov 21, 2024 • 0 new comments -
refactor: Turn `Expr.append` into DSL `Expr::Function(Append)`
#19536 commented on
Nov 21, 2024 • 0 new comments -
feat(python): Add show methods to DataFrame and LazyFrame
#19634 commented on
Nov 19, 2024 • 0 new comments -
`cum_fold_exprs` does not broadcast a `lit()` initial `acc` to the correct length
#19793 commented on
Nov 17, 2024 • 0 new comments -
Re-work API for `list.to_struct`
#19525 commented on
Nov 17, 2024 • 0 new comments -
List/Array: `sum` operation not supported
#19808 commented on
Nov 17, 2024 • 0 new comments -
Support additional join types with join_where
#18669 commented on
Nov 18, 2024 • 0 new comments -
Quadratic scaling in expression depth when collecting expressions
#16224 commented on
Nov 18, 2024 • 0 new comments -
(binding/scala) Request to use the official group ID for `scala-polars`
#11202 commented on
Nov 18, 2024 • 0 new comments -
Pack all optimization flags into a single dictionary
#19595 commented on
Nov 18, 2024 • 0 new comments -
Support `Float16` data type
#7288 commented on
Nov 18, 2024 • 0 new comments -
Support for Arrow Extension types
#9112 commented on
Nov 18, 2024 • 0 new comments -
Expr.format(fmt) - convert column to string with custom format string
#7133 commented on
Nov 18, 2024 • 0 new comments -
Panic when melting since 0.19.14
#12813 commented on
Nov 19, 2024 • 0 new comments -
Allow creating a dataframe with integer categorical values
#10430 commented on
Nov 19, 2024 • 0 new comments -
Polars Hudi support
#16946 commented on
Nov 19, 2024 • 0 new comments -
[FEA/Discussion]: allow specification of default engine via config options
#19797 commented on
Nov 19, 2024 • 0 new comments -
Adding “Rounding half to even”
#17798 commented on
Nov 19, 2024 • 0 new comments -
Fixed-width text file reader
#3151 commented on
Nov 19, 2024 • 0 new comments -
pl.concat_list(pl.col("struct_of_structs")) out of memory and crashes
#19805 commented on
Nov 20, 2024 • 0 new comments -
Creating a DataFrame from Pydantic models fails if there is missing data/Nones
#19761 commented on
Nov 20, 2024 • 0 new comments -
Future direction for Decimal, embracing fixed point?
#19784 commented on
Nov 20, 2024 • 0 new comments -
Get column dtype in expression
#4982 commented on
Nov 20, 2024 • 0 new comments -
read_csv_batched should return an instance of an iterator
#13885 commented on
Nov 20, 2024 • 0 new comments -
Incorrect description on read_csv_batched function
#14632 commented on
Nov 20, 2024 • 0 new comments -
Cannot `sink_parquet` when using `.is_in()` inside of `pl.when()/then()` in polars > 0.20.19
#15767 commented on
Nov 21, 2024 • 0 new comments