Skip to content

Implement a better sorting API #146

Open
@olekscode

Description

At the moment, we have three methods in DataFrame's sorting protocol:

DataFrame >> sortBy: aColumnName.
DataFrame >> sortBy: aColumnName using: aBlock.
DataFrame >> sortDescendingBy: aColumnName.

We need to add more methods, e.g.

DataFrame >> sortDescendingBy: aColumnName using: aBlock.

Also, the sorted* methods that do not modify the data frame but return a sorted copy instead:

DataFrame >> sortedBy: aColumnName.
DataFrame >> sortedBy: aColumnName using: aBlock.
DataFrame >> sortedDescendingBy: aColumnName.
DataFrame >> sortedDescendingBy: aColumnName using: aBlock.

We also need a way to sort by multiple columns (e.g. first by columnA, then by columnB).
Perhaps, also sort by column index (e.g. DataFrame >> sortByColumnAt: aNumber).

That being said, we need to discuss a consistent and flexible API that would allow us to cover all of those cases.

Finally, since DataFrame is a collection, we must support sort, sorted, sort:, and sorted:.
Normally, those methods should already work if we implement do: or add: but as Myroslava pointed out in issue #127, sorted: returns an Array instead of a DataFrame.

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions