-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: Way to convert Pandas dataframe to Table object #68
Comments
I posted this question on Stackoverflow here: |
I'm wondering if there would be potential for Pandas DataFrames (or a subclass) to be used as the internal data object for Orange? If there is a move towards numpy anyway, they work very nicely together + you get a lot of functionality for free. |
We have seriously considered using Pandas instead of pure numpy, but we would have to add to many things on our own, so we decided it would be easier to start from numpy. However, Orange is no longer limited to data in a single format. It can already use data which is stored in SQL and only moves it to local memory when needed (e.g. it would compute naive bayesian classifier on the database, without moving the data to the client). Adding Pandas DataFrames should be even simpler. It is not on any short-term list - our priority now is to port most of Orange 2 first - but we can do it in the future. |
@janezd That's great to hear. I can understand the thinking - Pandas does bring it's own restrictions as a base format (not least of all only allowing up to 2D data - which is limiting for image data, etc. depending on your plans for Orange). The restriction of data formats in Orange has previously prevented me from implementing some tools... time to take another look! |
Check this: http://docs.orange.biolab.si/3/modules/data.table.html. The data is still 2d, but it can be, for instance, a view into a 3d array. It is much more flexible now. |
|
Can you create a pull request with this? This makes it easier to follow the code and its potential changes. |
There's a new add on available, Orange3-spark. This add on include widgets to convert between Spark<--->Pandas<-->Orange Perhaps the Pandas<--->Orange widgets can be included in the Default orange distro? |
@jamartinh Could be, but can you please explain how they would be used? |
@kernc Can we close the issue since implementing Pandas dataframe is our GSoC project? Or we do it after the project is done? |
Is the issue fixed? 😸 |
It's a feature request, not an issue per se. Thus yes, feature request is accepted and is in the process of implementation. |
With the new pandas implementation (#1347) , there's a |
Duplicate of #2932. |
[FIX] Double-click NodeItem when above a LinkItem
Hi, Pandas is quickly becoming a standard in data analysis in the scientific Python community. So I think you should consider adding support for converting data in a pandas Dataframe to an Orange table object.
The other day I had a very complex csv which I wanted to read into Orange. The available CSV Importer couldn't read it but pandas, or a straight python script loaded without problems. So I added a Python Script block to my project. But then I got stuck because I couldn't find a way to transform the imported table into an Orange Table Object.
Perhaps better documentation on how to buiid a Table from raw data would already help a lot.
The text was updated successfully, but these errors were encountered: