You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-[Web Crawling & Web Scraping](#web-crawling--web-scraping)
89
89
-[Web Frameworks](#web-frameworks)
90
90
-[WebSocket](#websocket)
91
91
-[WSGI Servers](#wsgi-servers)
@@ -342,6 +342,7 @@ Inspired by [awesome-php](https://github.com/ziadoz/awesome-php).
342
342
*[Open Mining](https://github.com/mining/mining) - Business Intelligence (BI) in Pandas interface.
343
343
*[Orange](https://orange.biolab.si/) - Data mining, data visualization, analysis and machine learning through visual programming or scripts.
344
344
*[Pandas](http://pandas.pydata.org/) - A library providing high-performance, easy-to-use data structures and data analysis tools.
345
+
*[Optimus](https://github.com/ironmussa/Optimus) - Cleansing, pre-processing, feature engineering, exploratory data analysis and easy Machine Learning with a PySpark backend.
345
346
346
347
## Data Validation
347
348
@@ -729,6 +730,7 @@ Inspired by [awesome-php](https://github.com/ziadoz/awesome-php).
729
730
730
731
*[bpython](https://github.com/bpython/bpython) - A fancy interface to the Python interpreter.
731
732
*[Jupyter Notebook (IPython)](https://jupyter.org) - A rich toolkit to help you make the most out of using Python interactively.
*[ptpython](https://github.com/jonathanslenders/ptpython) - Advanced Python REPL built on top of the [python-prompt-toolkit](https://github.com/jonathanslenders/python-prompt-toolkit).
733
735
734
736
## Internationalization
@@ -815,6 +817,7 @@ Inspired by [awesome-php](https://github.com/ziadoz/awesome-php).
815
817
*[SnowNLP](https://github.com/isnowfy/snownlp) - A library for processing Chinese text.
816
818
*[spaCy](https://spacy.io/) - A library for industrial-strength natural language processing in Python and Cython.
817
819
*[TextBlob](https://github.com/sloria/TextBlob) - Providing a consistent API for diving into common NLP tasks.
820
+
*[PyTorch-NLP](https://github.com/PetrochukM/PyTorch-NLP) - A toolkit enabling rapid deep learning NLP prototyping for research.
818
821
819
822
## Network Virtualization
820
823
@@ -1200,9 +1203,9 @@ Inspired by [awesome-php](https://github.com/ziadoz/awesome-php).
1200
1203
*[textract](https://github.com/deanmalmgren/textract) - Extract text from any document, Word, PowerPoint, PDFs, etc.
1201
1204
*[toapi](https://github.com/gaojiuli/toapi) - Every web site provides APIs.
1202
1205
1203
-
## Web Crawling
1206
+
## Web Crawling & Web Scraping
1204
1207
1205
-
*Libraries for scraping websites.*
1208
+
*Libraries to automate data extraction from websites.*
1206
1209
1207
1210
*[cola](https://github.com/chineking/cola) - A distributed crawling framework.
0 commit comments