-
Salesforce Research
- Bay Area, CA
- http://memray.me
- in/memray
- https://scholar.google.com/citations?user=s6h8L_UAAAAJ&hl=en
Stars
A community hub for collecting and sharing real-world issues with LLMs and other models to help improve their capabilities.
This repo contains the code and data for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks"
Salesforce open-source LLMs with 8k sequence length.
Unified Controllable Visual Generation Model
A deep learning library for identifying keyphrases from text
CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.
ACTER is a manually annotated dataset for term extraction, covering 3 languages (English, French, and Dutch), and 4 domains (corruption, dressage, heart failure, and wind energy).
Automatically generate your résumé and various cover letters from YAML files.
Code to obtain the PMC-SA. A dataset for the summarization of scientific articles.
Everything you need to know for a Software Engineering interview
Large, curated set of benchmark datasets for evaluating automatic keyphrase extraction algorithms.
🎨 ML Visuals contains figures and templates which you can reuse and customize to improve your scientific writing.
AutoPhrase: Automated Phrase Mining from Massive Text Corpora
Plot the vector graph of attention based text visualisation
An open-source NLP research library, built on PyTorch.
A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.
Full Python implementation of the ROUGE metric, producing same results as in the official perl implementation.
A Python wrapper for the ROUGE summarization evaluation package
Unsupervised Language Modeling at scale for robust sentiment classification
PyTorch original implementation of Cross-lingual Language Model Pretraining.
Facebook AI Research Sequence-to-Sequence Toolkit