-
Salesforce Research
- Bay Area, CA
- http://memray.me
- in/memray
- https://scholar.google.com/citations?user=s6h8L_UAAAAJ&hl=en
-
llm_phrase_semantics Public
Code for EMNLP2024 paper: Traffic Light or Light Traffic? Investigating Phrasal Semantics in Large Language Models
Python Apache License 2.0 UpdatedNov 1, 2024 -
open-llm-problems Public
A community hub for collecting and sharing real-world issues with LLMs and other models to help improve their capabilities.
-
mteb-official Public
Forked from embeddings-benchmark/mtebMTEB: Massive Text Embedding Benchmark
Python Apache License 2.0 UpdatedAug 12, 2024 -
paperscraper Public
Forked from jannisborn/paperscraperTools to scrape publication metadata from pubmed, arxiv, medrxiv and chemrxiv.
Python MIT License UpdatedMay 22, 2024 -
beir Public
Forked from beir-cellar/beirA Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
Python Apache License 2.0 UpdatedSep 25, 2023 -
OpenNMT-kpg-release Public
Keyphrase Generation
-
fairscale Public
Forked from facebookresearch/fairscalePyTorch extensions for high performance and large scale training.
Python Other UpdatedMar 10, 2023 -
SimCSE Public
Forked from princeton-nlp/SimCSEEMNLP'2021: SimCSE: Simple Contrastive Learning of Sentence Embeddings
Python MIT License UpdatedFeb 28, 2023 -
-
-
apex Public
Forked from NVIDIA/apexA PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
Python BSD 3-Clause "New" or "Revised" License UpdatedAug 29, 2022 -
contriever Public
Forked from facebookresearch/contrieverContriever Towards Unsupervised Dense Information Retrieval with Contrastive Learning
Python Other UpdatedAug 29, 2022 -
emdr2 Public
Forked from DevSinghSachan/emdr2Code and Models for the paper "End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering" (NeurIPS 2021)
Python Other UpdatedApr 19, 2022 -
pyserini Public
Forked from castorini/pyseriniPyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
Python Apache License 2.0 UpdatedApr 15, 2022 -
ColBERT Public
Forked from stanford-futuredata/ColBERTColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21)
Python MIT License UpdatedMar 30, 2022 -
wikiextractor Public
Forked from attardi/wikiextractorA tool for extracting plain text from Wikipedia dumps
-
sentence-transformers Public
Forked from UKPLab/sentence-transformersMultilingual Sentence & Image Embeddings with BERT
Python Apache License 2.0 UpdatedMar 17, 2022 -
hands-on-with-pke Public
Forked from keyphrasification/hands-on-with-pkeJupyter Notebook UpdatedMar 5, 2022 -
DHR Public
Forked from yeliu918/DHRThis is the repository of the Dense Hierarchical Retrieval for Open-Domain Question Answering
Python UpdatedDec 23, 2021 -
datasets Public
Forked from huggingface/datasets🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
Python Apache License 2.0 UpdatedDec 8, 2021 -
academic-budget-bert Public
Forked from IntelLabs/academic-budget-bertRepository containing code for "How to Train BERT with an Academic Budget" paper
Python Apache License 2.0 UpdatedNov 21, 2021 -
ANCE Public
Forked from microsoft/ANCEA novel embedding training algorithm leveraging ANN search and achieved SOTA retrieval on Trec DL 2019 and OpenQA benchmarks
Python MIT License UpdatedSep 22, 2021 -
SentEval Public
Forked from facebookresearch/SentEvalA python tool for evaluating the quality of sentence embeddings.
Python Other UpdatedAug 30, 2021 -
fairseq Public
Forked from facebookresearch/fairseqFacebook AI Research Sequence-to-Sequence Toolkit written in Python.
Python MIT License UpdatedJan 19, 2021 -
yttm_transformers_tokenizer Public
Forked from king-menin/yttm_transformers_tokenizerImplementation of youtokentome tokenizer for transformers
Python UpdatedAug 20, 2020 -
gensim Public
Forked from piskvorky/gensimTopic Modelling for Humans
Python GNU Lesser General Public License v2.1 UpdatedNov 8, 2019 -
-
pytorchviz Public
Forked from szagoruyko/pytorchvizA small package to create visualizations of PyTorch execution graphs
Jupyter Notebook MIT License UpdatedJul 12, 2019 -
pyrouge Public
Forked from bheinzerling/pyrougeA Python wrapper for the ROUGE summarization evaluation package
Python MIT License UpdatedMay 1, 2019 -
ARAE Public
Forked from jakezhaojb/ARAECode for the paper "Adversarially Regularized Autoencoders (ICML 2018)" by Zhao, Kim, Zhang, Rush and LeCun
Python BSD 3-Clause "New" or "Revised" License UpdatedApr 26, 2019