trust_and_safety_models

Trust and Safety Models

We decided to open source the training code of the following models:

pNSFWMedia: Model to detect tweets with NSFW images. This includes adult and porn content.
pNSFWText: Model to detect tweets with NSFW text, adult/sexual topics
pToxicity: Model to detect toxic tweets. Toxicity includes marginal content like insults and certain types of harassment. Toxic content does not violate Twitter terms of service
pAbuse: Model to detect abusive content. This includes violations of Twitter terms of service, including hate speech, targeted harassment and abusive behavior.

We have several more models and rules that we are not going to open source at this time because of the adversarial nature of this area. The team is considering open sourcing more models going forward and will keep the community posted accordingly.

Name		Name	Last commit message	Last commit date
parent directory ..
abusive		abusive
nsfw		nsfw
toxicity		toxicity
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

trust_and_safety_models

trust_and_safety_models

README.md

Trust and Safety Models

Files

trust_and_safety_models

Directory actions

More options

Directory actions

More options

Latest commit

History

trust_and_safety_models

Folders and files

parent directory

README.md

Trust and Safety Models