Load the stopwords that you need in Pharo
Metacello new
baseline: 'AIStopwords';
repository: 'github://pharo-ai/stopwords/src';
load.
If you want to add a dependency on stopwords to your project, include the following lines into your baseline method:
spec
baseline: 'AIStopwords'
with: [ spec repository: 'github://pharo-ai/stopwords/src' ].
If you are new to baselines and Metacello, check out the Baselines tutorial on Pharo Wiki.
You can use the class façade to quickly obtain a stop word Collection. It supports multiple stopwords repositories (implemented as subclasses), but a default list is automatically configured. Users could get a list of stop words for a language, you can use the pattern:
AIStopwords for<Language>.
for example:
AIStopwords forEnglish.
AIStopwords forSpanish.
AIStopwords forFrench.
To change the default stopword class for a language:
AIStopwordsEnglish defaultStopwordClass: aClass.
Stopwords list were collected from https://github.com/igorbrigadir/stopwords
Example of usage:
'This is Ground Control to Major Tom' removeStopwordsUsing: AIStopwords forEnglish
will answer a Collection without the stopwords:
#('Ground' 'Control' 'Major' 'Tom')
Stopwords now can be augmented with #addStopword: and #addStopwords:
AIStopwordsEngCoreNLP new addStopword: 'myStopword'.
AIStopwordsEngLuceneSolr new addStopwords: #('stopword1' 'stopword2').