Tweep is an advanced Twitter scraping tool written in Python that allows for scraping Tweets from Twitter profiles without using Twitter's API.
Tweep utilizes Twitter's search operators to let you scrape Tweets from specific users, scrape Tweets relating to certain topics, hashtags & trends, or sort out sensitive information from Tweets like e-mail and phone numbers. I find this very useful, and you can get really creative with it too.
Some of the benefits of using Tweep vs Twitter API:
- Can fetch almost all Tweets (Twitter API limits to last 3200 Tweets only)
- Fast initial setup
- Can be used anonymously and without sign up
- No rate limitations
- Python 3.5/3.6
pip3 install -r requirements.txt
-uThe user's Tweets you want to scrape.-sSearch for Tweets containing this word or phrase.-gRetrieve tweets by geolocation. Format of the argument is lat,lon,range(in km) . ex : 48.01009,36.09876,0.5km - Please note that km has to be included.-oSave output to a file.--yearFilter Tweets before the specified year.--fruitDisplay Tweets with "low-hanging-fruit".--tweetsDisplay Tweets only.--verifiedDisplay Tweets only from verified users (Use with-s).--usersDisplay users only (Use with-s).--csvWrite as a .csv file.--hashtagsExtract hashtags.--useridSearch from Twitter user's ID.--limitNumber of Tweets to pull (Increments of 20).--countDisplay number Tweets scraped at the end of session.--statsShow number of replies, retweets, and likes.
The --fruit feature will display Tweets that might contain sensitive info such as:
- Profiles from leaked databases (Myspace or LastFM)
- Email addresses
- Phone numbers
- Keybase.io profiles
A few simple examples to help you understand the basics:
python3 tweep.py -u username- Scrape all the Tweets from user's timeline.python3 tweep.py -u username -s pineapple- Scrape all Tweets from the user's timeline containing pineapple.python3 tweep.py -s pineapple- Collect every Tweet containing pineapple from everyone's Tweets.python3 tweep.py -u username --year 2014- Collect Tweets that were tweeted before 2014.python3 tweep.py -u username --since 2015-12-20- Collect Tweets that were tweeted since 2015-12-20.python3 tweep.py -u username -o file.txt- Scrape Tweets and save to file.txt.python3 tweep.py -u username -o file.csv --csv- Scrape Tweets and save as a csv file.python3 tweep.py -u username --fruit- Show Tweets with low-hanging fruit.python3 tweep.py -s "Donald Trump" --verified --users- List verified users that Tweet about Donald Trump.python3 tweep.py -g 48.880048,2.385939,1km -o file.csv --csv- Scrape Tweets in a radius of 1km around a place in Paris a export them to a csv file.
955511208597184512 2018-01-22 18:43:19 GMT <now> pineapples are the best fruit
- Added new features:
--useridfeature allowing a user to search Tweets from a Twitter user's user-id.--limitfeature allowing a user to specify how many Tweets get scraped (Incriments of 20).--countfeature to display the total number of Tweets collected at the end of a Tweep session.--statsfeature to display the number of replies, retweets, and likes.-gfeature to scrape tweets in a radius of a gps location.
- Fixed:
Error handling- Moved to a seperate function and better organized.
- Added:
Python3update and rewriten using asyncio. Fetching Tweets should be a lot more faster naturally.Outputcan be saved.Repliesare now visible in the scrapes.
- Removed:
Picsfeature, I'll re-add this on a later date.
Shout me out on Twitter: @now
