Skip to content

bburscher/twitterStreams

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

twitterStreams

distributed analytics of live tweet stream

authors: Bjorn + Martijn

Installation

Compile librdkafka and install requirements

git clone https://github.com/edenhill/librdkafka.git
cd librdkafka
./configure
make
sudo make install

pip3 install -r requirements.txt

Usage

Download Kafka, and run in separate tabs

bin/zookeeper-server-start.sh config/zookeeper.properties bin/kafka-server-start.sh config/server.properties

Run once

TOPIC='election' bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic $TOPIC

Run producer (twitter -> kafka)

python3 tweePyTest.py

Run consumers (kafka -> print); may run multiple times on multiple hosts

GROUPID='yourgroup' # see comments inside consumer_test.py SERVERS=$IP_OR_HOSTNAME GROUPID=$GROUPID python3 consumer_test.py

About

distributed analytics of live twitter stream

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages