Density-based data stream clustering for arbitrary dimension data written in Clojure.
Reference paper.
Sample clustering for time-series data with moving hotspot:
lein test
Note: If you install imagemagick
for your platform, which supplies the command convert
, animated gifs (like in this README) will be generated by the tests in addition to heatmaps of clusters and grid density data.
In addition to the unit tests, there are tests which use generated data, each of which creates an output directory that will contain heatmaps in a time-series for the clusters at a given time and the grid densities at a given time.
You can run integration tests of the RPC server running a DStream clustering server instance:
./run_docker_compose_stack.sh
When faced with data shapes like a crater, which can look like so:
Clustering results, where each cluster has a distinct color:
When dimensionality is greater than 2, we use t-SNE
to reduce the dimensionality of the clusters at a given time to 2. We can then create plots like this, which are 3 different clusters in 5-dimensional space staying static through time, demonstrating the non-deterministic nature of t-SNE
, but also how valuable visualizing its results can be:
Copyright © 2017 FIXME
Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.