Skip to content

PyGraphistry is a Python library to quickly load, shape, embed, and explore big graphs with the GPU-accelerated Graphistry visual graph analyzer

License

Notifications You must be signed in to change notification settings

graphistry/pygraphistry

Repository files navigation

PyGraphistry: Leverage the power of graphs & GPUs to visualize, analyze, and scale your data

Build Status CodeQL Documentation Status Latest Version Latest Version License PyPI - Downloads

Uptime Robot status Twitter Follow

Demo: Interactive visualization of 80,000+ Facebook friendships (source data)

PyGraphistry is an open source Python library for data scientists and developers to leverage the power of graph visualization, analytics, AI, including with native GPU acceleration:

From global 10 banks, manufacturers, news agencies, and government agencies, to startups, game companies, scientists, biotechs, and NGOs, many teams are tackling their graph workloads with Graphistry.

Gallery

The notebook demo gallery shares many more live visualizations, demos, and integration examples

Twitter Botnet
Edit Wars on Wikipedia
(data)
100,000 Bitcoin Transactions
Port Scan Attack
Protein Interactions
(data)
Programming Languages
(data)

Install

Common configurations:

  • Minimal core

    Includes: The GFQL dataframe-native graph query language, built-in layouts, Graphistry visualization server client

    pip install graphistry

    Does not include graphistry[ai], plugins

  • No dependencies and user-level

    pip install --no-deps --user graphistry
  • GPU acceleration - Optional

    Local GPU: Install RAPIDS and/or deploy a GPU-ready Graphistry server

    Remote GPU: Use the remote endpoints.

For further options, see the installation guides

Visualization quickstart

Quickly go from raw data to a styled and interactive Graphistry graph visualization:

import graphistry
import pandas as pd

# Raw data as Pandas CPU dataframes, cuDF GPU dataframes, Spark, ...
df = pd.DataFrame({
    'src': ['Alice', 'Bob', 'Carol'],
    'dst': ['Bob', 'Carol', 'Alice'],
    'friendship': [0.3, 0.95, 0.8]
})

# Bind
g1 = graphistry.edges(df, 'src', 'dst')

# Override styling defaults
g1_styled = g1.encode_edge_color('friendship', ['blue', 'red'], as_continuous=True)

# Connect: Free GPU accounts and self-hosting @ graphistry.com/get-started
graphistry.register(api=3, username='your_username', password='your_password')

# Upload for GPU server visualization session
g1_styled.plot()

Explore 10 Minutes to Graphistry Visualization for more visualization examples and options

PyGraphistry[AI] & GFQL quickstart - CPU & GPU

CPU graph pipeline combining graph ML, AI, mining, and visualization:

from graphistry import n, e, e_forward, e_reverse

# Graph analytics
g2 = g1.compute_igraph('pagerank')
assert 'pagerank' in g2._nodes.columns

# Graph ML/AI
g3 = g2.umap()
assert ('x' in g3._nodes.columns) and ('y' in g3._nodes.columns)

# Graph querying with GFQL
g4 = g3.chain([
    n(query='pagerank > 0.1'), e_forward(), n(query='pagerank > 0.1')
])
assert (g4._nodes.pagerank > 0.1).all()

# Upload for GPU server visualization session
g4.plot()

The automatic GPU modes require almost no code changes:

import cudf
from graphistry import n, e, e_forward, e_reverse

# Modified -- Rebind data as a GPU dataframe and swap in a GPU plugin call
g1_gpu = g1.edges(cudf.from_pandas(df))
g2 = g1_gpu.compute_cugraph('pagerank')

# Unmodified -- Automatic GPU mode for all ML, AI, GFQL queries, & visualization APIs
g3 = g2.umap()
g4 = g3.chain([
    n(query='pagerank > 0.1'), e_forward(), n(query='pagerank > 0.1')
])
g4.plot()

Explore 10 Minutes to PyGraphistry for a wider variety of graph processing.

PyGraphistry documentation

Graphistry ecosystem

Community and support

Contribute

See CONTRIBUTE and DEVELOP for participating in PyGraphistry development, or reach out to our team