Accelerated, Production-Ready Graph Analytics for NetworkX Users

jwitsoe · September 4, 2024, 7:40pm

Originally published at: https://developer.nvidia.com/blog/accelerated-production-ready-graph-analytics-for-networkx-users/

NetworkX is a popular, easy-to-use Python library for graph analytics. However, its performance and scalability may be unsatisfactory for medium-to-large-sized networks, which can significantly hinder user productivity. NVIDIA and ArangoDB have collectively addressed these performance and scaling issues with a solution that requires zero code changes to NetworkX. This solution integrates three main components: The…

doberoi · September 4, 2024, 9:37pm

The NVIDIA and ArangoDB teams put in a lot of effort to make this blog as clear and useful to you as possible. Please let us know if you have any questions, comments, or feedback.

haroondsai · September 6, 2024, 7:23pm

Hi Team,

As per directions I ran code in Accelerated, Production-Ready Graph Analytics for NetworkX Users blog but in 2nd level of code.

Median Time: 90 seconds

import pandas as pd
import networkx as nx

Read into Pandas

pandas_edgelist = pd.read_csv(
“cit-Patents.txt”,
skiprows=4,
delimiter=“\t”,
names=[“src”, “dst”],
dtype={“src”: “int32”, “dst”: “int32”},
)

Create NetworkX Graph from Edgelist

G_nx = nx.from_pandas_edgelist(
pandas_edgelist, source=“src”, target=“dst”, create_using=nx.DiGraph
)

Got an error.

ValueError Traceback (most recent call last)
in <cell line: 7>()
5
6 # Read into Pandas
----> 7 pandas_edgelist = pd.read_csv(
8 “cit-Patents.txt”,
9 skiprows=4,

3 frames
/usr/local/lib/python3.10/dist-packages/pandas/io/parsers/c_parser_wrapper.py in read(self, nrows)
232 try:
233 if self.low_memory:
→ 234 chunks = self._reader.read_low_memory(nrows)
235 # destructive to chunks
236 data = _concatenate_chunks(chunks)

parsers.pyx in pandas._libs.parsers.TextReader.read_low_memory()

parsers.pyx in pandas._libs.parsers.TextReader._read_rows()

parsers.pyx in pandas._libs.parsers.TextReader._convert_column_data()

parsers.pyx in pandas._libs.parsers.TextReader._convert_tokens()

parsers.pyx in pandas._libs.parsers.TextReader._convert_with_dtype()

ValueError: Integer column has NA values in column 1

I correctly downloaded cit-Patents.txt file but it didnt work out . Could you please help me out?

Regards,
Haroon

anthony.mahanna · September 6, 2024, 8:02pm

Hi Haroon!

Thanks for reaching out.

Interesting find. The stack trace you’ve shared shows that we’re hitting the following lines in pandas: pandas/pandas/io/parsers/c_parser_wrapper.py at main · pandas-dev/pandas · GitHub.

Some starter questions for you;

What is your version of networkx & pandas?
How much memory do you have on your machine?
What is the OS of your machine?

I’ve just confirmed these lines over a Google Colab instance on CPU. You can check it out here: Google Colab

Happy to investigate further, just let me know.
Anthony

haroondsai · September 7, 2024, 3:20am

Hi Anthony,
Thanks for your deatiled reply and google colab notebook to resolve the issue.My issue has been resolved.I would use GPU and Colab Enterprise going forward to execute the code.
Best Regards,
Haroon

Topic		Replies	Views
Accelerating NetworkX on NVIDIA GPUs for High Performance Graph Analytics Technical Blog	0	359	November 8, 2023
How to Deploy ArangoDB Graphs on GPUs for Accelerated Graph Algorithms using Nvidia’s RAPIDS cuGraph Library Data Science of the Day fun-facts , graph-analytics-cugraph , rapids , algorithm	0	1023	April 4, 2022
What to Do with All That Bandwidth? GPUs for Graph and Predictive Analytics Technical Blog	2	387	August 14, 2017
Running Large-Scale Graph Analytics with Memgraph and NVIDIA cuGraph Algorithms Technical Blog	2	372	August 18, 2022
NetworkX 利用 NVIDIA cuGraph 進行零程式碼變更加速 Taiwan graph-analytics-cugraph , rapids , chinese	0	81	December 6, 2024
GPUs and Databases CUDA Programming and Performance	11	14430	December 3, 2009
Delivering fast recommendations from Google Analytics 360 SQL Knowledge Graph with RAPIDS cuGraph Technical Blog	0	364	May 3, 2021
Beginner's Guide to GPU Accelerated Graph Analytics in Python Technical Blog	1	517	August 16, 2021
NetworkX, NVIDIA cuGraph를 사용한 제로 코드 변경 가속화 도입 Technical Blog - South Korea	1	19	October 25, 2024
NetworkX Introduces Zero Code Change Acceleration Using NVIDIA cuGraph Technical Blog	1	24	October 22, 2024

Accelerated, Production-Ready Graph Analytics for NetworkX Users

Median Time: 90 seconds

Read into Pandas

Create NetworkX Graph from Edgelist

Related topics