What tools do you use to visualize and analyse very large complex networks?

Alexandre Nóbrega Duarte @Alexandre-Duarte-9

02 July 2013 67 663 Report

I'm currently using Gephi to try to visualize a network with 50000 nodes and 84000 edges but it is taking ages just to layout the network.

Do you have any better suggestions of tools for this?

Fabio Lorenzo Traversa

Write your own code in c++ is the best way. Of course it requires that you are an expert user of c++. Actually, I use matlab, but it is very sensible to how you implement your routines, even in this case you must be very expert to reach good results. With matlab I can manage without any problem complex networks up to 100.000 nodes and 300.000 edgs.

Rebecca Cunningham

I use UCINET which is excellent for analytics and visualisation http://www.analytictech.com/ucinet/

however may struggle with your large dataset.

You could try SIROSOM - developed by CSIRO - contact Stephen Fraser @ CSIRO QCAT

However SIROSOM focuses on showing the underlying patterns rather than relationships

Finally I would recommend Pajek

http://pajek.imfm.si/doku.php?id=download

Good luck

Arun Kumar

Yo can use the solar winds network management tool.

Johannes Hercher

Try cytoscape ( http://www.cytoscape.org/ ) it is opensource and suited for very large networks visualization and analysis

Alexandru Topirceanu

I work quite a lot with Gephi and have found that OpenOrd layout does the job.

https://marketplace.gephi.org/plugin/openord-layout/

Vladimir Gonzalez Gamboa

Pajek can work with miilon of nodes, igraph and STATNET in R environment are my recomendations!

Stuart R Borrett

Like Vladimir, I recommend Pajek for a network as big as yours, but I also find the R toolboxes useful.

Muddsair Sharif

Hi, i have currently such experience and i have some suggestion for you,

if you are good in programming then try Visiblox for data representation.

if you have no experience with such a good programming then try RapidMiner and weka for your Social representation of datset under certien criteria.

if you will feel any difficulty in RapidMiner then i will gave you some hints how to do it.

you are welcome if you will have any further query.

Regards

Muddsair

Khalil Abuosba

Here are some good resources that shall help you in your selection:

1. Top-20 Tools

http://www.netmagazine.com/features/top-20-data-visualisation-tools

(No. 20 is Gephi)

2. Data Visualization Tools

http://en.wikipedia.org/wiki/Data_visualization

3. Pajek - Program for Large Network Analysis

http://pajek.imfm.si/doku.php?id=pajek

4. Graphviz - Graph Visualization Software

http://www.graphviz.org/

5. WikiViz

http://www.wikiviz.org/wiki/Tools

Good Luck.

Renaud Lambiotte

There exist efficient libraries (networkx in python), but among user-friendly programs, I would recommend gephi.

J. Andres Dominguez-Gómez

Good point Zoltan. Clustering is the first step for "discrete" solutions.

Aurélien Mazurie

Visualizing such a huge network, while satisfying for the eye, may not be the best approach to tell anything about this network. What you usually get, except if your network has a very peculiar topology, is a 'hairball effect', with no discernable structure.

What you should do instead is either

(1) to describe your network using various topological descriptors of its size, density, complexity, etc. and compare it with other networks to get a sense of your network's peculiarities. For example, what does the degree distribution looks like?

(2) to reduce the complexity of your network by condensing nodes into supernodes, based on various clustering techniques (Girvan-Newman is a good start)

But I'm human too, so I understand the thrill of actually visualizing a huge amount of data in search for structure. Any of the modern network visualization software (Gephi and Cytoscape are good leading names) offer various layout algorithms that are difficult not to play with. Be aware that they are not always that useful, from a scientific standpoint.

Elizabeth Rasnick

I have used Pajek and was very happy with the results. It was not difficult to learn and allows you to graph complex social networks.

Jain Nitin

I came to knew about a new view for the budding reserarchers like me.. the social networking gave a challenging start with the practical examples to be used up by common people.

Konrad Fuks

I've got similar problem ... I tried PajekXXL and Gephi, but neither of them can visualize network with >20k vertices and >700k edges ... It can, but visualization is unreadable :( Big thanks to Aurélien and Zoltan, I'll try clustering techniques you've mentioned.

Paulo Milheiro Mendes

Besides Gephi (v0.8.2) I already tested Cytoscape v2.8.3 (http://www.cytoscape.org/).

Alexandre Nóbrega Duarte

Thank you for all your answers.

Gephi does a pretty decent job on calculating most required metrics and it works very well with the independent clusters in my network.

The only problem was really to layout the entire network. I tried it with cytoscape and it could handle in appropriately.

Thanks again.

Jain Nitin

Gephi is the best for large number of nodes...

Jain Nitin

Have a look at the site for the tool...

http://www.cytoscape.org/

Jain Nitin

Cytoscape is an open source software platform for visualizing molecular interaction networks and biological pathways and integrating these networks with annotations, gene expression profiles and other state data. Although Cytoscape was originally designed for biological research, now it is a general platform for complex network analysis and visualization.

William J.R. Longabaugh

You could take a look at BioFabric (www.BioFabric.org). It represents nodes as horizontal lines instead of as points, so you can get unambiguous visualizations of networks such as yours (50K nodes, 84K edges). A quick demo is available at: http://www.biofabric.org/gallery/pages/SuperQuickBioFabric.html There is also a simplified version in R: https://github.com/wjrl/RBioFabric Hope you find it useful!

Aurélien Mazurie

BioFabric is a fascinating idea; probably one of the best I've seen in the past years in terms of network visualization. I am still wrapping my mind around it, but it seems like it allows for a very quick identification of hubs (nodes with high degree) and communities (set of nodes with few connections with the rest of the network). This is pretty much what biologists look for when visualizing networks. Excellent work!

William J.R. Longabaugh

Thanks very, very much! I'm continuing to develop it, and I'm trying to post regularly about BioFabric examples and features on my blog at http://biofabric.blogspot.com

Dr. A. Abdul Rasheed

Hi, As it is listed by Mr. Khalil, those are some of the best choices to visualize a large complex networks. You can also have an attempt to use WALRUS. It can be used to visualize very large complex networks, at the same time it would work on Client/Server basis. Try it. All the Best.

Jesús M. Siqueiros-García

Most of the time I use cytoscape 2.8/3.0.2 (and for some purposes I use NetworkX). with Cytoscape I was able to visualize a 13000 nodes and 215000 edges, using a 32 ram machine. It didn't take that long to display the graph, although basic analyses took a while (several hours).

Frederic Andres

I am wondering if there is any available benchmark including data set to process comparison.

D. Kent Arrell

While I typically use Cytoscape for network visualization and analysis (with Network Analyzer, BiNGO, etc.), and have tried Pajek and several others already mentioned, I thought I would bring up an additional format that nobody here has yet touched on. They are called hive plots, which are designed to reduce visual complexity and may simplify network visualization. Hive plots were created by Martin Krzywinski at UBC, the same person who developed Circos (http://www.hiveplot.net/). By William's description, it sounds like BioFabric may be conceptually similar to hive plots. I'm not aware of associated tools to enable network analysis with hive plots, however, but my feeling is that hive plots will be most useful as a concise visual tool for comparing and contrasting multiple large networks.

William J.R. Longabaugh

I know there is an R package to create hive plots (see http://cran.r-project.org/web/packages/HiveR/index.html) and there is the RCytoscape project (see http://www.biomedcentral.com/1471-2105/14/217/abstract) that allows R to talk to Cytoscape, so perhaps some combination of these could get you the ability to do network analysis while using a Hive Plot visualization,

But as for the comparison of BioFabric and HivePlots, I want to emphasize that there is actually a big conceptual difference between the two approaches. Hive plots provide a way of compactly organizing the nodes within the framework of a traditional node-link diagram, and thus fundamentally is a new layout technique for this traditional framework. BioFabric abandons the traditional representation of "nodes as points" and instead represents nodes as lines, which allows a scalable and unambiguous presentation of the network edges.

D. Kent Arrell

Thanks for the response William. What I meant by "conceptually similar" is from a visual standpoint, not a structural one. I'm referring to the generation of recognizable network patterns, not to underlying concepts related to network assembly.

I'm not implying that Hive Plots and BioFabric are assembled in a common structural fashion. Instead, what I mean is that regardless of network size, both BioFabric and Hive Plot generate readily discernible visual patterns, which theoretically may convey similarities and/or differences when comparing large networks to one another. That is their shared conceptual element at which typical large network layouts, i.e. hairballs, fail miserably.

I'm in need of a useful tool to compare/contrast 3 networks of up to 3000 nodes each, so I will definitely give BioFabric a look, William, especially if it can provide clear visual evidence to support or emphasize network similarity.

William J.R. Longabaugh

Ah, I got it! Yes, in that regard they are similar.

For comparing networks using BioFabric, I would highly recommend using the Layout Using Node Attributes feature to specify a common order between the networks. A little description of this is here in my blog:

http://biofabric.blogspot.com/2013/05/banish-bipartite-blues.html

Another thing to think about is using the Link Groups feature, in combination with the Shadow Links feature, which are discussed here:

http://biofabric.blogspot.com/2013/07/i-guess-caltech-students-do-have-social.html

With link groups and shadow links, you could create a single network that combines the three networks you are studying, then divide them up into three link groups. Then you could compare the networks on a node-by-node basis.

Thanks for your interest in BioFabric!

Bill

D. Kent Arrell

Thanks Bill. I was going to ask about organizing node layouts the same across the multiple networks, critical for any visual comparison. I've done that previously using the Cytoscape ReOrient Plug-in for inter-network comparisons.

Since your org. is already part of the Cytoscape consortium, and BioFabric also uses the .sif file format, have you given any thought to including BioFabric as a component of Cytoscape?

William J.R. Longabaugh

With the new Cytoscape 3 OSGI architecture, it should be possible to make BioFabric a Cytoscape App (formerly a "plug-in"). It's on the wish list, but there are no resources at the moment to make it happen. At the moment, BioFabric can communicate with older versions of Cytoscape using the Cytoscape Gaggle plug-in and the BioFabric Gaggle-enabled version. So you can analyze networks in Cytoscape while viewing them in BioFabric. though I cannot speak to how Gaggle would scale with really large networks.

Any tighter integration of BioFabric with Cytoscape (e.g. as an alternate renderer) would be more difficult, so the separate App approach is favored at the moment.

Thanks for asking.

Bill

Robert E. Ulanowicz

Dear Alexandre, We have developed a suite of methods to analyze complex ecological networks . They were developed for weighted digraphs, but some of the methods are applicable as well to large, simple graphs. Good luck! Bob Ulanowicz

Max Duckwitz

I'd use R and find me a package that visualises whatever I am looking for.

Hamid Darvish

Pajek for large network analysis should be fine. It is very fast and accurate.

Spyros Angelopoulos

I personally prefer R and iGraph for visualisation of networks! They have served me well!

Rob Christley

For a network of this size i think Pajek is your best bet. the book: "Exploratory social network analysis with Pajek" could be helpful to get started.

Rainer Simon

Hi, if a bit of programming (rather than using a GUI tool) is ok for you, you might give this a try: https://github.com/rsimon/scala-force-layout (disclaimer: it's my own project). Should handle a graph of that size quite reasonably.

Cheers,

Rainer

Ajith Abraham

See the implementation of DBLP data: http://www.forcoa.net

Vladimir Gonzalez Gamboa

I also can recomend Pajek and depending on the analysis STATNET may be suitable for you, I am not sure if R-Siena also work with very large networks!,

Carlo Drago

Pajek is specifically oriented to the visualization and the analysis of large networks. You can find useful materials, as documentation and examples here:

http://vlado.fmf.uni-lj.si/pub/networks/pajek/

R and the package Igraph can be used as well in order for example to detect communities in large networks. Here you can find some documentation as well:

http://igraph.sourceforge.net/doc/R/00Index.html

Kind Regards.

Konstantinos Antonakopoulos

Dear Alexandre,

I used to work with Pajek but has various configuration problems and you dont really have the opportunity to understand many of the built-in features.

I have started working with Cytoscape which is used for complex (biological) network analysis and still is open source. By that, it means you can dig down to the code and do things the way you wish.

Another great improvement of Cytoscape 3.0 is the DynNetwork plugin where you can build dynamic networks as XGMML files. This gives you the possibility of visualizing the network as it evolves in time.

What can be cooler than that ?

Best regards,

Konstantinos

Andrej Mrvar

...Pajek but has various configuration problems

Can you please be more explicit what does 'various configuration problems' stand for?

It sounds like a very general statament...

...dont really have the opportunity to understand many of the built-in features.

There is a book: "Exploratory Social Network Analysis with Pajek"

published by the Cambridge University Press explaining most of the built in features:

http://www.cambridge.org/us/academic/subjects/sociology/research-methods-sociology-and-criminology/exploratory-social-network-analysis-pajek-2nd-edition

It is available also in Japanese:

http://www.tdupress.jp/books/isbn978-4-501-54710-3.html

and Chinese:

http://product.dangdang.com/22927985.html

Also Pajek manual with list of commands with short explanation exists:

http://mrvar.fdv.uni-lj.si/pajek/pajekman.pdf

Konstantinos Antonakopoulos

Thanks for your reply Andrej.

It is not my intention to downgrade to product as it does a great job :)

When I started using it at 2008, it was very difficult to get support information for the product, so after using it for a while, i switched to Cytoscape which is open-source and is working under Ubuntu.

My other problem was how to visualize the evolution of a network over time. As I explain above, Cytoscape does this nicely using the DynNetwork plugin.

Kind regards

Vincenzo Nicosia

What is the point of "visualizing" (i.e., drawing on a screen) a network with 10^5 nodes? It would certainly look like a meaningless mass of points and lines. The best "visualization" tool for such networks I can suggest is the quantitative analysis of their structural properties, community structure and hierarchical organisation.

Andrej Mrvar

Agree - for such networks first analysis to find some interesting parts in networks

(e.g. communities, dense subnetworks, fragments...)

should be done and then visualization of these parts separately

or visualisation of connections among 'shrunken' nodes (e.g. representing communities)

can be useful.

Sakshi Pahwa

Hi Alexandre,

I have also used Kinemage for some visualization. However, Kinemage requires input files to be in Pajek format so both need to be used together.

Besides, I've found Cytoscape also a very useful tool for visualization (as discussed above).

Roger Achkar

I totally agree with Nikos. A python package where the core is implemented in C++ will do the work.

Susmita Bag

Hello everyone..

I was just going through the discussion here regarding gene network and visualization. Here i need to get through a small query and its like "how to undergo topological analysis" by cytoscape 3.0.2. As i have a network with 20 genes, but i could not carry out the topology or the network.

please suggest me some plugin that i can use in cytoscape.

William J.R. Longabaugh

Sushmita;

Have you looked at the Network Analyzer that is now part of the Cytoscape core? Select Tools->NetworkAnalyzer->Network Analysis. That might get you started.

William J.R. Longabaugh

Vincenzo:

Sorry I missed your posting from a few weeks ago, re: "What is the point of "visualizing" (i.e., drawing on a screen) a network with 10^5 nodes? It would certainly look like a meaningless mass of points and lines."

My original response to this question, from July 2013, has long since scrolled off the top of this thread. Please consider taking a look at BioFabric (www.BioFabric.org). It represents nodes as horizontal lines instead of as points, so you can get unambiguous visualizations of networks with 10^5 nodes and more. A quick demo is available at: http://www.biofabric.org/gallery/pages/SuperQuickBioFabric.html There is also a simplified version in R: https://github.com/wjrl/RBioFabric

Bill

Gauthier Vandemoortele

Maybe this helps, I've just found a recent article about that:

Recent Large Graph Visualization Tools : A Review

https://www.jstage.jst.go.jp/article/imt/8/4/8_944/_article

«The tools being reviewed in this paper are igraph, Gephi, Cytoscape, Tulip, WiGis, CGV, VisANT, Pajek, In Situ Framework, Honeycomb and two visualization toolkits which are JavaScript InfoVis Toolkit and GraphGL.»

Regards

Simon S. Li

Snap or Igraph, these are c-projects.

Hamid Darvish

Although Gephi graphically creates nice maps,I would suggest Pajek for large network with an optimized PC could be a good choice.

AbdulKhalique Shaikh

Hi Khalil Abuosba

Currently the following link is not available

1. Top-20 Tools

http://www.netmagazine.com/features/top-20-data-visualisation-tools

it is directed to the following website

http://www.creativebloq.com/net-magazine

Any suggestion?

I need open source data analysis and visualize tool that supports web deployment.

Thnaks

Ezequiel Tacsir

Pajek is quite useful

William J.R. Longabaugh

If by "very large" we're talking about the original questioner's target of 50,000 nodes and 84,000 edges, BioFabric (www.BioFabric.org; which I mentioned above) can actually handle that quite well. Note that since BioFabric uses a breadth-first search for layout, a network of that size does not take long to display.

Bin Jiang

We used used head/tail breaks in order to see the underlying scaling pattern of far more less-connected nodes than well-connected ones:

Jiang B. (2013), Head/tail breaks: A new classification scheme for data with a heavy-tailed distribution, The Professional Geographer, 65 (3), 482 – 494.

Ma D., Sandberg M., and Jiang B. (2005, accepted), Characterizing the heterogeneity of the OpenStreetMap data and community, ISPRS International Journal of Geo-Information, xx(x), xx-xx, Preprint: http://arxiv.org/abs/1503.06091

Jiang B., Duan Y., Lu F., Yang T. and Zhao J. (2014), Topological structure of urban street networks from the perspective of degree correlations, Environment and Planning B: Planning and Design, 41(5), 813-828.

Jiang B. and Ma D. (2015), Defining least community as a homogeneous group in complex networks, Physica A: Statistical Mechanics and its Applications, 428, 154-160.

Rui Sarmento

Dear Alexandre,

We've been researching streaming algorithms and visualization methods or tools for online social networks visualization. Feel free to check our research. If you need any help I'll be glad to provide further assistance.

Regards,

Rui

Federico Giorgi

Cytoscape can handle any network size, with thousands of extra functions for coloring nodes, rendering, automatic layout, and so on.

https://cytoscape.org/

Leo Meyerovich

Graphistry handles fine as it is the only end-to-end GPU system: https://github.com/graphistry/pygraphistry (or graphistry.com for work)

Robert Alexander

Just out of curiosity: I really like Gephi but some of my graphs and layouts make it grind to a crawl. What are the underlying HW features that might speed it up? More cores, more threads, faster clock, more RAM, GPU's ... ???? No idea as to how Java can be sped up on a workstation. Thanks

Chinenye Matthew-Emmanuel Ezeh

Python packages such as NetworkX, Matplotlib and Pyplot will do the job.

The visualization will certainly be cluttered but you can reduce the node sizes to the barest minimum.

Amir Reza Mohammadi

I use GraphStream in java it's a powerful tool that you can modify the graph live to make animations and spread flows its uses CSS to stylize the graph

also neo4j bloom is a great tool to visualize larg graphs you can import your graph via csv edgelist and visualize it in bloom

Pablo Enrique Roman

Actually, you can write a paper if you build a really useful large graph visualization tool for a given area. Therefore it depends on the choice of what to highlight to the user. When a graph is very large, a large number of complex structures arise at different scales. Never underestimate the visualization field, I suggest building your own tool.

Tim Angus

https://graphia.app/

https://www.biorxiv.org/content/10.1101/2020.09.02.279349v1

Anton Vrdoljak

https://www.researchgate.net/post/Is_there_a_visualization_tool_for_a_very_large_graph_15_million_nodes_and_6_millions_edges2

Badges
Science topic

Similar topics
Mathematical Sciences
Graphs

More Alexandre Nóbrega Duarte's questions See All

Is this a facetotecta nauplius?

This larva was captured using a plankton net in the Persian Gulf during the summer. I believe it may be a Facetotecta nauplius.

08 August 2024 3,746 4 View

Following click reaction in cell lysates, protein is immobile and remains at the top of the gel in SDS-PAGE?

I am using CuBr/THPTA for a click reaction in total cell lysates. I am facing issues with my protein sample in non-reducing SDS-PAGE where it's not migrating properly and most of it remains at the...

29 July 2024 950 4 View

Which filtration method to go for run off water from dirty solar panels to be used again?

We are working on a robot that cleans solar panels using fresh water supply and a rotating brush. We are trying to conserve as much water at possible by recycling the dirty water that is collected...

28 July 2024 5,778 2 View

When making abraxane with Nab tech, why is 9:1 Chloroform&EtOH used as solvents?

Is it the best optimization ratio based on experiments?

22 July 2024 8,170 0 View

Can diamond be grown using molecular beam epitaxy?

22 July 2024 9,755 2 View

How can I use LabVIEW to control of an ethernet-based driver?

Hello everyone, I am new to LabVIEW, right now we need to control the movement of an ethernet-based stage drive. We got the sample code from a collaborator but their instrument is USB-based, so we...

11 July 2024 5,385 2 View

Systematic review meta-analysis paper?

Hi everybody, We are trying to write a systematic review meta-analysis paper. But I could find 19 references. I think 19 references are not enough to do a meta-analysis section and it is better to...

10 July 2024 5,490 5 View

2x=0.'. x=1/2*2x=0.'. x = 1/2 multiplica se pelo inverso dos elementos x?

Asked 1 minute ago 2x=0 1/2*2x=0 X=0 Multiplica se o inverso do elemento que multiplica x em sua nulidade (x=0)

01 July 2024 2,814 2 View

Do the SHSY5Y cells tend to form multiple layers over each other when highly confluent?

I have attached pics of the SH-SY5Y cells within a seeded plate after reaching >80-90% confluence. Are these dark clumps of multiple growing cell layers? Or dead cells? Any clue? many thanks.

01 July 2024 8,361 3 View

What is the actual distance covered during sit-to-stand test?

Fellows, We are analysing data from force platform to calculate the power based on the acquired force. However, the current formula consider the distance covered by the individual as the "height"...

26 June 2024 2,049 1 View

Which type of compound does lamda max of 218 indicate in a uv-vis spectrum of a partially purified compound through column and TLC?

A crude extract of fungal culture using EtOH was subjected to column and TLC and partially purified compound was obtained. UV vis spectrum of the compound/s has max absorbance at 218nm. The...

11 August 2024 9,801 2 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

How are iso-frequency contours plotted?

Let's say we have a standard, regular hexagonal honeycomb with a 3-arm primitive unit cell (something like the figure attached; the figure is only representative and not drawn to scale). The...

07 August 2024 1,937 1 View

Hello all, Looking for international reviewer to review Ph.D thesis in wireless sensor network.Can anybody help?

My name is Apurva Saoji. I am a Ph.D scholar in Computer engineering in India. I am looking for international expert in reviewing my PhD thesis, "Competitive Optimization Techniques to Minimize...

07 August 2024 4,600 2 View