Neo4j vs PostgreSQL

Created: November 2019

Description

Neo4j can be substantially faster for certain types of queries because of the benefits gained from index-free adjacency.

Example: Get the names of destination airports from all flights originating in Wyoming

Cypher (Neo4j query language):

MATCH (hi:Airport {state: 'WY'})-[:HAS_DEPARTURE]->(fl:Flight)-[:FLIES_TO]->(ap:Airport)

Postgres SQL:

RETURN DISTINCT ap.name
SELECT name from airports
JOIN flights ON (airports.iata = flights.destination_airport)
WHERE flights.origin_airport IN
(SELECT iata from airports WHERE airports.state = 'WY')
Attempt 1 2 3 Average Logo
Neo4j 0.29222047s 0.34756797s 0.30003158s 0.32918392s neo4j
PostgreSQL 2.32982001s 2.14832110s 2.39080437s 2.29873976s PostgreSQL

Context

This performance comparison was completed for the final project of my SCS 3252:017 Big Data Management Systems & Tools course (University of Toronto).

There is a PDF and Jupyter notebook investigating how Neo4j can outperform PostgreSQL for large interconnected datasets. See below:

Technology Stack

  • neo4j
  • PostgreSQL
  • jupyter notebooks
  • Google Colab