MongoDB and PostgreSQL have several architectural differences.
Basic unit of storage
In MongoDB, the basic unit of storage is a serialized JSON document. A document is a JSON data structure that contains key-value pairs. In these pairs, keys are strings and the values are types of data. MongoDB support various data types including nested documents, arrays, strings, dates, Boolean values, and numbers.
Unlike in NoSQL databases, PostgreSQL’s basic storage unit is a row, called a tuple. Each tuple holds a single record under a specific data type that the column defines. Tuples can store integers, strings, dates, Booleans, and more. Alongside the data values, each tuple also contains metadata like the primary key, which identifies each tuple within a table.
Query language
MongoDB uses MongoDB Query Language (MQL) which allows you to interact with the document-oriented structure of MongoDB. MQL is rich in features and supports projection, aggregation frameworks, document querying, aggregation pipelines, geospatial queries, and text searches.
PostgreSQL uses an SQL variant, called Postgres SQL, as its query language. Although similar to SQL, it has additional features like an extensible type system, functions, and inheritance. However, PostgreSQL is still compatible with standard SQL, so you can use SQL queries as well.
Indexing
An index is a data structure that maps values of one or more columns to a physical location of the corresponding data on the disk. It increases the efficiency of database data retrieval operations.
MongoDB uses indexes to optimize query performance. It supports indexing at both the field and collection levels. It offers several index types like B-tree, compound, text, geospatial, hashed, and clustered indexes.
PostgreSQL also provides various index types, including B-tree, hash, GIN, GiST, and Sp-GiST. The create index command creates a B-tree index by default.
Concurrency
Concurrency is the ability of a database system to manage multiple transactions at the same time. Concurrency allows multiple users to access and modify data without causing inconsistency issues or conflicts.
MongoDB has currency control mechanisms that use document-level atomicity and optimistic locking. It assumes there are no conflicts between most concurrency write operations, which allows people to modify data at the same time without acquiring locks. Every modification is atomic. This means that operations are either fully applied or not at all. It also creates a new revision ID for the document, which allows multiple documents with the same data to exist simultaneously.
PostgreSQL also uses multi-version concurrency control (MVCC) to manage data and concurrent transactions. MVCC creates separate rows when users make data changes, which ensures no conflicts between transactions. It supports these isolation levels: read uncommitted, read committed, serializable, and repeatable read. PostgreSQL also uses write-ahead-logging (WAL), which logs any changes to a database before writing them to a disk.
Availability
Availability ensures that even during a server outage, there’s no data downtime. MongoDB uses primary node replication, which duplicates data into replica sets. A singular primary node receives the writes, and secondary nodes then replicate this data. MongoDB automatically triggers a failover that elects a new primary node if a primary node becomes unavailable. These processes minimize MongoDB’s downtime.
In contrast, PostgreSQL uses logical and stream replication to ensure high availability. Logical replication selectively replicates specific tables or subsets of data. Streaming replication creates standby replicas that receive changes in the primary database. Additionally, PostgreSQL uses the PostgreSQL Automatic Failover (PAF) to allocate a new primary if there’s a failure event.
Scalability
Both PostgreSQL and MongoDB use a form of load balancing to evenly distribute read operations across multiple replicas while achieving a high degree of scalability. Their distributed architecture processes move data to improve performance. Data moves between replicas in PostgreSQL and between partitions in MongoDB.
MongoDB also uses sharding and read scalability to ensure a high level of horizontal scalability. Sharding distributes data across multiple partitions, and each shard holds a subset of data. Sharding distributes the workload for high-traffic data sets across multiple servers. Secondary replicas can handle read operations, which helps to distribute the read workload and increase performance.
PostgreSQL also offers partitioning, which splits large tables into smaller, more manageable parts. You can partition based on a hash, range, list, or another criterion.