PostgreSQL’s JSONB type is one of the most powerful and most misused features in the database. This post covers the practical details: storage differences, operators, GIN indexes, mutation functions, and the patterns where JSONB earns its place.
json vs jsonb: Always Use jsonb#
PostgreSQL has two JSON types: json and jsonb. The difference matters in production.
json | jsonb | |
|---|---|---|
| Storage | Stores the raw text, preserves whitespace and key order | Stores a decomposed binary representation |
| Read performance | Must re-parse on every read | Binary format, no re-parse needed |
| Write performance | Slightly faster (no decomposition) | Slightly slower (decomposes at write time) |
| Indexing | No index support | GIN and btree indexes supported |
| Key deduplication | Keeps duplicates | Last-write wins on duplicate keys |
Never use json in a new schema. The only reason json exists is for cases where you need to preserve exact key order or duplicate keys (rare auditing use cases). For everything else, jsonb is strictly better because it can be indexed and queried efficiently.
Querying: ->, ->>, #>, #>>#
The four navigation operators:
->returns a JSON sub-object or array element (type:jsonb)->>returns the value as text (type:text)#>navigates a path array, returnsjsonb#>>navigates a path array, returnstext
CREATE TABLE events (
id bigserial PRIMARY KEY,
payload jsonb NOT NULL,
created_at timestamptz NOT NULL DEFAULT now()
);
INSERT INTO events (payload) VALUES
('{"type": "order.created", "user": {"id": 42, "email": "alice@example.com"}, "items": [{"sku": "A1", "qty": 2}]}'),
('{"type": "order.shipped", "user": {"id": 42, "email": "alice@example.com"}, "tracking": "1Z999"}');
-- Get the event type as jsonb (quoted string in output)
SELECT payload -> 'type' FROM events;
-- Get the event type as plain text
SELECT payload ->> 'type' FROM events;
-- Navigate nested path: user.email
SELECT payload #>> '{user, email}' FROM events;
-- Filter by nested value
SELECT * FROM events
WHERE payload ->> 'type' = 'order.created'
AND (payload #>> '{user, id}')::int = 42;Containment Operators: @> and <@#
The containment operators are the most powerful JSONB querying tools, and they work with GIN indexes.
A @> B means “does A contain B as a subset of keys/values?”
A <@ B means “is A contained within B?”
-- Find all events where the user id is 42
SELECT * FROM events
WHERE payload @> '{"user": {"id": 42}}';
-- Find events of specific types
SELECT * FROM events
WHERE payload @> '{"type": "order.created"}';
-- Find events that have a tracking field at all
SELECT * FROM events
WHERE payload ? 'tracking';
-- Find events where type is one of a set
SELECT * FROM events
WHERE payload ->> 'type' = ANY(ARRAY['order.created', 'order.shipped']);GIN Indexes: What They Accelerate#
A GIN (Generalized Inverted Index) on a JSONB column indexes all keys and values within the document, enabling fast containment and existence queries.
-- Default GIN with jsonb_ops: supports @>, ?, ?|, ?&
CREATE INDEX idx_events_payload_gin ON events USING gin (payload);
-- GIN with jsonb_path_ops: smaller index, only supports @>
-- Better when you only need containment queries
CREATE INDEX idx_events_payload_path_ops ON events USING gin (payload jsonb_path_ops);
-- B-tree index on a specific extracted value (for equality + range on one field)
CREATE INDEX idx_events_type ON events ((payload ->> 'type'));
CREATE INDEX idx_events_user_id ON events (((payload #>> '{user, id}')::int));Use jsonb_path_ops when all your queries use @> containment. It produces a smaller, faster index because it only indexes values, not keys. Use the default jsonb_ops when you also need ? (key existence), ?|, or ?& operators.
The B-tree index on an extracted expression is the right choice when you have a specific field that is queried frequently with equality or range conditions. It is significantly faster than GIN for single-field lookups.
Updating JSONB: jsonb_set, ||, and Key Removal#
Updating a JSONB field requires constructing a new value. PostgreSQL provides several functions for this.
-- jsonb_set(target, path, new_value, create_missing)
-- Update user.email in place
UPDATE events
SET payload = jsonb_set(payload, '{user, email}', '"bob@example.com"', false)
WHERE id = 1;
-- jsonb_insert(target, path, new_value, insert_after)
-- Insert a new element into the items array at position 0
UPDATE events
SET payload = jsonb_insert(payload, '{items, 0}', '{"sku": "B2", "qty": 1}', false)
WHERE id = 1;
-- Merge / shallow merge with the || operator
UPDATE events
SET payload = payload || '{"status": "processed", "processed_at": "2025-01-01"}'
WHERE id = 1;
-- Remove a key with the - operator
UPDATE events
SET payload = payload - 'tracking'
WHERE id = 2;
-- Remove a nested path with the #- operator
UPDATE events
SET payload = payload #- '{user, email}'
WHERE id = 1;The || operator performs a shallow merge, not a deep merge. If both the left and right side have the same nested object key, the right side replaces the left entirely, rather than merging recursively. Use jsonb_set for targeted nested updates.
Unnesting: jsonb_array_elements and jsonb_each#
-- Expand a JSONB array into rows
SELECT
e.id,
item ->> 'sku' AS sku,
(item ->> 'qty')::int AS qty
FROM events e,
jsonb_array_elements(e.payload -> 'items') AS item
WHERE e.payload @> '{"type": "order.created"}';
-- Expand all key-value pairs of a JSONB object into rows
SELECT key, value
FROM events,
jsonb_each(payload)
WHERE id = 1;Real Use Case: Event Sourcing with JSONB#
Event sourcing is one of the best fits for JSONB. The event type determines the payload shape, so a single events table with a JSONB payload column handles heterogeneous event schemas naturally.
CREATE TABLE domain_events (
id bigserial PRIMARY KEY,
aggregate text NOT NULL,
agg_id bigint NOT NULL,
event_type text NOT NULL, -- extracted for indexing
version int NOT NULL,
payload jsonb NOT NULL,
created_at timestamptz NOT NULL DEFAULT now()
);
-- Partial index: find all shipment events cheaply
CREATE INDEX idx_domain_events_shipment
ON domain_events (agg_id, version)
WHERE event_type = 'order.shipped';
-- GIN for attribute queries inside payload
CREATE INDEX idx_domain_events_payload
ON domain_events USING gin (payload jsonb_path_ops);
-- Replay all events for a specific order
SELECT event_type, version, payload
FROM domain_events
WHERE aggregate = 'order' AND agg_id = 42
ORDER BY version;
-- Find all orders shipped to a specific address
SELECT agg_id, payload
FROM domain_events
WHERE event_type = 'order.shipped'
AND payload @> '{"address": {"city": "Milan"}}';ORM Integration#
from sqlalchemy import Column, BigInteger, Text, DateTime
from sqlalchemy.dialects.postgresql import JSONB
from sqlalchemy.orm import DeclarativeBase
import datetime
class Base(DeclarativeBase):
pass
class DomainEvent(Base):
__tablename__ = "domain_events"
id = Column(BigInteger, primary_key=True)
aggregate = Column(Text, nullable=False)
agg_id = Column(BigInteger, nullable=False)
event_type = Column(Text, nullable=False)
version = Column(BigInteger, nullable=False)
payload = Column(JSONB, nullable=False)
created_at = Column(DateTime(timezone=True), default=datetime.datetime.utcnow)
# Query using containment (uses GIN index)
from sqlalchemy import cast
from sqlalchemy.dialects.postgresql import JSONB as JSONB_type
session.query(DomainEvent).filter(
DomainEvent.event_type == "order.shipped",
DomainEvent.payload.contains({"address": {"city": "Milan"}})
).all()from django.db import models
from django.contrib.postgres.fields import JSONField # Django < 3.1
# from django.db.models import JSONField # Django >= 3.1 (auto-uses jsonb on Postgres)
class DomainEvent(models.Model):
aggregate = models.TextField()
agg_id = models.BigIntegerField()
event_type = models.TextField(db_index=True)
version = models.IntegerField()
payload = models.JSONField() # maps to jsonb in PostgreSQL
created_at = models.DateTimeField(auto_now_add=True)
class Meta:
indexes = [
models.Index(fields=["aggregate", "agg_id", "version"]),
]
# Query using key lookup (translated to ->> operator)
DomainEvent.objects.filter(
event_type="order.shipped",
payload__address__city="Milan"
)package models
import (
"database/sql/driver"
"encoding/json"
"fmt"
"time"
)
// JSONB is a custom type wrapping map[string]interface{}
// to satisfy GORM's Scanner/Valuer interface for jsonb columns.
type JSONB map[string]interface{}
func (j JSONB) Value() (driver.Value, error) {
b, err := json.Marshal(j)
return string(b), err
}
func (j *JSONB) Scan(value interface{}) error {
bytes, ok := value.([]byte)
if !ok {
return fmt.Errorf("failed to unmarshal JSONB: %v", value)
}
return json.Unmarshal(bytes, j)
}
type DomainEvent struct {
ID uint `gorm:"primarykey"`
Aggregate string `gorm:"not null;index:idx_agg"`
AggID int64 `gorm:"not null;index:idx_agg"`
EventType string `gorm:"not null;index"`
Version int `gorm:"not null"`
Payload JSONB `gorm:"type:jsonb;not null"`
CreatedAt time.Time
}
// Raw query for GIN containment
var results []DomainEvent
db.Raw(
`SELECT * FROM domain_events WHERE event_type = ? AND payload @> ?::jsonb`,
"order.shipped", `{"address": {"city": "Milan"}}`,
).Scan(&results)When NOT to Use JSONB#
JSONB is the right choice when the structure genuinely varies per row. It is the wrong choice when:
- The field is relational data (user, product, order relationships belong in normalized tables)
- The field is queried frequently by a specific attribute (make it a column, add a B-tree index)
- You need referential integrity (JSONB values are not foreign keys)
- The nested structure is deep and would be better expressed as a join
If you find yourself consistently extracting the same path (payload ->> 'status') in WHERE clauses and ORDER BY, that field should be a column. JSONB flexibility costs you query planner statistics, referential integrity, and type safety.
Common Mistakes#
Using json instead of jsonb
json type stores raw text and cannot be indexed. Every query that filters or extracts from a json column re-parses the entire text. In any realistic workload this is a performance regression compared to jsonb. There is no valid reason to use json for new tables.No GIN index on a queried JSONB column
@>) and key existence queries (?) perform a full sequential scan. On tables with millions of rows, this is the difference between milliseconds and minutes. Add a GIN index on any JSONB column you filter on.Updating JSONB with string concatenation
jsonb_set for targeted updates and the || merge operator for shallow merges. Never concatenate strings to build JSON for SQL queries.Storing deeply nested structures that should be columns
If you want to go deeper on any of this, I offer 1:1 coaching sessions for engineers working on AI integration, cloud architecture, and platform engineering. Book a session (50 EUR / 60 min) or reach out at manuel.fedele+website@gmail.com.