Working With Shopify Data Using Postgres JSONB
My focus this week is on the best way to store and manage data for my upcoming Best-Sellers Shopify app.
Shopify uses JSON to transport data to and from the Shopify API. My go to database these days is Postgres, which also has some great tools for storing and working with JSON data. This means Shopify and Postgres effectively "speak the same language".
In this post, I'll walk you through how I will be using Postgres's built-in JSONB data type to store and manage data for my upcoming Best-Sellers Shopify app.
This post will include:
- Shopify and JSON
- Postgres and JSON
- Shopify JSON and Postgres JSONB - the perfect match
- An Example Shopify Order schema using JSONB
- Querying Shopify Order data with JSONB
Shopify and JSON
You can see some example Order JSON in last week's A Day in the Life of a Shopify Order post.
Postgres and JSON
The Postgres database has support for working with JSON via two different data types:
json- stores and exact copy of the document, preserving whitespace. This is faster to save than jsonb, but much more inefficient at queries.
jsonb- stores a compressed binary copy of the document. This makes it slower to update than json, and whitespace is not preserved, but queries should run much faster.
Shopify JSON and Postgres JSONB - the perfect match
Shopify sends all of its API and Webhook data via JSON. Postgres supports tools for storing and working with JSON data. This means we can pass Shopify JSON data directly to Postgres without modification.
For my Best-Sellers app, I opted to choose JSONB over JSON since I really need some of the advanced querying capabilities provided by JSONB and I'm ok with the drawbacks such as slower updates and lack of whitespace.
An Example Shopify Order schema using JSONB
To demonstrate how to work with Shopify data using Postgres's JSONB data type, I'm going to show you a simple database schema that is very similar to the one I will be using for my app. Then in the next section of this post, I'll demonstrate a few queries for working with the JSONB data.
Here is an example Rails migration for a Shopify Orders table:
create_table :orders do |t|
t.integer :shop_id, null: false
t.bigint :shopify_order_id, null: false # the shopify order id
add_index :orders, :shop_id
add_index :orders, :shopify_order_id, unique: true
add_index :orders, :order_data, using: :gin
add_foreign_key :orders, :shops, on_delete: :cascade
And here is the same Orders table in SQL:
Column | Type | Modifiers
id | integer | not null default nextval('orders_id_seq'::regclass)
shop_id | integer | not null
shopify_order_id | bigint | not null
order_data | jsonb |
created_at | timestamp without time zone |
updated_at | timestamp without time zone |
"orders_pkey" PRIMARY KEY, btree (id)
"index_orders_on_shopify_order_id" UNIQUE, btree (shopify_order_id)
"index_orders_on_order_data" gin (order_data)
"index_orders_on_shop_id" btree (shop_id)
"fk_rails_7e761c2e1b" FOREIGN KEY (shop_id) REFERENCES shops(id) ON DELETE CASCADE
A few notes about the schema:
shopify_order_idis pulled out of the order data so I can add a unique contraint. This will prevent duplicate orders from creeping into my database
order_datais the JSONB column which stores the entire Order JSON for an order.
order_datauses a GIN index to make querying the JSONB data easier.
Querying Shopify Order data with JSONB
Now let's walk through a few examples so you can see how you can query this Order data stored as JSONB.
Select a count of all orders by financial_status
SELECT order_data->'financial_status' AS financial_status, COUNT(*)
FROM orders GROUP BY financial_status;
financial_status | count
"refunded" | 1
"pending" | 1
"paid" | 1
"partially_refunded" | 2
- This example uses the
->operator to extract the
financial_statusis then used to group the results.
Select a count of all orders between two dates
SELECT COUNT(*) FROM orders o
BETWEEN '2016-01-01' AND '2016-02-01';
processed_attimestamp is used as the time the order was actually placed. We can't trust the
created_attimestamp because orders can be created after the fact by the Shopify API.
- We must convert the
processed_atto a timestamp with timezone before we can use it in a date query like so:
Select a sum of all line item quantities and prices, by product_id
SELECT li->'product_id' AS product_id, SUM((li->>'quantity')::int), SUM((li->>'price')::numeric)
FROM orders o, jsonb_array_elements(o.order_data->'line_items') li GROUP BY li->'product_id';
product_id | sum | sum
null | 5 | 49.95
2938468037 | 5 | 210.00
3461132549 | 5 | 99.95
- We use the
jsonb_array_elementsPostgres function to extract each
->>JSONB operator is used to convert the value of
priceto text so we can convert them to their proper format as discussed in the following bullet point.
- Then we convert the
(li->>'price')::numericto allow us to SUM them.
Shopify and Postgres make a powerful combo. Since both Shopify and Postgres are able to work directly with JSON, there is no need to modify the data. This means nothing should get lost in translation when you are storing data from Shopify for an app.
Postgres provides some powerful tools for storing and querying JSON data. In particular, the JSONB data type provides both power and flexibility to manage JSON data within your app.