Working With Shopify Data Using Postgres JSONB
My focus this week is on the best way to store and manage data for my upcoming Best-Sellers Shopify app.
Shopify uses JSON to transport data to and from the Shopify API. My go to database these days is Postgres, which also has some great tools for storing and working with JSON data. This means Shopify and Postgres effectively "speak the same language".
In this post, I'll walk you through how I will be using Postgres's built-in JSONB data type to store and manage data for my upcoming Best-Sellers Shopify app.
This post will include:
- Shopify and JSON
- Postgres and JSON
- Shopify JSON and Postgres JSONB - the perfect match
- An Example Shopify Order schema using JSONB
- Querying Shopify Order data with JSONB
Shopify and JSON
Shopify uses JSON (Javascript Object Notation) as the means of sending and receiving data throughout the Shopify API.
You can see some example Order JSON in last week's A Day in the Life of a Shopify Order post.
Postgres and JSON
The Postgres database has support for working with JSON via two different data types: json
and jsonb
:
json
- stores and exact copy of the document, preserving whitespace. This is faster to save than jsonb, but much more inefficient at queries.jsonb
- stores a compressed binary copy of the document. This makes it slower to update than json, and whitespace is not preserved, but queries should run much faster.
Shopify JSON and Postgres JSONB - the perfect match
Shopify sends all of its API and Webhook data via JSON. Postgres supports tools for storing and working with JSON data. This means we can pass Shopify JSON data directly to Postgres without modification.
For my Best-Sellers app, I opted to choose JSONB over JSON since I really need some of the advanced querying capabilities provided by JSONB and I'm ok with the drawbacks such as slower updates and lack of whitespace.
An Example Shopify Order schema using JSONB
To demonstrate how to work with Shopify data using Postgres's JSONB data type, I'm going to show you a simple database schema that is very similar to the one I will be using for my app. Then in the next section of this post, I'll demonstrate a few queries for working with the JSONB data.
Here is an example Rails migration for a Shopify Orders table:
create_table :orders do |t|
t.integer :shop_id, null: false
t.bigint :shopify_order_id, null: false # the shopify order id
t.jsonb :order_data
t.timestamps
end
add_index :orders, :shop_id
add_index :orders, :shopify_order_id, unique: true
add_index :orders, :order_data, using: :gin
add_foreign_key :orders, :shops, on_delete: :cascade
And here is the same Orders table in SQL:
Column | Type | Modifiers
------------------+-----------------------------+-----------------------------------------------------
id | integer | not null default nextval('orders_id_seq'::regclass)
shop_id | integer | not null
shopify_order_id | bigint | not null
order_data | jsonb |
created_at | timestamp without time zone |
updated_at | timestamp without time zone |
Indexes:
"orders_pkey" PRIMARY KEY, btree (id)
"index_orders_on_shopify_order_id" UNIQUE, btree (shopify_order_id)
"index_orders_on_order_data" gin (order_data)
"index_orders_on_shop_id" btree (shop_id)
Foreign-key constraints:
"fk_rails_7e761c2e1b" FOREIGN KEY (shop_id) REFERENCES shops(id) ON DELETE CASCADE
A few notes about the schema:
shopify_order_id
is pulled out of the order data so I can add a unique contraint. This will prevent duplicate orders from creeping into my databaseorder_data
is the JSONB column which stores the entire Order JSON for an order.- The
order_data
uses a GIN index to make querying the JSONB data easier.
Querying Shopify Order data with JSONB
Now let's walk through a few examples so you can see how you can query this Order data stored as JSONB.
Select a count of all orders by financial_status
SELECT order_data->'financial_status' AS financial_status, COUNT(*)
FROM orders GROUP BY financial_status;
financial_status | count
----------------------+-------
"refunded" | 1
"pending" | 1
"paid" | 1
"partially_refunded" | 2
(4 rows)
- This example uses the
->
operator to extract thefinancial_status
from theorder_data
JSONB. - The
financial_status
is then used to group the results.
Select a count of all orders between two dates
SELECT COUNT(*) FROM orders o
WHERE (o.order_data->>'processed_at')::timestamptz
BETWEEN '2016-01-01' AND '2016-02-01';
count
-------
5
(1 row)
- The
processed_at
timestamp is used as the time the order was actually placed. We can't trust thecreated_at
timestamp because orders can be created after the fact by the Shopify API. - We must convert the
processed_at
to a timestamp with timezone before we can use it in a date query like so:(o.order_data->>'processed_at')::timestamptz
Select a sum of all line item quantities and prices, by product_id
SELECT li->'product_id' AS product_id, SUM((li->>'quantity')::int), SUM((li->>'price')::numeric)
FROM orders o, jsonb_array_elements(o.order_data->'line_items') li GROUP BY li->'product_id';
product_id | sum | sum
------------+-----+--------
null | 5 | 49.95
2938468037 | 5 | 210.00
3461132549 | 5 | 99.95
- We use the
jsonb_array_elements
Postgres function to extract eachline_item
from theline_items
array. - The
->>
JSONB operator is used to convert the value ofquantity
andprice
to text so we can convert them to their proper format as discussed in the following bullet point. - Then we convert the
quantity
to int:(li->>'quantity')::int
, andprice
to numeric:(li->>'price')::numeric
to allow us to SUM them.
Summary
Shopify and Postgres make a powerful combo. Since both Shopify and Postgres are able to work directly with JSON, there is no need to modify the data. This means nothing should get lost in translation when you are storing data from Shopify for an app.
Postgres provides some powerful tools for storing and querying JSON data. In particular, the JSONB data type provides both power and flexibility to manage JSON data within your app.