My focus this week is on the best way to store and manage data for my upcoming Best-Sellers Shopify app.
Shopify uses JSON to transport data to and from the Shopify API. My go to database these days is Postgres, which also has some great tools for storing and working with JSON data. This means Shopify and Postgres effectively “speak the same language”.
In this post, I’ll walk you through how I will be using Postgres’s built-in JSONB data type to store and manage data for my upcoming Best-Sellers Shopify app.
This post will include:
You can see some example Order JSON in last week’s A Day in the Life of a Shopify Order post.
The Postgres database has support for working with JSON via two different data types:
json- stores and exact copy of the document, preserving whitespace. This is faster to save than jsonb, but much more inefficient at queries.
jsonb- stores a compressed binary copy of the document. This makes it slower to update than json, and whitespace is not preserved, but queries should run much faster.
Shopify sends all of its API and Webhook data via JSON. Postgres supports tools for storing and working with JSON data. This means we can pass Shopify JSON data directly to Postgres without modification.
For my Best-Sellers app, I opted to choose JSONB over JSON since I really need some of the advanced querying capabilities provided by JSONB and I’m ok with the drawbacks such as slower updates and lack of whitespace.
To demonstrate how to work with Shopify data using Postgres’s JSONB data type, I’m going to show you a simple database schema that is very similar to the one I will be using for my app. Then in the next section of this post, I’ll demonstrate a few queries for working with the JSONB data.
Here is an example Rails migration for a Shopify Orders table:
1 2 3 4 5 6 7 8 9 10 11 12 create_table :orders do |t| t.integer :shop_id, null: false t.bigint :shopify_order_id, null: false # the shopify order id t.jsonb :order_data t.timestamps end add_index :orders, :shop_id add_index :orders, :shopify_order_id, unique: true add_index :orders, :order_data, using: :gin add_foreign_key :orders, :shops, on_delete: :cascade
And here is the same Orders table in SQL:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Column | Type | Modifiers ------------------+-----------------------------+----------------------------------------------------- id | integer | not null default nextval('orders_id_seq'::regclass) shop_id | integer | not null shopify_order_id | bigint | not null order_data | jsonb | created_at | timestamp without time zone | updated_at | timestamp without time zone | Indexes: "orders_pkey" PRIMARY KEY, btree (id) "index_orders_on_shopify_order_id" UNIQUE, btree (shopify_order_id) "index_orders_on_order_data" gin (order_data) "index_orders_on_shop_id" btree (shop_id) Foreign-key constraints: "fk_rails_7e761c2e1b" FOREIGN KEY (shop_id) REFERENCES shops(id) ON DELETE CASCADE
A few notes about the schema:
shopify_order_idis pulled out of the order data so I can add a unique contraint. This will prevent duplicate orders from creeping into my database
order_datais the JSONB column which stores the entire Order JSON for an order.
order_datauses a GIN index to make querying the JSONB data easier.
Now let’s walk through a few examples so you can see how you can query this Order data stored as JSONB.
1 2 3 4 5 6 7 8 9 10 SELECT order_data->'financial_status' AS financial_status, COUNT(*) FROM orders GROUP BY financial_status; financial_status | count ----------------------+------- "refunded" | 1 "pending" | 1 "paid" | 1 "partially_refunded" | 2 (4 rows)
->operator to extract the
financial_statusis then used to group the results.
1 2 3 4 5 6 7 8 SELECT COUNT(*) FROM orders o WHERE (o.order_data->>'processed_at')::timestamptz BETWEEN '2016-01-01' AND '2016-02-01'; count ------- 5 (1 row)
processed_attimestamp is used as the time the order was actually placed. We can’t trust the
created_attimestamp because orders can be created after the fact by the Shopify API.
processed_atto a timestamp with timezone before we can use it in a date query like so:
1 2 3 4 5 6 7 8 SELECT li->'product_id' AS product_id, SUM((li->>'quantity')::int), SUM((li->>'price')::numeric) FROM orders o, jsonb_array_elements(o.order_data->'line_items') li GROUP BY li->'product_id'; product_id | sum | sum ------------+-----+-------- null | 5 | 49.95 2938468037 | 5 | 210.00 3461132549 | 5 | 99.95
jsonb_array_elementsPostgres function to extract each
->>JSONB operator is used to convert the value of
priceto text so we can convert them to their proper format as discussed in the following bullet point.
(li->>'price')::numericto allow us to SUM them.
Shopify and Postgres make a powerful combo. Since both Shopify and Postgres are able to work directly with JSON, there is no need to modify the data. This means nothing should get lost in translation when you are storing data from Shopify for an app.
Postgres provides some powerful tools for storing and querying JSON data. In particular, the JSONB data type provides both power and flexibility to manage JSON data within your app.
If you have any specific questions or insights about working with Shopify, Postgres and JSON let me know in the comments or via the contact form.