Below are some examples with sample data to try out the BitYota DWS. The data sets are available in BitYota's publicly accessible S3 bucket
S3 bucket details
Data sets available
| Name | Base path | Sample path |
|---|---|---|
| Customers | s3://bityota-demo-data/tpch/customers | s3://bityota-demo-data/tpch/customers/customers.tbl.gz |
| Orders | s3://bityota-demo-data/tpch/orders | s3://bityota-demo-data/tpch/orders/orders.tbl.gz |
| Line Items | s3://bityota-demo-data/tpch/lineitems | s3://bityota-demo-data/tpch/lineitems/lineitems.tbl.gz |
| Regions | s3://bityota-demo-data/tpch/regions | s3://bityota-demo-data/tpch/regions/regions.tbl.gz |
| Nations | s3://bityota-demo-data/tpch/nations | s3://bityota-demo-data/tpch/nations/nations.tbl.gz |
| Suppliers | s3://bityota-demo-data/tpch/suppliers | s3://bityota-demo-data/tpch/suppliers/suppliers.tbl.gz |
| Parts | s3://bityota-demo-data/tpch/parts | s3://bityota-demo-data/tpch/parts/parts.tbl.gz |
| Part Suppliers | s3://bityota-demo-data/tpch/partsuppliers | s3://bityota-demo-data/tpch/partsuppliers/partsuppliers.tbl.gz |
Try out these sample queries
Example 1
SELECT COUNT(*) FROM sample_orders;
Example 2
SELECT
l_returnflag, l_linestatus,
SUM(l_quantity) AS sum_qty,
SUM(l_extendedprice) AS sum_base_price,
SUM(l_extendedprice * (1 - l_discount)) AS sum_disc_price,
SUM(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge,
AVG(l_quantity) AS avg_qty,
AVG(l_extendedprice) AS avg_price,
AVG(l_discount) AS avg_disc,
COUNT(*) AS count_order
FROM sample_line_items
WHERE
l_shipdate
GROUP BY
l_returnflag, l_linestatus
ORDER BY
l_returnflag, l_linestatus;
Example 3
SELECT
l_orderkey,
SUM(l_extendedprice * (1 - l_discount)) AS revenue,
o_orderdate, o_shippriority
FROM
sample_customers, sample_orders, sample_line_items
WHERE
c_mktsegmentsize = 'BUILDING' AND
c_custkey = o_custkey AND
l_orderkey = o_orderkey AND
o_orderdate < '1995-03-15' AND
l_shipdate > '1995-03-15'
GROUP BY
l_orderkey, o_orderdate, o_shippriority
ORDER BY
revenue DESC, o_orderdate
LIMIT 10;
Examples of BitYota SQL Syntax for JSON documents
You can load your JSON documents without the needing to do any specific data modeling with BitYota’s DWS. BitYota provides a JSON datatype that can be used as part of the regular SQL syntax for create table, and you can directly access the attributes within that JSON document as if they were regular table columns.
Example 1
To store the entire JSON document as-is in a single BitYota DWS table, you can write a create statement like this:
CREATE TABLE events (
event JSON // where event is the name of the column and json is the type
)...
{ _id: { $oid: “fbc2780b06”}, "blogpost" : { "key" : "5b8b43653b" }, "rating" : 64, "remote_addr" :"14.212.148.62"}
{ _id: { $oid: “db9e421400”}, "blogpost" : { "key" : "4477d65f4b" }, "rating" : 0, "remote_addr" :".84.46.125.36"}
{ _id: { $oid: “c3fca616f8”}, "blogpost" : { "key" : "172e33f27d" }, "rating" : 102, "remote_addr" :"167.237.118.51"}
...........
{ _id: { $oid: “cd44fa9059”}, "blogpost" : { "key" : "d5b7413d31" }, "rating" : 19, "remote_addr" :"68.143.177.50"}
The data can be queried just as-is directly in SQL, or you can transform a subset of the data into columnar format for higher performance.
Example 2
To directly query the JSON document, simply point to each attribute like it was a column in a table. For example,
SELECT event->’rating’, event->’blogpost.key’ FROM events;
Example 3
To get a report of the number of posts by rating, simply write this query:
SELECT COUNT(*), event->'rating' AS rating
FROM
events
GROUP BY
event->'rating'
ORDER BY
1 desc;
Example 4
Dealing with Nested Objects The above examples all use simple JSON data, but what about arrays with nested objects such as a sales invoice with embedded line items? Arrays can be accessed as well using SQL. So an object like
{ _id: { $oid: “fbc2780b06”}, "invoice_number" : 12345, "item" : [
{ “sku”: “134255-az”, “price”: 123.98, “status”:”closed”},
{ “sku”: “434235-oz”, “price”: 55.98, “status”:”pending”}, { “sku”: “643387-ad”, “price”: 101.98, “status”:”closed”} ]}
SELECT
getitem(jsondoc,’item’,0)
FROM
invoice;
SELECT getitems(jsondoc,’item’) FROM invoice;
CREATE VIEW normalizedInvoices AS
SELECT jsondoc->’invoice_number’, lineitem->’price’, lineitem->’sku’
FROM (select jsondoc, getitems(jsondoc,’item’) FROM invoices) invoice_items;
Example 4
Creating Custom FunctionsIn some circumstances you may want to write actual code to deal with the JSON data structure rather than relying on SQL constructs. This is easily done with your own user defined functions, that can be called as part of your SQL processing. Writing functions is supported in a number of languages (perl,python,plsql,javascript) . Say we want to create a Javascript function which just returns the sum of the line items from the above invoice. We would define it as
CREATE OR REPLACE FUNCTION sumofitems(invoicejson text) RETURNS int AS $$
var invoice = JSON.parse(invoicejson); var sumofprice = 0.0;
for (var ix in invoice.item)
sumofprice += invoice.item[ix].price; return sumofprice;
$$ LANGUAGE plv8 IMMUTABLE STRICT;
SELECT
jsondoc->’invoice_number’, sumofitems(jsondoc) AS “total_amount”
FROM
invoices;