Examples

Below are some examples with sample data to try out the BitYota DWS. The data sets are available in BitYota's publicly accessible S3 bucket

S3 bucket details

AKIAJPIUUQZ5BBLMGY3A
fTH0oj0ckC8hQx6kCnm3zGOMUewUGMK0ka2HGYh0
s3://bityota-demo-data/

Data sets available

Name Base path Sample path
Customers s3://bityota-demo-data/tpch/customers s3://bityota-demo-data/tpch/customers/customers.tbl.gz
Orders s3://bityota-demo-data/tpch/orders s3://bityota-demo-data/tpch/orders/orders.tbl.gz
Line Items s3://bityota-demo-data/tpch/lineitems s3://bityota-demo-data/tpch/lineitems/lineitems.tbl.gz
Regions s3://bityota-demo-data/tpch/regions s3://bityota-demo-data/tpch/regions/regions.tbl.gz
Nations s3://bityota-demo-data/tpch/nations s3://bityota-demo-data/tpch/nations/nations.tbl.gz
Suppliers s3://bityota-demo-data/tpch/suppliers s3://bityota-demo-data/tpch/suppliers/suppliers.tbl.gz
Parts s3://bityota-demo-data/tpch/parts s3://bityota-demo-data/tpch/parts/parts.tbl.gz
Part Suppliers s3://bityota-demo-data/tpch/partsuppliers s3://bityota-demo-data/tpch/partsuppliers/partsuppliers.tbl.gz
UGC Json Data s3://bityota-demo-data/jsondata/ s3://bityota-demo-data/jsondata/ugc.json.gz

Try out these sample queries

Example 1

SELECT COUNT(*) FROM sample_orders;

Example 2

SELECT
     l_returnflag, l_linestatus,
     SUM(l_quantity) AS sum_qty,
     SUM(l_extendedprice) AS sum_base_price,
     SUM(l_extendedprice * (1 - l_discount)) AS sum_disc_price,
     SUM(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge,
     AVG(l_quantity) AS avg_qty,
     AVG(l_extendedprice) AS avg_price,
     AVG(l_discount) AS avg_disc,
     COUNT(*) AS count_order
FROM sample_line_items
WHERE
     l_shipdate
GROUP BY
     l_returnflag, l_linestatus
ORDER BY
     l_returnflag, l_linestatus;

Example 3

SELECT
     l_orderkey,
     SUM(l_extendedprice * (1 - l_discount)) AS revenue,
     o_orderdate, o_shippriority
FROM
     sample_customers, sample_orders, sample_line_items
WHERE
     c_mktsegmentsize = 'BUILDING' AND
     c_custkey = o_custkey AND
     l_orderkey = o_orderkey AND
     o_orderdate < '1995-03-15' AND
     l_shipdate > '1995-03-15'
GROUP BY
     l_orderkey, o_orderdate, o_shippriority
ORDER BY
     revenue DESC, o_orderdate
LIMIT 10;


Examples of BitYota SQL Syntax for JSON documents

You can load your JSON documents without needing to do any specific data modeling with BitYota’s DWS. BitYota provides a JSON datatype that can be used as part of the regular SQL syntax for create table, and you can directly access the attributes within that JSON document as if they were regular table columns.

Example 1

To store the entire JSON document as-­is in a single BitYota DWS table, you can write a create statement like this:

CREATE TABLE events (
event JSON // where event is the name of the column and json is the type
)...

The data could look something like:

{ _id: { $oid: “fbc2780b06”}, "blogpost" : { "key" : "5b8b43653b" }, "rating" : 64, "remote_addr" :"14.212.148.62"}
{ _id: { $oid: “db9e421400”}, "blogpost" : { "key" : "4477d65f4b" }, "rating" : 0, "remote_addr" :".84.46.125.36"}
{ _id: { $oid: “c3fca616f8”}, "blogpost" : { "key" : "172e33f27d" }, "rating" : 102, "remote_addr" :"167.237.118.51"}
...........
{ _id: { $oid: “cd44fa9059”}, "blogpost" : { "key" : "d5b7413d31" }, "rating" : 19, "remote_addr" :"68.143.177.50"}

The data can be queried just as­-is directly in SQL, or you can transform a subset of the data into columnar format for higher performance.

Example 2

To directly query the JSON document, simply point to each attribute like it was a column in a table. For example,

SELECT event->'rating' AS rating , event->'blogpost.key' AS key FROM events;

would get you a list of the rating for each blogpost key. The attribute access can be used in all places where SQL would usually allow a column name. This includes operations such as joins and group­by, order­by and so on.

Example 3

To get a report of the number of posts by rating, simply write this query:

SELECT COUNT(*), event->'rating' AS rating
FROM
        events
GROUP BY
         event->'rating'
ORDER BY
         1 desc;

Example 4

Dealing with Nested Objects The above examples all use simple JSON data, but what about arrays with nested objects such as a sales invoice with embedded line items? Arrays can be accessed as well using SQL. So an object like

{ _id: { $oid: "fbc2780b06"}, "invoice_number" : 12345, "item" : [
{ "sku": "134255-az", "price": 123.98, "status":"closed"},
{ "sku": "434235-oz", "price": 55.98, "status":"pending"}, { "sku": "643387-ad", "price": 101.98, "status":"closed"} ]}

Can be accessed in SQL using a helper function like

SELECT
         getitem(jsondoc,'item',0)
FROM
         invoice;

The above statement fetches the first element in the line item. Accessing elements in order may not always be practical since arrays may not always be of the same length (e.g. different invoices may have a different number of line items). A better way would be to extract array elements in a normalized way with:

SELECT getitems(jsondoc,'item') FROM invoice;

which will return all the items as individual records of a set. We can then create a normalized view of the line items like this:

CREATE VIEW normalizedInvoices AS
SELECT jsondoc->'invoice_number', lineitem->'price', lineitem->'sku'
FROM (select jsondoc, getitems(jsondoc,'item') as lineitem FROM invoice) invoice_items;

Example 4

Creating Custom Functions
In some circumstances you may want to write actual code to deal with the JSON data structure rather than relying on SQL constructs. ­­ This is easily done with your own user defined functions, that can be called as part of your SQL processing. Writing functions is supported in a number of languages (perl,python,plsql,javascript) . Say we want to create a Javascript function which just returns the sum of the line items from the above invoice. We ­ would define it as

CREATE OR REPLACE FUNCTION sumofitems(invoicejson text) RETURNS int AS $$
var invoice = JSON.parse(invoicejson); var sumofprice = 0.0;
for (var ix in invoice.item)
sumofprice += invoice.item[ix].price; return sumofprice;
$$ LANGUAGE plv8 IMMUTABLE STRICT;

We can now use the above function to get the value of each invoice, like

SELECT
         jsondoc->'invoice_number', sumofitems(jsondoc) AS "total_amount"
FROM
         invoice;

The simplicity of the above function shows the power of being able to use functions to do complex queries. The function above returns a simple INT value, but you can also have the function return a complex user defined type as well as sets of values and also extend it for exception error handling.