Examples

Below are some examples with sample data to try out the BitYota DWS. The data sets are available in BitYota’s publicly accessible S3 bucket

S3 bucket details

Bucket Name:  s3://bityota-demo-data/

Data sets available

NameBase pathSample path
Blog Posts (json)s3://bityota-demo-data/jsondata/s3://bityota-demo-data/jsondata/ugc.json.gz
Customerss3://bityota-demo-data/tpch/customerss3://bityota-demo-data/tpch/customers/customers.tbl.gz
Orderss3://bityota-demo-data/tpch/orderss3://bityota-demo-data/tpch/orders/orders.tbl.gz
Line Itemss3://bityota-demo-data/tpch/lineitemss3://bityota-demo-data/tpch/lineitems/lineitems.tbl.gz
Regionss3://bityota-demo-data/tpch/regionss3://bityota-demo-data/tpch/regions/regions.tbl.gz
Nationss3://bityota-demo-data/tpch/nationss3://bityota-demo-data/tpch/nations/nations.tbl.gz
Supplierss3://bityota-demo-data/tpch/supplierss3://bityota-demo-data/tpch/suppliers/suppliers.tbl.gz
Partss3://bityota-demo-data/tpch/partss3://bityota-demo-data/tpch/parts/parts.tbl.gz
Part Supplierss3://bityota-demo-data/tpch/partsupplierss3://bityota-demo-data/tpch/partsuppliers/partsuppliers.tbl.gz

Try out these sample queries

Example 1

SELECT COUNT(*) FROM sample_orders;

Example 2

SELECT

l_returnflag, l_linestatus,

SUM(l_quantity) AS sum_qty,

SUM(l_extendedprice) AS sum_base_price,

SUM(l_extendedprice * (1 – l_discount)) AS sum_disc_price,

SUM(l_extendedprice * (1 – l_discount) * (1 + l_tax)) AS sum_charge,

AVG(l_quantity) AS avg_qty,

AVG(l_extendedprice) AS avg_price,

AVG(l_discount) AS avg_disc,

COUNT(*) AS count_order

FROM sample_line_items

WHERE

l_shipdate

GROUP BY

l_returnflag, l_linestatus

ORDER BY

l_returnflag, l_linestatus;

Example 3

SELECT

l_orderkey,

SUM(l_extendedprice * (1 – l_discount)) AS revenue,

o_orderdate, o_shippriority

FROM

sample_customers, sample_orders, sample_line_items

WHERE

c_mktsegmentsize = ‘BUILDING’ AND

c_custkey = o_custkey AND

l_orderkey = o_orderkey AND

o_orderdate < ‘1995-03-15′ AND      l_shipdate > ‘1995-03-15′

GROUP BY

l_orderkey, o_orderdate, o_shippriority

ORDER BY

revenue DESC, o_orderdate

LIMIT 10;

 

Examples of BitYota SQL Syntax for JSON documents

You can load your JSON documents without needing to do any specific data modeling with BitYota’s DWS. BitYota provides a JSON datatype that can be used as part of the regular SQL syntax for create table, and you can directly access the attributes within that JSON document as if they were regular table columns.

Example 1

To store the entire JSON document as-­is in a single BitYota DWS table, you can write a create statement like this:

CREATE TABLE events (

event JSON // where event is the name of the column and json is the type
)

The data could look something like:

{ _id: { $oid: “fbc2780b06”}, “blogpost” : { “key” : “5b8b43653b” }, “rating” : 64, “remote_addr” :”14.212.148.62″}

{ _id: { $oid: “db9e421400”}, “blogpost” : { “key” : “4477d65f4b” }, “rating” : 0, “remote_addr” :”.84.46.125.36″}

{ _id: { $oid: “c3fca616f8”}, “blogpost” : { “key” : “172e33f27d” }, “rating” : 102, “remote_addr” :”167.237.118.51″}

………..

{ _id: { $oid: “cd44fa9059”}, “blogpost” : { “key” : “d5b7413d31″ }, “rating” : 19, “remote_addr” :”68.143.177.50″}

The data can be queried just as­-is directly in SQL, or you can transform a subset of the data into columnar format for higher performance.

Example 2

To directly query the JSON document, simply point to each attribute like it was a column in a table. For example,

SELECT event->’rating’, event->’blogpost.key’ FROM events;

would get you a list of the rating for each blogpost key. The attribute access can be used in all places where SQL would usually allow a column name. This includes operations such as joins and group­by, order­by and so on.

Example 3

To get a report of the number of posts by rating, simply write this query:

SELECT COUNT(*), event->’rating’ AS rating
FROM
events
GROUP BY
event->’rating’
ORDER BY
1 desc;

Example 4

Dealing with Nested Objects
The above examples all use simple JSON data, but what about arrays with nested objects such as a sales invoice with embedded line items? Arrays can be accessed as well using SQL. So an object like

{ _id: { $oid: “fbc2780b06″}, “invoice_number” : 12345, “item” : [

{ “sku”: “134255-az”, “price”: 123.98, “status”:”closed”},

{ “sku”: “434235-oz”, “price”: 55.98, “status”:”pending”}, { “sku”: “643387-ad”, “price”: 101.98, “status”:”closed”} ]}

Can be accessed in SQL using a helper function like

SELECT
getitem(jsondoc,’item’,0)
FROM
invoice;

The above statement fetches the first element in the line item. Accessing elements in order may not always be practical since arrays may not always be of the same length (e.g. different invoices may have a different number of line items). A better way would be to extract array elements in a normalized way with:

SELECT getitems(jsondoc,’item’) FROM invoice;

which will return all the items as individual records of a set. We can then create a normalized view of the line items like this:

CREATE VIEW normalizedInvoices AS

SELECT jsondoc->’invoice_number’, lineitem->’price’, lineitem->’sku’

FROM (select jsondoc, getitems(jsondoc,’item’) as lineitem FROM invoice) invoice_items;

Example 4

Creating Custom Functions
In some circumstances you may want to write actual code to deal with the JSON data structure rather than relying on SQL constructs. ­­ This is easily done with your own user defined functions, that can be called as part of your SQL processing. Writing functions is supported in a number of languages (perl,python,plsql,javascript) .
Say we want to create a Javascript function which just returns the sum of the line items from the above invoice. We ­ would define it as

CREATE OR REPLACE FUNCTION sumofitems(invoicejson text) RETURNS int AS $$

var invoice = JSON.parse(invoicejson); var sumofprice = 0.0;

for (var ix in invoice.item)

sumofprice += invoice.item[ix].price; return sumofprice;

$$ LANGUAGE plv8 IMMUTABLE STRICT;

We can now use the above function to get the value of each invoice, like

SELECT
jsondoc->’invoice_number’, sumofitems(jsondoc) AS “total_amount”

FROM
invoice;

The simplicity of the above function shows the power of being able to use functions to do complex queries. The function above returns a simple INT value, but you can also have the function return a complex user defined type as well as sets of values and also extend it for exception error handling.