Skip to main content

Blog

Orchestrating a Real-time Data Pipeline with BitYota

In his post “Solving the Challenge of Data Integration in Big Data Analytics”, Nilesh talked about using BitYota’s built-in Data Integration components to bring recurring data in on a schedule into your BitYota Data Warehouse. But data loads are just one piece of building a data pipeline for Analytics. Frequently, you want to manipulate the data in some way either before or after every load, say to check incoming data quality or to build aggregates that can be consumed by your BI tools. And it all has to be in real-time , on fresh data. In the old model, this …

Solving the Challenge of Data Integration in Big Data Analytics

Anyone who is in the business of  big data analytics will tell you that significant effort goes into setting up and managing the data pipelines to extract and integrate data from disparate sources before analysis. A. Data Pipeline setup You just got access to a new source. Now the pressure is on to understand it’s value … Integrating a new data source or data set can be daunting. First, an in-depth knowledge of the data is required to load it. Data format, schema, frequency, delimiters, and layout are some of the many attributes which need to be understood. Second, the …

"Kobayashi Maru" a.k.a.The no-win scenario for Complex Analytics on MongoDB

MongoDB is great… MongoDB’s horizontal scalability and flexibility in handling changing data structures make it an ideal choice for agile application development. Additionally, the ability to quickly create read-only copies of the data via sharding/replication make it possible to run real-time dashboards for simple operational metrics such as counts of unique users directly on top of an operational Mongodb instance. ….But for Complex Analytics? Beyond simple metrics, businesses also need to gain deeper insights by analyzing, slicing and dicing data in various ways; for example – correlating user visits with purchases, identifying your most popular products and your most important …

The War of the Roses – MongoDb Data Structures and Complex Analytics

When you created your MongoDB, you created a set of collections which likely made sense for your specific web serving needs. They may have evolved over time, but your data is somewhat settled into what you believe is the current format. Let’s take the example of a blogging site. The majority of the data needed to run the site is in 2 collections – “sites” and “authors” — lets look at the “sites” collection. The Sites document has all the information needed to render a blog-page, organized like this:     { _id:"site42", sitename:”Hartley’s Fly Fishing”, url: “/hartleysflyfishing”, posts: …

NoSQL Stores – Scale & Performance for Transactions Yes, But For Analytics?

In the past few years NoSQL document based stores like MongoDB have made great inroads in new applications for many good reasons. Unlike traditional SQL databases, you don’t have to design your information schemas upfront to the n-th degree of detail. Instead you can rely on schema-less json structures to keep all your data together for your application. The content of the JSON document can evolve over time, without impacting your downstream database (no need to schedule database changes to add column or modify the schema). This is no small matter — a good programmer’s freedom to choose a data …

How to use Standard SQL over JSON with BitYota’s DWS

Document­-oriented, NoSQL databases have gained a lot of traction in the past few years, adopted by many companies for agile, scalable application development. Their general­ purpose document­-oriented design make them appropriate for a large number of use cases such as content management systems, mobile apps, gaming, e­commerce, analytics, archiving, and logging. NoSQL databases such as MongoDB are also good for real ­time operational dashboards for BI, because of their ability to support indexes as well as agility in handling changing data structures. On top of dashboards, businesses also need to gain deeper insights by analyzing, slicing and dicing data in …

Even an improbability drive needs coordinates.…

Big Data is nothing new – we know credit card companies, retailers, research labs, etc have spent the last few decades collecting, modeling and analyzing large data sets, at great cost. So what’s changed? Well, with billions of people using multiple devices and apps to access the Web, the very nature of this data has changed -it is no longer static, well structured or predictable in advance. Rather it is “polymorphic”, shape-shifting in format over time. This polymorphic data breaks our well-defined workflows of bringing data into a data mart for analytics. The old way used to involve collecting structured …

About BitYota

The Hitchhiker’s Guide to Big Data Analytics Welcome to the company blog from BitYota, the next gen Data Warehouse-as-a-Service for Big Data Analytics, accessible by anyone, anywhere. We are an expert and well-funded team with 35+ years of cumulative experience in building data platforms from Oracle, Informix, Yahoo!, Veritas, Tibco, etc. Our vision is to make data and analytics accessible to all. We know our vision is audacious and requires a complete rethinking of traditional data warehousing concepts and technologies. We believe we have the team, the tenacity and talent to do this and we are excited to take this journey with …

Get insights in minutes

Find the value you can add for your customers and your business today. Spin up a node, load your data and start running analysis.