What is Big Data?

What is Big Data?

Bryan Smith

Co-founder & big data specialist

In this video, Bryan explains what big data means and how it affects the modern data generation processes of today.

In this video, Bryan explains what big data means and how it affects the modern data generation processes of today.

Now free to watch

This video is now available for free. It is also part of a premium, accredited video course. Speak to an expert today to watch more.

What is Big Data?

18 mins 28 secs

Overview

Big Data is changing the way companies plan and strategise. Companies that place importance on data and solving the issues plaguing data scientists will have a leg up on their competition.

Key learning objectives:

  • Define "Data-First"

  • Understand the different types and uses of data

  • Understand the problems and applications of big data

Now free to watch

This video is now available for free. It is also part of a premium, accredited video course. Speak to an expert today to watch more.

Summary

How can a company become “Data-First”?

The largest, fastest growing companies in the world are those who have taken a “data-first” approach to growing their business. They are companies who, regardless of their sector, understand the value of the data they collect and analyse. With data now widely being “captured”, organisations can begin to invest in turning that data into new products, services, and insight. Companies who have a strategy to not only capture and store data, but also ship data and analyse it, are those who are on track to becoming “data-first”.

What is the difference between internal and external data?

Traditionally, discussing big data has always been limited to solving the problem of accessing data that is being generated by the business itself – internal data that a company has complete control over – from creation all the way to analysis. As companies become more data literate and the systems put in place to manage big data become more sophisticated, organisations are beginning to see the need for a strategy to handle the availability of external data – data that is valuable to your business, but is created, owned, and controlled by a third party.

How is external data used as a source for alternative information?

Data is an immensely valuable, constantly renewing resource of information that can add immeasurable value to your business. With nearly every app and device generating data, new opportunities have been created to marry new, external data feeds with existing internal data to generate net-new insight.

How has the problem of dealing with a variety of data developed and what is needed to manage it?

Beginning to work with data that are sourced from both inside and outside of our companies introduces the problem of dealing with data variety. In the traditional world of big data, volume and velocity were the major issues that needed to be solved in order to start working with data. Data variety, on the other hand, introduces the need to be able to take similar data and merge it together into a single, standard data product that can then be leveraged into useful information and grow and evolve with time.

What is the big problem plaguing data scientists?

In reality, and referenced by Forbes on multiple occasions, roughly 80% of a data scientist’s time is being taken up by “data prep”, leaving only 20% of their time for experimentation and building new solutions. Ultimately, this is why, according to Technology Review, only 0.5% of data available to data science teams is ever truly analysed. Think about the opportunity cost of data prep.

How does an organisation build an efficient & scalable data pipeline?

Rather than dumping raw data into a “lake” like most organisations have in the past, only to then transform that data when it is called upon to power an app or a solution, we have to build a pipeline that can transform data to match internal data customs on the way into the lake in order to give our data scientists any chance of finding use in it. The good news is, there are solutions out there that enable us to start small and scale up a solution as more and more data is “turned on” for use. A good place to begin your inquiry is with the data science teams themselves – building portfolios of data around teams or projects and adding new layers of data over time.

Why is it important to have an AI strategy?

The more data we can throw at a problem the better and more accurately we can train machines to predict that problem for us at a scale previously thought impossible. This is why having a modern data strategy is key to having an AI strategy – data is fuel and without it, there is no real pathway forward for learning or training.

What is the significance of Business Intelligence Solutions?

In the world of big data, choosing and rolling out business intelligence software goes hand in hand with our data strategies and it is the necessary layer that translates raw data into usable insight. Therefore, it is key to consider how data will flow into these solutions and power them over time. In an ideal scenario, this is just as automated as the functions completed by the business intelligence software itself. This involves setting up data pipelines, transformation and enrichment refineries, distribution systems, and access management layers.

Now free to watch

This video is now available for free. It is also part of a premium, accredited video course. Speak to an expert today to watch more.

Bryan Smith

Bryan Smith

Entrepreneur and CEO of ThinkData Works. Bryan has prior experience working as a policy advisor to a former Canadian Prime Minister.

There are no available videos from "Bryan Smith"