Module 4: Data Collection

What is data collection?
How do you collect data from numerous sources?

1

Let’s move into some practical areas of Big Data.
How would you collect data if you were to do data analysis?

Introduction

  • If you were to do a data analysis yourself, how would you proceed?
  • The first thing you should do is collect the data because, basically, you are performing an analysis on a dataset.
  • In this module, you will learn about the first phase of analysis: the data collection step.

Objectives

  1. Know how to find appropriate data for different types of projects.
  2. Distinguish different data types.
  3. Evaluate and evaluate different data sets to know the quality of the data sets

    2

    What is Data Collection?

    Definition

    The procedure of collecting, measuring, and analyzing precise insights for research using standard and validated techniques or tools.

    Application

    – Data collection is very important in the Big Data world and is referred to as the most important role in Big Data Analysis. 

    – The Internet and some prominent search engines provide almost unlimited sources of data for a variety of topics.

    – From these sources, it’s important that you find appropriate and good-quality data sets.

    3

    Lets Dive Deeper

    If you are a reader…

    Explore about the concepts of data collection by reading the articles below.

      Infographics

      Databases that can start you off

      Here are some links to famous databases that are good for beginners.

      Assignment

      Let’s write an article about data collection!

      • Examine why data collection is so important for big data projects.
      • Discuss situations where you would use secondary data and under what situation would you use primary data.
      • Discuss how you would assess data quality or the reliability of data?