Design a web scraping program by python to collect

Assignment Help Management Information Sys
Reference no: EM131963781

SIT742 Modern Data Science 2018 T1

Assignment : General data processing and using big data

This is an individual work on the understanding of data science, big data and their applications. It contains written answers and some programming-related tasks based on topics presented during Weeks 1 to 3.

Assignment  is broken down to three tasks below. You can use Google to find the data sources (i.e., websites). After your practice, please write down your executable Python codes and put the collected data in tables for above demonstration. You also need to write several paragraphs to explain your comparison and make a conclusion.

Task 1. Data Acquisition

Design a web scraping program by Python to collect weather forecast report data of a city (e.g., Melbourne) from a website, such as temperature, humidity, weather status (cloudy, sunny etc.), and store the data in a csv file. Please do this task in both the following ways:

(1) Collecting data by regular interval sampling. You need to find the best sampling interval in terms of space efficiency and demonstrate using numeric results why it is the best solution.

(2) Collecting data by change detection. You store one data object only when any of the weather forecast report data is changed at the website.

Both you need to record weather data with their timestamps. Then, compare the two collection methods, conclude the optimal one and demonstrate using numeric results.(Please refer to Lecture 2.)

Task 2. Data Integration

Use the optimal method you demonstrated above to collect weather report data from more than one websites and integrate the data from different sources (websites) and write the integrated data into a csv file. Please demonstrate

(1) how to do schema alignment and

(2) how to determine which is correct if two data from different sources do not agree with each other.

(Please do a survey about the existing techniques and use one to resolve the problem, Lecture 2 provides you some basic concepts and you may do a broader search by yourselves.)

Task 3. Missing Data Prediction

Use the data you collected in Tasks 1 and 2, please design a method to predict a missing data object, for example, between two consecutive data objects (time, temperature) in your csv file as below:

11:00AM, 15
12:00PM, 17

the user want to query about the temperature at 11:30AM.

(Please do a survey about the existing techniques and use one to resolve the problem. Lecture 3 provides you some basic concepts and you may do a broader search by yourselves.)

Reference no: EM131963781

Questions Cloud

Compute the value of the test statistic : a. At 5% should the null be rejected? b. Compute the value of the test statistic c. What is the P-value?
What inventory costing method would you prefer : Assume you own a restaurant. What inventory costing method would you prefer, and why? Also, include a discussion as to whether the costing method.
Find the 60th percentile : A bank's loan officer rates applicants for credit the rating are normally distributed with a mean f 200 and a standard deviation of 50 find the 60th percentile
What salary represents the 15th percentile : What salary represents the 15th percentile? You MUST show what went into the calculator and then your final answer.
Design a web scraping program by python to collect : Design a web scraping program by Python to collect weather forecast report data of a city from a website.
Describe the costs associated with software quality work : Describe the costs associated with software quality work? What practices should software engineers follow to enhance quality of software produced by their team?
Number of deliveries than the second delivery truck : Can we assert at the level of significance a = 0.05 that the first delivery truck on its route makes a larger number of deliveries than the second delivery.
Discuss in detail the role that an ids or ips would play : Discuss in detail the role that an IDS / IPS would play in the IR efforts, and explain how these systems can assist in the event notification.
Identify responsibilities of the decision maker : Consider the impact of the options on the stakeholders (consequences, risks, benefits, harms, costs).Identify responsibilities of the decision maker.

Reviews

Write a Review

Management Information Sys Questions & Answers

  Information technology and the changing fabric

Illustrations of concepts from organizational structure, organizational power and politics and organizational culture.

  Case study: software-as-a-service goes mainstream

Explain the questions based on case study. case study - salesforce.com: software-as-a-service goes mainstream

  Research proposal on cloud computing

The usage and influence of outsourcing and cloud computing on Management Information Systems is the proposed topic of the research project.

  Host an e-commerce site for a small start-up company

This paper will help develop internet skills in commercial services for hosting an e-commerce site for a small start-up company.

  How are internet technologies affecting the structure

How are Internet technologies affecting the structure and work roles of modern organizations?

  Segregation of duties in the personal computing environment

Why is inadequate segregation of duties a problem in the personal computing environment?

  Social media strategy implementation and evaluation

Social media strategy implementation and evaluation

  Problems in the personal computing environment

What is the basic purpose behind segregation of duties a problem in the personal computing environment?

  Role of it/is in an organisation

Prepare a presentation on Information Systems and Organizational changes

  Perky pies

Information systems to adequately manage supply both up and down stream.

  Mark the equilibrium price and quantity

The demand schedule for computer chips.

  Visit and analyze the company-specific web-site

Visit and analyze the Company-specific web-site with respect to E-Commerce issues

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd