Create document that directly addresses overview of the data

Assignment Help Database Management System
Reference no: EM131672504

Project details:

This project is connected to the Data Wrangling course. You have the choice between two databases for this project: SQL and MongoDB. For an explanation of the differences between these two databases, see this node. There are separate instructions where relevant below for each database choice.

Here's what you should do:

Step One - Complete Programming Exercises

Make sure all programming exercises are solved correctly in the "Case Study: OpenStreetMap Data" Lesson in the course you have chosen (MongoDB or SQL). This is the last lesson in that section.

Step Two - Review the Rubric and Sample Project

The Project Rubric. will be used to evaluate your project. It will need to Meet Specifications for all the criteria listed. Here are examples of what your final report could look like:

SQL Sample Project

MongoDB Sample Project

Step Three - Choose Your Map Area

Choose any area of the world and download a XML OSM dataset. The dataset should be at least 50MB in size (uncompressed). We recommend using one of following methods of downloading a dataset:

Download a preselected metro area from Map Zen.

Use the Overpass API to download a custom square area. Explanation of the syntax can found in the wiki. In general you will want to use the following query:(node(minimum_latitude, minimum_longitude, maximum_latitude, maximum_longitude);<;);out meta; e.g. (node(51.249,7.148,51.251,7.152);<;);out meta; the meta option is included so the elements contain timestamp and user information. You can use the Open Street Map Export Tool to find the coordinates of your bounding box. Note: You will not be able to use the Export Tool to actually download the data, the area required for this project is too large.

Step Four - Process your Dataset

It is recommended that you start with the problem sets in your chosen course and modify them to suit your chosen data set. As you unravel the data, take note of problems encountered along the way as well as issues with the dataset. You are going to need these when you write your project report.

SQL

Thoroughly audit and clean your dataset, converting it from XML to CSV format. Then import the cleaned .csv files into a SQL database using this schema or a custom schema of your choice.

MongoDB

Thoroughly audit and clean your dataset, converting it from XML to JSON format. Then import the cleaned .json file into a MongoDB database.

Hints and Tips - Feel free to adapt the code from the Case Study lesson to help you approach the auditing of your data. It will help your organization by creating a new script for each aspect of your dataset that you audit. Each field that you audit should also include a function that will help you update your dataset.

Step Five - Explore your Database

After building your local database you'll explore your data by running queries. Make sure to document these queries and their results in the submission document described below. See the Project Rubric for more information about query expectations.

Step Six - Document your Work

Create a document (pdf, html) that directly addresses the following sections from the Project Rubric.

Problems encountered in your map

Overview of the Data

Other ideas about the datasets

Try to include snippets of code and problematic tags (see MongoDB Sample Project or SQL Sample Project) and visualizations in your report if they are applicable.

Use the following code to take a systematic sample of elements from your original OSM region. Try changing the value of k so that your resulting SAMPLE_FILE ends up at different sizes. When starting out, try using a larger k, then move on to an intermediate k before processing your whole dataset.

Attachment:- Assignment Files.rar

Reference no: EM131672504

Questions Cloud

Define stage of the policy process : HCS 455 Healthcare Policy - Choose a policy to discuss, and write an overview of the policy and its background Describe how your chosen policy
What would be the average waiting time of students : If SMU can reduce the average advising time to 10 minutes, what would be the average waiting time if 420 students were seen each day?
List similarities and differences between the classic study : What is the research question or hypothesis in this study?Were the measures used in your study reliable and valid? Explain.
Which alternative would be cheaper for smu : If advisors earn $100 per day, which alternative would be cheaper for SMU.
Create document that directly addresses overview of the data : Create a document (pdf, html) that directly addresses the following sections from the Project Rubric. Overview of the Data
Everyday example of measure of productivity : Define productivity. Provide an everyday example of a measure of productivity
Participating in collective bargaining : Supports workers in forming unions and participating in collective bargaining
Calculate the percentage of defective units shipped : Nonfinancial measures of quality and time. For the past two years, Worldwide Cell Phones (WCP) has been working to improve the quality of its phones.
Communication and sensemaking regarding patient safety : Our discussion of communication and sensemaking regarding patient safety issues suggested that

Reviews

len1672504

10/9/2017 2:08:39 AM

This assignment is Part of Udacity Nano degree where I need to submit this project using the map of San Anoint,TX. They have some ind of turnit in program to check if I used someone else code, in that case if they find out that i did they going to kick me out of the program and the scholarship. So please do the project from scratch and if can't do it please inform me. Do not use any other codes from the internet. I need this project to be done by Tuesday using Python 3 via Jupyter notebook. If you do it using SQL that would be better.

len1672504

10/9/2017 2:08:33 AM

Hints and Tips - Feel free to adapt the code from the Case Study lesson to help you approach the auditing of your data. It will help your organization by creating a new script for each aspect of your dataset that you audit. Each field that you audit should also include a function that will help you update your dataset.

len1672504

10/9/2017 2:08:27 AM

You may want to start out by looking at a smaller sample of your region first when auditing it to make it easier to iterate on your investigation. See code in the notes below for how to do this. You can use a small (1-10MB) sample to make sure that your code works, and then an intermediate sample to check for the most common problems to clean. Remember to perform data cleaning when you convert the XML into CSV or JSON format. You won't change the original data file, only the data that you plan on inserting into your database. This is where your earlier organization will pay off, since you can just import the update functions from your auditing scripts into the cleaning and conversion script.

Write a Review

Database Management System Questions & Answers

  Normalize the following table into first normal form

Normalize the following table into first normal form. The table uses one row to record information about each student. A student may take one or more electives. This table is not in 1NF. Normalize this table so it is in 1NF. In your answer, list a..

  Briefly summarize the results of the process

Import your data into IBM SPSS software using your assigned data set. Save the data file for future use, and use IBM SPSS software to compute frequencies on all appropriate variables. Briefly summarize the results of the process in 50 to 70 words

  E-r diagrams are oriented reward which of the data models

Is it possible for two entities to be related to each other in two ways? For example, could entity A be related to entity B both 1:M and 1:1 at the same time?

  If you cannot capture some constraints explain why

Consider the university database from Exercise 2.3 and the ER dia-gram you designed. Write SQL statements to create the corresponding relations and capture as many of the constraints as possible. If you cannot capture some constraints, explain why.

  Build an entity relationship model

Build an entity relationship model for the above scenario. Show all attributes and ndicate all of your key attributes in red.

  Justification for utalizing database management system

What is the justification for utalizing database management system approach

  Explain about where you looking a secure academic database

Explain about where you are looking--a secure academic database? the open Internet? other locations?

  Employment opportunities as an oracle app developer

Employment opportunities as an Oracle App Developer. Use the appropriate template and only submit the portion of the template that services the assignment.

  Draft the requirements for the new web site

Draft the requirements for the new Web Site - questions including all the SQL scripts that you may have created. You also must provide screen shots of specific portions of the work you have done, especially the final results and some important inte..

  Create a form called customers

Create a form called "Customers" that allows the owners to enter data into the CUSTOMER and CUSTOMER RECORD tables. You can choose the design (aesthetics), however, locate an appropriate graphic to include on the form. All controls must be aligned..

  Create two user-defined roles in database adventureworks

Create two user-defined roles in database AdventureWorks that are shown in the following table and assign them the specified permissions for the PURCHASING.Vendor table:

  Functional dependencies present in the table

1. What errors prevent the table displayed above from being first normal form compliant? 2. List all the functional dependencies present in the table.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd