Create a new table called custsum that you also write to xyz

Assignment Help Python Programming
Reference no: EM131564517

Assignment

In this exercise you will create some data "assets" for use by the XYZ company in direct marketing campaigns. You will download data from the SSCC server to your local computer, create relational database tables that you'll store locally, create a "flat" file with selected customers and variables, and report on purchasing by gender. You'll document your work by providing your commented code. You'll save the new assets you created for future use and for sharing with others.

To do this assignment you'll be using the SSCC and a Postgres server running on it, Python and the pandas package, and the sqlite database. You'll use Python and pandas for data manipulation and reporting.

Getting Your Data from the SSCC

XYZ's data are in a Postgres DB server on the SSCC. To get them, log in to the dornick server. You'll need to do this using a VPN client if you are off campus. Then, connect to the Postgres database server.

The Working with the SSCC pdf document and the SSCC Cheat Sheet pdf on Canvas provide some information about how to connect to the SSCC and to get your data from the Postgres DB.

You'll find three (3) tables in the Postgres DB pilot schema that are named item, mail, and customer. Export each table as a csv file with a header record, using a temporary view to do each one. Using psql is the easiest way to do this. After you are done, delete each temporary view.

Download your three csv files to your computer so that you can work with the data in them using Python. Don't forget to log off from dornick.

You'll find documentation about XYZ's data (attached). Note that like most real world documentation, it's not "perfect." But it is the real thing.

Do These Things

Once you have your csv files on your computer, do the following five (5) things, most of which have "subthings:"

#1) Import each of the csv files you downloaded from the SSCC into a pandas DataFrame.

(a) Provide the code you used to do this.
(b) Print out the column names of your item DataFrame and the first four (4) records in it.
(c) Decribe the data types of the columns in the DataFrame.

Include your commented code for each of the above.

#2) Write each of you pandas DataFrames to a local SQLite DB named xyz.db. Include only data for active buyers in these tables. Verify that you have written the tables to your SQLite DB correctly.

(Commented code, of course.)

#3) Now, using the same data as you used for 2, above, create a new table called custSum that you also write to xyz.db, and that has the following characteristics. This table should have one row per customer record.

(a) Include on each customer's record a binary, Y or N, indicator of whether the customer is a 'heavy buyer,' where the definition of a 'heavy buyer' is a customer whose YTD purchasing in 2009 is greater than 90% of the 2009 YTD purchasing of all customers who are active buyers. Verify your coding of this new variable by crosstabulating it with an indicator of whether their 2009 YTD purchasing is at at least the 90th percentile of all 2009 YTD purchasing.

(b) Add to each customer's record whether the customer has the following credit cards: AMEX, Discover, VISA, and Mastercard, with each credit card variable codes as a Y or a N for yes or no, respectively. Document your creation of these codes by showing how they are related to the code values in the data

(c) Add to each customer's record their estimated HH income, and the genders of adults "1" and "2."

(d) Add to each customer's record their ZIP code and ZIP+4 code.

(e) Be sure to include the account number on each record in the SQL tables you create so that the tables can be related to each other, later.

(f) Provide a count of the number of records in each table.

(g) Verify that you have written this table to your SQLite DB correctly.

(Don't forget to comment your code so that a reader can understand what it is supposed to do.)

#4) Create a new pandas DataFrame of data that will be used for target maketing and write it out to a headered csv file.

(a) This DataFrame should have one row per customer. The customers included should be active buyers or lapsed buyers.

(b) The row for each customer should include the customer's account identifier, and an indicator variable (Y/N, or 1/0) for each product category the customer has made at least one purchase in.

(c) Include for each customer their buyer status, and the total dollar amount of the purchases they have made from XYZ using all data available for him or her.

(d) Write your DataFrame to a csv file, and also store it in a shelve database.

(e) Verify that the files you wrote your customer DataFrame to were written correctly.

(Commented code, of course.)

#5) Report the six (6) most frequently purchased product categories by the gender of "adult 1" using the data for the active customers. Include for these categories the total spend in dollars on each category, the total number of products purchased in these categories, and the number of adults in each gender category.

(Be sure to comment your code.)

Your Deliverables

Provide the above in up to six pages, but in no more than 7 pages, in a pdf file. Be sure that everything is readable. Address each of the five above parts in turn. Do 1 by providing your commented code and results. Then do 2 providing code + results, and so on.

Do not provide a list of code for all of the above items in a block, followed by the results of your code in a block. An assignment organized in this way will be returned ungraded. Be sure all of your code is syntactically correct, and that it approximates good Python coding style.

Reference no: EM131564517

Questions Cloud

Market failures and externalities : Prepare Three to Five page economic paper on Market Failures and Externalities. You have the opportunity to include a minimum.
Leasing is very common practice with expensive : Leasing is a very common practice with expensive, high-tech equipment.
Entering a newly established perfectly competitive market : A businessman is trying to discourage his partner from entering a newly established perfectly competitive market arguing that such a market even if profitable.
Steadily increasing cost of college tuition : Due in part to the economy and in part to the steadily increasing cost of college tuition, some schools are considering dropping classes.
Create a new table called custsum that you also write to xyz : create a new table called custSum that you also write to xyz.db, and that has the following characteristics. This table should have one row per customer record.
Anticipate paying taxes for the next several years : Assume that your company does not anticipate paying taxes for the next several years. What is the NAL of the lease?
What was the maximum height of helium-filled balloon rose : A helium-filled balloon rose 120 ft in 1.0 min. Each minute after that, it rose 75% as much as in the previous minute. What was its maximum height?
Dylan enjoys ramen noodles and music downloads : Dylan enjoys ramen noodles and music downloads. His monthly budget for these two items is 16 dollars. If the price of a package of ramen noodles is $1.42.
How far does the bicycle travel in coasting to a stop : A bicyclist traveling at 10m/s then coasts to a stop as the bicycle travels 0.90 as far each second as in the previous second.

Reviews

Write a Review

Python Programming Questions & Answers

  Write a python script called passmerger that merge contents

CSCI 3351: Write a Python script called passmerger that will merge the contents of two simplified user account tables in a database as described below. The script uses one command line argument that represents the name of the database file.

  User to do basic trigonometry functions

Create a program that allows the user to do basic trigonometry functions. First, ask the user if he or she would like to do sine, cosine, or tangent (sin, cos, tan).

  Calculate the accuracy of your linear classifier

Plot X,Y and the decision boundary. Make sure that you use a good plotting technique so that it is easy to distinguish which datapoint is X and which is Y. Calculate the accuracy of your linear classifier.

  Write python code that takes word and converts to pig latin

Write Python code that takes a word and converts it to Pig Latin. If the input consists of multiple words or contains punctuation, your code should print a suitable error message

  Calculate the problem and stop after the condition

Write a program that will calculate the problem and stop after the condition has been met.

  Write a program to receive a series of numbers

Write a program to receive a series of numbers (including decimal) from the user until enter key is pressed. Process the input data and display number count, sum and average. Use proper data type and format.

  Write program that is capable of generating a set words

You will write a Python program that is capable of generating a likely set of completion words given the start of a word as input to the program.

  Determines number of occurrences of a given letter in string

In this last checkpoint, you will write a new program that determines the number of occurrences of a given letter in a string. To do so, you are only allowed to use the replace() and len() functions.

  Write a program that produces a comparison of the balance

Write a program that produces a comparison of the balance of a bank account over time between simple, monthly and daily methods of compounding interest.

  Develop and test a python program

develop and test a python program to track student marks in the various assignments in a given unit of study. The time allocated for this project is 10 weeks.

  Email spam filter

Analyze the emails and predict whether the mail is a spam or not a spam - Create a training file and copy the text of several mails and spams in to it And create a test set identical to the training set but with different examples.

  Displays the charges for a patients hospital stay

Write a program that computes and displays the charges for a patient's hospital stay. First, the program should ask patient name and if the patient was admitted as an in-patient or an out-patient.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd