Create a new pandas dataframe of data that will be used

Assignment Help Python Programming
Reference no: EM131555254

Assignment: Python and postgres

In this exercise you will create some data "assets" for use by the XYZ company in direct marketing campaigns. You will download data from the SSCC server to your local computer, create relational database tables that you'll store locally, create a "flat" file with selected customers and variables, and report on purchasing by gender. You'll document your work by providing your commented code. You'll save the new assets you created for future use and for sharing with others.

To do this assignment you'll be using the SSCC and a Postgres server running on it, Python and the pandas package, and the sqlite database. You'll use Python and pandas for data manipulation and reporting.

Getting Your Data from the SSCC

XYZ's data are in a Postgres DB server on the SSCC. To get them, log in to the dornick server. You'll need to do this using a VPN client if you are off campus. Then, connect to the Postgres database server.

The Working with the SSCC pdf document and the SSCC Cheat Sheet pdf on Canvas provide some information about how to connect to the SSCC and to get your data from the Postgres DB.

You'll find three tables in the Postgres DB pilot schema that are named item, mail, and customer. Export each table as a csv file with a header record, using a temporary view to do each one. Using psql is the easiest way to do this. After you are done, delete each temporary view.

Download your three csv files to your computer so that you can work with the data in them using Python. Don't forget to log off from dornick.

You'll find documentation about XYZ's data (attached). Note that like most real world documentation, it's not "perfect." But it is the real thing.

Do These Things

Once you have your csv files on your computer, do the following five (5) things, most of which have "subthings:"

#1) Import each of the csv files you downloaded from the SSCC into a pandas DataFrame.

(a) Provide the code you used to do this.
(b) Print out the column names of your item DataFrame and the first four (4) records in it.
(c) Decribe the data types of the columns in the DataFrame.

Include your commented code for each of the above.

#2) Write each of you pandas DataFrames to a local SQLite DB named xyz.db. Include only data for active buyers in these tables. Verify that you have written the tables to your SQLite DB correctly.

(Commented code, of course.)

#3) Now, using the same data as you used for 2, above, create a new table called custSum that you also write to xyz.db, and that has the following characteristics. This table should have one row per customer record.

(a) Include on each customer's record a binary, Y or N, indicator of whether the customer is a 'heavy buyer,' where the definition of a 'heavy buyer' is a customer whose YTD purchasing in 2009 is greater than 90% of the 2009 YTD purchasing of all customers who are active buyers. Verify your coding of this new variable by crosstabulating it with an indicator of whether their 2009 YTD purchasing is at at least the 90th percentile of all 2009 YTD purchasing.

(b) Add to each customer's record whether the customer has the following credit cards: AMEX, Discover, VISA, and Mastercard, with each credit card variable codes as a Y or a N for yes or no, respectively. Document your creation of these codes by showing how they are related to the code values in the data

(c) Add to each customer's record their estimated HH income, and the genders of adults "1" and "2."

(d) Add to each customer's record their ZIP code and ZIP+4 code.

(e) Be sure to include the account number on each record in the SQL tables you create so that the tables can be related to each other, later.

(f) Provide a count of the number of records in each table.

(g) Verify that you have written this table to your SQLite DB correctly.

(Don't forget to comment your code so that a reader can understand what it is supposed to do.)

#4) Create a new pandas DataFrame of data that will be used for target maketing and write it out to a headered csv file.

(a) This DataFrame should have one row per customer. The customers included should be active buyers or lapsed buyers.

(b) The row for each customer should include the customer's account identifier, and an indicator variable (Y/N, or 1/0) for each product category the customer has made at least one purchase in.

(c) Include for each customer their buyer status, and the total dollar amount of the purchases they have made from XYZ using all data available for him or her.

(d) Write your DataFrame to a csv file, and also store it in a shelve database.

(e) Verify that the files you wrote your customer DataFrame to were written correctly.

(Commented code, of course.)

#5) Report the six most frequently purchased product categories by the gender of "adult 1" using the data for the active customers. Include for these categories the total spend in dollars on each category, the total number of products purchased in these categories, and the number of adults in each gender category.

(Be sure to comment your code.)

Your Deliverables

Provide the above in up to six (6) pages, but in no more than 7 pages, in a pdf file. Be sure that everything is readable. Address each of the five above parts in turn. Do 1 by providing your commented code and results. Then do 2 providing code + results, and so on.

Do not provide a list of code for all of the above items in a block, followed by the results of your code in a block. An assignment organized in this way will be returned ungraded. Be sure all of your code is syntactically correct, and that it approximates good Python coding style.

Reference no: EM131555254

Questions Cloud

Create a table or matrix to perform evaluation comparison : Create a table or matrix to perform your evaluation comparison. describe in detail the evaluation method that you plan to use to compare and contrast 3 options.
What computations with q and r will produce the solution : Provide a glimpse of some widely used matrix factorizations, some of which are discussed later in the text.
Testing the advertising claims of a tire manufacturer : You work for a consumer watchdog publication and are testing the advertising claims of a tire manufacturer.
How the issue might affect the role or duty of the police : Identify how the issue might affect the role or duty of the police, judiciary, or corrections.
Create a new pandas dataframe of data that will be used : Provide a count of the number of records. Create a new pandas DataFrame of data that will be used for target maketing and write it out to a headered csv file.
Describe the functions or commands of a matrix program : For block operations, it may be necessary to access or enter submatrices of a large matrix.
How a single court ruling can have systemic effects : Select a Supreme Court case from the list provided below, outline details of the case and its deposition, and trace the ruling's repercussions .
Describe the commands or operations of the matrix program : Suppose memory or size restrictions prevent a matrix program from working with matrices having more than 32 rows and 32 columns.
What are the most important features that you would look : Explain why is this critical. What are the most important features (at least 5) that you would look for in a tool to capture evidence correctly?

Reviews

Write a Review

Python Programming Questions & Answers

  Write a python program to implement the diff command

Without using the system() function to call any bash commands, write a python program that will implement a simple version of the diff command.

  Write a program for checking a circle

Write a program for checking a circle program must either print "is a circle: YES" or "is a circle: NO", appropriately.

  Prepare a python program

Prepare a Python program which evaluates how many stuck numbers there are in a range of integers. The range will be input as two command-line arguments.

  Python atm program to enter account number

Write a simple Python ATM program. Ask user to enter their account number, and print their initail balance. (Just make one up). Ask them if they wish to make deposit or withdrawal.

  Python function to calculate two roots

Write a Python function main() to calculate two roots. You must input a,b and c from keyboard, and then print two roots. Suppose the discriminant D= b2-4ac is positive.

  Design program that asks user to enter amount in python

IN Python Design a program that asks the user to enter the amount that he or she has budget in a month. A loop should then prompt the user to enter his or her expenses for the month.

  Write python program which imports three dictionaries

Write a Python program called hours.py which imports three dictionaries, and uses the data in them to calculate how many hours each person has spent in the lab.

  Write python program to create factors of numbers

Write down a python program which takes two numbers and creates the factors of both numbers and displays the greatest common factor.

  Email spam filter

Analyze the emails and predict whether the mail is a spam or not a spam - Create a training file and copy the text of several mails and spams in to it And create a test set identical to the training set but with different examples.

  Improve the readability and structural design of the code

Improve the readability and structural design of the code by improving the function names, variables, and loops, as well as whitespace. Move functions close to related functions or blocks of code related to your organised code.

  Create a simple and responsive gui

Please use primarily PHP or Python to solve the exercise and create a simple and responsive GUI, using HTML, CSS and JavaScript.Do not use a database.

  The program is to print the time

The program is to print the time in seconds that the iterative version takes, the time in seconds that the recursive version takes, and the difference between the times.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd