Reference no: EM13968124
The specific problem in this project is about the time-series data trend prediction. The specific application scenario is in e-commerce. You are given a real dataset obtained from a real-world e-commerce application where there were 1000 products and 31490 customers (i.e., buyers) who bought these products. Of these 1000 products there are 100 key products (popular products). Also these 1000 products are in 15 categories. The specific data are given in the seven tables and the specific details of these tables are given below. The time window of this dataset is in 119 days with data documentation for each day. Hence, the time unit is one day where the timeline goes from the 0-th day to the 118-th day (17 weeks in total). Now you are asked to do the sale quantity prediction for the 100 key products for each day between the 119-th day and the 146-th day (four weeks).
• buyer_basic_info.txt: the basic attribute information of the buyers; in particular, the column names of this table are "buyer_id", "registration_time", "seller_level", "buyer_level", "age", and "gender". If we do not know the gender of a buyer, we set this buyer's gender attribute as -1.
• buyer_historical_category15_quantity.txt: the consumption quantities in the 15 categories for the buyers; in particular, the column names of this table are "buyer_id", "consumption quantity in the 1st category", ..., and "consumption quantity in the 15th category". The 15 categories are the ones of the products the customers bought in this dataset.
• buyer_historical_category15_money.txt: the consumption amounts in the 15 categories for the buyers; in particular, the column names of this table are "buyer_id", "consumption amount in the 1st category", ..., and "consumption amount in the 15th category".
• product_features.txt: the basic attribute information of the products; in particular, the column names of this table are "product_id", "attribute_1", "attribute_2", and "original price".
• Key_product_IDs.txt: the key product IDs
• trade_info_training.txt: the trade information between the key products and the buyers from the 0-th day to the 118-th day (17 weeks); in particular, the column names of this table are "product_id", "buyer_id", "trade_time", "trade_quantity", and "trade_price".
• product_distribution_training_set.txt: there are 120 columns, where the 1-st column shows the "product_id" and the 2-nd to the 120-th columns show the "quantities" of the key products from the 0-th day to the 118-th day; for example, the element at the 5-th row and the 10-th column in this table shows the quantity of the 5-th product at the 8-th day.
For grade students you are asked to do the prediction for the overall sale quantity of the 100 key products for each day of the four weeks (i.e., for each of the time window from the 119-th day to the 146-th day), and also for each key product for each day of the four weeks.
This phase is for the coding part of the project and concerns with the implementation of a time-series prediction method that you either take from the literature or you have developed by yourself as the result of your research in the first phase.
Please make sure to follow the format requirement as the text output file specified here. The file puts each prediction as one line where the first prediction is for the overall prediction and each subsequent prediction is for a key product. Each prediction output line begins with the key product id where the overall prediction id is 0. There is a space between the prediction and the key product id. Then there is a space between a pair of the predictions of two neighboring days. The prediction lines in the output file begin with the first line as the overall prediction where the product id is 0, and then the first key product prediction with the smallest product id (i.e., 1), all the way to the last line as the prediction for the last key product prediction (i.e., id = 964). Also note that for undergrad students your output file only has one line prediction just for the overall prediction beginning with the product id = 0.
What you need to turn in: you shall turn in a zipped package containing the source code of your implementation of the prediction method with appropriate comments and documentations in the code, a README file to explain how to compile and run your code under what specific environment, and a text file containing the output matrix following exactly the format requirement stated above.
Attachment:- project.zip
Should lennox be liable for the revocation
: Carter was a sales representative for Lennox China, Inc. He was given an exclusive territory that included Colorado, New Mexico, Arizona, and Utah. His contract with Lennox was for 2 years. Carter believed that if he had been allowed to finish his co..
|
What is your ethical obligation
: Virtually any TV show, movie, or song can be downloaded for free on the Internet. Most of this material is copyrighted and was very expensive to produce. Most of it is also available for a fee through such legitimate sites as iTunes. What is your eth..
|
Would there be a difference under common law contracts
: On May 1, you contract orally with Johnny, a salesperson with Keyboards Emporium, to buy for $450 an electric organ for your personal enjoyment with delivery to occur on July 1. On May 15, you ask for delivery on June 1 and Johnny agrees. But deliver..
|
Compensation law to which turner construction appealed fair
: The stakes were high for Gene Elliot, whose on-the-job injuries were estimated to be serious enough to merit at least a $2.4 million settlement. But who should pay for his injuries: Turner Construction or B&C Steel? Or should Elliot be forced to pay ..
|
This phase is for the coding part of the project
: This phase is for the coding part of the project and concerns with the implementation of a time-series prediction method that you either take from the literature or you have developed by yourself as the result of your research in the first phase.
|
Global warming is the increase in the average temperature
: Global warming is the increase in the average temperature of Earth's lower atmosphere over the past several years. A primary focal point is determining what is causing the temperature increase.
|
Consider progress, barriers in relation to sd in a business
: "Consider progress and barriers in relation to SD in a business or industry sector that you know well. What have been the catalysts for change and the obstacles to integration of SD in policy and performance?"
|
Using staff surveys to determine the safety awareness
: Staff surveys are used to provide man agent of a snapshot of the safety culture within an airline at the ground level.the staff surveys have been conducted several years apart and are to be compared. In the paper please ,
|
Assignment artificial intelligence
: All the perceptron questions below must be answered by writing a program in the language of your choice that implements the perceptron algorithm given in class. The program should take as input a FILE in this format:
|