Reference no: EM132930217
Part 1
Tasks for LOMA [Review LOMA Tutorial 6, 7]
Total Hotel Charges
1. You are assigned to work out the total charges (inclusive of GST and service charge) for one booking in the dataset.
The booking number you are assigned to is based on the last three digits of your matriculation number + 1000 (E.g. if your matric number is 2112345A, then booking number assigned = 345
+ 1000 = 1345).
Based on this booking (row) number, look up for the average daily rate (adr), stay_in_weekend_nights and stay_in _week_nights for that row in the data file.
Using a calculator, compute the total charges inclusive of GST and service charges as follows :
(i) Length of stay (los) = stay_in_weekend_nights + stay_in _week_nights;
(ii) Total before GST and service charges = average daily rate (adr) * length of stay (los);
(iii) Multiply (ii) above by 1.07 to get Total charges inclusive of GST;
(iv) Multiply (iii) above by 1.1 to get Total charges inclusive of GST and service charges. Truncate your answer to an integer.
Be sure to show your step-by-step working.
2. The government is planning to hike the GST to k% by 2025. The value of k is based on the following formula :
A(x) = Γ e(3s+10)/10 ‾| mod 8
B(y) = log200 (2y2 + 8)
k = |_ A(x) + B(y) + 7 _|
Using the last two digits of your matriculation number to represent x and y respectively, find the value of k using a calculator (E.g. if your matric number is 2112345A, then use x = 4 and y = 5). Be sure to show your step-by-step working.
Re-compute the total charges inclusive of GST and service charges based on k (new GST rate). Truncate your answer to an integer.
3. Do you think :
a. Function A : Z → Z defined by A(x) = Γ e(3s+10)/10 ‾|mod 8 is a one-to-one function?
b. Function B : Z → R defined by B(y) = log200 (2y2 + 8) is an onto function?
4. Explain whether X or Y has a greater influence on this function :
k = ? (Γ e(3s+10)/10 ‾| mod 8 ) + log200 (2y2 + 8) + 7?
Tasks for DAVA [Review DAVA Practical 1 to 4]
Truth or Myth?
Listed below are some conventional beliefs (or are they just myths?) about hotel bookings. They may not be true, and it is up to you to prove / disprove them, using the data that we have.
1. The longer the lead time, the cheaper the average daily rate (adr).
2. Bookings that have deposit type that are non-refundable usually have short lead time as people are not so willing to commit for non-refundable deposits.
3. The data for average daily rate (adr) are normally distributed.
4. In terms of market segment, "Online TA/TO" has the highest average lead time as well as the highest proportion of bookings.
5. The higher the number of children, the higher the average for total_of_special_requests. This applies to all customer types.
6. Resort hotels are cheaper than city hotels.
7. The best time of the year to get the cheapest average daily rate (adr) is in January.
8. The longer the stay for weekday nights, the cheaper the average daily rate (adr).
Select any THREE of the beliefs listed above. Using KNIME Analytics Platform and what you have learnt in DAVA Practical 1 - 4, support or counter (not support) your selected beliefs.
- State any assumptions made.
- Provide your observations of the chart/table.
- State your conclusion clearly (support / not support the beliefs).
- Provide screenshots to substantiate your point.
PART 2
Tasks for LOMA [Review LOMA Tutorial]
It's Staycation Time
COVID-19 has had a significant impact on the hotel industry. With the decline in international air travel, hotels have been gearing up to entice locals to take staycations instead. Besides attractive staycation packages that offer value-added experience such as cooking classes, local tours and spa retreats, some hotels are also rolling out free upgrades, complimentary parking and dining credits.
1. You are assigned to work out the dining credits for one booking in the dataset. The booking number you are assigned to is based on the last three digits of your matriculation number + 1000 (E.g. if your matric number is 2112345A, then booking number assigned = 345 + 1000 = 1345).
Based on this booking (row) number, look up for the average daily rate (adr), stay_in_weekend_nights and stay_in _week_nights for that row in the data file.
The length of stay for that booking = stay_in_weekend_nights + stay_in _week_nights. Let p = Length of stay, subject to a maximum of 5 days. Compute p.
Compute the dining credits using the formula below :
Dining Credits = adr * (0.5 + ΣPk=1 1/5k)
You can use the calculator. However, be sure to show your step-by-step working.
2. Make the change in variables by writing the function for dining credits to be in terms of j, where j = k + 3.
Recycling Savings
A hotel produces 1500 tonnes of waste per year, and each tonne of waste incur a refuse disposal fee of $400. This means (1500 * $400) = $600,000 is spent on refuse disposal fees per year.
To ramp up environmental and sustainability drive, the hotel is embarking on a waste reduction exercise by segregating common recyclables such as carton boxes, plastic packaging, and newspapers, and operating food waste digesters. By doing so, the hotel hopes to achieve a recycling rate of 6% every year. This means a savings of 6% * 1500 * $400 = $36000 every year.
If the recycling savings is deposited into a bank account every year, the hotel will earn an interest of 3% per year on the accumulated savings from the previous year. On top of that, there will be fresh recycling savings generated from each subsequent year. We will asume that the fresh savings for each year remain flat at $36000.
3. What would be the accumulated saving at the end of Year 3? Write down your answers without any rounding.
4. Express the accumulated savings as a recursive formula.
5. Using the above recursive formula, find out how long it will take the hotel to hit $250,000 in accumulated savings.
6. Express the accumulated savings as an explicit formula.
7. Using the above explicit formula, find the accumulated savings at the end of 34 years.
Tasks for DAVA [Review DAVA Practical 5 - 9]
It's Data Prep Time
Perform the following data preparation work:
1. Derive New Columns
a) Derive a new column called length_of_stay using Math Formula node : length_of_stay = stay_in_weekend_nights + stay_in _week_nights
b) Derive a new column (field) called kids using Math Formula node : kids = children + babies
c) Derive a new column called guest_type as follows:
d) Derive a new column called guest_type_num as follows:
Right columns : Ensure values are numeric. Output columns should not have any zeros.
Note
• Left columns : When typing, do be careful about the capital and small letters.
• A ppend each new variable as a new column. D o not replace the original columns.
2. Data Cleaning
a) Convert the columns agent and company to string type as they represent IDs.
b) There are some missing values in the data. Decide what you want to do for each case. Execute the handling of missing values.
Justify why you chose the method for each case.
c) Remove the following columns (fields) to reduce redundancy and noise in the data :
• arrival_date_year
• stay_in_weekend_nights
• stay_in_week_nights
• adults
• children
• babies
• previous_cancellations
• previous_bookings_not_canceled
• assigned_room_type
• booking_changes
• day_in_waiting_list
• reservation_status
• reservation_status_date
• kids
d) Are there any other opportunities for improving the quality of the data?
Note
• Provide screenshots of the configuration of the nodes used.
• Provide screenshots of the before and after effects of your data preparation work.
Hotel Booking Analytics - Clustering or Regression?
3. Select ONE of the following options :
Option 1 (Clustering)
Find out if the features in the hotel booking data can be used to group the listings into different groups, each corresponding roughly to total_of_special_requests.
a) Perform K-means clustering to create 5 clusters based on their features (except total_of_special_requests column). Document your steps.
Determine if there is a high degree of correspondence between the clusters and the total_of_special_requests. Explain your findings.
b) Select any TWO of the 5 clusters and bring out the profile characteristics of each cluster. Highlight any difference between the 2 selected clusters.
Option 2 (Regression)
Find out if the features in hotel booking data can be used to predict total_of_special requests.
a) Perform linear regression with total_of_special_requests as the target, and using all numeric features. Do n ot use the features that are of s tring types.
Document your steps.
Write down the regression equation.
b) Which are the important features that can determine the predicted values for total_of_special_requests?
Comment on the accuracy of your regression model. What did you use to justify that? Do you have any other findings using Power BI?
Attachment:- Review LOMA.rar