Determine a possible investment portfolio

Assignment Help Other Subject
Reference no: EM133096418

Analysis of Financial Indicators Using Clustering

Objective: The objective of this exercise is to segment stocks by various factors to determine a possible investment portfolio.

Activities:
• Import and prepare data
• Apply data mining algorithms
• Configure predictive models
• Create data visualizations
• Analyze and interpret output from models
• Publish results

Scenario

You have been investing in savings and stocks for several years. You have invested wisely and the return on your investment is slightly above the average investor's. However, you believe that by using analytical techniques that you developed in school, you should be able to create an investment portfolio that will provide better than average returns. Because you have limited funds to invest, you want to target the stocks that will help you create a strong portfolio that will meet your investment goals. You scraped some performance indicators from a highly reliable financial website and downloaded it into a
.csv file which you have since converted to Excel.

The data you acquired includes several attributes and measures on which to base your analysis and there are 7,112 rows of data so you realize that you will need to segment the data based on various attributes to narrow down your search for the ideal portfolio given your constraints. At least initially, you would like to focus on stock price, invested capital, and total debt.

Cluster Analysis

Given a dataset, organizing it into meaningful groups is a basic and useful approach to data mining and data analysis. Clustering classifies samples into groups using a measure of association so that data points within a group are similar. Data points from different groups are not similar. Data points are multidimensional, that is they consist of several variables. Visualization is not practical for humans when datasets consist of more than three dimensions.

The input to a clustering exercise is a dataset and the number of clusters. The result of the analysis is a set of clusters. K-means clustering is a method of finding clusters and their centers (R) given a choice in the number of clusters (K). It is often used for market segmentation. The goal is to make the inter-cluster difference (distance) high and the intra-cluster difference (distance) low.
To build an analysis for segmentation analysis, proceed as follows:

1. Open the file FinancialIndicators.xlsx (Hands-on_3_FinancialIndicator.xlsx) and explore its contents. Notice that there are plenty of variables to choose from for segmentation.

2. Close Excel.

3. Launch SAP Predictive Analytics.

4. Click Expert Analytics ? Expert Analytics

5. From the menu, choose File ? New.

6. In the New Dataset window choose Excel. Next

7. Search for the FinancialIndicators.xlsx file provided to you and open.

8. The first row is the header data. Check to see that 7,112 rows of data have been acquired. Create.

9. Switch to the Prepare panel.

10. Notice that some of the columns of the spreadsheet have come through as measures and the default aggregation is SUM. It does not make sense to add up Stock Price. Click on the cog next to the Stock Price measure and change the aggregation method to Average.

11. Switch to the Predict panel.

12. From the Algorithms tab (on the right side, within Components panel), drag and drop or double click the R-K-Means algorithm into your analysis.

13. The algorithm component is automatically connected to the data source component.

14. Hover over the R-K-Means algorithm and either click on the cog or choose Configure Settings (on the right).

15. In the R-K-Means properties dialog box, provide the necessary details:
a. In the Number of Clusters field, enter 12.
b. Select stock price, invested capital, and total debt to be used for the cluster analysis.
c. Retain the default values for the advanced properties.
d. Choose Done.

16. From the Data Writers tab, drag and drop or double click on the CSV Writer component.

17. Configure Settings of the CSV data writer.
a. In the CSV Writer Configure Settings, select a CSV file to store the result (use Browse and give the file a name).
b. Chose Done.

18. Click Run to run the analysis

19. You should receive a succeeded message. OK

20. You are now in the Results Grid view.

21. Switch to the Summary view to see the results in Figure 3.

22. You can see the center coordinates of the clusters. Also the size of each cluster which is the number of stocks in each cluster.

23. Results visualization and interpretation...
a. In the Cluster Representations pane, select Cluster Distribution.
i. You see a chart of cluster size vs cluster number, (Figure 4). These are the number of stocks in each cluster. You can roll over the bars to see the number.
ii. Stocks within a cluster are similar to each other and dissimilar to all other stocks in other clusters.
b. In the Cluster Representations pane, select Cluster Density and Distance.

i. You see that cluster 3 in Figure 5 has the lowest/weakest density and cluster 12 in the same figure has the highest. Low density clusters imply clusters of noise, outliers, or other loosely associated data. The distance shows how dissimilar the clusters are.
b. In the Cluster Representations pane, select Cluster Density and Distance.

i. You see that cluster 3 in Figure 5 has the lowest/weakest density and cluster 12 in the same figure has the highest. Low density clusters imply clusters of noise, outliers, or other loosely associated data. The distance shows how dissimilar the clusters are.

c. In the Cluster Representations pane, select Cluster Center Representation.
i. You see a radar chart of the cluster centers (radar axes are the variables); you can change the cluster number in the Data panel. Notice in Figure 7 that the average stock price in cluster 6 is much higher than that of other clusters.

c. In the Cluster Representations pane, select Parallel Coordinate Chart.
i. The axes are all normalized. Parallel lines between the axes imply a positive relationship between the two dimensions. Intersecting lines imply a negative relationship.

d. In the Cluster Representations pane, select Scatter Matrix Charts.
i. You see the scatter charts of store clusters plotted between various pairs of dimensions

24. The fitted results are stored in the CSV file. You can open the saved csv file and explore the 12 clusters that have been generated or you can explore further with visualizations

25. From the File menu, select Save.

26. Enter a name for the document.

27. Choose Save.

Analysis

28. Switch to the Visualize panel.

29. From the dropdown on Select Analysis shown in Figure 10, choose Analysis 1

30. Select Component R-K Means.

31. Create a column chart with Stock Price on the Y Axis and ClusterNumber as the dimension. Notice that Cluster 6's stock price is much higher than others. Since we have limited funds available, we will not want to purchase any stocks in cluster 6. Filter cluster 6 from your column chart. Sort by stock price.

32. Create appropriate visualizations to answer the following questions.

Question 1: Which cluster contains those stocks with the lowest average stock price? What is the average stock price in this cluster?

Question 2: Excluding the largest cluster (cluster 6 in this example), which is the cluster with the highest average stock price? Of that cluster, what company has the highest average stock price and what is that price? What company has the lowest?

Question 3: Continue to examine the cluster from question 2. What do your observations tell you regarding total debt and invested capital for the companies in this cluster?

Question 4: What other observations can you make about the clusters in this analysis? (Provide at least 3 with their associated visualizations and justifications for your observations. Why are these observed relationships important?)

Attachment:- Workshop Clustering Analysis.rar

Reference no: EM133096418

Questions Cloud

Record the transactions on the books of Pharoah Company : The cost of the merchandis sold is $2,690. Pharoah paid the freight charges of $115. Record the transactions on the books of Pharoah Company
What training method do you recommend for bloomingdale : What training method do you recommend for Bloomingdale's? Explain the method and how it meets Bloomingdale's expectations for safety training.
Introductory course in human resources management : Amy and Glenn have many part-time employees in their grocery store, primarily in the 18 to 25 years age range. They are considering providing a modest benefits
Explaining the need to make some improvement : A short introduction explaining the need to make some improvement.
Determine a possible investment portfolio : The objective of this exercise is to segment stocks by various factors to determine a possible investment portfolio
How will you motivate the current team : Six months into the project, the client reviews the progress and issues a stop-work order. The main issues identified during their review:
Indirect discrimination on a daily basis : Do you think that there is indirect discrimination on a daily basis? Why or why not?
Why are workers organizing : Why are workers organizing? Include at least three reasons. What is the union campaign and election procedure?
How would you conduct a scrutiny gap for a movie theater : How would you conduct a scrutiny gap for a movie theater? (Based on describing the expectation versus the actual outcome of outsourcing new candidates.)

Reviews

Write a Review

Other Subject Questions & Answers

  Cross-cultural opportunities and conflicts in canada

Short Paper on Cross-cultural Opportunities and Conflicts in Canada.

  Sociology theory questions

Sociology are very fundamental in nature. Role strain and role constraint speak about the duties and responsibilities of the roles of people in society or in a group. A short theory about Darwin and Moths is also answered.

  A book review on unfaithful angels

This review will help the reader understand the social work profession through different concepts giving the glimpse of why the social work profession might have drifted away from its original purpose of serving the poor.

  Disorder paper: schizophrenia

Schizophrenia does not really have just one single cause. It is a possibility that this disorder could be inherited but not all doctors are sure.

  Individual assignment: two models handout and rubric

Individual Assignment : Two Models Handout and Rubric,    This paper will allow you to understand and evaluate two vastly different organizational models and to effectively communicate their differences.

  Developing strategic intent for toyota

The following report includes the description about the organization, its strategies, industry analysis in which it operates and its position in the industry.

  Gasoline powered passenger vehicles

In this study, we examine how gasoline price volatility and income of the consumers impacts consumer's demand for gasoline.

  An aspect of poverty in canada

Economics thesis undergrad 4th year paper to write. it should be about 22 pages in length, literature review, economic analysis and then data or cost benefit analysis.

  Ngn customer satisfaction qos indicator for 3g services

The paper aims to highlight the global trends in countries and regions where 3G has already been introduced and propose an implementation plan to the telecom operators of developing countries.

  Prepare a power point presentation

Prepare the power point presentation for the case: Santa Fe Independent School District

  Information literacy is important in this environment

Information literacy is critically important in this contemporary environment

  Associative property of multiplication

Write a definition for associative property of multiplication.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd