Explain how did you evaluate the classifier

Reference no: EM133003268

MITS5509 Intelligent Systems for Analytics

Carefully read the following two questions and provide the appropriate answer.

Question 1:
The bankruptcy-prediction problem can be viewed as a problem of classification. The data set you will be using for this problem includes one ratio that have been computed from the financial statements of real-world firms. These ratios have been used in studies involving bankruptcy prediction. The first sample (training set) includes 68 data value on firms that went bankrupt and firms that did not. This will be your training sample. The second sample (testing set) of 68 firms also consists of some bankrupt firms and some non-bankrupt firms. Your goal is to use different classifiers to build a training model, by randomly selecting the 40 data points (20 points from category 1 and 20 points from category 0), and then test its performance on the testing model by randomly selecting 40 data points from the testing set. (Try to analyze the new cases yourself manually before you run the neural network and see how well you do.)

Students must use the following classifiers. The selection of the classifiers depends upon the members of the group, e.g. if the group has four members then they will use the four classifiers from the following five classifiers.
1. Neural network
2. Support vector machine
3. Nearest neighbor algorithm
4. Decision tree
5. Naive Bayes

The following tables show the training sample and test data you should use for this major assignment.

Training Sample Data

Firm	WC	Category
1	309.577	1
2	363.79	1
3	341.399	1
4	363.616	1
5	323.673	1
6	323.353	1
7	350.371	1
8	240.602	1
9	220.057	1
10	287.837	1
11	274.6	1
12	278.494	1
13	234.267	1
14	284.923	1
15	190.62	1
16	327.76	1
17	211.94	1
18	373.571	1
19	219.891	1
20	193.489	1
21	204.333	1
22	205.657	1
23	362.361	1
24	285.562	1
25	352.649	1
26	400.44	1
27	307.301	1
28	240.314	1
29	322.995	1
30	408.197	1
31	209.027	1
32	198.979	1
33	340.418	1
34	320.154	1
35	189.826	0
36	651.65	0
37	487.494	0
38	254.899	0

39	575.646	0
40	160.712	0
41	269.729	0
42	513.301	0
43	1996.866	0
44	683.512	0
45	377.246	0
46	289.579	0
47	171.851	0
48	205.39	0
49	203.593	0
50	365.159	0
51	266.962	0
52	461.943	0
53	215.392	0
54	235.794	0
55	881.477	0
56	463.897	0
57	475.693	0
58	540.01	0
59	612.817	0
60	140.277	0
61	396.541	0
62	271.185	0
63	507.039	0
64	733.641	0
65	612.455	0
66	499.495	0
67	290.715	0
68	171.447	0

Testing Sample Data

Firm	WC
1	367.325
2	347.513
3	330.226
4	178.106
5	378.899
6	257.212
7	333.088
8	182.324
9	238.099
10	329.643
11	294.644
12	281.666
13	308.086
14	317.079
15	245.139
16	354.662
17	292.256
18	306.79
19	222.396
20	367.628
21	342.115
22	353.326
23	336.39
24	298.008
25	266.396
26	243.554
27	172.184
28	362.479
29	249.981
30	327.877
31	286.696
32	182.762
33	338.347
34	302.57
35	299.651
36	247.595
37	339.311
38	366.139

39	398.295
40	205.129
41	371.419
42	175.406
43	476.159
44	359.144
45	315.97
46	329.629
47	399.552
48	442.799
49	255.405
50	408.036
51	497.195
52	249.674
53	292.026
54	481.193
55	394.76
56	273.175
57	311.517
58	238.067
59	292.459
60	2010.227
61	637.604
62	379.869
63	268.318
64	416.08
65	377.011
66	355.757
67	319.223
68	240.423

From the above data set, the group has to prepare a report which include the followings:

1. Explain the process of building each classifier using the training set (add the screenshots).

2. Explain how did you evaluate the classifier.

3. Create the confusion matrix based on 70% (training) / 30% (testing).

4. Predict the category of the values (any random 40 values) in table used for Testing set.

5. Compare the results between the different classifiers and discuss which one is the best and why.

Note: Students can use any open source free data mining software such as Python, Statistica Data Miner, Weka,RapidMiner, KNIME and MATLAB etc.

Question 2:

Create a DASHBOARD. For creating a dashboard, the group can use the above database or any other database. The group has to prepare a report which include the followings:

1. Write an introduction about the dataset used and add the reference (link).

2. Create at least four figures (different graphs) and add them to dashboard.

3. Add Screenshot of each of the steps.

4. Describe the figures in the dashboard.

The student can use any software to create the dashboard such as Microsoft excel, Power BI, Tableau, etc.

The above list of documents is not necessarily in any order. The chronological order we cover these topics in lectures is not meant to dictate the order in which you collate these into one coherent document for your assignment.

Your report must include a Title Page with the title of the Assignment and the name and ID numbers of all group members. A contents page showing page numbers and titles of all major sections of the report. All Figures included must have captions and Figure numbers and be referenced within the document.

Captions for figures placed below the figure, captions for tables placed above the table. Include a footer with the page number. Your report should use 1.5 spacing with a 12 point Times New Roman font.

Include references where appropriate. Citation of sources (if using any) is mandatory and must be in the APA style.

Attachment:- Intelligent Systems for Analytics.rar

Reference no: EM133003268

Questions Cloud

What account would we credit when we journalize this entry : Question - On February 1, we purchased supplies on account for $500. What account would we credit when we journalize this entry

What amount should itss report as a liability : Total payment to retailers as at 12/31/Year 10 $440,000. What amount should ITSS report as a liability for unredeemed coupons at December 31, Year 10

Complete the T accounts providing the amounts : Complete the T accounts providing the amounts of a) to l) to show the flow of costs through the company's manufacturing accounts

What is Peter basis in his ownership interest at year-end : His only charitable contribution is $50 that he gave to the Red Cross. He itemizes all of his deductions. What is Peter basis in his ownership interest

Explain how did you evaluate the classifier : Explain the process of building each classifier using the training set and Compare the results between the different classifiers and discuss which one is best

Journal of the settlement of bonds payable at maturity : On April 1, 2010 Gempita Company issued and sold 14,000 bonds with a nominal value of Rp 20,000 per share, Journal of settlement of Bonds Payable at maturity

Prepare Acquisition analysis as at July : As at 1 July 2020, Andrews Ltd. had a dividend payable liability of $30 000. Prepare Acquisition analysis as at 1 July 2020

What is the most fundamental assumption : What is the most fundamental assumption regarding related party transactions that gives rise to the extensive disclosure requirements

What is the financial potential of a business : What is the financial potential of a business? When considering business and personal objectives, what should clients' files be analysed for

User Account

All Pages