Example calculation of entropy, Computer Engineering

Assignment Help:

Example Calculation:

If we see an example we are working with a set of examples like S = {s1,s2,s3,s4} categorised with a binary categorisation of positives and negatives like that s1  is positive and the rest are negative. Expect further there that we want to calculate the information gain of an attribute, A, and  A can take the values {v1,v2,v3} obviously. So lat in finally assume that as: 

1745_Example Calculation of Entropy.png

Whether to work out the information gain for A relative to S but we first use to calculate the entropy of S. Means that to use our formula for binary categorisations that we use to know the proportion of positives in S and the proportion of negatives. Thus these are given such as: p+ = 1/4 and p- = 3/4. So then we can calculate as: 

Entropy(S) = -(1/4)log2(1/4) -(3/4)log2(3/4) = -(1/4)(-2) -(3/4)(-0.415) = 0.5 + 0.311

= 0.811 

Now next here instantly note that there to do this calculation into your calculator that you may need to remember that as: log2(x) = ln(x)/ln(2), when ln(2) is the natural log of 2. Next, we need to calculate the weighted Entropy(Sv) for each value v = v1, v2, v3, v4, noting that the weighting involves multiplying by (|Svi|/|S|). Remember also that Sv  is the set of examples from S which have value v for attribute A. This means that:  Sv1 = {s4}, sv2={s1, s2}, sv3 = {s3}. 

We now have need to carry out these calculations: 

(|Sv1|/|S|) * Entropy(Sv1) = (1/4) * (-(0/1)log2(0/1) - (1/1)log2(1/1)) = (1/4)(-0 -

(1)log2(1)) = (1/4)(-0 -0) = 0 

(|Sv2|/|S|) * Entropy(Sv2) = (2/4) * (-(1/2)log2(1/2) - (1/2)log2(1/2))

                                      = (1/2) * (-(1/2)*(-1) - (1/2)*(-1)) = (1/2) * (1) = 1/2 

(|Sv3|/|S|) * Entropy(Sv3) = (1/4) * (-(0/1)log2(0/1) - (1/1)log2(1/1)) = (1/4)(-0 -

(1)log2(1)) = (1/4)(-0 -0) = 0 

Note that we have taken 0 log2(0) to be zero, which is standard. In our calculation,

we only required log2(1) = 0 and log2(1/2) =  -1. We now have to add these three values together and take the result from our calculation for Entropy(S) to give us the final result: 

Gain(S,A) = 0.811 - (0 + 1/2 + 0) = 0.311 

Now we look at how information gain can be utilising in practice in an algorithm to construct decision trees.


Related Discussions:- Example calculation of entropy

Intelligent systems assignment, This logbook should be used to record decis...

This logbook should be used to record decisions, ideas, work done by your group on this assignment. Each group should keep one logbook, which must be submitted along with your sour

Initialize new pvm processes, Q. Initialize new PVM processes? pvm_spa...

Q. Initialize new PVM processes? pvm_spawn( char *task, char **argv, int flag, char *where, int ntask, int *tids ) Initialize new PVM processes. Task a character st

Explain about layout cells, Q. Explain about Layout Cells? In Layout vi...

Q. Explain about Layout Cells? In Layout view you can draw layout cells and layout tables to define design areas of a document. This task is easier to accomplish if you prepare

Future of hyper threading, Current Pentium 4 based MPUs use Hyper-threading...

Current Pentium 4 based MPUs use Hyper-threading, but the next-generation cores, Woodcrest and Merom, Conroe will not. While some have alleged that this is because Hyper-threading

What is the use of isolated i/o configuration, What is the use of isolated ...

What is the use of isolated I/O configuration.  In isolated I/O configuration the CPU has distinct input and output instructions and each of these instructions is associated wi

Constants - first-order logic, Constants - first-order logic: Constant...

Constants - first-order logic: Constants are things that is cannot be changed, like as england, black and barbara. So then they stand for one thing only, so that can be confu

Computer Fundamentals, state and explain the advantages of having densely ...

state and explain the advantages of having densely packed integrated Circuits in the computer

Addressing mode, the 68000 has rich of addressing mode . it concerned with ...

the 68000 has rich of addressing mode . it concerned with the way data is accessed . identify the destion addressing mode for EXG D0, A2

Difference between data warehousing and data mining, Difference between dat...

Difference between data warehousing and data mining The difference between data warehousing and data mining is that data warehousing shows to the data storage while data mining

Properties of electronic cash, Properties : 1.  Monetary Value: Monetar...

Properties : 1.  Monetary Value: Monetary value must be backed by also cash, bank - authorized credit cards or bank certified cashier's cheque. 2.  Interoperability: E-cash

Write Your Message!

Captcha
Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd