Implement 5-fold cross-validation to choose t

Assignment Help Basic Computer Science
Reference no: EM131082524

In Section 3.6.3 we used the test set that we had put aside to both select τ, the threshold for the log odds, and to evaluate the Type I and II errors incurred when we use this threshold. Ideally, we choose τ from another set of messages that is both independent of our training data and our test data. The method of cross-validation is designed to use the training set for training and validating the model. Implement 5-fold cross-validation to choose τ and assess the error rate with our training data. To do this, follow the steps:

(a) Use the sample () function to permute the indices of the training set, and organize these permuted indices into 5 equal-size sets, called folds.

(b) For each fold, take the corresponding subset from the training data to use as a ‘test' set. Use the remaining messages in the training data as the training set. Apply the functions developed in Section 3.6 to estimate the probabilities that a word occurs in a message given it is spam or ham, and use these probabilities to compute the log likelihood ratio for the messages in the training set.

(c) Pool all of the LLR values from the messages in all of the folds, i.e., from all of the training data, and use these values and the type I Error Rate () function to select a threshold that achieves a 1% Type I error.

(d) Apply this threshold to our original/real test set and find its Type I and Type II errors.

Reference no: EM131082524

Questions Cloud

Amount of the annual interest tax shield : What is the amount of the annual interest tax shield given a tax rate of 35 percent?
Write code to handle the attachments in the message : Write code to handle the attachments in the message
Calculate monthly return : A mutual fund that had a NAV Rs. 20 at the beginning of the month made income and capital gain distribution of Rs. 0.0375 and Rs. 0.03 per share respectively; during the month and then ended the month with a net asset value of Rs. 20.06. Calculate..
Can you improve the prediction using them : Can you improve the prediction using them?
Implement 5-fold cross-validation to choose t : Apply this threshold to our original/real test set and find its Type I and Type II errors.
Large downpayment on the purchase of a house : Statement I: When a Bank requires a Borrower to pay a large downpayment on the purchase of a house, the Bank is reducing its risk by increasing the equity cushion to support any losses in value to the collateral.
Calculate the interest rate : Calculate the interest rate on 1,2, 3, 4, 5, 10, and 20 year Treasury securities. Please show all work (steps involved).
Develop a hybrid classifier that uses both the word vectors : Develop a hybrid classifier that uses both the word vectors and these additional features.
Find out effective rate of interest : A finance company offers him a hire purchase deal of repayment in 30 months, the flat rate being 6.497%. Find out Effective rate of Interest.

Reviews

Write a Review

Basic Computer Science Questions & Answers

  How it sales manager learn technical in his role

How does an IT sales manager learn to be technical in his role without over complicating the IT aspects most consumers want to understand

  Explain the different modes of data transfer

Explain the different modes of data transfer

  Innovation to optimize system power cost

Would you adopt this innovation if metric you were attempting to optimize was system power x cost? Suppose a defect density of 0.4/cm2, an alpha of 4, and a wafer of diameter 30cm.

  Prepare a list of threat categories

Contingency Planning Paper Address the following items: Assume that you have been hired by a small veterinary practice to help them prepare a contingency planning document. The practice has a small LAN with four computers and Internet access. Prepare..

  Deriving logging information for chinese wall model

In the example of deriving required logging information for the Chinese Wall model, it is stated that the time must be logged.Why? Explain.

  Distinguish distances that moved their shopping carts

Performing 200 Nm of work. Both Brian and Dawn are exerting same amount of force (20 N). Distinguish the distances that Brian and Dawn moved their shopping carts.

  Determine the frequency (in hz)

A. Determine the frequency (in Hz) and the period (in s) for the sinusoidal wave described in the last problem. B. An oscilloscope shows a wave repeating every 27 ms. What is the frequency of the wave?

  Give a cfg which represents the language

Give a CFG which represents the language {a^i b^j c^k / i!=j or j!=k }

  State to what next hop the following will be delivered

Suppose customer PA acquires a direct link to Q, and QA acquires a direct link to P, in addition to existing links. Give tables for P and Q, ignoring R.

  How you think fox lake should proceed

Using Figure 10-13 as a guide, develop a plan for implementing the process in Figure 10-9. Ignore the Collect Deposit activity. Assume that it has been developed and works.

  What does the memory model of a microcontroller

What does the memory model of a microcontroller show and discuss the reasons for different types of memory such as RAM, EEPROM, and FLASH used in HCS12.

  What is the impact on the efficiency of the operations

What is the impact on the efficiency of the operations enqueue and dequeue if we were to maintain the queue's front at the beginning of the list and the queue's back at the list's end?

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd