Describing the final network configuration and weka settings

Assignment Help Computer Networking
Reference no: EM131451469

Program - Neural Networks

For the last programming project of the semester, you will use off-the-shelf neural network software to investigate airline lateness statistics. You are given a data file (.csv format) containing data on delays from various causes from the 29 largest airports in the United States. We are interested in finding out if the pattern of causes of delays is sufficient to identify the airport.

THE SOFTWARE

The WEKA package is available on all Flarsheim labs (and is also a free download if you want to install it on your own system). It has modules to compute many types of AI functions including Bayesian networks and neural networks, the focus of this assignment. WEKA allows you to build and train a network by specifying the configuration and the data file; there is no need to write your own artificial-neuron or back propagation code.

THE DATA FILE

The file, airlines.csv, is taken from https://think.cs.vt.edu/corgis/csv/airlines/airlines.html. The data dictionary, describing the format and meaning of each field, is also on that page. In short, the file contains, for each airport, the number of delays:

  • due to the airline;
  • due to late aircraft;
  • due to issues with the aviation system itself (congestion, air traffic control, etc)
  • due to security concerns; and
  • due to weather.

In addition, it lists the number of flights canceled, delayed, or diverted. For delays, it lists the total minutes delay for each cause.

The file also contains the number of carriers, total number of flights, and number of on-time flights per airport. The 3-letter airport code is also given; this is the output (dependent) variable.

The data on number of carriers per airport should be screened out of your input data and not used as input for your network. This is because in some cases this is enough to uniquely identify the airport; we do not want our network to bypass the bulk of the data. Likewise, the name of the airport should not be used as input.

INPUT TO YOUR NETWORK

Use the numeric data for number and amount of delays, diversions, etc., for each cause. You will want to normalize this data, either by number of delays or number of flights.

Scaling the data: You may need to adjust the scale of your data (e.g. record delays in hours rather than minutes) so that all inputs are of approximately the same magnitude. If inputs vary over multiple scales of magnitude (as this data does), the network requires much more training-and we only have so much data. Therefore, adjusting data so that numbers are proportions (floats in [0.0 - 1.0)) rather than raw counts can provide more efficient learning from the same data. Another option is to code each variable separately as a z-score, as the number of standard deviations above or below the mean that item is. (z-scores below the mean are negative, above the mean positive; thus a z-score of -0.27 means an item is 0.27 standard deviations below the average for that variable, and a z-score of 1.12 is 1.12 standard deviations above the mean). The advantage of this is that all data items are on the same scale-mean of 0, standard deviation of 1-even if some variables have a characteristic range of 0.01-0.10 and others have a range of 1,000 - 100,000.

Exclude from input: Name of airport, month, year, month name, year/month code, airport code.

OUTPUT FROM YOUR NETWORK:

Your network should have 29 output neurons, 1 for each airport. Select the maximum value from the output neurons as the network's response.

NEURAL NETWORK CONFIGURATION -

This is your playground! The general approach is to start with the input neurons and a single neuron in the hidden layer. Randomly select a subset of the data (say, 5% of it) to withhold for testing (WEKA can do this automatically) and train the network, then test it on the withheld data. At first, it'll probably be terrible. Then add a second neuron to the hidden layer, select a new subset of the data, retrain the network from scratch, and check results. Continue adding neurons to the hidden layer until the network can consistently predict all withheld data, or when adding more neurons leads to a decrease in performance on the test set.

That uses one hidden layer. Each hidden neuron computes a linear combination of the inputs, and each output is a linear combination of the hidden neurons. You can have multiple layers of hidden neurons. It's probably best not to get too carried away; this data probably won't support more than 2 hidden layers. (The more hidden layers, the more training data needed.) And there's no requirement of multiple hidden layers; in general, a network should be as complex as needed to perform well, and no more.

So try some different configurations if you like. The key point is that anytime the network configuration is changed, the entire network must be re-initialized and trained from the beginning, particularly if different data items are selected for testing. (Otherwise the network is partly trained on test data, which invalidates any test results.)

In a short report describing the final network configuration and the WEKA settings needed to produce it. Also include a short report describing how you designed your network, how it was tested and validated, and how well it was able to classify the results.

Attachment:- Assignment.zip

Reference no: EM131451469

Questions Cloud

Debt and equity and the tax rate is zero : Assume the firm is equally financed by both debt and equity and the tax rate is zero.
Consider the given statement and solve : If you had the chance to work from home and telecommute, would you take it? If the opportunity meant that you had to allow your company to monitor.
Does the writer make the barrier clear in the introduction : Does the writer make the barrier or issue clear in the introduction?Does the writers course of action for addressing the barrier or issue follow the logic model
Are you now obligated to answer those calls and e-mails : You have just been issued a new company BlackBerry (to make sure you never miss an important e-mail or phone call!).
Describing the final network configuration and weka settings : CS 461 Program - Neural Networks. In a short report describing the final network configuration and the WEKA settings needed to produce it
Expected net operating profit after taxes : Dernham Inc. has an expected net operating profit after taxes, EBIT (1-T), of $10,000 million in the coming year.
What is opinion : Would your opinion change if you knew the cost savings from outsourcing were putting American radiologists out of a job? What if they were being read.
Research and design a human resources succession plan : Research and design a Human Resources Succession plan for a health care organization.
About the odd dividend policy : Bucksnort, Inc., has an odd dividend policy. If you require a return of 12 percent on the company’s stock, how much will you pay for a share today?

Reviews

Write a Review

Computer Networking Questions & Answers

  Networking and types of networking

This assignment explains the networking features, different kinds of networks and also how they are arranged.

  National and Global economic environment and ICICI Bank

While working in an economy, it has a separate identity but cannot operate insolently.

  Ssh or openssh server services

Write about SSH or OpenSSH server services discussion questions

  Network simulation

Network simulation on Hierarchical Network Rerouting against wormhole attacks

  Small internet works

Prepare a network simulation

  Solidify the concepts of client/server computing

One-way to solidify the concepts of client/server computing and interprocess communication is to develop the requirements for a computer game which plays "Rock, Paper, Scissors" using these techniques.

  Identify the various costs associated with the deployment

Identify the various costs associated with the deployment, operation and maintenance of a mobile-access system. Identify the benefits to the various categories of user, arising from the addition of a mobile-access facility.

  Describe how the modern view of customer service

Describe how the greater reach of telecommunication networks today affects the security of resources which an organisation provides for its employees and customers.

  Technology in improving the relationship building process

Discuss the role of Technology in improving the relationship building process Do you think that the setting of a PR department may be helpful for the ISP provider? Why?

  Remote access networks and vpns

safekeeping posture of enterprise (venture) wired and wireless LANs (WLANs), steps listed in OWASP, Securing User Services, IPV4 ip address, IPV6 address format, V4 address, VPN, Deploying Voice over IP, Remote Management of Applications and Ser..

  Dns

problems of IPV, DNS server software, TCP SYN attack, Ping of Death, Land attack, Teardrop attack, Smurf attack, Fraggle attack

  Outline the difference between an intranet and an extranet

Outline the difference between an intranet and an extranet A programmer is trying to produce an applet with the display shown in Figure 1 below such that whenever one of the checkboxes is selected the label changes to indicate correctly what has..

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd