You will implement an end-to-end document classication

Assignment Help Computer Engineering
Reference no: EM13372337

You will implement an end-to-end document classi?cation system that predicts which category pages belong to, using the classi?cation scheme.

Your system will use the averaged perceptron machine learning algorithm, which you will implement. You will test your implementation of the learning algorithm on a pre-computed dataset, so that you can see whether your learner performs as expected.

Once you have your learner, you will apply it to the article classi?cation task, using features you design and extract yourself. You will evaluate your classi?er using the n-fold cross validation technique, which you will also implement.

Finally, you will describe your experiments in a three-page report, which you will submit alongside a tarball/zip?le including your code and instructions to run your system.

You are free to use a programming language of your choice to implement the assignment.

The assessment of this assignment is not about the quality of your code. Rather, it is about how well you can set up, evaluate and analyse a typical statistical natural language processing experiment.

However, the correctness of your code will prove critical in producing intelligible results: if you do not implement your learner, extractor and evaluator correctly, you will produce results that are impossible to explain.

An important part of this assignment is learning to identify and describe relevant details. There are an almost limitless combination ofmeasures you can use or experiments you can do to analyse how your system performs. However, space is limited, so you must be selective. Once you have a correct implementation, asking the right questions and using statistics that answer them concisely is the key to good marks.

You will be assessed on a 3-page report (not including tables and/or diagrams) that describes and analyses your results. You are not required to describe your implementation in the report.

The analysis of the results of the ?rst machine learning problem should be brief. This experiment is to help you verify the correctness of your implementation.

Most of your report should describe your article classi?cation experiment. Describe which features you included, and identify which types of features were most important for your classi?er's accuracy. Characterise the kinds of errors the system made, using some combination of qualitative and quantitative analysis.

Although in general the choice of how to present your results is up to you, you must include micro-averaged Precision, Recall and F-Measure statistics using 10-fold cross validation for the article classi?cation task. You are encouraged to evaluate a baseline con?guration using only the most obvious features (such as bag- of-words), and analyse the contribution of more innovative features individually.

Download:- statical natural language processing.zip

Reference no: EM13372337

Questions Cloud

Your mission is to analyse the australian car manufacturing : your mission is to analyse the australian car manufacturing industry by performing a 5 forces analysis and a life cycle
The essay approximately two-three pages in length is : the essay approximately two-three pages in length is neededdo these excerpts from columbus log provide us with any
Part-1use one business rule delivery fees consist of a : part-1use one business rule delivery fees consist of a fixed amount of 1000 per truck plus carbon emission charge and
Introductionthe role of an information systems project : introductionthe role of an information systems project manager to produce a project initiation document pid for your
You will implement an end-to-end document classication : you will implement an end-to-end document classi?cation system that predicts which category pages belong to using the
1 industry demand function q 14 - frac12p 0001income : 1. industry demand function q 14 - frac12p 0.001income. marginal cost is fixed and equal to 16. fixed costs 0. you
Question a a consumer organization wishes to test 12 : question a a consumer organization wishes to test 12 different new perfumes and has devised a number of tests to
Question capital gains taxin july 2011 the labour party put : question capital gains taxin july 2011 the labour party put forward a proposal to have a comprehensive capital gains
Implementation of both the algorithms using cc code 1 : implementation of both the algorithms using cc code 1. roommates problem 2. intern problem1. the roommate problemthe

Reviews

Write a Review

Computer Engineering Questions & Answers

  What competitive advantage will gain

What competitive advantage will you gain by establishing SDLC and following rigid processes and procedures? List your response in bulleted format and provide details for each.

  Left most derivation

A->a|aS|bAA, B->b|bS|aBB, For the string “aaabbabbba” determine a Left most derivation.

  Produce the result as the user instructed on the screen

Produce the result as the user instructed on the screen with appropriate messages.

  How proposed business is a lawnmower business

Main function, this will provide the menu interface to get to all the rest of them, it will basically be the switch statement that calls the other functions and returns their values.

  Hhf employees to recognize and avoid malware treats

give screenshots and an explanation of your results when you download, install, and run a security program such as Spybot - S&D.

  Examine the importance and purpose of of n-tier systems

Examine the importance and purpose of of n-tier systems

  What is the des

What are the substitution ciphers? How do they differ from one-time pads (OTP)? Which is better for the IS manager to employ and why.

  Create a gui pad that has numbers and letters

Create a GUI pad that has numbers and letters(That can be capitalized), and a text area to display One button that clears the text.

  Are there any errors in the following program

Are there any errors in the following program? If so, please point them out and correct them. Explain why those errors are wrong.

  Plan an er diagram

Plan an ER diagram

  What is the concurrency control and what is its objective

What is the role of a database management system (DBMS) and what are its benefits.

  Program that creates basic user interface code

Program that creates basic user interface code

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd