Reference no: EM133417830
Programming Assignment
Named Entity Recognition (NER) with Recurrent Neural Network (RNN)
Build a recurrent neural network for Named Entity Recognition (NER) on CONLL 2003 dataset. Your task is to classify words into 10 different classes: , O, B-ORG, B-PER, B-LOC, B-MISC, IORG, I-PER, I-LOC, I-MISC. We are identifying whether words are part of a phrase referring to an organization, person, location, or miscellaneous. B indicates that word is at the beginning of the phrase, I indicates that the word is inside the phrase but not the first word, O indicates it is outside the phrase (does not belong to it).
Data: You can find training, test, and validation sets on Blackboard. You will build the model and tune parameters using training and validation data, and evaluate the final model (after all development and tuning) with the test data.
Pre-processing: Read the complete data. First column has the words to be classified, and last column shows the gold standard tag for each word. Lower case capitalized words (i.e., starts with a capital letter) but not all capital words (e.g., USA). Do not remove stopwords. Data is already separated by sentence and tokenized, so do not use different tools to tokenize for this task. Separate data by sentence. Once you know the maximum sentence length in the data, append 0s at the end of shorter sentences to make them match this max length. Set the tag for the 0s to .
Training: Build an RNN. Start with a vanilla RNN, with one layer of 256 hidden units, and a fully connected output layer using softmax as activation function. Use Adam optimizer, and cross-entropy for the loss function with learning rate 0.0001. Try a bidirectional RNN with the same settings. Train with 2000 mini batches per epoch. You may see convergence around 5000 epochs. You can change the RNN unit to LSTM or GRUs in both the unidirectional and bidirectional architectures, and experiment with different learning rates and batch sizes. Build a system architecture, as well as hyperparameter and parameter tuning using training and validation data. Finally, for the best architecture among the 6 (pick one!) above (RNN, bi-RNN, LSTM, bi-LSTM, GRU, bi-GRU), make the necessary modifications to update the embeddings along with the rest of the network. This is your 7th and final system. Save your trained systems (i.e., models) using libraries such as callbacks.ModelCheckpoint(...) or model.save_weights(..).
Testing: Apply your trained models (7 total) to test data. Save your output and results in a .txt or a .log file. Results should be in the following format: Word Gold_Standard Prediction SOCCER O O - O O MEXICO B-LOC B-LOC GET O O
Evaluation: Run conlleval.py on your output. Use the get_result function to print out your accuracy in the log file