Analyse methods and algorithms

Assignment Help JAVA Programming
Reference no: EM132989202

COSC 2637 Big Data Processing - RMIT University

Overview
Write advanced MapReduce programs which give your chance to develop in-depth understanding of principles when solving complex problems on Hadoop execution platform, and analyze solutions by applying the knowledge learned in this course to achieve the optimal outcome.

Learning Outcome 1: model and implement efficient big data solutions for various application areas using appropriately selected algorithms and data structures.

Learning Outcome 2: analyse methods and algorithms, to compare and evaluate them with respect to time and space requirements and make appropriate design choices when solving real-world problems.

Learning Outcome 3: motivate and explain trade-offs in big data processing technique design and analysis in written and oral form.

Learning Outcome 4: explain the Big Data Fundamentals, including the evolution of Big Data, the characteristics of Big Data and the challenges introduced.
Learning Outcome 6: apply the novel architectures and platforms introduced for Big data, i.e. Hadoop, MapReduce and Spark.

Assessment details

Task 1 - Count word co-occurrence frequency
Write a MapReduce program that uses pairs approach and outputs the frequency of word pairs.
- Given "(a, b)" and word pair "(b, a)", they are considered as different word pairs,
- Do not output count the pair of same words, e.g., "(a, a)",
- The words are considered co-occurred if they are in the same line and the number of words between them <=3.

Task 2 - Count word pair relative frequency
Write a MapReduce program that uses pairs approach and outputs the relative frequency of word pairs.
- Given "(a, b)" and word pair "(b, a)", they are considered as different word pairs,
- Do not output count the pair of same words, e.g., "(a, a)",
- The words are considered co-occurred if they are in the same line and the number of words between them <=3.

Task 3 - Implement PAM algorithm with a MapReduce Program
The most common realization of k-medoid clustering is the Partitioning Around Medoids (PAM) algorithm which is described below:
step 1. Initialize: randomly select ???? of the ???? data points as the medoids
step 2. Assignment step: Associate each data point to the closest medoid.
step 3. Update step: For each medoid m and each data point ???? associated to ???? , swap ???? and ????, and compute the total cost of the configuration (that is, the average dissimilarity of ???? to all the data points associated
to ????). Select the medoid o with the lowest cost of the configuration.
step 4. Repeat alternating steps 2 and 3 until there is no change in the assignments.

(a) Your program must correct implement PAM. In your code, provide detailed comments to specify where each step is implemented. For example
//Step 2 start.
...
Block of code;
...
//Step 2 end.

Run your PAM MapReduce program to cluster a point dataset NYTaxiLC1000 1 (with 1000 points in longitude and latitude from line 1 to line 1000) where 1 ≤ ???? ≤ 6. Note the initial medoids are always points at line 100, 200, 300, 400, 500 and 600 (i.e., ???? = 1, the initial medoid is point at line 100; ???? = 2, the initial
medoids are points at line 100, 200; and so on for k=3, 4, 5 and 6).

(b) Visualize the clustering results. The points belonging to the same cluster are with the same color. The medoid of each cluster is highlighted.

(c) Analyse what is the best setting of ???? (3 ≤ ???? ≤ 6) and explain why.

Attachment:- Big Data Processing.rar

Reference no: EM132989202

Questions Cloud

Weigh the reasonableness of security controls against risk : Assume you are the president of a small data brokerage company, which gathers information from consumers on the web
What the strengths of each central tendency is : What is the most significant reason to use descriptive statistics in communication research? Why? what the strengths of each central tendency is
What is the balance in the deferred income tax account : What is the balance in the Deferred Income Tax account at the end of 2015, 2016, and 2017? If your answer is zero, enter "0". If required
Discuss the role of conceptualization and operationalization : Discuss the role of conceptualization and operationalization within the survey. What are the most significant elements of a questionnaire?
Analyse methods and algorithms : Analyse methods and algorithms, to compare and evaluate them with respect to time and space requirements and make appropriate design choices
How would reconcile the legal with the ethical : What are two significant legal requirements that may impact an ethical communication issue? How would you reconcile the legal with the ethical?
What are challenges that f co faces in performing valuation : What are the challenges that F Co faces in performing a valuation? Are there any aspects of a typical valuation which they will not be able to perform
Data visualization design : I would like to discuss how one data visualization design choice may be beneficial over another.
Identify a communication problem or issue : Identify a communication problem or issue that you would be interested in researching, e.g., people's addiction to texting and the problem

Reviews

Write a Review

JAVA Programming Questions & Answers

  Recursive factorial program

Write a class Array that encapsulates an array and provides bounds-checked access. Create a recursive factorial program that prompts the user for an integer N and writes out a series of equations representing the calculation of N!.

  Hunt the wumpus game

Reprot on Hunt the Wumpus Game has Source Code listing, screen captures and UML design here and also, may include Javadoc source here.

  Create a gui interface

Create GUI Interface in java programing with these function: Sort by last name and print all employees info, Sort by job title and print all employees info, Sort by weekly salary and print all employees info, search by job title and print that emp..

  Plot pois on a graph

Write a JAVA program that would get the locations of all the POIs from the file and plot them on a map.

  Write a university grading system in java

University grading system maintains number of tables to store, retrieve and manipulate student marks. Write a JAVA program that would simulate a number of cars.

  Wolves and sheep: design a game

This project is designed a game in java. you choose whether you'd like to write a wolf or a sheep agent. Then, you are assigned to either a "sheep" or a "wolf" team.

  Build a graphical user interface for displaying the image

Build a graphical user interface for displaying the image groups (= cluster) in JMJRST. Design and implement using a Swing interface.

  Determine the day of the week for new year''s day

This assignment contains a java project. Project evaluates the day of the week for New Year's Day.

  Write a java windowed application

Write a Java windowed application to do online quiz on general knowledge and the application also displays the quiz result.

  Input pairs of natural numbers

Java program to input pairs of natural numbers.

  Create classes implement java interface

Interface that contains a generic type. Create two classes that implement this interface.

  Java class, array, link list , generic class

These 14 questions covers java class, Array, link list , generic class.

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd