Comparing software development workloads

Assignment Help Basic Statistics
Reference no: EM131341253

Comparing Software Development Workloads

Estimating the cost of developing software in terms of work load is difficult since it is a challenge to quantify the size and complexity of a software system. The article Analysis of Size Metrics and Effort Performance Criterion in Software Cost Estimation provides an overview of different metrics used to assess size and complexity (Malathi & Sridhar, 2012). The metrics include counts of lines of code, function point counts, and operation counts. Function point counts are often utilized because they can be estimated based on project design specifications.

The dataset pointworkload.cvs contains data collected from 104 programming projects at AT&T between 1986 and 1991 (Matson & Huguenard, 2005). This dataset include number of work hours for each project, the function point count for each project, and identifiers for operating system, data management system, and programming language utilized. In this application, you will investigate whether operating system, data management system and programming language impact the number of work hours per function point for a project.

Open the dataset pointworkload.csv in Excel. Create a new column that calculates the number of work hours per function point for each project. Save the file with this new data column.

Next, you would want to look at the distribution of work hours per function point in a frequency diagram. Doing so in Excel requires either binning and counting the data yourself or installing the Data Analysis Toolpak Add-On. However, even with the add-on, simply getting a histogram requires multiple steps. Excel is designed for data presentation not for significant statistical analysis. It is capable of the statistical analysis but only with add-ons, macros, or programming. Instead of taking these steps, you will switch now to a software tool designed for statistical analysis, SPSS.

Go to the Resources section for Unit 4, and download the document IBM_SPSS_Installation_and_Registration_Instructions. This will guide you through the process of installing the statistical analysis platform SPSS which you will utilize for the remainder of this assignment.

  1. Import the file you revised in Excel to include work hours per function point into SPSS (be sure to tell it that yes there are variable names included at the top of your file) and take a screenshot showing your successful installation and import. This screen shot should be pasted into your overall document.
  2. ) In the top tool-bar, select Analyze, Descriptive Statistics, Frequencies. Put the work hours per function point variable you created in the Variable(s) column. Click Charts and select Histogram. Then,click Continue and OK. SPSS will now run the requested analysis. In the Output, scroll down to the histogram and copy-paste it into your overall document. Describe the distribution of the data. Does it appear to be normally distributed? What are the average and standard deviation? Are there any outliers?
    Now, you are ready to determine whether operating system, data management system, or language impact the work hours per function point. To do this, you will utilize two different statistical tools. The t-test for difference in means between two independent samples and the analysis of variance.
  3. There are two different operating systems utilized. A 0 indicates UNIX, and a 1 indicates MVS. The t-test will allow you to assess the null hypothesis that the two operating systems give the same average work load per function point. Select Analyze, Compare Means, Independent-Samples T-Test. Your test variable is work hours per function point. Your grouping variable is OS. You will need to click Define Groups and make Group 1 = 0 (UNIX) and Group 2 = 1 (MVS). With these defined, click Continue and OK to get both the group statistics and the t-test results. Use the group statistics to calculate the t-value. Show all of your work for the calculation. For α=0.05, what is the p-value for the hypothesis? Based on this result, draw a conclusion as to whether or not the different operating systems result in a significant difference in work load per function point.
  4. By examining the t-test results from the previous question, you can see that both the t-statistic and the p-value are calculated there. You will be running several tests to determine if programming language impacts work load per function point, and you should draw your data from these charts rather than calculating by hand. Go back to your Independent-Samples T-Test and change the Grouping Variable to Language. Define the groups as 1 (Cobol) and 2 (PLI). Copy the t-test results to your overall document. Repeat this process for groups 1 (Cobol) and 3 (C), groups 1 (Cobol) and 4 (Other), groups 2 (PLI) and 3 (C), groups 2 (PLI) and 4 (Other), and groups 3 (C) and 4 (Other). Copy all six t-test results to your overall document. Based on these result, draw a conclusion as to whether or not the different programming languages result in a significant difference in work load per function point. Be sure to state the different null hypotheses considered and which are rejected and accepted at α=0.05.
  5. Running six different t-tests certainly answers the question of whether or not programming language effects work load per function point, but it is relatively time consuming to run and assess each of these results separately. Analysis of variance (ANOVA) allows this multiple group comparison. Go to Analyze, Compare Means, One-Way ANOVA. Select work hours per function point as your dependent variable and Language as factor then click OK. Copy the ANOVA table to your overall document. Explain what the ANOVA table tells you and what conclusions can be drawn.
  6. ANOVA has the down side that it only tells if some group is significantly different from some other group but does not identify those groups. You can obtain that information by adding a post hoc test to compare means. Go back to the One-Way ANOVA and click on Post Hoc. You will see numerous options. These are all different methods for comparing the groups. Each approaches the comparison differently. You will utilize the Tukey comparison here. Select Tukey then click Continue and OK. You will see both a comparison table and a table creating homogenous subsets. From this data you should be able to conclude that there is a significant difference between 1 (Cobol) and 2 (PLI). Copy these charts to your overall document and explain how that conclusion may be drawn. How does this compare to your t-test conclusions?
  7. Utilize t-test and/or ANOVA to determine the impact of database management system on work load per function point. The values are 1 (IDMS), 2 (IMS), 3 (INFORMIX), 4 (INGRESS), and 5 (Other). You should present your data, draw conclusions, and explain those conclusions.

Malathaim S. & Sridhar, S. (2012). Analysis of size effect metrics and effort performance criterion in software cost estimation. Indian Journal of Computer Science and Engineering, 3(1), pp. 24-31. Retrieved from https://www.ijcse.com/docs/INDJCSE12-03-01-101.pdf

Matson, J. E. & Huguenar, B. R. (2005). Evaluating aptness of a regression model. Journal of Statistics Education Data Archive. Retrieved from https://www.amstat.org/publications/jse/jse_data_archive.htm

Reference no: EM131341253

Questions Cloud

Multiple regression and linear regression in bullet points : What is difference between multiple regression and linear regression in bullet points?
Explain the components : Describe and analyze a situation in which you observed at least two individuals engaged in conflict. Explain the components. What was happening,-did you observe any "problematic behaviors" utilized during this interaction ? Any possible triggers? ..
Which project should i recommend and why : Hurdle rates. Assume that Hershey Foods has a policy that cost savings projects must earn a 16 percent hurdle rate and that new products must earn a 25 percent hurdle rate.- If I can fund only one of them, which project should I recommend and why?
Can you offer a critique of the dependent variable : What would be the appropriate significance test for this experiment? Can you offer a critique of the dependent variable? If you changed the dependent variable, would it affect your choice of significance tests? If so, how?
Comparing software development workloads : Estimating the cost of developing software in terms of work load is difficult since it is a challenge to quantify the size and complexity of a software system.
What is the npv of the project : Its pretax cost of equity is 14 percent, and its pretax cost of debt is 8 percent. The tax rate is 40 percent. What is the NPV of this project?
Would raise or lower amount of new preferred stock issued : Venture capital. Why are convertible preferred stock and a staged capital commitment employed by venture capitalists?
Sequence of signals that represent : Digital data are transmitted as a sequence of signals that represent 0s or 1s. Suppose that such data are being transmitted to a satellite and then relayed to a distant site.
Explain when it might make sense for hershey to lease assets : Explain when it might make sense for Hershey to lease an asset. Under what conditions should Hershey lease its computer systems?

Reviews

Write a Review

Basic Statistics Questions & Answers

  Statistics-probability assignment

MATH1550H: Assignment:  Question:  A word is selected at random from the following poem of Persian poet and mathematician Omar Khayyam (1048-1131), translated by English poet Edward Fitzgerald (1808-1883). Find the expected value of the length of th..

  What is the least number

MATH1550H: Assignment:  Question:     what is the least number of applicants that should be interviewed so as to have at least 50% chance of finding one such secretary?

  Determine the value of k

MATH1550H: Assignment:  Question:     Experience shows that X, the number of customers entering a post office during any period of time t, is a random variable the probability mass function of which is of the form

  What is the probability

MATH1550H: Assignment:Questions: (Genetics) What is the probability that at most two of the offspring are aa?

  Binomial distributions

MATH1550H: Assignment:  Questions:  Let’s assume the department of Mathematics of Trent University has 11 faculty members. For i = 0; 1; 2; 3; find pi, the probability that i of them were born on Canada Day using the binomial distributions.

  Caselet on mcdonald’s vs. burger king - waiting time

Caselet on McDonald’s vs. Burger King - Waiting time

  Generate descriptive statistics

Generate descriptive statistics. Create a stem-and-leaf plot of the data and box plot of the data.

  Sampling variability and standard error

Problems on Sampling Variability and Standard Error and Confidence Intervals

  Estimate the population mean

Estimate the population mean

  Conduct a marketing experiment

Conduct a marketing experiment in which students are to taste one of two different brands of soft drink

  Find out the probability

Find out the probability

  Linear programming models

LINEAR PROGRAMMING MODELS

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd