subject
Mathematics, 19.03.2021 21:10 brandon1748

Flag Computer software is commonly used to translate text from one language to another. As part of his Ph. D. thesis, Philipp Koehn developed a phrase-based translation program called Pharaoh. The quality of the translation can vary. A good translation system should match a professional human translation. It is important to be able to quantify how good the translations produced by Pharaoh are. The IBM T. J. Watson Research Center developed methods to measure the quality of a translation from one language to another. One of these is the BiLingual Evaluation Understudy (BLEU). BLEU is a score ranging from 0 to 1 that indicates how well a computer translation matches a professional human translation of the same text. Higher scores indicate a better match. BLEU helps companies who develop translation software "to monitor the effect of daily changes to their systems in order to weed out bad ideas from good ideas."

To compare Pharaoh's ability to translate with similar computer translation software, Koehn took a random sample of 100 blocks of Spanish text, each of which contained 300 sentences, and used Pharaoh to translate each of these to English. The BLEU score was calculated for each of the 100 blocks. He wants to use this data to see if it differs from the mean BLEU score of another leading translation software which has a population mean score of 0.295. Open the data file BLEU-Scores

0.294
0.284
0.241
0.249
0.257
0.245
0.291
0.287
0.319
0.313
0.295
0.311
0.291
0.281
0.28
0.3
0.313
0.264
0.272
0.257
0.297
0.279
0.262
0.265
0.211
0.276
0.278
0.267
0.304
0.264
0.281
0.266
0.282
0.324
0.242
0.232
0.31
0.285
0.309
0.284
0.286
0.289
0.308
0.27
0.32
0.284
0.307
0.257
0.266
0.297
0.282
0.251
0.299
0.237
0.287
0.315
0.285
0.284
0.303
0.313
0.307
0.294
0.298
0.312
0.266
0.274
0.273
0.284
0.301
0.286
0.294
0.33
0.292
0.297
0.293
0.29
0.307
0.268
0.284
0.312
0.274
0.302
0.306
0.319
0.281
0.264
0.373
0.343
0.309
0.29
0.297
0.262
0.305
0.348
0.261
0.279
0.322
0.343
0.286
0.233
. Use this information to answer questions 6 through 10.

Assuming the requirements are satisfied, calculate a 95% confidence interval for the mean of the BLEU test scores. Round your answer accurate to three decimal places in interval notation. Round your answers to three decimal places and be sure to put the lower bound in the first box and the upper bound in the second. [Example: (42.335, 54.859)]
( , )
.Calculate the degrees of freedom and the test statistic for a test of H_o:\ mu = 0.295 against H_a:\ mu != 0.295 . Assume the requirements are satisfied. Round the t-statistic to three decimal places (Example: 2.345) and the degrees of freedom to the nearest whole number (Example: 23).
df =
t =
.Calculate the P-value for a test of H_o:\ mu = 0.295 against H_a:\ mu != 0.295 . Assume the requirements are satisfied. Round your answer to three decimal places (Example: 0.009
).
P-value =
Based on the results of this test, is there enough evidence to say that Pharaoh's ability to translate into English is different than the other leading translation software? Use a level of significance of alpha = 0.05 .
Yes, because the P-value was greater than the level of significance.
Yes, because the P-value was lower than the level of significance.
No, because the results of the test were statistically insignificant.
No, because the P-value was greater than the level of significance.
Suppose the alternative hypothesis of this test had been lower-tailed instead of two-tailed. How would this affect the conclusions of this test?
Unlike the two-tailed test, we would conclude that there is a difference between Pharaoh's translation and the translation of the other software.
Unlike the two-tailed test, we would conclude that there is no difference between Pharaoh's translation and the translation of the other software.
The results of a lower-tailed test are always opposite the results of a two-tailed test, so we would fail to reject the null hypothesis.
The conclusion would be the same as the two-tailed test. Although the p-value for the lower-tailed test is different, it is still less than alpha.
The conclusion would be the same as the two-tailed test. Although the p-value for the lower-tailed test is the same as the p-value for the two-sided test.

ansver
Answers: 1

Another question on Mathematics

question
Mathematics, 20.06.2019 18:02
Find the volume of the largest rectangular box in the first octant with three faces in the coordinate planes and one vertex in the plane x + 2y + 3z = 15.
Answers: 1
question
Mathematics, 21.06.2019 16:00
Drag the tiles to the correct boxes to complete the pairs. not all tiles will be used. the heights of the girls in an advanced swimming course are 55, 60, 59, 52, 65, 66, 62, and 65 inches. match the measures of this data with their values.
Answers: 1
question
Mathematics, 21.06.2019 16:30
Yoku is putting on sunscreen. he uses 2\text{ ml}2 ml to cover 50\text{ cm}^250 cm 2 of his skin. he wants to know how many milliliters of sunscreen (c)(c) he needs to cover 325\text{ cm}^2325 cm 2 of his skin. how many milliliters of sunscreen does yoku need to cover 325 \text{ cm}^2325 cm 2 of his skin?
Answers: 3
question
Mathematics, 21.06.2019 17:30
Can someone me and do the problem plz so i can understand it more better
Answers: 2
You know the right answer?
Flag Computer software is commonly used to translate text from one language to another. As part of...
Questions
question
Chemistry, 19.09.2019 11:30
question
Mathematics, 19.09.2019 11:30
question
Physics, 19.09.2019 11:30
question
Mathematics, 19.09.2019 11:30
Questions on the website: 13722363