CAT常用能力估計(jì)方法比較及其優(yōu)化:能力綜合估計(jì)方法開發(fā)
本文選題:計(jì)算機(jī)化自適應(yīng)測驗(yàn) + 能力估計(jì)。 參考:《江西師范大學(xué)》2014年碩士論文
【摘要】:近年來,隨著測量理論和計(jì)算機(jī)技術(shù)的發(fā)展,計(jì)算機(jī)化自適應(yīng)測驗(yàn)(Computerized Adaptive Testing,CAT)受到人們越來越多的關(guān)注。能力估計(jì)技術(shù)在CAT中一直扮演重要角色,其估計(jì)的準(zhǔn)確與否不僅影響選題策略的自適應(yīng),還會(huì)由此持續(xù)的影響CAT最關(guān)注的能力估計(jì)的準(zhǔn)確性。 CAT的能力估計(jì)方法至今仍沿用IRT時(shí)代的幾種主要方法,常見的包括MLE,MAP,EAP,WLE等。本文就CAT中能力估計(jì)方法的比較與開發(fā)開展了兩項(xiàng)研究:研究一對(duì)四種常用CAT估計(jì)方法采用計(jì)算機(jī)蒙特卡洛模擬程序,分別從偏差,均方根誤差,題庫調(diào)用均勻性,測驗(yàn)效率等方面,進(jìn)行了系統(tǒng)性的比較。研究二則是以研究一為基礎(chǔ),根據(jù)不同估計(jì)方法特點(diǎn)及優(yōu)劣,開發(fā)了一種新的CAT能力估計(jì)方法——能力綜合估計(jì)法,即強(qiáng)調(diào)在CAT能力估計(jì)的不同階段,綜合運(yùn)用恰當(dāng)?shù)腃AT能力綜合估計(jì)方法,以期取長補(bǔ)短,發(fā)揮現(xiàn)有能力估計(jì)方法的優(yōu)勢,達(dá)到同時(shí)提高CAT能力估計(jì)的準(zhǔn)確度及測驗(yàn)效率。研究結(jié)果表明: 1) MLE的偏差小但均方根誤差大,曝光率相對(duì)其他方法更好,但測驗(yàn)效率最差,且對(duì)特殊作答模式無法給出有效的估計(jì)。 2) WLE的偏差最小,均方根誤差多數(shù)情況下優(yōu)于MLE,,在a分層選題且b均勻時(shí)曝光率最好,且最大信息量選題時(shí)的測驗(yàn)效率最高。 3) MAP的偏差最大,均方根誤差較小,曝光率在大多數(shù)條件下與WLE,EAP并無區(qū)別,且a分層選題策略下的測驗(yàn)效率最高。 4) EAP的偏差僅次于MAP,但均方根誤差最小,測驗(yàn)效率略低于MAP。 5)本研究提出的前期和中期用EAP,后期用WLE的能力綜合估計(jì)法可以有效提高EAP的偏差并基本維持EAP的均方根誤差。 6)綜合法主要可以在控制均方根誤差的基礎(chǔ)上有效改善EAP的偏差。對(duì)EAP偏差的改善率可達(dá)到30%~40%,而均方根誤差僅相比EAP差了不到5%。 7)綜合法在不同長度的測驗(yàn)中均能有效改善EAP的偏差,其中短測驗(yàn)中改善的效果更好。
[Abstract]:In recent years, with the development of measurement theory and computer technology, more and more attention has been paid to computerized Adaptive testing. Capability estimation technology has always played an important role in CAT. The accuracy of the estimation not only affects the adaptive selection strategy, but also affects the accuracy of the capability estimation that CAT pays most attention to. The capability estimation methods of CAT are still used in the era of IRT, such as MLEMP-MAPE / WLE and so on. In this paper, two studies have been carried out on the comparison and development of capability estimation methods in CAT. A pair of four commonly used CAT estimation methods are studied by using Monte Carlo simulation program, respectively, from deviation, root mean square error, homogeneity of item bank, etc. The efficiency of the test is compared systematically. On the basis of research one, according to the characteristics of different estimation methods and their advantages and disadvantages, a new CAT capability estimation method-capability comprehensive estimation method is developed, which emphasizes the different stages of CAT capability estimation. In order to make use of the advantages of the existing ability estimation methods, we can improve the accuracy of CAT capability estimation and test efficiency by using the appropriate comprehensive estimation method of CAT capability. The results show that: 1) the deviation of MLE is small but the root mean square error is large, the exposure rate is better than other methods, but the test efficiency is the worst, and the estimation of the special response mode can not be given effectively. 2) the deviation of WLE is the smallest, the root mean square error is better than MLEs in most cases, the exposure is the best when a stratified topic is selected and b is uniform, and the test efficiency is the highest when the maximum amount of information is selected. 3) the deviation of MAP is the biggest, the root mean square error is small, the exposure rate is not different from that of WLEN EAP under most conditions, and the test efficiency is the highest under the strategy of a stratified selection. 4) the deviation of EAP is second to that of EAP, but the root mean square error is the least, and the test efficiency is slightly lower than that of EAP. 5) in this study, the error of EAP can be effectively improved and the root mean square error of EAP can be basically maintained by using the capability comprehensive estimation method of WLE in the early and middle stages, and in the latter stage. 6) the synthetic method can effectively improve the deviation of EAP on the basis of controlling root mean square error. The improvement rate of EAP deviation can reach 30% and 40%, but the root mean square error is less than 5% less than that of EAP. 7) the synthetic method can improve the deviation of EAP effectively in the test of different length, and the effect of improvement in the short test is better.
【學(xué)位授予單位】:江西師范大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2014
【分類號(hào)】:B841
【參考文獻(xiàn)】
相關(guān)期刊論文 前7條
1 羅芬,丁樹良,胡小松,萬宇文,甘登文;基于IRT若干參數(shù)估計(jì)方式的比較[J];江西師范大學(xué)學(xué)報(bào)(自然科學(xué)版);2003年01期
2 王祖儉;黃國兵;丁樹良;;基于遺傳算法的項(xiàng)目反應(yīng)理論3PLM參數(shù)估計(jì)[J];江西師范大學(xué)學(xué)報(bào)(自然科學(xué)版);2005年06期
3 朱隆尹;丁樹良;;CAT能力估計(jì)方法的比較研究[J];江西師范大學(xué)學(xué)報(bào)(自然科學(xué)版);2007年03期
4 殷華,宋繼華;CAT能力求解算法研究與優(yōu)化[J];中國人民公安大學(xué)學(xué)報(bào)(自然科學(xué)版);2005年02期
5 王權(quán);;“馬爾可夫鏈蒙特卡洛”(MCMC)方法在估計(jì)IRT模型參數(shù)中的應(yīng)用[J];考試研究;2006年04期
6 辛濤;樂美玲;張佳慧;;教育測量理論新進(jìn)展及發(fā)展趨勢[J];中國考試;2012年05期
7 王華;陳景;馬翠琴;周麗娟;;基于GA-BP算法的IRT模型參數(shù)估計(jì)方法研究[J];華北電力大學(xué)學(xué)報(bào)(自然科學(xué)版);2012年05期
本文編號(hào):1775962
本文鏈接:http://sikaile.net/shekelunwen/xinlixingwei/1775962.html