oural of oder ppled tatstcal ethods olume 6 ssue rtcle 5 5--007 ao-uffma ased tatstcal odg ethod ladd hamlov adolu versty, urkey eay sma adolu versty, urkey ollow ths ad addtoal works at: http://dgtalcommos.waye.edu/jmasm art of the ppled tatstcs ommos, ocal ad ehavoral ceces ommos, ad the tatstcal heory ommos ecommeded tato hamlov, ladd ad sma, eay (007) " ao-uffma ased tatstcal odg ethod," oural of oder ppled tatstcal ethods: ol. 6: ss., rtcle 5. valable at: http://dgtalcommos.waye.edu/jmasm/vol6/ss/5 hs egular rtcle s brought to you for free ad ope access by the pe ccess ourals at gtalommos@ayetate. t has bee accepted for cluso oural of oder ppled tatstcal ethods by a authorzed admstrator of gtalommos@ayetate.
oural of oder ppled tatstcal ethods opyrght 007, c. ay, 007, ol. 6, o., 65-78 538 947/07/$95.00 ao-uffma ased tatstcal odg ethod ladd hamlov eay sma adolu versty, urkey tatstcal codg techques have bee used for lossless statstcal data compresso, applyg methods such as rdary, hao, ao, haced ao, uffma ad hao-ao-las codg methods. ew ad mproved codg method s preseted, the ao-uffma ased tatstcal odg ethod. t holds the advatages of both the ao ad uffma codg methods. t s more easly applcable tha the uffma codg methods ad t s more optmal tha ao codg method. he optmalty wth respect to the other methods s realzed o the bass of glsh, erma, urksh, rech, ussa ad pash. ey words: ao-uffma based statstcal codg method, probablty dstrbuto of laguage, etropy, formato, optmal code. troducto roblem tatemet uffma s algorthm s a well-kow ecodg method that geerates a optmal prefx ecodg scheme, the sese that the average code word legth s mmum. s opposed to ths, ao s method has ot bee used so much because t geerates prefx ecodg schemes that ca be sub-optmal (ueda & omme, 004). ths artcle, a mproved codg method s preseted, whch has bee amed the ao-uffma ased tatstcal odg method ad applcatos of ths method. hs method holds the both advatages of ao ad uffma codg method. o, t s more easly applcable tha the uffma codg method ad s more optmal tha ao codg method. he optmalty of the metoed codg method wth ladd hamlov s a rofessor the epartmet of tatstcs. mal: asamlov@aadolu.edu.tr. eay sma s a esearch ssstat the epartmet of tatstcs. mal: seayyolaca@aadolu.edu.tr. respect to the other codg methods s realzed o the bass of glsh, erma, urksh, rech, ussa ad pash. he classcal codg methods ad the cocept of optmalty are descrbed the secto ttled lasscal odg ethods ad ptmalty. mproved codg method, ao- uffma ased odg ethod by whch ecodg schemes, whch are arbtrarly close to the optmum, ca be easly costructed, s troduced the secto called ao-uffma ased tatstcal odg ethod. the followg secto, the tables of costructed bary codes are gve ad comparsos of cosdered methods sese of optmalty are made. the cocluso, the terpretato of optmalty of these results s made subject to classcal codg methods ad suggestos are gve. vervew ssume that a source alphabet, = { s, s, s }, whose probabltes of occurrece are = { p, p p }, ad a code alphabet, = { a, a, a r } s gve. he propose of ths study s the geerato of a s, such a way ecodg scheme, { } w 65
66 - that l = = p l s mmzed, where l s the legth of w. formato theory has mportat applcatos probablty theory, statstcs ad commucato systems. ossless ecodg methods used to solve ths problem clude uffma s algorthm (uffma, 95), hao s method (hao & eaver, 949), arthmetc codg (ayood, ), ao s method (akerso, arrs, & ohso, 998), ehaced ao-based codg algorthm (ueda & ome, 004) etc. daptve versos of these methods have bee proposed, ad ca be foud (aller, 973; allager, 978; akerso et al., 998; uth, 985; ueda, 00; ayood, ). he survey s ecessarly bref as ths s a well-reputed feld. lso, assume that the source s memoryless or zeroth-order, whch meas that the occurrece of the ext symbol s depedet of ay other symbol that has occurred prevously. gher-order models clude arkov models (akerso et al., 998), dctoary techques (v & empel, 977; v & empel, 978), predcto wth partal matchg (tte, offat, & ell, 999), grammar based compresso (effer & ag, ), etc., ad the techques troduced here are also readly applcable for such structure models. lasscal odg ethods ad ptmalty ths secto, the fudametal steps of classcal codg methods are descrbed ad the cocept of optmalty of codes s expouded. lasscal codg methods uppose that source alphabet (alphabet of laguage) = { s,s s } ad ts probablty dstrbuto = { p, p p } are gve. rdary odg ethod hs method requres the followg steps: (a) eterme umber satsfyg the equalty log, where s the legth of codeword ad s the the umber of symbols source alphabet; (b) frequecy; umerate letter gorg the (c) overt umbers determed by (b) from base 0 to base such that s the legth of coverted umber (oma, 997). hao odg ethod ostructo of hao s provded by steps: (a) ut { p, p } ascedg order p = p p p ; (b) alculate = log p the legth of codeword, =,,...,; (c) et defe dyadc fracto as k = 0 ad k = p, k. he = calculate, =,,...,; (d) overt dyadc fracto to bary form by usg obltz s trck, the select frst bts as a code correspodg to s (akerso et. al., 003). ao odg ethod hs method volves the steps: (a) erform the probabltes of symbols source alphabet ascedg order p p ; p (b) vde the set of symbols to two subsets such that the sum of the probabltes of occurreces of symbols each subset are equal or almost equal. he, assg a 0 to frst subset ad a to secod; (c) epeat step (a) utl all subsets have a sgle elemet (Венцель, 969).
& 67 haced ao odg ethod hs method proposed the followg steps: (a) osder the source alphabet = { s,s s } whose probablty dstrbuto of occurreces s = { p, p p }, where p p p ; (b) bta φ : s w,,s w the ecodg scheme by ao s method; (c) earrage w, w w to w, w w such that j for all < j, ad smultaeously mata s,s,, s the same order, to yeld the ecodg scheme: φ : s w,,s w (ueda & omme, 004). uffma odg ethod hs method s bottom-up whle the others are top-dow. t ca be explaed more clearly as follows: (a) ort symbols of source alphabet decreasg order of ther probabltes; (b) erge the two least-probable letter to a sgle output whose probablty s the sum of the correspodg probabltes; (c) o to step (a) f the umber of remag outputs s more tha ; (d) ssg a 0 ad a arbtrarly as code words for the two remag outputs; (e) pped the curret codeword wth a 0 ad a to obta the codeword the precedg outputs ad repeat step (e) f a output s the result of the merger of two outputs a precedg step. top f o output s preceded by aother output a precedg step (azhag, 004). (a) erform the source alphabet = { s,s s } whose probablty dstrbuto of occurreces s = { p, p p } ad the order of probabltes s t mportat; (b) bta the cumulatve dstrbuto by the fucto (s) = p(a) ; a s (c) osder modfed cumulatve dstrbuto fucto (s) = p(a) + p(s), a< s where (s) deotes the sum of probabltes of all symbols less tha s plus half the probablty of the symbols; (d) bta the legth of codeword by the formula (s) = log + p(s), where. deotes roudg up; (e) overt dyadc fracto (s) to bary form by usg obltz s trck such that the codeword has () s bts (over & homas, 99). he cocept of optmalty of codes here exsts a uquely decodable code whose codeword legths are gve by the sequece {} l f raft equalty l = = holds. ue to raft equalty (over, 99), the codtos for optmal codes are as follows: (a) he average codeword legth = p of a optmal code for a source s = greater tha or equal to ts etropy ( ) = plog p ; = hao-ao-las odg ethod hs method ca be explaed by steps:
68 - (b) he average codeword legth of a optmal code for a source s strctly less tha ()+. or source alphabet = { s,s s } whose probablty dstrbuto of occurreces s = { p, p p }, the average codeword legth s gve by, ad etropy of the source alphabet s gve by ( ). der these codtos, t s requred to trasmt as well as possble formato by usg codes cossts of fewer bts. o, ths problem ca be cosdered as optmzato problem whch s cosst of mmzg = = p l subject to costrat, where s = dmeso of codebook,.e. f the codebook s {0,} the = etc. hs problem s solved by usg agrage ultplers, ad the followg result s obtaed: * log p l = ; (.) * = = log ( ) ; (.) l = pl = p p = l = ( ). (.3) ut t s t possble to fd a terger umber for codeword legth that satsfes (.). or ths reaso, t s ecessary to obta the etropy lower boud (over & homas, 99; oma, 997) satsfyg the followg equalty: * l = pl ( ). = (.4) oreover, f s a statoary stochastc process, l ( ), (.5) where () s the etropy rate of the process. der the metoed kowledge, the formato per symbol (letter) s gve by ( ) f/ letter = ad the optmalty crtera for codes s cosdered as f/ letter (Венцель, 969). oreover, the optmalty meas that f the text s coded by a optmal codg method, the umber of s ad the umber of 0s are early equal sece of maxmum etropy. ece, the optmal codes meas that they trasmt early maxmum formato sce s ad 0s are t always equal probable. ao-uffma ased tatstcal odg ethod ths secto, a ew ad mproved codg method s proposed, whch ca be cosdered as a hybrd method that holds the both advatages of ao ad uffma codg methods. t s well kow that ao codg method s a suboptmal procedure for costructg a source code (ueda & omme, 004). ths method, the source symbols ad ther probabltes are sorted a o-creasg order of the probabltes ad the the set of symbols s dvded to two subsets such that the sum of the probabltes of occurreces of symbols each subset are equal or almost equal. he ma advatage of ths method s the dvso of the set of symbols. ecause, t requres pure computatos. ece, the frst goal of the mproved codg method s to hold ths advatage. uffma codg method s a optmal procedure (over & homas, 99). ths method, the source symbols ad ther probabltes are also sorted decreasg order ad the the two least-probable symbols are merged to a sgle output whose probablty s the sum of the correspodg probabltes. hus, by ths recursve procedure, the optmal uffma codes are costructed. he advatage of ths codg method s that the procedure s from bottom to top. ths way, the short code
& 69 words are atta to the symbols that occur frequetly ad log code words are atta to the symbols that occur rarely. hs advatage of uffma codg method costtutes the secod goal of the mproved codg method. osderg the advatages of these two codg procedure a hybrd codg method s preseted. o, the codg method s more easly applcable tha the uffma codg methods ad s more optmal tha ao codg method. he codes performed by that codg method are prefx codes ad satsfy the sblg property. he ao-uffma based statstcal codg method s ow proposed the followg form: (a) erform the probabltes of symbols source alphabet ascedg order p p ; p (b) hoose k such that k m p p s mmzed. hs umber k = = k+ dvdes the source symbols to two sets of almost equal probablty. (c) erge the two least-probable letter each set to a sgle output whose probablty s the sum of the correspodg probabltes; (d) o to step (c) f the umber of remag outputs s more tha ; (e) ssg a 0 ad a arbtrarly as codewords for the two remag outputs; (f) pped the curret codeword wth a 0 ad a to obta the codeword the precedg outputs ad repeat step (e) f o output s preceded by aother output a precedg step merge the two least-probable subset to a sgle output whose probablty s the sum of the correspodg probabltes; (g) top f o output s preceded by aother output a precedg step. ote that, accordg to step (b) due to sze of source alphabet, the set of symbols ca be dvded to more subsets (, =,,...) of equal or almost equal probabltes. he advatages of the proposed method arse from the comparsos of ths method wth the other aforesad codg methods. he applcatos of ths method ad comparsos are gve the followg secto. ables, omputatoal etals ad omparsos ths secto, order to dcate the advatages of our proposed method, ao- uffma ased statstcal codg method, we compare t wth the tradtoal codg methods. arous bary codes for glsh, erma, urksh, rech, ussa ad pash symbols are costructed sese of optmalty. rech, erma, pash ad glsh symbols (letters) are the at characters cosstg of 6 letters whch are gve able a. he probabltes of rech, erma ad pash symbols (letters) were establshed 939 by letcher ratt (tephes, 00; ratt, 939), the probabltes of glsh symbols (letters) were establshed by am hamdo () ad they are gve able b. able a. rech, erma, pash ad glsh ymbols a c d e g h j k l m p q r s t u v w x y z
70 - able b. robabltes of rech, erma, pash ad glsh ymbols ymbols glsh rech erma pash 0.06574 0.045 0.0734 0.034984 0.0444 0.09788 0.0586 0.04989 0.055809 0.903 0.005053 0.03349 0.00 0.05645 0.059630 0.03765 0.86 0.049756 0.05576 0.07936 0.053 0.00890 0.077 0.369 0.04598 0.784 0.988 0.0847 0.00876 0.03063 0.045 0.7564 0.00959 0.005 0.007 0.07559 0.00598 0.4 0.05783 0.0990 0.073 0.0589 0.0980 0.036 0.069 0.0803 0.07353 0.0599 0.0557 0.0 0.00350 0.6 0.7-0.06506 0.0566 0.0837 0.0544 0.6693 0.0044 0.03647 0.04064 0.078 0.9 0.0879 0.085 0.03005 0.09905 0.085 0.00944 0.55 0.06539 0.06765 0.0674 0.03703 0.0069 0.0396 0. 0.3 0.000-0.59 0.040 0.04679 0.05856 0.3676 0.00694 0.0006 0.00704 0.0649 0.00443 0.04 0.0497 0.0350 0.067 0.08684 0.0505 0.00875 0.06873 0.07980 0.0469 0.03934 0.00895 0.3 0.00 0.00895 0.0053 - urksh ource cossts of 9 symbols (letters). he captal ad small letters of the urksh are gve able a. robabltes of occurrece of urksh symbols (letters) are gve able b (hamlov & olaca, 005; alklc & alklc, 00). osdered probabltes have bee costtuted from a corpus cosst of words from may varety of felds,. e. scetfc artcles, ewspapers, poetcs etc.,.5 mllo characters total. ussa uses yrllc alphabet cosstg of 3 symbols (letters) whch are gve able 3a. robabltes of ussa symbols are gve able 3b., where deotes the space symbol (Венцель, 969; aglom & aglom, 966).
& 7 able a. urksh ource Ç Ğ İ a b ç d e f g ğ h ı j k Ö Ş Ü l m o ö p r s ş t ı ü v y able b. robabltes of urksh ymbols etter requecy etter requecy etter requecy Ç Ğ 0.06 0.037 0.0084 0.00 0.0400 0.078 0.0038 0.04 0.009 0.0096 İ Ö 0.0444 0.073 0.3 0.0407 0.0530 0.030 0.0633 0.04 0.0074 0.0073 Ş Ü 0.0604 0.064 0.057 0.087 0.084 0.07 0.0087 0.095 0.030 0.39 able 3a. ussa ymbols (yrllc alphabet) А Б В Г Д Е Ж З И Й К Л М Н О а б в г д е ж з и й к Л м н о П Р С Т У Ф Х Ц Ч Ш Щ Ъ(Ь) Ы Э Ю Я п р с т у ф х ц ч ш щ ъ(ь) ы э ю я able 3b. robabltes of ussa ymbols ymbols robabltes ymbols robabltes А Б В Г Д Е Ж З И Й К Л М Н О П 0.064 0.05 0.039 0.04 0.06 0.074 0.008 0.05 0.064 0.00 0.09 0.036 0.06 0.056 0.095 0.04 Р С Т У Ф Х Ц Ч Ш Щ Ъ(Ь) Ы Э Ю Я 0.04 0.047 0.056 0.0 0.00 0.009 0.004 0.03 0.006 0.003 0.05 0.06 0.003 0.007 0.09 0.45
7 - order to costruct bary codes for glsh, erma, urksh, rech, ussa ad pash, the classcal codg methods are appled to cosdered source alphabets. osequetly, the costructed bary codes are gve respectvely ables 4-9. oreover, ao-uffma ased statstcal codg method s also appled to cosdered laguages. ary costructed by ao-uffma based statstcal codg are gve able 0. glsh o 0 3 4 5 6 7 8 9 0 3 4 5 6 7 8 9 0 3 4 5 6 able 4 ary for robablty strubuto of glsh ymbols rdary 00 0 0 00 0 0 0 0 0 0 0 0 0 0 000 -- 0 0 0 0 0 00 0 0 0 0 0 0 0 0 00 0 00 0 00 00 0 rdered hao 000 000 0 0 0 0 00 0 ao 00 00 00 0 00 0 0 0 0 0 0 0 haced ao 00 00 00 0 00 0 0 0 0 0 0 0 uffma 00 0 0 00 0 0 0 0 0 0 0 0 0 0 0
& 73 able 5. ary for robablty strubuto of erma ymbols erma o rdary -- rdered hao ao haced ao uffma 0 3 4 5 6 7 8 9 0 3 4 5 6 7 8 9 0 3 4 5 00 0 0 00 0 0 0 0 0 0 0 0 0 0 000 0 000 00 00 0 0 00 00 000 0 0 0 00 00 0 0 0 000 0 0 0 0 00 00 0 00 00 0 00 0 00 0 00 0 0 0 0 0 00 00 0 00 0 00 0 00 0 0 0 0 0 000 0 00 0 0 0 0 00 00 0 0 able 6. ary for robablty strubuto of urksh ymbols urksh o rdary -- rdered hao ao haced ao uffma Ç Ğ İ Ö Ş Ü 0 3 4 5 6 7 8 9 0 3 4 5 6 7 8 9 0 3 4 5 6 7 8 9 00 0 0 00 0 0 0 0 0 0 0 0 0 0 000 0 00 0 0 0 0 0 00 0 00 0 000 00 0 0 0 0 00 0 0 0 00 00 0 0 00 0 İ Ü Ş Ç Ğ Ö 0 0 0 0 000 0 0 0 00 0 0 0 0 000 000 0 0 00 0 0 0 000 000 0 0 00 0 0 0 0 000 0 0 0 00 0 0 0 0 0 0 0 0 0 0
- 74 able 7. ary for robablty strubuto of rech ymbols rech o rdary -- rdered hao ao haced ao uffma 0 3 4 5 6 7 8 9 0 3 4 5 6 7 8 9 0 3 4 5 00 0 0 00 0 0 0 0 0 0 0 0 0 0 000 0 00 0 00 0 00 0 0 0 00 00 00 0 0 0 0 00 0 000 0 00 0 0 0 0 0 0 00 00 0 00 0 00 0 00 0 00 0 0 0 0 0 0 00 00 0 00 0 00 0 00 0 00 0 0 0 0 0 0 00 0 0 00 0 0 0 0 0 0 0 0 0 0 able 8. ary for robablty strubuto of ussa ymbols ussa o rdary -- rdered hao ao haced ao uffma А Б В Г Д Е Ж З И Й К Л М Н О П Р С Т У Ф Х Ц Ч Ш Щ Ъ(Ь) Ы Э Ю Я 0 3 4 5 6 7 8 9 0 3 4 5 6 7 8 9 0 3 4 5 6 7 8 9 30 3 00 0 0 00 0 0 0 0 0 0 0 0 0 0 000 0 00 0 0 0 0 00 00 0 000 0 0 0 0 00 00 00 0 00 0 0 00 0 000 0 0 0 0 0 О Е А И Т Н С Р В Л К М Д П У Я Ы З Ъ(Ь) Б Г Ч Й Х Ж Ю Ш Ц Щ Э Ф 0 0 000 0 0 0 0 00 0 0 00 0 000 000 0 0 00 0 00 0 00 0 0 0 000 000 0 0 00 0 00 0 00 0 0 0 0 0 00 00 0 0 0 00 0 00 0 0 0 00 0 0 00 00
& 75 able 9. ary for robablty strubuto of pash ymbols pash ymbols o rdary -- rdered hao ao haced ao uffma 0 3 4 5 6 7 8 9 0 3 4 5 6 7 8 9 0 3 4 5 00 0 0 00 0 0 0 0 0 0 0 0 0 0 000 0 00 0 0 000 00 0 0 0 00 00 0 0 0 0 000 0 000 0 0 0 0 0 0 00 00 0 0 00 0 00 0 00 0 0 0 0 0 00 00 0 0 00 0 00 0 00 0 0 0 0 0 0 00 000 00 0 0 0 0 0 0 0 0 0 0 0 able 0. ary ostructed by ao-uffma ased tatstcal odg ethod urksh ao- uffma based for urksh symbols ussa ao- uffma based for ussa symbols glsh, rech, erma, pash ao- uffma based for glsh symbols ao- uffma based for rech symbols ao-uffma based for erma symbols ao-uffma based for pash symbols Ç Ğ İ Ö Ş Ü 0 00 0 0 0 0 0 0 0 0 0 00 0 0 0 0 00 А Б В Г Д Е Ж З И Й К Л М Н О П Р С Т У Ф Х Ц Ч Ш Щ Ъ(Ь) Ы Э Ю Я 0 00 0 00 00 00 0 00 0 0 0 0 0 0 000 00 00 00 0 0 00 00 0 00 0 0 00 00 00 00 00 000 0 0 00 00 0 00 00 00 000 0 000 00 0 00 000 00 0 0 0 0 000 0 0 0 0 0 0 0 0 00 0 0 0 000 0 0 0 00 0 000 00 0 0 00 0 0 0 0 0 00 00 0 0 000 00 00 00 0 0
76 - order to determe the formato per letter for cosdered alphabets due to the metoed codg methods, the followg stages are preseted: ) he etropy of each metoed laguages () s calculated. ) he codeword legth of each codes show ables 4-0 s obtaed by coutg the bts of the code words ad thus average codeword legth s computed for each codg methods. 3) he formato per letter ( ) f/ letter = s get for terpretato of optmalty of codes. he results of these stages are gve able. s prevously preseted, the optmalty crtera for codes s / s. bvously, t s see from able that, bary codes costructed for each symbols of dfferet alphabet by ao-uffma based statstcal codg method s more optmal tha ao codg method ad s as optmal as costructed by uffma codg method but t s more easly applcable tha uffma codg method. lso, the mproved codg method s more optmal tha the others. oreover, f a fle s coded by ao-uffma based codes the the dmeso of the fle wll be less tha the fles coded by the other cosdered codg methods. ece, ths meas faster commucato. ource glsh urksh rech erma pash ussa able formato per letter set by costructed bary codes hao ao uffma (bts) (bts) (bts) rdary (bts) 0.845 0.873 0.797 0.890 0.803 0.8839 0.880 0.9075 0.8885 0.900 0.950 0.9085 0.9834 0.9937 0.9854 0.990 0.9909 0.995 mproved ao (bts) 0.9839 0.9937 0.9854 0.990 0.9909 0.9936 hao ao las (bts).079.0955.09.083.6.4 0.9905 0.9939 0.9899 0.995 0.994 0.9936 ao-uffma based (bts) 0.9888 0.9939 0.9899 0.990 0.996 0.9936
& 77 ocluso t s see that, bary codes costructed by ao- uffma based statstcal codg method carry formato per letter as much as codes costructed by uffma codg method. owever, by ths codg method the less subset you dvde the more optmal codes you obta. hus, ths result make ao-uffma based statstcal codg method preferred codg methods as uffma codg method for each of the cosdered laguages. ao-uffma based statstcal codg method takes less tme tha uffma codg method to costruct bary codes. owever, t requre more pure computato tha uffma codg method by meas of dvdg the source alphabet to subsets ad ths meas faster codg. s t s commoly kow, operatg system of computers based o merca tadard ode for formato terchage () whch s ordary bary codes. herefore, aother ma result from ths study s the advatage of ao-uffma based codes rather tha. bvously, t ca be cocluded from ths study that ordary codes are ot optmal because they have the hghest average codeword legth ad the least formato per letter. ece, sce codes are ordary codes, the text coded by them wll be larger sze cotrary to ao-uffma based codes. o, codes are ot preferred codes. osequetly, ao-uffma based codes ca be used computer systems for data compresso rather tha for faster commucato. ecause, f a fle s coded by ao-uffma based codes the the dmeso of the fle wll be less tha fle coded by but t wll trasmt the same formato by usg codes cosst of less bts. efereces azhag,. (004). http://cx.rce.edu/cotet/m076/latest/, reatve ommos. over,.. & homas,.. (99). lemets of formato theory. : oh ley & os, c. aller,. (973). adaptve system for data compresso. 7th slomar coferece o crcuts, systems, ad computers, 593 597. allager,. (978). aratos o a theme by uffma. rasactos o formato heory, 4(6), 668 674. akerso,., arrs,.. & ohso,.. (003). troducto to formato theory ad data compresso (d ed.). oca ato, : hapma & all/ ress. akerso,., arrs,. & ohso,. (998). troducto to formato theory ad data compresso. ress. uffma,. (95). method for the costructo of mmum redudacy codes. roceedgs of, 40(9), 098 0. effer,.., & ag,. (). rammar-bassed codes: a ew class of uversal lossless source codes. rasactos o formato heory, 46(3), 737 754. uth,. (985). yamc uffma codg. oural of lgorthms, 6, 63 80. oma,. (997). troducto to odg ad formato heory. ew ork: prger-erlag. ueda,. (00). dvaces data compresso ad patter recogto. h thess, chool of omputer cece, arleto versty, ttawa, aada. ueda,.. & omme.. (004). early-ptmal ao-ased odg lgorthm. formato rocessg ad aagemet, 40, 57-68. ayood,. (). troducto to data compresso ( d ed.). orga aufma. Венцель, Ε. С. (969). Теория Вероятностей, Москва. ratt,. (939). ecret ad urget: he story of codes ad cphers. lue bbo ooks. hamdo,. (). http://dwww.epfl.ch/matra/ours /eb/compresso/eglsh.html, tate versty of ew ork. tephes,. (00). http://www.satacruzpl.org/readyref/fles/gl/ltfrqsp.shtml, ata ruz ublc brares, alfora.
78 - hamlov. ad olaca. (005). arous bary codes for probablty dstrbuto of urksh letters. teratoal oferece rdered tatstcal ata: pproxmatos, ouds ad haracterzatos, pp.70 zmr, urkey. hao,.., & eaver,. (949). he mathematcal theory of commucatos. versty of llos ress. tte,., offat,. & ell,. (999). aagg ggabytes: ompressg ad dexg documets ad mages (d ed.). orga aufma. olaca,. (005). tatstcal propertes of dfferet laguages based o etropy ad formato theory. adolu versty raduate chool of ceces, aster of cece hess (at turksh), sksehr. v,. & empel,. (977). uversal algorthm for sequetal data compresso. rasactos o formato heory, 3(3), 337 343. v,. & empel,. (978). ompresso of dvdual sequeces va varable-rate codg. rasactos o formato heory, 5(5), 530 536.