Interval Algorithm for Homophonic Coding

Size: px
Start display at page:

Download "Interval Algorithm for Homophonic Coding"

Transcription

1 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 47, NO 3, MARCH Interval Algorithm for Homophonic Coding Mamoru Hoshi, Member, IEEE, and Te Sun Han, Fellow, IEEE Abstract It is shown that the idea of the successive refinement of interval partitions, which plays the key role in the interval algorithm for random number generation by Han and Hoshi, is also applicable to the homophonic coding An interval algorithm for homophonic coding is introduced which produces an independent and identically distributed (iid) sequence with probability Lower and upper bounds for the expected codeword length are given Based on this, an interval algorithm for fixed-to-variable homophonic coding is established The expected codeword length per source letter converges to ( ) ( ) in probability as the block length tends to infinity, ( ) is the entropy rate of the source The algorithm is asymptotically optimal An algorithm for fixed-to-fixed homophonic coding is also established The decoding error probability tends to zero as the block length tends to infinity Homophonic coding with cost is generally considered The expected cost of the codeword per source letter converges to ( ) ( ) in probability as the block length tends to infinity, denotes the average cost of a source letter The main contribution of this paper can be regarded as a novel application of Elias coding technique to homophonic coding Intrinsic relations among these algorithms, the interval algorithm for random number generation and the arithmetic code are also discussed Index Terms Arithmetic coding, cost coding, Elias coding, homophonic coding, interval algorithm, random number generation I INTRODUCTION HOMOPHONIC coding (or substitution) is a coding technique that transforms a stream of source letters with an arbitrary distribution into an invertible random stream of symbols (random codeword), which all have the same frequency That is to guarantee security in source coding as much as possible The key idea of homophonic coding for randomization is to introduce a suitable number of representations (codewords) for each letter from the source alphabet and to randomly choose one of the representations (codewords) at each step The codewords assigned to the source letter are called the homophones for Simple examples are helpful to explain what is the homophonic coding Example 1: Let be a source taking values in the source alphabet Manuscript received August 30, 1999; revised September 7, 2000 The material in this paper was presented in part at the IEEE Information Theory and Communication Workshop, Kruger National Park, South Africa, June 20 25, 1999 The authors are with the Graduate School of Information Systems, The University of Electro-Communications, Chofu, Tokyo , Japan ( hoshi@isuecacjp; han@isuecacjp) Communicated by M Weinberger, Associate Editor for Source Coding Publisher Item Identifier S (01) Case 1 (Günther [1, p 407]): Let have the probability and The following mapping defines a homophonic code: with probability with probability with probability (1) The letter is encoded at random into, with probability respectively The codewords 0 and 10 are called the homophones for the source letter The probability of the codeword 0 (10) being produced is 1/2 (1/4) It is evident that this code is prefix free and, hence, uniquely decodable The code is represented by the tree of Fig 1, which we call the homophonic coding tree As is obvious from the figure, a leaf (denoted by a box) uniquely specifies a codeword (ie, the path from the root to the leaf) and vice versa Therefore, we also call the leaves with label the homophones for the source letter The expected length of the codewords (ie, the expected path length of the tree) is, as the entropy of the source is Case 2: Let have the probability and The following mapping defines a homophonic code: with probability with probability The homophonic coding tree of this code is shown in Fig 2 In this case, the set of the homophones (ie, codewords; leaves) is countably finite set The expected length of the codeword is, as the entropy of the source is Let us now formulate the homophonic coding Let be a random variable taking values in the source alphabet with probability Let be a finite set called the code alphabet and a prefix-free subset of, is the set of all nonnull finite sequences of elements of Let be a random variable taking values in We consider the conditional distribution of for given and define a stochastic multivalued mapping such that for each, there exists exactly one such that and ; is assigned to with probability and called a homophone for We refer to this mapping as the (prefix-free) homophonic coding The homophones (ie, codewords) for are those for which It is obvious from the above condition (2) /01$ IEEE

2 1022 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 47, NO 3, MARCH 2001 Fig 1 Homophonic coding tree for the source X which have the probability p =3=4 and p =1=4 Code alphabet is f0; 1g Fig 2 Homophonic coding tree for the source X which have the probability p =2=3 and p =1=3 Code alphabet is f0; 1g that the inverse mapping of is unique (uniquely decodable) Note that homophonic coding, essentially, is interpreted as a channel with input set and output set Let be any variable-length random variable, is a positive integer-valued random variable such that the value of uniquely determines whether or not, (ie, is a stopping time) The variable-length random sequence is said to be completely -random, if each component variable is statistically independent of the preceding ones and obeys the same probability distribution In a word, the completely -random sequence may be rephrased to be a variablelength independent and identically distributed (iid) sequence subject to The homophonic coding is said to be -perfect if the homophone is completely -random When is uniform, we say the homophone and the homophonic coding, simply, completely random, and perfect, respectively [2] The perfect homophonic coding enables us to build what Shannon called the strongly ideal cipher system [2], [3] Günther [1] showed an algorithm for iid sources, called the variable-length homophonic substitution that makes the resulting -ary homophone completely random However, the required memory grows exponentially Jendal et al [2] have shown that for a binary homophonic coding, which is perfect and minimizes the expected length of the codeword among perfect homophonic codings, the expected length of the codeword is less than bits, is the entropy of the source (See also da Rocha et al [4]) Ryabko [5] suggested a perfect homophonic coding algorithm The expected length of the codeword generated by his algorithm is upper bounded by The required memory size is, is the redundancy defined as the difference between the average codeword length and the source entropy Homophonic coding proposed so far is required to be: 1) perfect, ie, 1a) iid and 1b) equiprobable; 2) error-free; 3) low redundant; and 4) equal symbol cost We consider the homophonic coding in more general situations, ie, generalize the requirement 1b) or 2) or 4) In this paper, we shall show that the idea of the successive refinement of interval partition, which was used in the interval algorithm for random number generation (cf Han and Hoshi [6]), is directly applicable also to the -perfect homophonic coding In Section II, we propose an interval algorithm for -ary homophonic coding, which produces an iid sequence with probability distribution, ie, completely -random sequence The expected length of the codeword generated by the algorithm is bounded as, depends only on In particular, when with, the produced sequence is completely random and is upper bounded by bits In Section III, we consider block coding for general stationary (and ergodic) source processes In Section III-A, we give an interval algorithm for fixed-to-variable -perfect homophonic coding which has the complexity linear in block length and is asymptotically optimal Furthermore, we show that the expected length of the codeword per source letter converges to in probability as the block length tends to infinity, is the entropy rate of the stationary ergodic source process (Theorem 4) Section III-B gives an asymptotically optimal linear complexity interval algorithm with stationary ergodic source processes for fixed-to-fixed -perfect homophonic coding and shows that, if the coding rate is lager than, the decoding error probability approaches to zero as the block length tends to infinity Section IV discusses -perfect homophonic coding by interval algorithm with cost and shows that the expected cost of the codeword per source letter converges to as the block length tends to infinity, denotes the average cost of the code letter The minimization of over all possible s is shown to be, which is attained by choosing case, it is also shown that (cf Section IV) In this converges in probability to

3 HOSHI AND HAN: INTERVAL ALGORITHM FOR HOMOPHONIC CODING 1023 if the source is stationary ergodic This property is used for the proof of the key theorem (Theorem 4) In Section V-A we discuss the relationship between the interval algorithms for random number generation and for homophonic coding In Section V-B we discuss the intrinsic relationship between the arithmetic coding algorithm and the interval algorithms In the Appendix, the proof of the convergence is given Finally, in closing this introduction, we would like to emphasize the following points First of all, it should be emphasized that the key idea of the present paper, ie, the key idea underlying the interval algorithm for homophonic coding is the successive refinement of interval partitions on the unit interval This key idea bas been earlier discovered first by Elias (cf Jelinek [7]), the origin of which may date back even to the Shannon Fano coding (cf Shannon [8]) Fixed-to-variable source codes based on this successive refinement of interval partitions are sometimes called the Elias code In other words, the standpoint of Elias code is to look at source sequences as the corresponding subintervals on the unit interval This standpoint can be made much clearer if we consider source coding with cost (cf Savari and Gallager [9]) Since, however, in order to construct Elias code the interval partitions with infinite precision were required, it was not implementable from the practical point of view, although the idea itself of Elias code remained very intriguing Subsequently, this practical implementation problem has been solved by Rissanen [10], Pasco [11], etc, by introducing floating point (instead of fixed point) arithmetic computations of interval partitions with finite precision at the expense of as small increase of the coding rate as desired Such codes have been called the arithmetic code Since there are many ways of modifying Elias code to practically implementable arithmetic codes by finite precision computations, the term arithmetic code sometimes denotes a family of those codes constructed on the basis of finite precision computations for interval partitions In a word, arithmetic code may reasonably be regarded as a kind of modification of Elias code, but not vice versa On the other hand, the interval algorithm for homophonic coding (IAHC) to be presented in this paper is properly constructed on the basis of the successive refinement of interval partitions with infinite precision, which is exactly along the original idea of Elias code rather than that of arithmetic code with finite precision This is because the interval algorithm for homophonic coding is required to be exactly (not approximately) -perfect In this sense, it is not right at all to say that the interval algorithm for homophonic coding may be viewed as a modification or direct application of arithmetic code: the consequence of any direct application of arithmetic code is far from -perfect homophonic coding It should be noted also that an auxiliary uniform random number (cf Section II), the use of which implies an intrinsic partition of a symbol s coding space in the Elias coding procedure among its homophones, intervenes and plays a crucial role in the interval algorithm for homophonic coding but does not at all in arithmetic code Nevertheless, the deep conceptual connection of the interval algorithm for homophonic coding as well as the interval algorithm for random number generation (cf Han and Hoshi [6]) to arithmetic code should not be inappropriately downplayed The main contribution of this paper can be regarded as a novel application of Elias coding technique to homophonic coding, it replaces the usual Huffman coding technique It is rather interesting also to observe that there exists an indeed intrinsic common structure underlying these algorithms, Elias code, and arithmetic codes, although they are, of course, respectively different at the detailed technical levels II INTERVAL ALGORITHM FOR HOMOPHONIC CODING In the previous paper [6], we have established an algorithm for random number generation The algorithm is based on the successive refinement of interval partition and is called the interval algorithm In the present paper, we propose the interval algorithm for -perfect homophonic coding Suppose that we are given a source taking values in the source alphabet with probability, Hereafter, we call this the source distribution Let be the code alphabet First, let us review the basic notation of the interval algorithm Partition the unit interval into subintervals according to probability distribution ; that is Hereafter, we sometimes call this distribution the partition ratio It is obvious that are mutually disjoint and Any interval may be partitioned into subintervals according to as follows: With this notation, the subinterval of length recursive manner as (3) (4) (5) with any sequence ) is defined in the for the null string It is then obvious that the size of the interval is given by Similarly to the above, we partition the unit interval into subintervals according to probability distribution ; that is (6) (7)

4 1024 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 47, NO 3, MARCH 2001 Fig 3 Homophone 121 for X =1obtained by the algorithm IAHC [The source distribution q =(5=12; 1=3; 1=4), code alphabet Z = f1; 2g and partition ratio p =(1=2; 1=2)] (8) The interval algorithm for homophonic coding (denoted by IAHC) for the source is now described as follows Interval Algorithm for Homophonic Coding (IAHC) 0) Read a source output X = k 2f1; ; Ng, compute the interval J(k) and choose a random number r uniformly distributed on J(k); 1) Set s := (null string), I(s) :=[0; 1); 2) If I(s) J(k) then output the string s as a codeword for k, and terminate the algorithm; 3) Partition I(s) into M subintervals I(sj)(j =1; ; M); find the a such that r 2 I(sa); set s := sa, and go to step 2); This algorithm is summarized as, Given a source output, then take a uniform random number on the interval ; and find the largest interval (ie, the shortest string ), which contains the and is contained in It is easy to check that the set of codewords generated in this manner forms a prefix-free code Example 2 (Günther [1, p 409]): Consider the case with,, Fig 3 illustrates the homophonic coding by the interval algorithm Suppose that the source outputs a symbol We compute the interval and get a uniform random number, say Since is not included in, is partitioned with ratio to have intervals and Fig 4 Homophonic coding tree by the algorithm IAHC for the source with distribution q =(5=12; 1=3; 1=4), code alphabet Z = f1; 2g and partition ratio p =(1=2; 1=2) Since contains, set However, is not included in, and so is further partitioned to have intervals and The interval contains, but is not included in and, therefore, set is further partitioned to have intervals and Since contains and is included in, the algorithm outputs as a codeword for the source symbol and terminates In this manner, depending on the value of the uniform random number, we obtain one of the codewords for ; one of the codewords for ; the only codeword 22 for The homophonic coding tree of this code is shown in Fig 4 Noting that the probability that the algorithm reaches at a node of level is given as, we can see that the code produced is perfect The expected length of the codeword is 25, as the entropy of the source is When we use, instead of,we obtain a -perfect homophonic code in the same way as the above: the codewords for ; for ; for The -perfect homophonic code with such a nonuniform is closely related to the problem of homophonic coding with cost (cf Section IV)

5 HOSHI AND HAN: INTERVAL ALGORITHM FOR HOMOPHONIC CODING 1025 Just as in the case of interval algorithm for random number generation, we can equivalently describe the above algorithm IAHC in terms of an -ary tree (possibly of infinite size) as follows: 1) every internal node has children branches connecting the node to the children are labeled as in the order from left to right; 2) each node is associated with an interval as follows: a) the interval is associated with the root; b) the interval is associated in the one-to-one manner with the node if the path from the root to the node spells out the sequence of branches ; thus, we can use the associated sequence to denote the node ; 3) a node is a leaf and labeled if and (the same label may be assigned to several leaves) We call this tree the homophonic coding tree corresponding to the interval algorithm IAHC The tree thus defined works in the following way Starting with the root, we traverse the tree according to the source output and the uniform random number on Wegodown from a node to the son node such that the interval contains When we reach a leaf with label, output as a codeword (homophone) for, and terminate the algorithm The leaves s (or equivalently, the sequences s) with label are the homophones for Thereafter, we use the node, the associated interval, and the sequence, interchangeably Let be the expected length of the codeword generated by the homophonic coding tree Note that is equal to the expected level of the leaves, the level of the root is zero by convention Since the algorithm traverses the homophonic coding tree according to the source output and the uniform random number, the resulting variable-length output sequence (ie, the codeword) of the algorithm is also a random sequence is the random variable to denote the length of the sequence The key property of this sequence is that it is actually a variable-length iid sequence, which is stated as follows Theorem 1: When the algorithm IAHC with partition ratio is applied to the output of the source with probability distribution, the resulting -ary sequence is a variable-length iid sequence subject to probability distribution with stopping time, that is, is completely -random Thus, the homophonic coding by IAHC is -perfect Proof: Suppose that the source output took place with probability Let all the codewords (ie, all the homophones) for be listed as Then, it is obvious from the way of interval partition in the algorithm IAHC that (disjoint union) Since and the random number is uniformly distributed on, we see that On the other hand, from (6) it follows that: we have put Hence which implies that is a variable-length iid sequence subject to probability distribution with stopping time Corollary 1: Let be uniform, then the homophonic coding by IAHC is perfect, ie, the resulting -ary sequence is completely random Now that the performance of the homophonic coding tree is identical with that of generating tree for random number generation, the following theorem on the homophonic coding is a simple restatement of the theorem that has been already established with the interval algorithm for random number generation (cf [6, Theorem 3, pp ]) Theorem 2: For any and any, the expected length of the codeword generated by the interval algorithm IAHC is bounded as the lower bound must be satisfied by any tree algorithm for homophonic coding and is the binary entropy and Corollary 2: For uniform and any, the expected length of the codeword generated by the interval algorithm IAHC is bounded as (9) (10) In particular, if, the bound (10) reduces to (11) the logarithm of the entropy is to the base 2 The bound (11) is the same as that given by Ryabko [5] for this particular case III INTERVAL ALGORITHM FOR BLOCK HOMOPHONIC CODING In this section, let us consider the situation in which we want to encode a source block of length, instead of a one-shot source Let be any source

6 1026 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 47, NO 3, MARCH 2001 process with component variables source alphabet taking values in the, and put Then, we can apply the interval algorithm IAHC in Section II with in place of, is the probability distribution of In order to do this, we first need to successively partition the unit interval according to the source distribution Let for the null string For any sequence, the subinterval is defined in the recursive manner as we have put (12) (13) (14) (15) It is here easy to see that (16) for any sequence and also that for In exactly the same manner as for, we associate a string over the alphabet with the interval A Fixed-to-Variable Homophonic Coding An interval algorithm for fixed-to-variable homophonic coding is obtained by slightly modifying the interval algorithm IAHC; that is, simply by replacing all the variable in steps 0) and 2) of the IAHC by Interval Algorithm for Fixed-to-Variable Homophonic Coding (v-iahc) 0) Read a source sequence x 1 xn; compute the interval J(x 1 xn); and choose a random number r uniformly distributed on J(x 1 xn); 1) Set s := (null string), I(s) :=[0; 1); 2) If I(s) J(x 1 xn), then output the string s as a codeword for x 1 xn and terminate the algorithm; 3) Partition I(s) into M subintervals I(sj)(j =1; ; M); find the a such that r 2 I(sa); set s := sa; and go to step 2); This algorithm is called the interval algorithm for fixed-tovariable homophonic coding (denoted by v-iahc) It is easy to see that this algorithm has the complexity linear in block length Since can be written as, it is evident from Theorem 2 that the algorithm thus defined satisfies the following bound: denotes the length of the codeword for and denotes the expectation Therefore, we have (17) which results in the following theorem Theorem 3: Let us here consider the case the source is stationary Then the sequence obtained by applying the algorithm v-iahc to the output of the source is completely -random and, hence, the homophonic coding by v-iahc is -perfect Furthermore, the algorithm v-iahc asymptotically minimizes the expected length of the codeword per source letter in the sense that (18) (19) is the entropy rate of the source This means that the interval algorithm v-iahc is asymptotically optimal with respect to the average code length per source letter (cf [6]) In particular, when is uniform, ie,, (18) becomes to (20) Thus, the algorithm v-iahc with uniform is not only perfect but also asymptotically optimal in terms of the average codeword length per source letter We are now in a position to state the following key theorem telling that the random number is actually constant as Theorem 4: Let the source be stationary ergodic and let denote the length of the codeword for generated by v-iahc with partition ratio Then, the length per source letter converges in probability as in prob (21) The proof of this theorem is given later in the Appendix because, for this proof, we need Theorem 8 to be stated in the next section B Fixed-to-Fixed Homophonic Coding So far we have discussed algorithms with the random stopping time that terminate with probability one However, these algorithms may not terminate within a prescribed finite time with probability one To ensure the finite time termination, one way is to restrict the number of interval parti-

7 HOSHI AND HAN: INTERVAL ALGORITHM FOR HOMOPHONIC CODING 1027 Fig 6 Homophonic coding tree by the algorithm f-iahc for the source with distribution q =(5=12; 1=3; 1=4), code alphabet Z = f1; 2g and partition ratio p =(1=2; 1=2) Fig 5 Homophone 121 for X =1obtained by the algorithm f-iahc (The source distribution q =(5=12; 1=3; 1=4), code alphabet Z = f1; 2g and partition ratio p =(1=2; 1=2)) tions (ie, the codeword length) to a prescribed fixed integer, with a positive number called the rate, at the cost of possible decoding error This is tantamount to considering the fixed-to-fixed coding It is easy to modify the algorithm v-iahc for this purpose To do so, we simply replace the step 2) in v-iahc by If then go to step 3); otherwise, output the string as a codeword for ; and terminate the algorithm; We call such an algorithm the interval algorithm for fixed-to-fixed homophonic coding (denoted by f-iahc), which is described as Interval Algorithm for Fixed-to-Fixed Homophonic Coding (f-iahc) 0) Read a source sequence x 1 xn; compute the interval J(x 1 xn); and choose a random number r uniformly distributed on J(x 1 xn); 1) Set ` := 0, s := (null string), I(s) :=[0; 1); 2) If jsj <L max (n) then go to step 3); otherwise, output the string s for x 1 xn; and terminate the algorithm; 3) Partition I(s) into M subintervals I(sj)(j =1; ; M); find the a such that r 2 I(sa); set s := sa; and go to step 2); Like the algorithm v-iahc, the complexity of this algorithm f-iahc is linear in block length However, with the code generated by the algorithm f-iahc, the decoding error is inevitable (ie, not uniquely decodable) in general, in contrast to the case of the algorithms IAHC and v-iahc Example 3: Consider Example 2 again ( ) Fig 5 illustrates the coding by f-iahc with Fig 6 shows the corresponding homophonic coding tree (See also the tree for v-iahc in Fig 4) In this example, the codewords for are and 122; the codewords for are and ; the codewords for are and These codewords are uniquely decodable except for the codeword Let be the decoding error probability by f-iahc with Let be a codeword generated by f-iahc and the homophonic coding tree of v-iahc If a prefix ( ) of the codeword corresponds to a leaf of the tree, can be uniquely decoded; in this case, the prefix is sufficient for correct decoding On the other hand, if corresponds to an internal node of, then decoding error occurs Therefore, the decoding error probability is equal to the probability that the algorithm f-iahc terminates at a leaf at level larger than in the tree This observation together with Theorem 4 leads to the following theorem Theorem 5: Let the source be stationary ergodic Then, the homophonic coding by f-iahc with partition ratio is -perfect Furthermore, with a fixed rate, the decoding error probability of the homophonic coding by f-iahc satisfies if (22) if An immediate consequence of Theorem 5 is as follows Corollary 3: Let the source be stationary ergodic Then, the homophonic coding by f-iahc with partition ratio is perfect Furthermore, with a fixed, the decoding error probability by f-iahc satisfies if (23) if Since the homophonic coding by f-iahc with uniform is perfect and all the codewords are of the same length, we can use f-iahc itself also as what Shannon called the strongly ideal cipher [3] IV HOMOPHONIC CODING WITH COST Suppose that each code symbol is assigned the cost and let The problem of source coding with such a general additive unequal

8 1028 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 47, NO 3, MARCH 2001 cost has first been studied by Krause [12] and subsequently by Csiszár [13], etc Among others, the paper by Savari and Gallager [9] is one of a few papers treating the problem of arithmetic coding with unequal cost As for some different but related properties with cost, see, eg, Csiszár [14] In this section, we first study the problem of homophonic coding with unequal cost The cost of a codeword is defined by The expected cost of the codeword for a one-shot source that is generated by a homophonic coding tree is defined by (24) is the partition ratio and (25) is the probability that the codeword is generated by the tree We want here to minimize the expected cost by varying the partition ratio This is carried out as follows The weight of a node of is defined as the sum of the probabilities of leaves reachable from the node Then, we have the following relations (cf Ahlswede and Wegener [15]): (26) (27) is the random variable to denote the codeword generated by the tree These relations immediately yields the following key formula (eg, cf Ahlswede and Wegener [15]): (28) On the other hand, owing to (26), (24) is rewritten as follows: (29) we have put Let us now consider the homophonic coding with cost when the source subject to distribution on is replaced by the source block subject to distribution on Denote by and the length of the codeword for and the cost of the codeword for, respectively, then, in this case, (29) is written as (30) In -perfect homophonic coding with cost, we want to make the average cost [but not the average length ] of codewords as small as possible under the requirement that the output code sequence is -completely random Now, (30) together with Theorem 3 immediately yields (cf in Hoshi and Han [16, Theorem 1]): Theorem 6: Let the source be stationary, then for any partition ratio and cost function, the expected cost of the codeword per source letter generated by the algorithm v-iahc satisfies (31) Here the ratio can be interpreted as the local information gain per unit cost It is straightforward to check that attains the maximum over all possible s when, is the root of the equation, so that we have the following theorem (cf [16, Theorem 2]) Theorem 7: Consider the same source as in Theorem 6 For any and, we have the inequality under the algorithm v-iahc as (32) the logarithm of the entropy is to the base and the equality holds when and only when Theorems 6 and 7 mean that the algorithm v-iahc is asymptotically optimal (from the viewpoint of cost) with a given cost when In this connection, we notice that the uniform partition ratio is the best one only when all the costs are equal Remark 1: The term appears in the problem of minimizing the average cost for the stationary source using an optimal prefix code with unequal symbol costs (see [17] etc) A variant of Theorem 6 is the following theorem on the convergence of in probability, which is used to prove Theorem 4 in Section III-A Theorem 8: Let the source be stationary ergodic and given Let denote the cost of the codeword for generated by the algorithm v-iahc with Obviously, this homophonic coding is -perfect Then, the cost of the codeword per source letter satisfies as in prob (33) Proof: Let fix and consider any codeword Then, the probability of is given (34) (35) Since is a codeword, must be a leaf Hence, the interval is included in some, which is uniquely determined by Therefore, in view of and, we have (36)

9 HOSHI AND HAN: INTERVAL ALGORITHM FOR HOMOPHONIC CODING 1029 Since is stationary ergodic, we have the almost sure convergence (cf Barron [18]) This together with (36) implies that from which it follows that for any as (37) as (38) (39) On the other hand, it is clear that, for an arbitrary small constant X Then, a simple calculation gives We observe that and, hence, by (39) we have By virtue of Theorem 7, we have Thus, letting X in (44) it follows from (45) that (40) (41) (42) (43) (44) (45) (46) Since and are arbitrary constants, letting we obtain (47) In the light of (42), this together with (39) completes the proof V CONCLUDING REMARKS A Relation to Random Number Generation So far, we have shown that the idea of the successive refinement of interval partitions, which plays the key role also in the interval algorithm for random number generation (Han an Hoshi [6]), is directly applicable also to the homophonic coding Thus, as may have already been understood, -perfect homophonic coding with source distribution is identical with generating a random number with target distribution by repeatedly and independently tossing an -coin with probability Therefore, the length (cost) of the generated codeword exactly coincides with the number (cost) of coin tosses to generate a random number To help the reader s understanding on the intrinsic relationship between the interval algorithm for homophonic coding and that for random number generation, let us review here from [6] what is the interval algorithm for random number generation: Interval Algorithm for Random Number Generation 1) Set s = (null string), I(s) =[0; 1); 2) If I(s) J(k) for some k 2X, then set X = k and terminate the algorithm; 3) Toss the M -coin with distribution p to have a value a 2f1; ; Mg, then set s = sa, and go to step 2) The difference between the algorithms for homophonic coding and random number generation lies only in the ways of using the same tree As was stated, in the homophonic coding, the input to the tree is an output from the source with distribution We traverse with partition ratio the tree according to the source output and the uniform random number on to have a codeword for On the contrary, in the random number generation, the input to the tree is an output sequence of independent -coin tosses subject to distribution We traverse the tree according to the sequence to reach a leaf and get a random number with distribution Actually, on the basis of this observation, we have thus far derived Theorems 2 (expected code length), Theorem 3 (expected code length per source letter), Theorem 6 (expected cost per source letter), and Theorem 7 (lower bound for the expected cost per source letter) in correspondence with the theorems on the performance of the interval algorithms for random number generation ([6, Theorem 3, eq (52)], [16, Theorems 1, 2]) Inversely, we can easily see that Theorems 4 and 8 (the convergence in probability of the codeword length and of the cost per source letter) are valid also to evaluate the performance of the iterative random number generation, and are here understood to denote the number of coin tosses and the cost required to generate a random sequence of length, respectively As an illustrative case, let us consider the counterpart of Theorem 5, ie, the situation we want to produce a random sequence of length by tossing -coin times (ie, fixed-to-fixed random number generation) In this case, we cannot always produce a random sequence of length subject exactly to the prescribed target distribution In the fixed-to-fixed algorithm for random number generation by -coin with probability, we output a special sequence, say,, when we stay still at an internal node of the generating tree in the final step of coin tosses This causes a difference between the target distribution and the generated distribution It is straightforward to check by using Theorem 5 that the variational distance (48)

10 1030 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 47, NO 3, MARCH 2001 between the target distribution satisfies and the generated distribution if (49) if is any stationary ergodic source with It should be noted here that the lower equality of (49) remains to be valid even if we invoke whatever random number generation algorithms that are not necessarily the interval algorithm used here (cf Han [19, Lemma 22]) B Relation to Arithmetic Coding Now, we should point out that there exists the remarkable conceptual analogy of the interval algorithm with the arithmetic code The encoding process of the algorithm IAHC corresponds to the encoding process of the arithmetic coding algorithm (cf [20], [9]), as the process of generating a random number by the interval algorithm in [6] corresponds to the decoding process of the arithmetic coding algorithm In the case of homophonic coding, as was described in Section II, we encode each source symbol into the several codewords with code alphabet The codewords for are, depending on the uniform random number, defined as the sequences of the smallest length such that the interval is entirely contained with infinite precision in the interval On the contrary to homophonic coding, in the case of arithmetic coding, we assign the only one (not several) codeword to each source symbol The codeword for is defined as the sequence of the smallest length for which the interval is approximately contained with finite precision in the interval The idea of the successive refinement of interval partitions has been earlier discovered first by Elias (cf [7]), and then applied with due modifications by many people, for example, in the fields of arithmetic coding, random number generation, homophonic coding, and so on It is rather surprising that the intrinsic relation among these applications remained not quite clear until recently This would be mainly because arithmetic coding, random number generation, and homophonic coding have been separately studied APPENDIX PROOF OF THEOREM 4 Let the v-iahc with partition ratio be as given in Theorem 4 Note that here is given and in accordance with this we introduce the cost function such that, is specified as in Theorem 8, ie, Then, clearly have put to the base in terms of Section IV Note here that we In this appendix, the logarithm is Let be the set of all the codewords such that (50) is an arbitrary small constant Then, from Theorem 8, under the algorithm v-iahc with partition ratio we have (51) Step 1) Consider the algorithm f-iahc with such that and (52) (53) Since the codeword is an iid sequence subject to, is also an iid sequence Therefore, by virtue of Chebyshev s inequality, we have Lemma 1: For any (54) Let be the set of all the codewords such that (55) is the code alphabet Then, by (52) and (53), can be expressed as (56) Consider a codeword of such that and write (57) so that If, then from (56) we have (58) which contradicts the fact that, owing to (50) Thus, must hold ( denotes the complement of a set), which together with (51) and the property (due to Lemma 1) yield (59) denotes the length of the codeword for generated by the v-iahc Step 2) Consider the algorithm f-iahc with and such that (60) Similarly to the step 1), since the codeword is an iid sequence subject to, again by means of Chebyshev s inequality, we have the following Lemma 2: For any (61)

11 HOSHI AND HAN: INTERVAL ALGORITHM FOR HOMOPHONIC CODING 1031 Let be the set of codewords such that Then, by (53) and (60), can be expressed as (62) (63) Consider a codeword of such that Then, there exist and such that (64) and, hence, If, then from (63) we have (65) which contradicts the fact that, owing to (50) Thus must hold, which together with (51) and the property (due to Lemma 2) conclude that (66) In the light of (53) and, (59) together with (66) completes the proof ACKNOWLEDGMENT The authors would like to thank Dr J Abrahams for having brought their attention to the homophonic coding They would also like to thank Prof K Kobayashi and Prof H Morita for their discussions REFERENCES [1] C G Günther, A universal algorithm for homophonic coding, in Advances in Cryptology Eurocrypt 88 (Lecture Notes in Computer Science) Berlin, Germany: Springer-Verlag, 1988, pp [2] H N Jendal, Y J B Kuhn, and J L Massey, An information theoretic treatment of homophonic substitution, in Advances in Cryptology Eurocrypt 89 (Lecture Notes in Computer Science) Berlin, Germany: Springer-Verlag, 1990, pp [3] C E Shannon, Communication theory of secrecy systems, Bell System Tech J, vol 28, pp , 1949 [4] V C da Rocha and J L Massey, On the entropy bound for optimum homophonic substitution, in Proc IEEE Int Symp Information Theory, Ulm, Germany, Sept 1997, p 93 [5] B Ryabko and A Fionov, A fast and efficient homophonic coding algorithm, in Algorithms and Computation, T Asano et al, Eds Berlin, Germany: Springer-Verlag, 1996, pp [6] T S Han and M Hoshi, Interval algorithm for random number generation, IEEE Trans Inform Theory, vol 43, pp , Mar 1997 [7] F Jelinek, Probabilistic Information Theory New York: Mc- Graw-Hill, 1968 [8] C E Shannon, A mathematical theory of communication, Bell System Tech J, vol 27, pp , 1948 [9] S A Savari and R G Gallager, Arithmetic coding for finite-state noiseless channels, IEEE Trans Inform Theory, vol 40, pp , Jan 1994 [10] J Rissanen, Generalized Kraft inequality and arithmetic coding, IBM J Res Devel, vol 20, pp , 1976 [11] R C Pasco, Some coding algorithms for fast data compression, PhD dissertation, Dept Elect Eng, Stanford Univ, Stanford, CA, 1976 [12] R M Krause, Channels which transmit letters of unequal duration, Inform Contr, vol 5, pp 13 24, 1962 [13] I Csiszár, Simple proofs of some theorems on noiseless channels, Inform Contr, vol 14, pp , 1969 [14] I Csiszár, G Katona, and G Túsnady, Information sources with different cost scales and the principle of the conservation of entropy, Z Wahrscheinlichkeitstheorie verw Gebiete, vol 12, pp , 1969 [15] R Ahlswede and I Wegener, Search Problems New York: Wiley, 1987 [16] M Hoshi and T S Han, Random number generation with cost by an interval algorithm, in Proc IEEE Int Symp Information Theory, Ulm, Germany, Sept 1997, p 158 [17] I Csiszár and J Körner, Information Theory: Coding Theorems for Discrete Memoryless Systems New York: Academic, 1981 [18] A R Barron, The strong ergodic theorem for densities: Generalized Shannon McMillan Breiman theorem, Ann Probab, vol 13, no 4, pp , 1985 [19] T S Han, Information-Spectrum Methods in Information Theory (in Japanese) Tokyo, Japan: Baifu-kan, 1998 [20] J Rissanen and G G Langdon, Arithmetic coding, IBM J Res Devel, vol 23, pp , 1979 [21] M Hoshi and T S Han, Interval algorithms for homophonic coding, in Proc IEEE Information Theory Workshop, Kruger National Park, South Africa, June 1999, pp [22] B Ryabko and A Fionov, Decreasing redundancy of homophonic coding, in Proc IEEE Int Symp Information Theory, Ulm, Germany, Sept 1997, p 94

Information Theory and Communication

Information Theory and Communication Information Theory and Communication Shannon-Fano-Elias Code and Arithmetic Codes Ritwik Banerjee rbanerjee@cs.stonybrook.edu c Ritwik Banerjee Information Theory and Communication 1/12 Roadmap Examples

More information

Chapter 5 VARIABLE-LENGTH CODING Information Theory Results (II)

Chapter 5 VARIABLE-LENGTH CODING Information Theory Results (II) Chapter 5 VARIABLE-LENGTH CODING ---- Information Theory Results (II) 1 Some Fundamental Results Coding an Information Source Consider an information source, represented by a source alphabet S. S = { s,

More information

Lecture 17. Lower bound for variable-length source codes with error. Coding a sequence of symbols: Rates and scheme (Arithmetic code)

Lecture 17. Lower bound for variable-length source codes with error. Coding a sequence of symbols: Rates and scheme (Arithmetic code) Lecture 17 Agenda for the lecture Lower bound for variable-length source codes with error Coding a sequence of symbols: Rates and scheme (Arithmetic code) Introduction to universal codes 17.1 variable-length

More information

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 The Encoding Complexity of Network Coding Michael Langberg, Member, IEEE, Alexander Sprintson, Member, IEEE, and Jehoshua Bruck,

More information

The Encoding Complexity of Network Coding

The Encoding Complexity of Network Coding The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network

More information

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 11 Coding Strategies and Introduction to Huffman Coding The Fundamental

More information

Lecture 15. Error-free variable length schemes: Shannon-Fano code

Lecture 15. Error-free variable length schemes: Shannon-Fano code Lecture 15 Agenda for the lecture Bounds for L(X) Error-free variable length schemes: Shannon-Fano code 15.1 Optimal length nonsingular code While we do not know L(X), it is easy to specify a nonsingular

More information

Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay

Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 29 Source Coding (Part-4) We have already had 3 classes on source coding

More information

UC San Diego UC San Diego Previously Published Works

UC San Diego UC San Diego Previously Published Works UC San Diego UC San Diego Previously Published Works Title Universal lossless compression via multilevel pattern matching Permalink https://escholarshiporg/uc/item/39k54514 Journal IEEE Transactions on

More information

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 53, NO. 10, OCTOBER

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 53, NO. 10, OCTOBER IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 53, NO. 10, OCTOBER 2007 3413 Relay Networks With Delays Abbas El Gamal, Fellow, IEEE, Navid Hassanpour, and James Mammen, Student Member, IEEE Abstract The

More information

ARELAY network consists of a pair of source and destination

ARELAY network consists of a pair of source and destination 158 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 55, NO 1, JANUARY 2009 Parity Forwarding for Multiple-Relay Networks Peyman Razaghi, Student Member, IEEE, Wei Yu, Senior Member, IEEE Abstract This paper

More information

6. Lecture notes on matroid intersection

6. Lecture notes on matroid intersection Massachusetts Institute of Technology 18.453: Combinatorial Optimization Michel X. Goemans May 2, 2017 6. Lecture notes on matroid intersection One nice feature about matroids is that a simple greedy algorithm

More information

Almost all Complete Binary Prefix Codes have a Self-Synchronizing String

Almost all Complete Binary Prefix Codes have a Self-Synchronizing String Almost all Complete Binary Prefix Codes have a Self-Synchronizing String Christopher F. Freiling Douglas S. Jungreis François Théberge Kenneth Zeger IEEE Transactions on Information Theory Submitted: February

More information

3 No-Wait Job Shops with Variable Processing Times

3 No-Wait Job Shops with Variable Processing Times 3 No-Wait Job Shops with Variable Processing Times In this chapter we assume that, on top of the classical no-wait job shop setting, we are given a set of processing times for each operation. We may select

More information

Group Secret Key Generation Algorithms

Group Secret Key Generation Algorithms Group Secret Key Generation Algorithms Chunxuan Ye and Alex Reznik InterDigital Communications Corporation King of Prussia, PA 9406 Email: {Chunxuan.Ye, Alex.Reznik}@interdigital.com arxiv:cs/07024v [cs.it]

More information

Optimal Variable Length Codes (Arbitrary Symbol Cost and Equal Code Word Probability)* BEN VARN

Optimal Variable Length Codes (Arbitrary Symbol Cost and Equal Code Word Probability)* BEN VARN INFORMATION AND CONTROL 19, 289-301 (1971) Optimal Variable Length Codes (Arbitrary Symbol Cost and Equal Code Word Probability)* BEN VARN School of Systems and Logistics, Air Force Institute of Technology,

More information

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings On the Relationships between Zero Forcing Numbers and Certain Graph Coverings Fatemeh Alinaghipour Taklimi, Shaun Fallat 1,, Karen Meagher 2 Department of Mathematics and Statistics, University of Regina,

More information

Characterization of Boolean Topological Logics

Characterization of Boolean Topological Logics Characterization of Boolean Topological Logics Short Form: Boolean Topological Logics Anthony R. Fressola Denison University Granville, OH 43023 University of Illinois Urbana-Champaign, IL USA 61801-61802

More information

Pierre A. Humblet* Abstract

Pierre A. Humblet* Abstract Revised March 1980 ESL-P-8 0 0 GENERALIZATION OF HUFFMAN CODING TO MINIMIZE THE PROBABILITY OF BUFFER OVERFLOW BY Pierre A. Humblet* Abstract An algorithm is given to find a prefix condition code that

More information

THREE LECTURES ON BASIC TOPOLOGY. 1. Basic notions.

THREE LECTURES ON BASIC TOPOLOGY. 1. Basic notions. THREE LECTURES ON BASIC TOPOLOGY PHILIP FOTH 1. Basic notions. Let X be a set. To make a topological space out of X, one must specify a collection T of subsets of X, which are said to be open subsets of

More information

Module 7. Independent sets, coverings. and matchings. Contents

Module 7. Independent sets, coverings. and matchings. Contents Module 7 Independent sets, coverings Contents and matchings 7.1 Introduction.......................... 152 7.2 Independent sets and coverings: basic equations..... 152 7.3 Matchings in bipartite graphs................

More information

Interval Algorithms for Coin Flipping

Interval Algorithms for Coin Flipping IJCSNS International Journal of Computer Science and Network Security, VOL.10 No.2, February 2010 55 Interval Algorithms for Coin Flipping Sung-il Pae, Hongik University, Seoul, Korea Summary We discuss

More information

Construction C : an inter-level coded version of Construction C

Construction C : an inter-level coded version of Construction C Construction C : an inter-level coded version of Construction C arxiv:1709.06640v2 [cs.it] 27 Dec 2017 Abstract Besides all the attention given to lattice constructions, it is common to find some very

More information

Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Lecture 10 (Chapter 7) ZHU Yongxin, Winson

Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Lecture 10 (Chapter 7) ZHU Yongxin, Winson Welcome Back to Fundamentals of Multimedia (MR412) Fall, 2012 Lecture 10 (Chapter 7) ZHU Yongxin, Winson zhuyongxin@sjtu.edu.cn 2 Lossless Compression Algorithms 7.1 Introduction 7.2 Basics of Information

More information

Open and Closed Sets

Open and Closed Sets Open and Closed Sets Definition: A subset S of a metric space (X, d) is open if it contains an open ball about each of its points i.e., if x S : ɛ > 0 : B(x, ɛ) S. (1) Theorem: (O1) and X are open sets.

More information

ARITHMETIC operations based on residue number systems

ARITHMETIC operations based on residue number systems IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 53, NO. 2, FEBRUARY 2006 133 Improved Memoryless RNS Forward Converter Based on the Periodicity of Residues A. B. Premkumar, Senior Member,

More information

On the Complexity of the Policy Improvement Algorithm. for Markov Decision Processes

On the Complexity of the Policy Improvement Algorithm. for Markov Decision Processes On the Complexity of the Policy Improvement Algorithm for Markov Decision Processes Mary Melekopoglou Anne Condon Computer Sciences Department University of Wisconsin - Madison 0 West Dayton Street Madison,

More information

Lecture 15: The subspace topology, Closed sets

Lecture 15: The subspace topology, Closed sets Lecture 15: The subspace topology, Closed sets 1 The Subspace Topology Definition 1.1. Let (X, T) be a topological space with topology T. subset of X, the collection If Y is a T Y = {Y U U T} is a topology

More information

Chapter 5 Lempel-Ziv Codes To set the stage for Lempel-Ziv codes, suppose we wish to nd the best block code for compressing a datavector X. Then we ha

Chapter 5 Lempel-Ziv Codes To set the stage for Lempel-Ziv codes, suppose we wish to nd the best block code for compressing a datavector X. Then we ha Chapter 5 Lempel-Ziv Codes To set the stage for Lempel-Ziv codes, suppose we wish to nd the best block code for compressing a datavector X. Then we have to take into account the complexity of the code.

More information

Course notes for Data Compression - 2 Kolmogorov complexity Fall 2005

Course notes for Data Compression - 2 Kolmogorov complexity Fall 2005 Course notes for Data Compression - 2 Kolmogorov complexity Fall 2005 Peter Bro Miltersen September 29, 2005 Version 2.0 1 Kolmogorov Complexity In this section, we present the concept of Kolmogorov Complexity

More information

The strong chromatic number of a graph

The strong chromatic number of a graph The strong chromatic number of a graph Noga Alon Abstract It is shown that there is an absolute constant c with the following property: For any two graphs G 1 = (V, E 1 ) and G 2 = (V, E 2 ) on the same

More information

5. Lecture notes on matroid intersection

5. Lecture notes on matroid intersection Massachusetts Institute of Technology Handout 14 18.433: Combinatorial Optimization April 1st, 2009 Michel X. Goemans 5. Lecture notes on matroid intersection One nice feature about matroids is that a

More information

Horn Formulae. CS124 Course Notes 8 Spring 2018

Horn Formulae. CS124 Course Notes 8 Spring 2018 CS124 Course Notes 8 Spring 2018 In today s lecture we will be looking a bit more closely at the Greedy approach to designing algorithms. As we will see, sometimes it works, and sometimes even when it

More information

Lecture 4: September 11, 2003

Lecture 4: September 11, 2003 Algorithmic Modeling and Complexity Fall 2003 Lecturer: J. van Leeuwen Lecture 4: September 11, 2003 Scribe: B. de Boer 4.1 Overview This lecture introduced Fixed Parameter Tractable (FPT) problems. An

More information

COMPSCI 650 Applied Information Theory Feb 2, Lecture 5. Recall the example of Huffman Coding on a binary string from last class:

COMPSCI 650 Applied Information Theory Feb 2, Lecture 5. Recall the example of Huffman Coding on a binary string from last class: COMPSCI 650 Applied Information Theory Feb, 016 Lecture 5 Instructor: Arya Mazumdar Scribe: Larkin Flodin, John Lalor 1 Huffman Coding 1.1 Last Class s Example Recall the example of Huffman Coding on a

More information

On 2-Subcolourings of Chordal Graphs

On 2-Subcolourings of Chordal Graphs On 2-Subcolourings of Chordal Graphs Juraj Stacho School of Computing Science, Simon Fraser University 8888 University Drive, Burnaby, B.C., Canada V5A 1S6 jstacho@cs.sfu.ca Abstract. A 2-subcolouring

More information

Complete Variable-Length "Fix-Free" Codes

Complete Variable-Length Fix-Free Codes Designs, Codes and Cryptography, 5, 109-114 (1995) 9 1995 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands. Complete Variable-Length "Fix-Free" Codes DAVID GILLMAN* gillman @ es.toronto.edu

More information

1 Sets, Fields, and Events

1 Sets, Fields, and Events CHAPTER 1 Sets, Fields, and Events B 1.1 SET DEFINITIONS The concept of sets play an important role in probability. We will define a set in the following paragraph. Definition of Set A set is a collection

More information

FUTURE communication networks are expected to support

FUTURE communication networks are expected to support 1146 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL 13, NO 5, OCTOBER 2005 A Scalable Approach to the Partition of QoS Requirements in Unicast and Multicast Ariel Orda, Senior Member, IEEE, and Alexander Sprintson,

More information

Interleaving Schemes on Circulant Graphs with Two Offsets

Interleaving Schemes on Circulant Graphs with Two Offsets Interleaving Schemes on Circulant raphs with Two Offsets Aleksandrs Slivkins Department of Computer Science Cornell University Ithaca, NY 14853 slivkins@cs.cornell.edu Jehoshua Bruck Department of Electrical

More information

AXIOMS FOR THE INTEGERS

AXIOMS FOR THE INTEGERS AXIOMS FOR THE INTEGERS BRIAN OSSERMAN We describe the set of axioms for the integers which we will use in the class. The axioms are almost the same as what is presented in Appendix A of the textbook,

More information

An Improved Upper Bound for the Sum-free Subset Constant

An Improved Upper Bound for the Sum-free Subset Constant 1 2 3 47 6 23 11 Journal of Integer Sequences, Vol. 13 (2010), Article 10.8.3 An Improved Upper Bound for the Sum-free Subset Constant Mark Lewko Department of Mathematics University of Texas at Austin

More information

LOW-DENSITY PARITY-CHECK (LDPC) codes [1] can

LOW-DENSITY PARITY-CHECK (LDPC) codes [1] can 208 IEEE TRANSACTIONS ON MAGNETICS, VOL 42, NO 2, FEBRUARY 2006 Structured LDPC Codes for High-Density Recording: Large Girth and Low Error Floor J Lu and J M F Moura Department of Electrical and Computer

More information

Lecture 3: Graphs and flows

Lecture 3: Graphs and flows Chapter 3 Lecture 3: Graphs and flows Graphs: a useful combinatorial structure. Definitions: graph, directed and undirected graph, edge as ordered pair, path, cycle, connected graph, strongly connected

More information

Theorem 2.9: nearest addition algorithm

Theorem 2.9: nearest addition algorithm There are severe limits on our ability to compute near-optimal tours It is NP-complete to decide whether a given undirected =(,)has a Hamiltonian cycle An approximation algorithm for the TSP can be used

More information

Compressing Data. Konstantin Tretyakov

Compressing Data. Konstantin Tretyakov Compressing Data Konstantin Tretyakov (kt@ut.ee) MTAT.03.238 Advanced April 26, 2012 Claude Elwood Shannon (1916-2001) C. E. Shannon. A mathematical theory of communication. 1948 C. E. Shannon. The mathematical

More information

Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1

Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1 CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanford.edu) January 11, 2018 Lecture 2 - Graph Theory Fundamentals - Reachability and Exploration 1 In this lecture

More information

On the Max Coloring Problem

On the Max Coloring Problem On the Max Coloring Problem Leah Epstein Asaf Levin May 22, 2010 Abstract We consider max coloring on hereditary graph classes. The problem is defined as follows. Given a graph G = (V, E) and positive

More information

General properties of staircase and convex dual feasible functions

General properties of staircase and convex dual feasible functions General properties of staircase and convex dual feasible functions JÜRGEN RIETZ, CLÁUDIO ALVES, J. M. VALÉRIO de CARVALHO Centro de Investigação Algoritmi da Universidade do Minho, Escola de Engenharia

More information

Algorithms Dr. Haim Levkowitz

Algorithms Dr. Haim Levkowitz 91.503 Algorithms Dr. Haim Levkowitz Fall 2007 Lecture 4 Tuesday, 25 Sep 2007 Design Patterns for Optimization Problems Greedy Algorithms 1 Greedy Algorithms 2 What is Greedy Algorithm? Similar to dynamic

More information

Independence Number and Cut-Vertices

Independence Number and Cut-Vertices Independence Number and Cut-Vertices Ryan Pepper University of Houston Downtown, Houston, Texas 7700 pepperr@uhd.edu Abstract We show that for any connected graph G, α(g) C(G) +1, where α(g) is the independence

More information

Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay

Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 26 Source Coding (Part 1) Hello everyone, we will start a new module today

More information

An Eternal Domination Problem in Grids

An Eternal Domination Problem in Grids Theory and Applications of Graphs Volume Issue 1 Article 2 2017 An Eternal Domination Problem in Grids William Klostermeyer University of North Florida, klostermeyer@hotmail.com Margaret-Ellen Messinger

More information

Intro. To Multimedia Engineering Lossless Compression

Intro. To Multimedia Engineering Lossless Compression Intro. To Multimedia Engineering Lossless Compression Kyoungro Yoon yoonk@konkuk.ac.kr 1/43 Contents Introduction Basics of Information Theory Run-Length Coding Variable-Length Coding (VLC) Dictionary-based

More information

6. Finding Efficient Compressions; Huffman and Hu-Tucker

6. Finding Efficient Compressions; Huffman and Hu-Tucker 6. Finding Efficient Compressions; Huffman and Hu-Tucker We now address the question: how do we find a code that uses the frequency information about k length patterns efficiently to shorten our message?

More information

Extremal Graph Theory: Turán s Theorem

Extremal Graph Theory: Turán s Theorem Bridgewater State University Virtual Commons - Bridgewater State University Honors Program Theses and Projects Undergraduate Honors Program 5-9-07 Extremal Graph Theory: Turán s Theorem Vincent Vascimini

More information

looking ahead to see the optimum

looking ahead to see the optimum ! Make choice based on immediate rewards rather than looking ahead to see the optimum! In many cases this is effective as the look ahead variation can require exponential time as the number of possible

More information

Greedy Algorithms CHAPTER 16

Greedy Algorithms CHAPTER 16 CHAPTER 16 Greedy Algorithms In dynamic programming, the optimal solution is described in a recursive manner, and then is computed ``bottom up''. Dynamic programming is a powerful technique, but it often

More information

II (Sorting and) Order Statistics

II (Sorting and) Order Statistics II (Sorting and) Order Statistics Heapsort Quicksort Sorting in Linear Time Medians and Order Statistics 8 Sorting in Linear Time The sorting algorithms introduced thus far are comparison sorts Any comparison

More information

Linear-Programming Decoding of Nonbinary Linear Codes Mark F. Flanagan, Member, IEEE, Vitaly Skachek, Member, IEEE, Eimear Byrne, and Marcus Greferath

Linear-Programming Decoding of Nonbinary Linear Codes Mark F. Flanagan, Member, IEEE, Vitaly Skachek, Member, IEEE, Eimear Byrne, and Marcus Greferath 4134 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 9, SEPTEMBER 2009 Linear-Programming Decoding of Nonbinary Linear Codes Mark F. Flanagan, Member, IEEE, Vitaly Skachek, Member, IEEE, Eimear Byrne,

More information

Acyclic Edge Colorings of Graphs

Acyclic Edge Colorings of Graphs Acyclic Edge Colorings of Graphs Noga Alon Benny Sudaov Ayal Zas Abstract A proper coloring of the edges of a graph G is called acyclic if there is no 2-colored cycle in G. The acyclic edge chromatic number

More information

Figure-2.1. Information system with encoder/decoders.

Figure-2.1. Information system with encoder/decoders. 2. Entropy Coding In the section on Information Theory, information system is modeled as the generationtransmission-user triplet, as depicted in fig-1.1, to emphasize the information aspect of the system.

More information

Integer Programming ISE 418. Lecture 7. Dr. Ted Ralphs

Integer Programming ISE 418. Lecture 7. Dr. Ted Ralphs Integer Programming ISE 418 Lecture 7 Dr. Ted Ralphs ISE 418 Lecture 7 1 Reading for This Lecture Nemhauser and Wolsey Sections II.3.1, II.3.6, II.4.1, II.4.2, II.5.4 Wolsey Chapter 7 CCZ Chapter 1 Constraint

More information

On the Finiteness of the Recursive Chromatic Number

On the Finiteness of the Recursive Chromatic Number On the Finiteness of the Recursive Chromatic Number William I Gasarch Andrew C.Y. Lee Abstract A recursive graph is a graph whose vertex and edges sets are recursive. A highly recursive graph is a recursive

More information

Data Compression Techniques

Data Compression Techniques Data Compression Techniques Part 1: Entropy Coding Lecture 1: Introduction and Huffman Coding Juha Kärkkäinen 31.10.2017 1 / 21 Introduction Data compression deals with encoding information in as few bits

More information

EDAA40 At home exercises 1

EDAA40 At home exercises 1 EDAA40 At home exercises 1 1. Given, with as always the natural numbers starting at 1, let us define the following sets (with iff ): Give the number of elements in these sets as follows: 1. 23 2. 6 3.

More information

On the other hand, the main disadvantage of the amortized approach is that it cannot be applied in real-time programs, where the worst-case bound on t

On the other hand, the main disadvantage of the amortized approach is that it cannot be applied in real-time programs, where the worst-case bound on t Randomized Meldable Priority Queues Anna Gambin and Adam Malinowski Instytut Informatyki, Uniwersytet Warszawski, Banacha 2, Warszawa 02-097, Poland, faniag,amalg@mimuw.edu.pl Abstract. We present a practical

More information

Heuristic Algorithms for Multiconstrained Quality-of-Service Routing

Heuristic Algorithms for Multiconstrained Quality-of-Service Routing 244 IEEE/ACM TRANSACTIONS ON NETWORKING, VOL 10, NO 2, APRIL 2002 Heuristic Algorithms for Multiconstrained Quality-of-Service Routing Xin Yuan, Member, IEEE Abstract Multiconstrained quality-of-service

More information

The self-minor conjecture for infinite trees

The self-minor conjecture for infinite trees The self-minor conjecture for infinite trees Julian Pott Abstract We prove Seymour s self-minor conjecture for infinite trees. 1. Introduction P. D. Seymour conjectured that every infinite graph is a proper

More information

On Soft Topological Linear Spaces

On Soft Topological Linear Spaces Republic of Iraq Ministry of Higher Education and Scientific Research University of AL-Qadisiyah College of Computer Science and Formation Technology Department of Mathematics On Soft Topological Linear

More information

Optimal Region for Binary Search Tree, Rotation and Polytope

Optimal Region for Binary Search Tree, Rotation and Polytope Optimal Region for Binary Search Tree, Rotation and Polytope Kensuke Onishi Mamoru Hoshi 2 Department of Mathematical Sciences, School of Science Tokai University, 7 Kitakaname, Hiratsuka, Kanagawa, 259-292,

More information

Fast algorithm for generating ascending compositions

Fast algorithm for generating ascending compositions manuscript No. (will be inserted by the editor) Fast algorithm for generating ascending compositions Mircea Merca Received: date / Accepted: date Abstract In this paper we give a fast algorithm to generate

More information

16 Greedy Algorithms

16 Greedy Algorithms 16 Greedy Algorithms Optimization algorithms typically go through a sequence of steps, with a set of choices at each For many optimization problems, using dynamic programming to determine the best choices

More information

Secret sharing on the d-dimensional cube

Secret sharing on the d-dimensional cube Secret sharing on the d-dimensional cube László Csirmaz Central European University June 10, 2005 Abstract We prove that for d > 1 the information rate of the perfect secret sharing scheme based on the

More information

Reflection in the Chomsky Hierarchy

Reflection in the Chomsky Hierarchy Reflection in the Chomsky Hierarchy Henk Barendregt Venanzio Capretta Dexter Kozen 1 Introduction We investigate which classes of formal languages in the Chomsky hierarchy are reflexive, that is, contain

More information

Chordal deletion is fixed-parameter tractable

Chordal deletion is fixed-parameter tractable Chordal deletion is fixed-parameter tractable Dániel Marx Institut für Informatik, Humboldt-Universität zu Berlin, Unter den Linden 6, 10099 Berlin, Germany. dmarx@informatik.hu-berlin.de Abstract. It

More information

Universal Cycles for Permutations

Universal Cycles for Permutations arxiv:0710.5611v1 [math.co] 30 Oct 2007 Universal Cycles for Permutations J Robert Johnson School of Mathematical Sciences Queen Mary, University of London Mile End Road, London E1 4NS, UK Email: r.johnson@qmul.ac.uk

More information

MA651 Topology. Lecture 4. Topological spaces 2

MA651 Topology. Lecture 4. Topological spaces 2 MA651 Topology. Lecture 4. Topological spaces 2 This text is based on the following books: Linear Algebra and Analysis by Marc Zamansky Topology by James Dugundgji Fundamental concepts of topology by Peter

More information

Review of Sets. Review. Philippe B. Laval. Current Semester. Kennesaw State University. Philippe B. Laval (KSU) Sets Current Semester 1 / 16

Review of Sets. Review. Philippe B. Laval. Current Semester. Kennesaw State University. Philippe B. Laval (KSU) Sets Current Semester 1 / 16 Review of Sets Review Philippe B. Laval Kennesaw State University Current Semester Philippe B. Laval (KSU) Sets Current Semester 1 / 16 Outline 1 Introduction 2 Definitions, Notations and Examples 3 Special

More information

An Efficient Decoding Technique for Huffman Codes Abstract 1. Introduction

An Efficient Decoding Technique for Huffman Codes Abstract 1. Introduction An Efficient Decoding Technique for Huffman Codes Rezaul Alam Chowdhury and M. Kaykobad Department of Computer Science and Engineering Bangladesh University of Engineering and Technology Dhaka-1000, Bangladesh,

More information

ON SWELL COLORED COMPLETE GRAPHS

ON SWELL COLORED COMPLETE GRAPHS Acta Math. Univ. Comenianae Vol. LXIII, (1994), pp. 303 308 303 ON SWELL COLORED COMPLETE GRAPHS C. WARD and S. SZABÓ Abstract. An edge-colored graph is said to be swell-colored if each triangle contains

More information

Matchings in Graphs. Definition 1 Let G = (V, E) be a graph. M E is called as a matching of G if v V we have {e M : v is incident on e E} 1.

Matchings in Graphs. Definition 1 Let G = (V, E) be a graph. M E is called as a matching of G if v V we have {e M : v is incident on e E} 1. Lecturer: Scribe: Meena Mahajan Rajesh Chitnis Matchings in Graphs Meeting: 1 6th Jan 010 Most of the material in this lecture is taken from the book Fast Parallel Algorithms for Graph Matching Problems

More information

A Connection between Network Coding and. Convolutional Codes

A Connection between Network Coding and. Convolutional Codes A Connection between Network Coding and 1 Convolutional Codes Christina Fragouli, Emina Soljanin christina.fragouli@epfl.ch, emina@lucent.com Abstract The min-cut, max-flow theorem states that a source

More information

8 Integer encoding. scritto da: Tiziano De Matteis

8 Integer encoding. scritto da: Tiziano De Matteis 8 Integer encoding scritto da: Tiziano De Matteis 8.1 Unary code... 8-2 8.2 Elias codes: γ andδ... 8-2 8.3 Rice code... 8-3 8.4 Interpolative coding... 8-4 8.5 Variable-byte codes and (s,c)-dense codes...

More information

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE Jie Luo, Member, IEEE, and Anthony Ephremides, Fellow, IEEE

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE Jie Luo, Member, IEEE, and Anthony Ephremides, Fellow, IEEE IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 2593 On the Throughput, Capacity, and Stability Regions of Random Multiple Access Jie Luo, Member, IEEE, and Anthony Ephremides, Fellow,

More information

The Rainbow Connection of a Graph Is (at Most) Reciprocal to Its Minimum Degree

The Rainbow Connection of a Graph Is (at Most) Reciprocal to Its Minimum Degree The Rainbow Connection of a Graph Is (at Most) Reciprocal to Its Minimum Degree Michael Krivelevich 1 and Raphael Yuster 2 1 SCHOOL OF MATHEMATICS, TEL AVIV UNIVERSITY TEL AVIV, ISRAEL E-mail: krivelev@post.tau.ac.il

More information

Importance Sampling Simulation for Evaluating Lower-Bound Symbol Error Rate of the Bayesian DFE With Multilevel Signaling Schemes

Importance Sampling Simulation for Evaluating Lower-Bound Symbol Error Rate of the Bayesian DFE With Multilevel Signaling Schemes IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL 50, NO 5, MAY 2002 1229 Importance Sampling Simulation for Evaluating Lower-Bound Symbol Error Rate of the Bayesian DFE With Multilevel Signaling Schemes Sheng

More information

SETS of bipolar codewords that have equal numbers of s

SETS of bipolar codewords that have equal numbers of s IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 56, NO. 4, APRIL 2010 1673 Knuth s Balanced Codes Revisited Jos H. Weber, Senior Member, IEEE, and Kees A. Schouhamer Immink, Fellow, IEEE Abstract In 1986,

More information

New Constructions of Non-Adaptive and Error-Tolerance Pooling Designs

New Constructions of Non-Adaptive and Error-Tolerance Pooling Designs New Constructions of Non-Adaptive and Error-Tolerance Pooling Designs Hung Q Ngo Ding-Zhu Du Abstract We propose two new classes of non-adaptive pooling designs The first one is guaranteed to be -error-detecting

More information

THE TRANSITIVE REDUCTION OF A DIRECTED GRAPH*

THE TRANSITIVE REDUCTION OF A DIRECTED GRAPH* SIAM J. COMPUT. Vol. 1, No. 2, June 1972 THE TRANSITIVE REDUCTION OF A DIRECTED GRAPH* A. V. AHO, M. R. GAREY" AND J. D. ULLMAN Abstract. We consider economical representations for the path information

More information

More on weighted servers

More on weighted servers More on weighted servers or FIFO is better than LRU Leah Epstein Csanád Imreh Rob van Stee Abstract We consider a generalized 2-server problem on the uniform space in which servers have different costs

More information

Trees. 3. (Minimally Connected) G is connected and deleting any of its edges gives rise to a disconnected graph.

Trees. 3. (Minimally Connected) G is connected and deleting any of its edges gives rise to a disconnected graph. Trees 1 Introduction Trees are very special kind of (undirected) graphs. Formally speaking, a tree is a connected graph that is acyclic. 1 This definition has some drawbacks: given a graph it is not trivial

More information

arxiv:cs/ v1 [cs.ds] 20 Feb 2003

arxiv:cs/ v1 [cs.ds] 20 Feb 2003 The Traveling Salesman Problem for Cubic Graphs David Eppstein School of Information & Computer Science University of California, Irvine Irvine, CA 92697-3425, USA eppstein@ics.uci.edu arxiv:cs/0302030v1

More information

Math 5593 Linear Programming Lecture Notes

Math 5593 Linear Programming Lecture Notes Math 5593 Linear Programming Lecture Notes Unit II: Theory & Foundations (Convex Analysis) University of Colorado Denver, Fall 2013 Topics 1 Convex Sets 1 1.1 Basic Properties (Luenberger-Ye Appendix B.1).........................

More information

PANCYCLICITY WHEN EACH CYCLE CONTAINS k CHORDS

PANCYCLICITY WHEN EACH CYCLE CONTAINS k CHORDS Discussiones Mathematicae Graph Theory xx (xxxx) 1 13 doi:10.7151/dmgt.2106 PANCYCLICITY WHEN EACH CYCLE CONTAINS k CHORDS Vladislav Taranchuk Department of Mathematics and Statistics California State

More information

Treewidth and graph minors

Treewidth and graph minors Treewidth and graph minors Lectures 9 and 10, December 29, 2011, January 5, 2012 We shall touch upon the theory of Graph Minors by Robertson and Seymour. This theory gives a very general condition under

More information

A greedy, partially optimal proof of the Heine-Borel Theorem

A greedy, partially optimal proof of the Heine-Borel Theorem A greedy, partially optimal proof of the Heine-Borel Theorem James Fennell September 28, 2017 Abstract The Heine-Borel theorem states that any open cover of the closed interval Œa; b contains a finite

More information

A Comparative Study of Entropy Encoding Techniques for Lossless Text Data Compression

A Comparative Study of Entropy Encoding Techniques for Lossless Text Data Compression A Comparative Study of Entropy Encoding Techniques for Lossless Text Data Compression P. RATNA TEJASWI 1 P. DEEPTHI 2 V.PALLAVI 3 D. GOLDIE VAL DIVYA 4 Abstract: Data compression is the art of reducing

More information

Some Applications of Graph Bandwidth to Constraint Satisfaction Problems

Some Applications of Graph Bandwidth to Constraint Satisfaction Problems Some Applications of Graph Bandwidth to Constraint Satisfaction Problems Ramin Zabih Computer Science Department Stanford University Stanford, California 94305 Abstract Bandwidth is a fundamental concept

More information

2.2 Set Operations. Introduction DEFINITION 1. EXAMPLE 1 The union of the sets {1, 3, 5} and {1, 2, 3} is the set {1, 2, 3, 5}; that is, EXAMPLE 2

2.2 Set Operations. Introduction DEFINITION 1. EXAMPLE 1 The union of the sets {1, 3, 5} and {1, 2, 3} is the set {1, 2, 3, 5}; that is, EXAMPLE 2 2.2 Set Operations 127 2.2 Set Operations Introduction Two, or more, sets can be combined in many different ways. For instance, starting with the set of mathematics majors at your school and the set of

More information