OPTIMAL PREFIX CODES AND HUFFMAN CODES

Size: px
Start display at page:

Download "OPTIMAL PREFIX CODES AND HUFFMAN CODES"

Transcription

1 Intern. J. Computer Math., Vol. 80, June 2003, pp OPTIMAL PREFIX CODES AND HUFFMAN CODES DONGYANG LONG a,b, *, WEIJIA JIA c,y and MING LI d,z a Department of Computer Science, Zhongshan University, Guangzhou , Guangdong, P.R.C.; b The State Key Laboratory of Information Security, Chinese Academy of Sciences, Beijing , P.R.C.; c Department of Computer Science, City University of Hong Kong, 83 Tat Chee Avenue, Kowloon, Hong Kong, P.R.C.; d School of Computing, National University of Singapore, , Singapore (Received 13 June 2002; In final form 23 September 2002) Existence of the optimal prefix codes is shown in this paper. Relationship between the optimal prefix code and the Huffman code is also discussed. We prove that all Huffman codes are optimal prefix codes and conversely optimal prefix codes need not be Huffman codes. Especially, the problem of whether the optimal prefix code has to be maximal is presented. Although for information source alphabets of being not greater than four letters we show that an optimal prefix code must be maximal, it remains to be an open problem in general. As seen from Huffman codes, optimal prefix codes are used not only for statistical modeling but also for dictionary methods. Moreover, it is obtained that the complexity of breaking an optimal prefix code is NP-complete from the viewpoint of computational difficulty. Keywords: Data transmission and compression; Huffman code; Optimal prefix code; Maximal prefix code C.R. Categories: F.4.3, E.4, H.1.1, I INTRODUCTION Huffman codes have been widely used in data, image, and video compression [1, 3 9, 11 14, 17 22]. For instance, the Huffman coding is used to compress the result of a quantitative stage in JPEG [18]. Huffman published a method for constructing highly an efficient coding for a given finite information source in 1952 [8]. This method is known as the Huffman coding (or the Huffman s algorithm) and the corresponding code of a Huffman coding (or the code generated by the Huffman s algorithm) is said to be the Huffman code (or canonical Huffman code). Huffman coding schemes are optimal prefix coding schemes, that is, they have the smallest average code word length among all prefix coding schemes. Two other algorithmic methods for approximately solving the above problem of finding an optimal This work was partially sponsored by HK UGC grants, CityU 1055=01E, CityU 1039=02E and CityU Grants and by the 863 Program (Project No AA144060) and the National Natural Science Foundation of China (Project No ). * Corresponding author. issldy@zsu.edu.cn=dylong @yahoo.com y wjia@cs.cityu.edu.hk z lim@comp.nus.edu.sg ISSN print; ISSN online # 2003 Taylor & Francis Ltd DOI: =

2 728 D. LONG et al. coding scheme are Shannon s and Fano s methods [19, 21]. Both of them are prefix coding schemes [21]. Although Shannon s and Fano s methods are not optimal prefix coding schemes in general, a mistake concept seems to be produced. There are many literatures, which presented that Huffman coding schemes are optimal prefix coding schemes, but the problem of which if optimal prefix codes have to be Huffman codes is ambiguous or it did not concern with [1, 3 9, 11 14, 17 22]. In Ref. [14] they did not distinguish Huffman codes and optimal prefix codes. Motivated by the above problem, this paper mainly discusses the difference between Huffman codes and optimal prefix codes. As we have known, for finite source alphabets, all Huffman codes are optimal prefix codes [4, 8, 9, 21]. For infinite source alphabets, several approaches have been taken to construct Huffman codes [3, 4, 8, 19, 21]. Existence of Huffman codes for infinite source alphabets was shown in Ref. [1, 6, 14, 17]. To the best of our knowledge there is no known discussion in the literature whether or not optimal prefix codes are Huffman codes. It is also a continuation of our previous works [15, 16]. This paper is organized as follows. For simplicity, first we list some of definitions and basic notions in Section 2. Besides existence of optimal prefix codes for finite or infinite information source alphabets, we show that optimal prefix codes need not be Huffman codes in Section 3. Section 4 concerns with relationship between optimal prefix codes and optimal maximal prefix codes. An open problem is presented: whether or not an optimal prefix code is maximal. It is proven that, for finite information source alphabets with not greater than four letters, the class of the optimal prefix codes coincides with the class of the optimal maximal prefix codes. Section 5 applies the optimal prefix code to data compression. The optimal prefix codes are used not only for traditional statistical modeling but also for dictionary methods. Data compression will yield good encryption [12]. From the viewpoint of computational complexity, the problem of breaking a file encoded by a prefix code or an optimal prefix code is also dealt with in Section 6. It is obtained that the complexity of breaking an optimal prefix code is also NP-complete. Finally, Section 7 concludes with some remarks. 2 DEFINITIONS AND BASIC NOTIONS Some basic concepts and notations are first given [2 4, 10, 19, 23]. An alphabet S is a finite set and S is the set of all finite length words formed from the letters of S (include empty word l) and S þ ¼ S {l}. A subset C S þ is called a code [2, 10] (or uniquely decipherable code, or uniquely decodable code [4, 19]) if, for all words x i1, x i2,..., x in, x j1, x j2,..., x jm 2 C, the equality x i1 x i2 x in ¼ x j1 x j2 x jm implies m ¼ n, x ik ¼ x jk, k ¼ 1,..., n. A code C S þ is called a maximal code, if for any x 2 S C, C [ {x} is not a code. For each word w 2 S, let l(w) denote the word length of w. A code C S þ is called a prefix (or instantaneous) code [2, 4, 10, 19, 23] if C \ CS þ ¼;, that is if no code word is a prefix of any other code word. A prefix code C S þ is called a maximal prefix code, if for any x 2 S C, C [ {x} is not a prefix code. DEFINITION 2.1 An information source [19] is an ordered pair I ¼ (S, P), where S ¼ {s 1, s 2,..., s q } is a source alphabet and P is a probability law that assigns to each element s i 2 S a probability P(s i ): DEFINITION 2.2 Let S ¼ {s 1, s 2,..., s m } be a source alphabet and S ¼ {a 1, a 2,..., a n } an input alphabet. An ordered pair (C, f ) is said to be a coding (encoding), if C S þ is a code (uniquely decodable code, uniquely decipherable code [19]) and f : S þ! S þ is a one-to-one mapping.

3 OPTIMAL PREFIX CODES 729 DEFINITION 2.3 Let I ¼ (S ¼ {s 1, s 2,..., s m }, P ¼ {p 1, p 2,..., p m }) be an information source and S ¼ {a 1, a 2,..., a n } an input alphabet. (C, f ) is a coding such that f : S! S þ is a one-to-one mapping. Average code word length (C, f ) is P m l¼1 l( f (s i))p(s i ), where l( f (s i )) denotes the length of the code word f (s i ) 2 C. Note that the Average code word length is only defined for a special coding f : s i (2S)! c i 2 S þ. Clearly, we easily extend Definition 2.3 to a general information source. DEFINITION 2.4 Let I 0 ¼ (S 0 ¼ {w 1, w 2,..., w k }, P 0 ¼ {p 1, p 2,..., p k }) be an information source and S ¼ {a 1, a 2,..., a n } an input alphabet and (C, f ) is a coding. Then the Average code word length of (C, f ) is P k l( f (w i))p(w i ), where l( f (w i )) denotes the length of the code word f (w i ) 2 C. Suppose that f : w i (2S þ )! c i (2S þ ), i ¼ 1,..., k, is a coding for a source alphabet S ¼ {s 1, s 2,..., s m }. Suppose it is known that the source words (For example, a minimal set of generating elements in S þ is considered as a set of the source words. Of course, for different type of texts there may exist distinct sets of the source words) w 1,..., w k occur with relative frequencies p 1,..., p k, respectively. That is, p i is to be regarded as the probability that a word selected at random from the source text will be w i. As seen from Huffman coding, it is only a coding method based on the source letters. That is, for a given information source I ¼ (S ¼ {s 1, s 2,..., s m }, P ¼ { p 1, p 2,..., p m }) and an input alphabet S ¼ {a 1, a 2,..., a n }, if h is a Huffman coding then h is an one-to-one mapping from the source alphabet S into the set S þ of all input words, i.e., h: S! S þ such that h is an one-to-one mapping and {h(s 1 ), h(s 2 ),..., h(s m )} is a Huffman code over S. Clearly, we can generalize the Huffman coding. Therefore, we define a g-huffman coding below. DEFINITION 2.5 Let I 0 ¼ (S 0 ¼ {w 1, w 2,..., w m }, P 0 ¼ {p 1, p 2,..., p m }) be an information source and S ¼ {a 1, a 2,..., a n } an input alphabet and (C, f ) is a coding. We call the coding f a g-huffman coding, if and only if C is a Huffman code for I 0 ¼ (S 0 ¼ {w 1,..., w m }, P 0 ). Note that for a given information source I ¼ (S ¼ {s 1, s 2,..., s m }, P ¼ { p 1, p 2,..., p m }), we easily construct a Huffman coding. However, it is more difficult to get a g-huffman coding. Firstly, a minimal set S 0 ¼ {w 1,..., w k } of generating elements of S þ ¼ {s 1, s 2,..., s m } þ is given. Then we will calculate the source words w 1,..., w k occur with relative frequencies p 1,..., p k, respectively. That is, p i is to be regarded as the probability that a word selected at random from the source text will be w i. Thus we get the distribution P 0 ¼ {p 1, p 2,..., p k }. According to this information source I 0 ¼ (S 0 ¼ {w 1, w 2,..., w k }, P 0 ¼ {p 1, p 2,..., p k }), by Huffman s algorithm, we easily obtain a g-huffman coding. Clearly, a Huffman coding for I ¼ (S ¼ {s 1, s 2,..., s m }, P ¼ {p 1, p 2,...p m }) has to be a g-huffman coding for I ¼ (S ¼ {s 1, s 2,..., s m }, P ¼ {p 1, p 2,..., p m }). But a g-huffman coding for I 0 ¼ (S 0 ¼ {w 1, w 2,..., w k }, P 0 ¼ {p 1, p 2,..., p k }) is not a Huffman coding for I ¼ (S ¼ {s 1, s 2,..., s m }, P ¼ {p 1, p 2,..., p m }) in general. The g-huffman coding is a coding method based on the source words instead of letters. DEFINITION 2.6 Let S ¼ {s 1, s 2,..., s m } be a source alphabet and S ¼ {a 1, a 2,..., a n } an input alphabet and (C, f ) is a coding. Then

4 730 D. LONG et al. (a) A coding is said to be prefix, if C is a prefix code. (b) A coding is said to be maximal, if C is a maximal code. (c) A coding is said to be maximal prefix, if C is a maximal prefix code. DEFINITION 2.7 Let I ¼ (S ¼ {s 1, s 2,..., s m }, P ¼ {p 1, p 2,..., p m }) be an information source and S ¼ {a 1, a 2,..., a n } an input alphabet and (C ¼ {c 1, c 2,..., c m }, f ) be a coding such that f : S! S þ is a one-to-one mapping. (a) A code C is called optimal for an information source I ¼ (S, P) if no other code has a smaller average code word length. That is, (D ¼ {d 1, d 2,..., d m }, f ) is any coding such P that f : S! S þ is a one-to-one mapping and D is a code over S, then we have m l(c i)p(s i ) P l(d i )P(s i ). We also call the code C an optimal code for a given finite information source I ¼ (S, P). (b) A maximal code C is called optimal for an information source I ¼ (S, P) if no other maximal code has a smaller average code word length. That is, (D ¼ {d 1, d 2,..., d m }, f ) is any maximal coding such that f : S! S þ is a one-to-one mapping and D is a maximal code over S, then we have P m l(c i)p(s i ) P l(d i )P(s i ). We also call the maximal code C an optimal maximal code for a given finite information source I ¼ (S, P). (c) A prefix code C is called optimal for an information source I ¼ (S, P) if no other prefix code has a smaller average code word length. That is, (D ¼ {d 1, d 2,..., d m }, f ) is any prefix coding such that f : S! S þ is a one-to-one mapping and D is a prefix code over S, then we have P m l(c i)p(s i ) P l(d i )P(s i ). We also call the prefix code C is an optimal prefix code for a given finite information source I ¼ (S, P). (d) A maximal prefix code C is called optimal for an information source I ¼ (S, P) if no other maximal prefix code has a smaller average code word length. That is, (D ¼ {d 1, d 2,..., d m }, f ) is any maximal prefix coding such that f : S! S þ is a one-toone mapping and D is a maximal prefix code over S, then we have P m l(c i)p(s i ) P l(d i )P(s i ). We also call the maximal prefix code C is an optimal maximal prefix code for a given finite information source I ¼ (S, P). 3 EXISTENCE OF OPTIMAL PREFIX CODES First, it is proven that existence of optimal prefix coding schemes for finite information source alphabets. THEOREM 3.1 Let I ¼ (S, P) be a finite information source. Then all Huffman coding schemes for I are optimal prefix coding schemes. Conversely, the optimal prefix coding schemes need not be Huffman coding schemes. Proof Proof of the previous part of Theorem 1 can be referred to Refs. [15, 16]. Conversely, that all optimal prefix codes need not be Huffman codes is verified by the following Example 3.1. Example 3.1 Let an information source I ¼ (S ¼ {s 1, s 2, s 3, s 4, s 5, s 6 }, P ¼ {0:26, 0:24, 0:14, 0:13, 0:12, 0:11}) and input alphabet S ¼ {0, 1}. Table I shows two Huffman codes. But Table II gives a prefix coding. According to the Huffman s algorithm, we know that the codes of source alphabets s 1 and s 2 must start with different bits, but in the code of Table II

5 OPTIMAL PREFIX CODES 731 TABLE I Two Huffman Coding Schemes. Source letter Probability Huffman code C 1 Huffman code C 2 s s s s s s they both start with 0. This code C 3 is therefore impossible to generate by any re-labeling of the nodes of the Huffman trees of Table I. That is, C 3 cannot be generated by the Huffman method! Clearly, C 3 is a prefix code. By Example 3.1, we immediately get proof of the latter half of Theorem 3.1. Therefore, the class of Huffman codes is a proper subclass of optimal prefix codes. Theorem 3.1 illustrates difference between Huffman codes and optimal prefix codes and shows the two concepts are distinct. For infinite information source alphabets, existence of the optimal prefix codes is given below. THEOREM 3.2 Let X be a random variable with a countable infinite of possible outcomes and with finite entropy. Then for every r > 1, the following hold: (1) There exists a sequence of r-ary truncated optimal prefix codes for X, which converges to an optimal prefix code for X. (2) The average code word lengths in any sequence of r-ary truncated optimal prefix codes converge to the shortest average code word length for X. (3) Any r-ary optimal prefix code for X must satisfy the Kraft inequality with equality. To facilitate proof, basic notations and definitions [4, 14, 19] are first given. A prefix code over a finite alphabet S (with d letters) is called a d-ary code (prefix code) over S. Let Z þ denote the positive integers. A sequence of d-ary prefix codes C 1, C 2 converges to an infinite prefix code C if for every i 1, the ith code word of C n is eventually constant (as n grows) and equals the ith code word of C. d-ary prefix codes are known to satisfy Kraft s inequality P w2c d l(w) 1. Conversely, any collection of positive integers that satisfies Kraft s inequality corresponds to the code word lengths of a prefix code [4, 19]. Let X be a source random variable whose countable infinite range is (without loss of generality) Z þ, with respective probabilities p 1 p 2 p 3, where p i > 0 for all i. The average code word length of a code C ¼ {w 1, w 2,...} to encode X is P 1 p il(w i ). The entropy TABLE II A Non-Huffman Coding. Source letter Probability Code C 3 s s s s s s

6 732 D. LONG et al. of the random variable X is defined as H(X ) ¼ P 1 p i log p i. It is well known that the average code word length of an Huffman code is no smaller than H(X ) and is smaller than H(X ) þ 1 [4]. By Lemma in Ref. [4], we easily obtain that: LEMMA 3.1 The average code word length of an optimal prefix code is no smaller than H(X ) and is smaller than H(X ) þ 1. Lemma 3.1 plays a crucial role, which establishes the existence of optimal codes for infinite information sources. By Huffman s algorithm, we know that Huffman coding gives a method for constructing optimal prefix codes for finite source ranges. Next, for each n 1, let X n be a random variable with a finite range, similar to truncated Huffman code [34], we define a d-ary truncated optimal prefix code of size n for X as a d-ary optimal prefix code for X n. Proof of Theorem 3.2 proof of Theorem 3.2. Using a minor modification of Theorem 1 in Ref. [14], we easily get For each n 1, let C n be a d-ary truncated optimal prefix code of size n for X, and denote the sequence of n code word lengths of C n (followed by zeros) by l (n) ¼ {l (n) 1, l(n) 2,..., l(n) n,0; 0;...}: Let P J denote the set of all sequences of positive integers. For each n, the average length 1 l(n) i p (n) i of optimal prefix code C n is not larger than H(X n ) þ 1 (by Lemma 3.1), where the entropy of X n is H(X n ) ¼ Xn since S n ¼ P n p i! 1asn!1. Hence p (n) i log p (n) i ¼ 1 X n p i log p i log 1! H(X ), as n!1 S n S n H(X n ) þ 1 H(X ) þ 2 for n sufficiently large. For each positive integer n, we have and, therefore, for all i. This implies that X 1 p i l (n) i (H(X n ) þ 1)S n p i l (n) i (H(X n ) þ 1)S n H(X n ) þ 1 l (n) i H(X ) þ 2 p i for n sufficiently large. Thus for each i, the sequence of code word lengths {l (1) i, l (2) i,...} is bounded and therefore the corresponding sequence of code words can only take on a finite set of possible values. Hence, for each i, there is a convergent subsequence of code words. In fact, every infinite indexed subset of this sequence of code words has a convergent subsequence of code words. We conclude (using a minor modification of Ref. [14]) that there exists a

7 subsequence of codes C n1, C n2,..., that converges to an infinite code ^C. Clearly, ^C is a prefix code since it is a limit of finite optimal prefix codes. Furthermore, the subsequence {l (nk) }of elements of J, converges to a sequence ^l ¼ {^l 1, ^l 2,...} 2 J, in the sense that for each i 2 Z þ, the sequence l (n k) i converges to ^l i. To show the optimality of ^C, let l 1, l 2,..., be the code word lengths of an arbitrary prefix code. For every k, there exists a j k such that ^l i ¼ l (n m) i for every i k provided that m j. Thus for all m j, the optimality of optimal prefix codes implies X k Therefore, and thus p (k) i ^l i ¼ Xk p (k) i l (n m) i ¼ Xn m p (k) i l (n m) i S X n nm m p (n m) i l (n m) i S X n nm m p (n m) i l i : S k S k X k p i^l i Xn m OPTIMAL PREFIX CODES 733 X 1 p i l (n m) i Xn m p i^l i X1 p i l i X1 p i l i : p i l i (1) This implies that the infinite prefix code ^C is optimal. To prove Part (b) of the theorem, notice that by the optimality of optimal prefix codes X n p i l (n) i ¼ S n X n X p (n) i l (n) n i S n p (n) i l (nþ1) i ¼ Xn p i l (nþ1) i Xnþ1 p i l (nþ1) i : The sequence P n p il (n) i is thus an increasing sequence that is bounded above by H(X ) þ 2 and has a limit. It follows from (1) that X 1 Next by the optimality of optimal prefix codes Thus X n m X p i^l i lim p i l (n m) n n!1 i ¼ lim p i l (n) n!1 i : X n lim p i l (n) n!1 i ¼ lim S X n n p n n!1 i l(n) i lim S X n n n!1 X n lim p n 1 X n n!1 i l(n) i ¼ lim n!1 S n p i l (n) i ¼ X1 p (n) i ^l i ¼ X1 p i^l i : This proves the second part of the theorem. Finally we prove Part (c) of the theorem. Let the code word lengths of an optimal prefix code be denoted l 1 l 2, and assume to the contrary that the Kraft inequality is strict, i.e., P i d l i < 1. Let d ¼ 1 P i d l i > 0, then there exists a positive integer k such that p i^l i :

8 734 D. LONG et al. d l i < d for all i k. Let j be an integer such tat l j > l k. Define a collection of integers ^l 1, ^l 2, ^l 3,..., such that ^l i ¼ l i, for all i 6¼ j and such that ^l j ¼ l k. Then X 1 d ^l i ¼ X1 d l i d l j þ d l k < X1 d l i þ d ¼ 1: Thus the integers ^l 1, ^l 2, ^l 3,..., satisfy Kraft s inequality, so that there exists a prefix code having them as code word lengths. Since ^l j < l j, such a prefix code will have a strictly smaller average code word length for X than the optimal prefix code whose code word lengths are l 1, l 2, l 3,... This is a contradiction. Similarly, we easily obtain the following Theorems 3.3 and 3.4. The details of proof of Theorems 3.3 and 3.4 are omitted here. THEOREM 3.3 Let I ¼ (S, P) be a finite information source. Then all Huffman coding schemes for I are optimal maximal prefix coding schemes. Conversely, the optimal maximal prefix coding schemes need not be Huffman coding schemes. THEOREM 3.4 Let X be a random variable with a countable infinite of possible outcomes and with finite entropy. Then for every r > 1, the following hold: (1) There exists a sequence of r-ary truncated optimal maximal prefix codes for X, which converges to an optimal maximal prefix code for X. (2) The average code word lengths in any sequence of r-ary truncated optimal maximal prefix codes converge to the shortest average code word length for X. (3) Any r-ary optimal maximal prefix code for X must satisfy the Kraft inequality with equality. 4 OPTIMAL PREFIX AND OPTIMAL MAXIMAL PREFIX CODES Now, let us come to the relationship between optimal prefix codes and optimal maximal prefix codes. First, we will give several examples, which show that connection between optimal prefix codes and optimal maximal prefix codes is quite complicated. For instance, let I ¼ (S ¼ {A, B, C, D}, P ¼ {0:9, 0:04, 0:03, 0:03}) be a given information source. Consider a maximal prefix coding scheme: f : S! C 1 ¼ {00, 01, 11, 10} and two prefix coding schemes g: S! C 2 ¼ {1, 001, 011, 110} and h: S! C 3 ¼ {1, , , } over the alphabet {0, 1}. By definitions, it is easy to verify that C 1 is a maximal prefix code and that C 2 and C 3 are both prefix codes but not maximal. We easily calculate that the average code word lengths of C 1, C 2, and C 3 are 2, 1.2, and 2, respectively. And we immediately get that the average code word length of a Huffman code for the above information source is Therefore, C 1 is no optimal maximal prefix code; neither C 2 nor C 3 is the optimal prefix code. However, the average code word length of prefix codes may be less than or equal to one of maximal prefix codes. Similarly, the average code word length of prefix codes can also be greater than one of maximal prefix codes. We aim to discover relationship between optimal prefix codes and optimal maximal prefix codes. In general, we have the following problem. Problem 4.1 Whether or not the class of optimal prefix codes coincides with the class of optimal maximal prefix codes.

9 OPTIMAL PREFIX CODES 735 Firstly, from definitions it immediately follows theorem 4.1. That is, the optimal prefix code contains the optimal maximal prefix code. THEOREM 4.1 Optimal maximal prefix codes have to be optimal prefix codes. Proof Let (S, P) be given a finite information source. Assume that C is an optimal maximal prefix code and that S! D is any prefix coding scheme. By Theorem 1, there is a Huffman code H such that the average code word length of C is the same as the one of the Huffman code H. Therefore, the average code word length of the prefix code D is greater than or equal to the one of H, namely the one of C. This shows that C is an optimal prefix code. And consequently optimal maximal prefix codes are contained in optimal prefix codes. Some results related to the above Problem 4.1 are given below. THEOREM 4.2 Let (S, P) be given a finite information source such that the number of the letters in alphabet S is 2. Then optimal prefix codes coincide with optimal maximal prefix codes. Proof For simplicity, we only consider optimal prefix codes over the alphabet {0, 1}. Assume that C ¼ {c 1, c 2 } is an optimal prefix code and l(c 1 ) ¼ l 1, l(c 2 ) ¼ l 2, and l 1 l 2. Suppose that P ¼ {p 1, p 2 } with p 1 p 2. By Theorem 1, there exists a Huffman code D ¼ {d 1, d 2 }withl(d 1 ) ¼ r 1, l(d 2 ) ¼ r 2, and r 1 r 2 such that it is also an optimal prefix code. Therefore, the average code word length of C is equal to the one of D, i.e., p 1 l 1 þ P 2 l 2 ¼ p 1 r 1 þ p 2 r 2. Since a Huffman code is a maximal prefix code [16], by Proposition 3.8 of Chapter II in Ref. [2] (p. 102), thus 1=2 r 1 þ 1=2 r 2 ¼ 1. According to r 1 r 2 being positive integers, hence r 1 ¼ r 2 ¼ 1. But p 1 l 1 þ p 2 l 2 ¼ p 1 r 1 þ p 2 r 2 ¼ p 1 þ p 2 ¼ 1, we thus have l 1 ¼ l 2 ¼ 1, and consequently 1=2 l 1 þ 1=2 l 2 ¼ 1. Again by Proposition 3.8 of Chapter II in Ref. [2] (p. 102), then C is a maximal prefix code. Clearly, C is an optimal maximal prefix code. Conversely, assume that C ¼ {c 1, c 2 } is an optimal maximal prefix code over the alphabet {0, 1}. It is easy to verify that C ¼ {c 1, c 2 } is a maximal prefix code if and only if C ¼ {0, 1}. Therefore, C ¼ {0, 1} is a Huffman code and consequently an optimal prefix code. That is, an optimal maximal prefix code has to be an optimal prefix code. Similarly, we have Theorem 4.3 below. THEOREM 4.3 Let (S, P) be given a finite information source such that the number of the letters in alphabet S is 3. Then optimal prefix codes coincide with optimal maximal prefix codes. Proof It also suffices to verify that Theorem 3 is true for the alphabet {0, 1}. Let C ¼ {c 1, c 2, c 3 } be an optimal prefix code and l(c 1 ) ¼ l 1, l(c 2 ) ¼ l 2, l(c 3 ) ¼ l 3, and l 1 l 2 l 3. Suppose that P ¼ {p 1, p 2, p 3 } with p 1 p 2 p 3. Making use of Theorem 1, there exists a Huffman code D ¼ {d 1, d 2, d 3 }withl(d 1 ) ¼ r 1, l(d 2 ) ¼ r 2, l(d 3 ) ¼ r 3, and r 1 r 2 r 3 such that it is also an optimal prefix code. Hence the average code word length of C equals the one of D, i.e., p 1 l 1 þ p 2 l 2 þ p 3 l 3 ¼ p 1 r 1 þ p 2 r 2 þ p 3 r 3. By a Huffman code being a maximal prefix code [15] and Proposition 3.8 of Chapter II in Ref. [2] (p. 102), then 1=2 r 1 þ 1=2 r 2 þ 1=2 r3 ¼ 1. Since r 1 r 2 r 3 are positive integers, hence r 1 ¼ 1, r 2 ¼ r 3 ¼ 2. Otherwise, when r 1 > 1, this contradicts 1=2 r 1 þ 1=2 r 2 þ 1=2 r 3 ¼ 1. But p 1 l 1 þ p 2 l 2 þ p 3 l 3 ¼ p 1 r 1 þ p 2 r 2 þ p 3 r 3 and l 1 l 2 l 3, we have if l 1 > 1 then p 1 l 1 þ p 2 l 2 þ p 3 l 3 > p 1 r 1 þ p 2 r 2 þ p 3 r 3, and consequently this is impossible. Thus we get

10 736 D. LONG et al. l 1 ¼ 1. Similarly, we have l 2 ¼ l 3 ¼ 2. Clearly, 1=2 l 1 þ 1=2 l 2 þ 1=2 l 3 ¼ 1. Again Proposition 3.8 of Chapter II in Ref. [2] (p. 102), we obtain that C is a maximal prefix code. C clearly is an optimal maximal prefix code. Conversely, suppose that C ¼ {c 1, c 2, c 3 } is an optimal maximal prefix code over the alphabet {0, 1}. By Lemma 1.2 in Ref. [18], there exist two positive integers p and q such that {0 p,1 q } C ¼ {c 1, c 2, c 3 }. (i) If p 2 and q 2 then {0 pþ1, 10, 01, 1 qþj } is a prefix code. This contradicts C ¼ {c 1, c 2, c 3 } being a maximal prefix code. (ii) If p ¼ 1 and q > 2 or p > 2 and q ¼ 1 then {0, 10, 110, 1 3þ1 } or {1, 01, 001, O 3þj } is a prefix code. This is also impossible. Therefore, we have that p ¼ 1 and q ¼ 2orp ¼ 2 and q ¼ 1. It is easy to verify that there exist exactly two maximal prefix codes C ¼ {0, 10, 11} and C ¼ {1, 01, 00} and that they are Huffman codes for the information source [7]. Clearly, both of them are optimal prefix codes. Thus, an optimal maximal prefix code has to be an optimal prefix code. In general, for the number of the information source alphabet being not greater than 4, we further get the following result. THEOREM 4.4 Let (S, P) be given a finite information source such that the number of the letters in alphabet S is 4. Then every optimal prefix code has to be maximal. Proof For simplicity, we only show that Theorem 4 is true for the alphabet {0, 1}. Let C ¼ {c 1, c 2, c 3, c 4 } be an optimal prefix code and l(c 1 ) ¼ l 1, l(c 2 ) ¼ l 2, l(c 3 ) ¼ l 3, l(c 4 ) ¼ l 4, and l 1 l 2 l 3 l 4. Suppose that P ¼ {p 1, p 2, p 3, p 4 } with p 1 p 2 p 3 p 4. By Theorem 1, there exists a Huffman code D ¼ {d 1, d 2, d 3, d 4 } with l(d 1 ) ¼ r 1, l(d 2 ) ¼ r 2, l(d 3 ) ¼ r 3, l(d 4 ) ¼ r 4, and r 1 r 2 r 3 r 4 such that it is also an optimal prefix code. Hence the average code word length of C equals the one of D, i.e., p 1 l 1 þ p 2 l 2 þ p 3 l 3 þ p 4 l 4 ¼ p 1 r 1 þ p 2 r 2 þ p 3 r 3 þ p 4 r 4. By a Huffman code being a maximal prefix code [16] and Proposition 3.8 of Chapter II in Ref. [2] (p. 102), then 1=2 r 1 þ 1=2 r 2 þ 1=2 r 3 þ 1=2 r 4 ¼ 1. From the form 1=2 r 1 þ 1=2 r 2 þ 1=2 r 3 þ 1=2 r 4 ¼ 1 it easily follows that r 1 ¼ 1, r 2 ¼ 2, and r 3 ¼ r 4 ¼ 3orr 1 ¼ r 2 ¼ r 3 ¼ r 4 ¼ 2. We will divide into two cases to discuss the rest of the problem. When r 1 ¼ 1, r 2 ¼ 2, and r 3 ¼ r 4 ¼ 3, since D is a Huffman code, then p 1 p 3 þ p 4. According to p 1 l 1 þ p 2 l 2 þ p 3 l 3 þ p 4 l 4 ¼ p 1 r 1 þ p 2 r 2 þ p 3 r 3 þ p 4 r 4 ¼ p 1 þ 2p 2 þ 3p 3 þ 3p 4, p 1 p 3 þ p 4, and 1=2 l 1 þ 1=2 l 2 þ 1=2 l 3 þ 1=2 l 4 1, thus we have that l 1 ¼ 1, l 2 ¼ 2, and l 3 ¼ l 4 ¼ 3orl 1 ¼ l 2 ¼ l 3 ¼ l 4 ¼ 2. Regardless of l 1 ¼ 1, l 2 ¼ 2, and l 3 ¼ l 4 ¼ 3orl 1 ¼ l 2 ¼ l 3 ¼ l 4 ¼ 2, by Proposition 3.8 of Chapter II in Ref. [2] (p. 102), we easily obtain that 1=2 l 1 þ 1=2 l 2 þ 1=2 l 3 þ 1=2 l 4 ¼ 1 and consequently that C is a maximal prefix code. When r 1 ¼ r 2 ¼ r 3 ¼ r 4 ¼ 2, by D is a Huffman code, then p 1 p 3 þ p 4. Similarly, we have that l 1 ¼ 1, l 2 ¼ 2, and l 3 ¼ l 4 ¼ 3or l 1 ¼ l 2 ¼ l 3 ¼ l 4 ¼ 2. Although verification of the above statement is quite complicated, the methods used in proofs are simple and are completely similar to ones of the case l 1 ¼ 1, l 2 ¼ 2, and l 3 ¼ l 4 ¼ 3. In fact, if l 1 ¼ 1, l 2 ¼ l 3 ¼ 2, and l 4 ¼ 2 þ i, then it contradicts 1=2 l 1 þ 1=2 l 2 þ 1=2 l 3 þ 1=2 l 4 1. If l 1 ¼ 1, l 2 ¼ 2, l 3 ¼ 3 þ i, l 4 ¼ 3 þ j, with i þ j > 0, by 2( p 1 þ p 2 þ p 3 þ p 4 ) ¼ p 1 þ 2p 2 þ (3 þ i)p 3 þ (3 þ j)p 4, then p 1 ¼ (1 þ i) p 3 þ (1 þ j)p 4 > p 3 þ p 4. Therefore, it is also impossible, because of p 1 p 3 þ p 4.Ifl 1 ¼ 1, l 2 ¼ 3 þ k, l 3 ¼ 3 þ i, l 4 ¼ 3 þ j, by 2(p 1 þ p 2 þ p 3 þ p 4 ) ¼ p 1 þ (3 þ k)p 2 þ (3 þ i)p 3 þ (3 þ j)p 4, then p 1 ¼ (1 þ k)p 2 þ (1 þ i)p 3 þ (1 þ j)p 4 > p 3 þ p 4. This contradicts p 1 p 3 þ p 4. Combining the above discussion, we have l 1 ¼ 1, l 2 ¼ 2, and l 3 ¼ l 4 ¼ 3. Similarly, if l 1 ¼ 2 þ h, l 2 ¼ 2 þ k, l 3 ¼ 2 þ i, l 4 ¼ 2 þ j, with 0 h k i j and h þ k þ iþ j 6¼ 0. By 2(p 1 þ p 2 þ p 3 þ p 4 ) ¼ (2 þ h)p 1 þ (2 þ k)p 2 þ (2 þ i)p 3 þ (2 þ j)p 4, hp 1 þ kp 2 þ ip 3 þ jp 4 ¼ 0. This is impossible yet. This shows that l 1 ¼ l 2 ¼ l 3 ¼ l 4 ¼ 2. When

11 OPTIMAL PREFIX CODES 737 FIGURE 1 Optimal prefix codes and Huffman codes. l 1 ¼ 1, l 2 ¼ 2, and l 3 ¼ l 4 ¼ 3orl 1 ¼ l 2 ¼ l 3 ¼ l 4 ¼ 2, we clearly have that 1=2 l 1 þ 1=2 l 2 þ 1=2 l 3 þ 1=2 l 4 ¼ 1. Again by Proposition 3.8 of Chapter II in Ref. [2] (p. 102), then C is a maximal prefix code. It is very interesting that the word optimal concerns the economy of a prefix code. As seen in Ref. [10], if C is a maximal prefix code then every code word occurs as part of a message, hence no part of all words over the alphabet is wasted. For every optimal prefix coding, there is an optimal maximal prefix coding such that they have the same average code word length. This property does not belong to common prefix coding schemes. Therefore, we conjecture that Problem 4.1 is true in general, that is, the optimal prefix codes coincide with the optimal maximal prefix codes. Combining the above discussion, we give Figure 1 that illustrates relationships among optimal prefix codes and optimal maximal prefix codes and Huffman codes. 5 APPLICATION TO DATA COMPRESSION As the simplest example, consider a special file A 3 B 4 A 90 B 3 over the alphabet {A, B}. Regardless of the probabilities, Huffman coding will assign a single bit to each of the letters TABLE III A Prefix Coding for the File M. Words of the file M Prefix code (space) 0 THE 101 OF 1000 DEVELOPMENT 1001 ENCRYPTION STANDARD ADVANCED STATUS REPORT FIRST FOUND ON 11111

12 738 D. LONG et al. TABLE IV A Prefix Coding for the File M. Words of the file M Prefix code (space) 000 DEVELOPMENT 001 ENCRYPTION 100 THE 110 STANDARD 011 ADVANCED 0100 STATUS 0101 REPORT 1011 FIRST 1110 FOUND 1111 OF ON A and B, giving no compression, thus the file is 100 bits. But we take a prefix coding such that A 3 B 4 A 89! 1, AB! 01, and BB! 00, where {1, 01, 00} is clearly a prefix code. And the file is 5 bits. Therefore, we have a compression ratio of 100=3. In particular, the prefix coding on the two-letter alphabet can yield compression. Therefore, the prefix coding can be used for data compression, and different compression ratios between the prefix coding and the Huffman coding are obtained. For example, we will encode the file M: STATUS REPORT ON THE FIRST ROUND OF THE DEVELOPMENT OF THE ADVANCED ENCRYPTION STANDARD. By traditional statistical modeling, we easily calculate that the average code word length of the block code is 5 bits=symbol, and that the average code word length of a Huffman code is 342=87 bits=symbol. Furthermore, the encoded file by the block code and the Huffman code will take up 87 5 ¼ 435 bits and =87 ¼ 342 bits respectively. Therefore, the compression ratio is 435=342 ¼ 1.27:1. Additionally, in dictionary methods we will encode the file M by a prefix code different from the Huffman code. First a prefix code is generated as follow: C ¼ {0, 101, 0001, 1001, 00011, 10011, 01011, 11011, 00111, 10111, 01111, 11111}. By the following Table III, we easily calculate that the encoded file will take up bits. Therefore, the compression ratio is 1 13 þ 3 3 þ 4 2 þ 4 1 þ 5 8 ¼ ¼ 5:87:1: Furthermore, it easily follows that there exist a lot of various prefix codes with completely different compression ratios. For instance, by Table IV, we will easily calculate that the encoded file will take up 92 bits. Clearly, the compression ratio is ¼ 4:73:1: In general, compression methods based on strings of symbols can be more efficient than methods that compress individual symbols [21]. Comparing with the Huffman coding, the code words in the above prefix code do not depend on the frequencies of occurrence of

13 OPTIMAL PREFIX CODES 739 all the symbols in the alphabet. When applied to dictionary methods, the prefix coding is a compression algorithm that is remarkable simple as well as fast, although it does not achieve optimal compression. Especially, it is very good at improving Chinese or Japanese storing text retrieval systems security [12]. On the other hand, one of the most difficult problems of the above solution is how to generate a prefix code. Fortunately, an efficient algorithm generating a prefix code has been given below. Generating a prefix code C Input: An alphabet S Output: A maximal prefix code C. Step 1: Set C = empty, the number of code words in C, m = 0, and the code checking times, t =0. Step 2: Repeat following Steps 3 through 6 until m > total_code or t > total_code. Step 3: Randomly generate an integer n as the length of word, 1 n max_length; t=t+1: Step 4: Randomly select a word w of length n from S : Step 5: Compare w with each word in C, if there exist prefix relations Between w and a word in C, then go to Step 3. Otherwise continue. Step 6: C=C+w(add the word w to C), m=m+1,t=t+ 1, go to Step 3. Completely similarly, a maximal prefix code can be generated [23]. In fact, we have THEOREM 5.1 Let S be a finite alphabet and n be any positive integer. The above algorithm must generate a prefix code C containing n code words in polynomial time of n ði.e. Oðn 3 ÞÞ. Proof Consider a word w ¼ a 1 a 2 a n of length n. Since all proper prefix of w are w 0 ¼ 1, w 1 ¼ a 1, w 2 ¼ a 1 a 2,..., w n 1 ¼ a 1 a 2 a n 1, they are total n words. At first, let C ¼ {w}, then we obtain a prefix code {w, w 1 } from the prefix code C ¼ {w} by at most n steps (comparing w 1 with w in C, we add the word w 1 to C if there is no prefix relations between w 1 and w.). Similarly, we get a prefix code {w, w 1, w 2 } from the prefix {w, w 1 }byat most 2n steps. Continuing the above discussion, we have a prefix code {w, w 1,..., w n 1 } from the prefix code {w, w 1,..., w n 2 } by at most (n 1)n steps. Therefore, we generated a prefix code containing n code words from C by at most n þ 2n þ 3n þþ (n 1)n ¼ (n 3 n 2 )=2 steps. This shows that the above algorithm must generate a prefix code C containing n code words in polynomial time. COROLLARY 5.1 There is an efficient algorithm generating a maximal prefix code. And we have the following Theorem 5.2 and Corollary 5.2. THEOREM 5.2 Let S be an alphabet and L be any finite language over S: Then we can efficiently decide whether or not L is a prefix code over S: COROLLARY 5.2 Let S be an alphabet and L any finite language over S: Then we can efficiently decide whether or not L is a maximal prefix code over S:

14 740 D. LONG et al. 6 BREAKING AN OPTIMAL PREFIX CODE Rubin [20] and Jones [11] discuss the ways in which data compression algorithms such as Huffman s algorithm may be used as encryption techniques. Klein et al. [12] have considered the aspect of using Huffman codes also as an encryption method. It was motivated by an application to storing a large textual database on a CD-ROM. The text of the database had not only to be compressed, but also to be encrypted to prevent illegal use of copyrighted material. Fraenkel and Klein [5] have shown that the problem of finding the encoding rule given both a sample of the source stream (or original file) and the corresponding sample of the encoded file is NP-complete. Gillman et al. [7] recently examined the problem of deciphering a file that has been Huffman codes, but not otherwise encrypted. They found that a Huffman code can be surprisingly difficult to crypt-analyze. Motivated by the same problem, we investigate the problem of crypt-analyzing a message that has been compressed using the optimal prefix coding but not otherwise encrypted in this section. Making use of the results of Ref. [5], we easily have THEOREM 6.1 Given an original file and a corresponding encoded file by the optimal prefix coding, the complexity of guessing the maximal prefix code is NP-complete. Proof Since an optimal prefix code has to be a prefix code, we first should examine whether or not it is a prefix code in order to decide if it is an optimal prefix code. Therefore, given an original file and a corresponding encoded file by the optimal prefix coding, the complexity of guessing the optimal prefix code is more difficult than one of guessing the prefix code. According to Fraenkel and Klein s results [5], the complexity of the former is at least NPcomplete. As we have known from the Huffman coding, the complexity of the above investigated problem is based on traditional statistical modeling, i.e., the coding schemes view the source file as consisting of a sequence of letters selected from an alphabet. However, applying to dictionary methods, it will make the problem of breaking an optimal prefix code much more difficult. Now, in a different way we consider the problem of breaking an optimal prefix code. First, we consider it for a model where every word or run of identical words is encoded individually. Given a original file or a plaintext M ¼ w 1 w 2 w n, where the w i are the elements to be encoded, and a ciphertext or a encoded file M 0, which is a code word sequence of which the opponent knows that it is a maximal prefix encoding of M; that is, he knows M and M 0 and there is a partition of M 0 into code words c(w 1 ), c(w 2 ),..., c(w n ) which satisfy the following two conditions: (i) The set of the different code words in the sequence {c(w 1 ), c(w 2 ),..., c(w n )} is a maximal prefix code. (ii) The encoding defined by the sequence is consistent, that is, c(w i ) ¼ c(w j ) if and only if w i ¼ w j, for all 1 i, j n. The opponent s objective is to find the maximal prefix code, i.e., the function c( ). Again, by Fraenkel and Klein s results [5], we easily show that the complexity of the above problem of guessing a maximal prefix code is also NP-complete. The following is a simpler and direct proof of the above problem. Furthermore, from the proof of Theorem 6.2, it immediately follows that the complexity of guessing a prefix code is NP-complete. THEOREM 6.2 Let M 0 be the encoded file by the optimal prefix coding and the length of M 0 be m. Then M 0 is encoded by at most 2 m 1 optimal prefix codes.

15 OPTIMAL PREFIX CODES 741 Proof Let M 0 ¼ a 1 a 2 a 3 a m, a i 2 S, i ¼ 1,..., m. Suppose that we divide M 0 into c 1 c 2 c k, i.e., M 0 ¼ c 1 c 2 c k such that {c 1, c 2,..., c k } is an optimal prefix code. Since the length l(c j )ofc j satisfy 1 l(c j ) m, c j ( j ¼ 1,..., k) are able to take all the sub-words of the word M 0 [2, 10, 23]. Suppose that the number of the sets {c 1, c 2,..., c k } satisfying with M 0 ¼ c 1 c 2 c k is D m. Then the number of the set {c 1, c 2,..., c k } such that a 2 a 3 a m ¼ c 1 c 2 c k is D m 1. Repeating the above discussion, we have that the number of the sets {c 1, c 2,..., c k } with a i a iþ1 a m ¼ c 1 c 2 c k is D m i. According to the choice of c j ( j ¼ 1,..., k), we easily obtain that D m ¼ D m 1 þ D m 2 þþd m (m 1) þ 1. Therefore, D m ¼ D m 1 þ D m 2 þþd m (m 1) þ 1 ¼ D m 2 þþd m (m 1) þ 1 þ (D m 2 þ þd m (m 1) þ 1) ¼ 2(D m 2 þþd m (m 1) þ 1) ¼ 2 2 (D m 3 þþd m (m 1) þ 1) ¼ 2 3 (D m 4 þþd m (m 1) þ 1) ¼¼2 m 2 (D m (m 1) þ 1) ¼ 2 m 2 (D 1 þ 1) ¼ 2 m 1. Note that it is easy to verify that D 1 ¼ 1. According to Theorem 6.2, we have known that it is a NP-complete problem to seek all possible optimal prefix codes that are used for encoding the original message. 7 CONCLUSION One disadvantage [22] of Huffman coding is that it makes two passes over the data: one pass to collect frequency counts of the letters in the file, followed by the construction of a Huffman tree and transmission of the tree to the receiver; and a second pass to encode and transmit the symbols themselves, based on the Huffman tree. This cause delay when used for network communication, and in file compression applications the extra disk accesses can slow down the scheme. Comparing with Huffman coding, the prefix coding may not collect frequency counts of the letters in the file. An algorithm efficiently generating a prefix code is given in Section 3. This makes the prefix coding scheme easily be implemented. On the other band, as we have seen from Ref. [12], the Huffman codes are good at using in a large information retrieval system. Important for a large information retrieval system is the issue of the cryptographic security of storing the text in compressed form, as might be required for copyrighted material. And in the usual approach to full-text retrieval, the processing of queries does not directly involve the original text files (in which key words may be located using some pattern matching technique), but rather the auxiliary dictionary and concordance files. The dictionary is the list of all the different words appearing in the text and is usually ordered alphabetically. The concordance contains, for every word of the dictionary, the lexicographically ordered list of references to all its occurrences in the text; it is accessed via the dictionary, which contains for every word a pointer to the corresponding list in the concordance. In particular, a prefix code based on the words of the original file is suitable for storing these auxiliary dictionary and concordance files. An optimal prefix code is used not only for statistical modeling but also for dictionary methods. Although the results of theoretical analyzing of optimal prefix codes are given, much research still needs to be done. In particular, the implementation of the optimal prefix coding scheme will be one of our future topics. References [1] Abrahams, J. (1994). Huffman-type codes for infinite source distributions. J. Franklin Inst., 331B(3), [2] Berstel, J. and Perrin, D. (1985). Theory of Codes. Academic Press, Orlando. [3] Bell, T. C., Cleary, J. G. and Witten, I. H. (1990). Text Compression. Prentice Hall, Englewood Cliffs, NJ. [4] Cover, T. and Thomas, J. (1991). Elements of Information Theory. New York, Wiley.

16 742 D. LONG et al. [5] Fraenkel, A. S. and Klein, S. T. (1994). Complexity aspects of guessing prefix codes. Algorithmica, [6] Gallager, R. A. and Van Voorhis, D. C. (1975). Optimal source coding for geometrically distributed integer alphabets. IEEE Trans. Inform. Theory, IT-21(3), [7] Gillman, D. W., Mohtashemi, M. and Rivest, R. L. (1996). On breaking a Huffman code. IEEE Trans. Inform. Theory, IT-42(3), [8] Huffman, D. A. (1951). A method for the construction of minimum redundancy codes. Proc. IRE, 40, [9] Hankerson, D., Harris, G. A., Johnson, P. D., Jr. (1997). Introduction to Information Theory and Data Compression. CRC Press LLC. [10] Jürgensen, H. and Konstantinidis, S. (1997). Codes. In: Rozenberg, G. and Salomaa, A. (Eds.), Handbook of Formal Languages, Vol. 1. Springer-Verlag, Berlin, Heidelberg, pp [11] Jones, D. W. (1988). Applications of splay trees to data compression. Communication of ACM, 31, [12] Klein, S. T., Bookstein, A. and Deerwester, S. (1989). Storing text-retrieval systems on CD-ROM: Compression and encryption considerations. ACM Trans. Inform. Syst., 7, [13] Lei, S. M. and Sun, M. T. (1991). An entropy coding system for digital HDTV applications. IEEE Trans. Circuit Systems Video Technology, 1, [14] Linder, T., Tarokh, V. and Zeger, K. (1997). Existence of optimal prefix codes for infinite source alphabets. IEEE Trans. Inform. Theory, 43(6), [15] Long, D. and Jia, W. (2001). Optimal maximal encoding different from Huffman encoding. Proc. of International Conference on Information Technology: Coding and Computing, 2 4 April 2001, Las Vegas, IEEE Computer Society, pp [16] Long, D. and Jia, W. (2000). The optimal encoding schemes. Proceedings of 16th World Computer Congress Aug 2000, Beijing, China, IFIP=SEC2000: Information Security, International Academic Publishers, pp [17] Montgomery, B. and Abrahams, J. (1987). On the redundancy of optimal binary prefix condition codes for finite and infinite sources. IEEE Trans. Inform. Theory, IT-33(1), [18] Pennebaker, W. B. and Mitchell, J. L. (1993). JPEG: Still Image Data Compression Standard, New York. [19] Roman, S. (1996). Introduction to Coding and Information Theory. Springer-Verlag, New York. [20] Rubin, F. (1979). Cryptographic aspects of data compression codes. Cryptologia, 3, [21] Salomon, D. (1988). Data Compression: The Complete Reference, Springer. [22] Vitter, J. S. (1987). Design and analysis of dynamic Huffman codes. Journal of the Association for Computing Machinery, 34(4), [23] Shyr, H. J. (1991). Free Monoids and Languages. Hon Min Book Company, Taichung, Taiwan R.O.C.

17

The strong chromatic number of a graph

The strong chromatic number of a graph The strong chromatic number of a graph Noga Alon Abstract It is shown that there is an absolute constant c with the following property: For any two graphs G 1 = (V, E 1 ) and G 2 = (V, E 2 ) on the same

More information

Chapter 5 VARIABLE-LENGTH CODING Information Theory Results (II)

Chapter 5 VARIABLE-LENGTH CODING Information Theory Results (II) Chapter 5 VARIABLE-LENGTH CODING ---- Information Theory Results (II) 1 Some Fundamental Results Coding an Information Source Consider an information source, represented by a source alphabet S. S = { s,

More information

Information Theory and Communication

Information Theory and Communication Information Theory and Communication Shannon-Fano-Elias Code and Arithmetic Codes Ritwik Banerjee rbanerjee@cs.stonybrook.edu c Ritwik Banerjee Information Theory and Communication 1/12 Roadmap Examples

More information

Complete Variable-Length "Fix-Free" Codes

Complete Variable-Length Fix-Free Codes Designs, Codes and Cryptography, 5, 109-114 (1995) 9 1995 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands. Complete Variable-Length "Fix-Free" Codes DAVID GILLMAN* gillman @ es.toronto.edu

More information

The Encoding Complexity of Network Coding

The Encoding Complexity of Network Coding The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network

More information

On Generalizations and Improvements to the Shannon-Fano Code

On Generalizations and Improvements to the Shannon-Fano Code Acta Technica Jaurinensis Vol. 10, No.1, pp. 1-12, 2017 DOI: 10.14513/actatechjaur.v10.n1.405 Available online at acta.sze.hu On Generalizations and Improvements to the Shannon-Fano Code D. Várkonyi 1,

More information

Parameterized graph separation problems

Parameterized graph separation problems Parameterized graph separation problems Dániel Marx Department of Computer Science and Information Theory, Budapest University of Technology and Economics Budapest, H-1521, Hungary, dmarx@cs.bme.hu Abstract.

More information

THE RELATIVE EFFICIENCY OF DATA COMPRESSION BY LZW AND LZSS

THE RELATIVE EFFICIENCY OF DATA COMPRESSION BY LZW AND LZSS THE RELATIVE EFFICIENCY OF DATA COMPRESSION BY LZW AND LZSS Yair Wiseman 1* * 1 Computer Science Department, Bar-Ilan University, Ramat-Gan 52900, Israel Email: wiseman@cs.huji.ac.il, http://www.cs.biu.ac.il/~wiseman

More information

9.1 Cook-Levin Theorem

9.1 Cook-Levin Theorem CS787: Advanced Algorithms Scribe: Shijin Kong and David Malec Lecturer: Shuchi Chawla Topic: NP-Completeness, Approximation Algorithms Date: 10/1/2007 As we ve already seen in the preceding lecture, two

More information

An Efficient Decoding Technique for Huffman Codes Abstract 1. Introduction

An Efficient Decoding Technique for Huffman Codes Abstract 1. Introduction An Efficient Decoding Technique for Huffman Codes Rezaul Alam Chowdhury and M. Kaykobad Department of Computer Science and Engineering Bangladesh University of Engineering and Technology Dhaka-1000, Bangladesh,

More information

Math 5593 Linear Programming Lecture Notes

Math 5593 Linear Programming Lecture Notes Math 5593 Linear Programming Lecture Notes Unit II: Theory & Foundations (Convex Analysis) University of Colorado Denver, Fall 2013 Topics 1 Convex Sets 1 1.1 Basic Properties (Luenberger-Ye Appendix B.1).........................

More information

Unlabeled equivalence for matroids representable over finite fields

Unlabeled equivalence for matroids representable over finite fields Unlabeled equivalence for matroids representable over finite fields November 16, 2012 S. R. Kingan Department of Mathematics Brooklyn College, City University of New York 2900 Bedford Avenue Brooklyn,

More information

Maximal Monochromatic Geodesics in an Antipodal Coloring of Hypercube

Maximal Monochromatic Geodesics in an Antipodal Coloring of Hypercube Maximal Monochromatic Geodesics in an Antipodal Coloring of Hypercube Kavish Gandhi April 4, 2015 Abstract A geodesic in the hypercube is the shortest possible path between two vertices. Leader and Long

More information

Pierre A. Humblet* Abstract

Pierre A. Humblet* Abstract Revised March 1980 ESL-P-8 0 0 GENERALIZATION OF HUFFMAN CODING TO MINIMIZE THE PROBABILITY OF BUFFER OVERFLOW BY Pierre A. Humblet* Abstract An algorithm is given to find a prefix condition code that

More information

Discrete Applied Mathematics. A revision and extension of results on 4-regular, 4-connected, claw-free graphs

Discrete Applied Mathematics. A revision and extension of results on 4-regular, 4-connected, claw-free graphs Discrete Applied Mathematics 159 (2011) 1225 1230 Contents lists available at ScienceDirect Discrete Applied Mathematics journal homepage: www.elsevier.com/locate/dam A revision and extension of results

More information

8 Matroid Intersection

8 Matroid Intersection 8 Matroid Intersection 8.1 Definition and examples 8.2 Matroid Intersection Algorithm 8.1 Definitions Given two matroids M 1 = (X, I 1 ) and M 2 = (X, I 2 ) on the same set X, their intersection is M 1

More information

Almost all Complete Binary Prefix Codes have a Self-Synchronizing String

Almost all Complete Binary Prefix Codes have a Self-Synchronizing String Almost all Complete Binary Prefix Codes have a Self-Synchronizing String Christopher F. Freiling Douglas S. Jungreis François Théberge Kenneth Zeger IEEE Transactions on Information Theory Submitted: February

More information

A Model of Machine Learning Based on User Preference of Attributes

A Model of Machine Learning Based on User Preference of Attributes 1 A Model of Machine Learning Based on User Preference of Attributes Yiyu Yao 1, Yan Zhao 1, Jue Wang 2 and Suqing Han 2 1 Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada

More information

A GRAPH FROM THE VIEWPOINT OF ALGEBRAIC TOPOLOGY

A GRAPH FROM THE VIEWPOINT OF ALGEBRAIC TOPOLOGY A GRAPH FROM THE VIEWPOINT OF ALGEBRAIC TOPOLOGY KARL L. STRATOS Abstract. The conventional method of describing a graph as a pair (V, E), where V and E repectively denote the sets of vertices and edges,

More information

We show that the composite function h, h(x) = g(f(x)) is a reduction h: A m C.

We show that the composite function h, h(x) = g(f(x)) is a reduction h: A m C. 219 Lemma J For all languages A, B, C the following hold i. A m A, (reflexive) ii. if A m B and B m C, then A m C, (transitive) iii. if A m B and B is Turing-recognizable, then so is A, and iv. if A m

More information

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings

On the Relationships between Zero Forcing Numbers and Certain Graph Coverings On the Relationships between Zero Forcing Numbers and Certain Graph Coverings Fatemeh Alinaghipour Taklimi, Shaun Fallat 1,, Karen Meagher 2 Department of Mathematics and Statistics, University of Regina,

More information

The self-minor conjecture for infinite trees

The self-minor conjecture for infinite trees The self-minor conjecture for infinite trees Julian Pott Abstract We prove Seymour s self-minor conjecture for infinite trees. 1. Introduction P. D. Seymour conjectured that every infinite graph is a proper

More information

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006

2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 The Encoding Complexity of Network Coding Michael Langberg, Member, IEEE, Alexander Sprintson, Member, IEEE, and Jehoshua Bruck,

More information

PLANAR GRAPH BIPARTIZATION IN LINEAR TIME

PLANAR GRAPH BIPARTIZATION IN LINEAR TIME PLANAR GRAPH BIPARTIZATION IN LINEAR TIME SAMUEL FIORINI, NADIA HARDY, BRUCE REED, AND ADRIAN VETTA Abstract. For each constant k, we present a linear time algorithm that, given a planar graph G, either

More information

On the packing chromatic number of some lattices

On the packing chromatic number of some lattices On the packing chromatic number of some lattices Arthur S. Finbow Department of Mathematics and Computing Science Saint Mary s University Halifax, Canada BH C art.finbow@stmarys.ca Douglas F. Rall Department

More information

Agreedy approximation for minimum connected dominating sets

Agreedy approximation for minimum connected dominating sets Theoretical Computer Science 329 2004) 325 330 www.elsevier.com/locate/tcs Note Agreedy approximation for minimum connected dominating sets Lu Ruan a, Hongwei Du b, Xiaohua Jia b,,1, Weili Wu c,1,2, Yingshu

More information

Figure-2.1. Information system with encoder/decoders.

Figure-2.1. Information system with encoder/decoders. 2. Entropy Coding In the section on Information Theory, information system is modeled as the generationtransmission-user triplet, as depicted in fig-1.1, to emphasize the information aspect of the system.

More information

Preferred directions for resolving the non-uniqueness of Delaunay triangulations

Preferred directions for resolving the non-uniqueness of Delaunay triangulations Preferred directions for resolving the non-uniqueness of Delaunay triangulations Christopher Dyken and Michael S. Floater Abstract: This note proposes a simple rule to determine a unique triangulation

More information

A Note on the Succinctness of Descriptions of Deterministic Languages

A Note on the Succinctness of Descriptions of Deterministic Languages INFORMATION AND CONTROL 32, 139-145 (1976) A Note on the Succinctness of Descriptions of Deterministic Languages LESLIE G. VALIANT Centre for Computer Studies, University of Leeds, Leeds, United Kingdom

More information

General properties of staircase and convex dual feasible functions

General properties of staircase and convex dual feasible functions General properties of staircase and convex dual feasible functions JÜRGEN RIETZ, CLÁUDIO ALVES, J. M. VALÉRIO de CARVALHO Centro de Investigação Algoritmi da Universidade do Minho, Escola de Engenharia

More information

A Reduction of Conway s Thrackle Conjecture

A Reduction of Conway s Thrackle Conjecture A Reduction of Conway s Thrackle Conjecture Wei Li, Karen Daniels, and Konstantin Rybnikov Department of Computer Science and Department of Mathematical Sciences University of Massachusetts, Lowell 01854

More information

Characterization of Boolean Topological Logics

Characterization of Boolean Topological Logics Characterization of Boolean Topological Logics Short Form: Boolean Topological Logics Anthony R. Fressola Denison University Granville, OH 43023 University of Illinois Urbana-Champaign, IL USA 61801-61802

More information

Winning Positions in Simplicial Nim

Winning Positions in Simplicial Nim Winning Positions in Simplicial Nim David Horrocks Department of Mathematics and Statistics University of Prince Edward Island Charlottetown, Prince Edward Island, Canada, C1A 4P3 dhorrocks@upei.ca Submitted:

More information

A Connection between Network Coding and. Convolutional Codes

A Connection between Network Coding and. Convolutional Codes A Connection between Network Coding and 1 Convolutional Codes Christina Fragouli, Emina Soljanin christina.fragouli@epfl.ch, emina@lucent.com Abstract The min-cut, max-flow theorem states that a source

More information

Triangle Graphs and Simple Trapezoid Graphs

Triangle Graphs and Simple Trapezoid Graphs JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 18, 467-473 (2002) Short Paper Triangle Graphs and Simple Trapezoid Graphs Department of Computer Science and Information Management Providence University

More information

Optimal Variable Length Codes (Arbitrary Symbol Cost and Equal Code Word Probability)* BEN VARN

Optimal Variable Length Codes (Arbitrary Symbol Cost and Equal Code Word Probability)* BEN VARN INFORMATION AND CONTROL 19, 289-301 (1971) Optimal Variable Length Codes (Arbitrary Symbol Cost and Equal Code Word Probability)* BEN VARN School of Systems and Logistics, Air Force Institute of Technology,

More information

Treewidth and graph minors

Treewidth and graph minors Treewidth and graph minors Lectures 9 and 10, December 29, 2011, January 5, 2012 We shall touch upon the theory of Graph Minors by Robertson and Seymour. This theory gives a very general condition under

More information

Group Secret Key Generation Algorithms

Group Secret Key Generation Algorithms Group Secret Key Generation Algorithms Chunxuan Ye and Alex Reznik InterDigital Communications Corporation King of Prussia, PA 9406 Email: {Chunxuan.Ye, Alex.Reznik}@interdigital.com arxiv:cs/07024v [cs.it]

More information

Solutions to Homework 10

Solutions to Homework 10 CS/Math 240: Intro to Discrete Math 5/3/20 Instructor: Dieter van Melkebeek Solutions to Homework 0 Problem There were five different languages in Problem 4 of Homework 9. The Language D 0 Recall that

More information

ON SWELL COLORED COMPLETE GRAPHS

ON SWELL COLORED COMPLETE GRAPHS Acta Math. Univ. Comenianae Vol. LXIII, (1994), pp. 303 308 303 ON SWELL COLORED COMPLETE GRAPHS C. WARD and S. SZABÓ Abstract. An edge-colored graph is said to be swell-colored if each triangle contains

More information

Hashing. Yufei Tao. Department of Computer Science and Engineering Chinese University of Hong Kong

Hashing. Yufei Tao. Department of Computer Science and Engineering Chinese University of Hong Kong Department of Computer Science and Engineering Chinese University of Hong Kong In this lecture, we will revisit the dictionary search problem, where we want to locate an integer v in a set of size n or

More information

On competition numbers of complete multipartite graphs with partite sets of equal size. Boram PARK, Suh-Ryung KIM, and Yoshio SANO.

On competition numbers of complete multipartite graphs with partite sets of equal size. Boram PARK, Suh-Ryung KIM, and Yoshio SANO. RIMS-1644 On competition numbers of complete multipartite graphs with partite sets of equal size By Boram PARK, Suh-Ryung KIM, and Yoshio SANO October 2008 RESEARCH INSTITUTE FOR MATHEMATICAL SCIENCES

More information

Constructing arbitrarily large graphs with a specified number of Hamiltonian cycles

Constructing arbitrarily large graphs with a specified number of Hamiltonian cycles Electronic Journal of Graph Theory and Applications 4 (1) (2016), 18 25 Constructing arbitrarily large graphs with a specified number of Hamiltonian cycles Michael School of Computer Science, Engineering

More information

Infinite locally random graphs

Infinite locally random graphs Infinite locally random graphs Pierre Charbit and Alex D. Scott Abstract Motivated by copying models of the web graph, Bonato and Janssen [3] introduced the following simple construction: given a graph

More information

AXIOMS FOR THE INTEGERS

AXIOMS FOR THE INTEGERS AXIOMS FOR THE INTEGERS BRIAN OSSERMAN We describe the set of axioms for the integers which we will use in the class. The axioms are almost the same as what is presented in Appendix A of the textbook,

More information

A step towards the Bermond-Thomassen conjecture about disjoint cycles in digraphs

A step towards the Bermond-Thomassen conjecture about disjoint cycles in digraphs A step towards the Bermond-Thomassen conjecture about disjoint cycles in digraphs Nicolas Lichiardopol Attila Pór Jean-Sébastien Sereni Abstract In 1981, Bermond and Thomassen conjectured that every digraph

More information

Lecture 15. Error-free variable length schemes: Shannon-Fano code

Lecture 15. Error-free variable length schemes: Shannon-Fano code Lecture 15 Agenda for the lecture Bounds for L(X) Error-free variable length schemes: Shannon-Fano code 15.1 Optimal length nonsingular code While we do not know L(X), it is easy to specify a nonsingular

More information

Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay

Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay Digital Communication Prof. Bikash Kumar Dey Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 26 Source Coding (Part 1) Hello everyone, we will start a new module today

More information

Compressing Data. Konstantin Tretyakov

Compressing Data. Konstantin Tretyakov Compressing Data Konstantin Tretyakov (kt@ut.ee) MTAT.03.238 Advanced April 26, 2012 Claude Elwood Shannon (1916-2001) C. E. Shannon. A mathematical theory of communication. 1948 C. E. Shannon. The mathematical

More information

Algorithmic Aspects of Acyclic Edge Colorings

Algorithmic Aspects of Acyclic Edge Colorings Algorithmic Aspects of Acyclic Edge Colorings Noga Alon Ayal Zaks Abstract A proper coloring of the edges of a graph G is called acyclic if there is no -colored cycle in G. The acyclic edge chromatic number

More information

Notes for Lecture 24

Notes for Lecture 24 U.C. Berkeley CS170: Intro to CS Theory Handout N24 Professor Luca Trevisan December 4, 2001 Notes for Lecture 24 1 Some NP-complete Numerical Problems 1.1 Subset Sum The Subset Sum problem is defined

More information

T consists of finding an efficient implementation of access,

T consists of finding an efficient implementation of access, 968 IEEE TRANSACTIONS ON COMPUTERS, VOL. 38, NO. 7, JULY 1989 Multidimensional Balanced Binary Trees VIJAY K. VAISHNAVI A bstract-a new balanced multidimensional tree structure called a k-dimensional balanced

More information

Limitations of Algorithmic Solvability In this Chapter we investigate the power of algorithms to solve problems Some can be solved algorithmically and

Limitations of Algorithmic Solvability In this Chapter we investigate the power of algorithms to solve problems Some can be solved algorithmically and Computer Language Theory Chapter 4: Decidability 1 Limitations of Algorithmic Solvability In this Chapter we investigate the power of algorithms to solve problems Some can be solved algorithmically and

More information

COLORING EDGES AND VERTICES OF GRAPHS WITHOUT SHORT OR LONG CYCLES

COLORING EDGES AND VERTICES OF GRAPHS WITHOUT SHORT OR LONG CYCLES Volume 2, Number 1, Pages 61 66 ISSN 1715-0868 COLORING EDGES AND VERTICES OF GRAPHS WITHOUT SHORT OR LONG CYCLES MARCIN KAMIŃSKI AND VADIM LOZIN Abstract. Vertex and edge colorability are two graph problems

More information

The Structure of Bull-Free Perfect Graphs

The Structure of Bull-Free Perfect Graphs The Structure of Bull-Free Perfect Graphs Maria Chudnovsky and Irena Penev Columbia University, New York, NY 10027 USA May 18, 2012 Abstract The bull is a graph consisting of a triangle and two vertex-disjoint

More information

Lecture 17. Lower bound for variable-length source codes with error. Coding a sequence of symbols: Rates and scheme (Arithmetic code)

Lecture 17. Lower bound for variable-length source codes with error. Coding a sequence of symbols: Rates and scheme (Arithmetic code) Lecture 17 Agenda for the lecture Lower bound for variable-length source codes with error Coding a sequence of symbols: Rates and scheme (Arithmetic code) Introduction to universal codes 17.1 variable-length

More information

On vertex types of graphs

On vertex types of graphs On vertex types of graphs arxiv:1705.09540v1 [math.co] 26 May 2017 Pu Qiao, Xingzhi Zhan Department of Mathematics, East China Normal University, Shanghai 200241, China Abstract The vertices of a graph

More information

Crossing Families. Abstract

Crossing Families. Abstract Crossing Families Boris Aronov 1, Paul Erdős 2, Wayne Goddard 3, Daniel J. Kleitman 3, Michael Klugerman 3, János Pach 2,4, Leonard J. Schulman 3 Abstract Given a set of points in the plane, a crossing

More information

Math 302 Introduction to Proofs via Number Theory. Robert Jewett (with small modifications by B. Ćurgus)

Math 302 Introduction to Proofs via Number Theory. Robert Jewett (with small modifications by B. Ćurgus) Math 30 Introduction to Proofs via Number Theory Robert Jewett (with small modifications by B. Ćurgus) March 30, 009 Contents 1 The Integers 3 1.1 Axioms of Z...................................... 3 1.

More information

COMPSCI 650 Applied Information Theory Feb 2, Lecture 5. Recall the example of Huffman Coding on a binary string from last class:

COMPSCI 650 Applied Information Theory Feb 2, Lecture 5. Recall the example of Huffman Coding on a binary string from last class: COMPSCI 650 Applied Information Theory Feb, 016 Lecture 5 Instructor: Arya Mazumdar Scribe: Larkin Flodin, John Lalor 1 Huffman Coding 1.1 Last Class s Example Recall the example of Huffman Coding on a

More information

Fundamental Properties of Graphs

Fundamental Properties of Graphs Chapter three In many real-life situations we need to know how robust a graph that represents a certain network is, how edges or vertices can be removed without completely destroying the overall connectivity,

More information

Interval Algorithm for Homophonic Coding

Interval Algorithm for Homophonic Coding IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 47, NO 3, MARCH 2001 1021 Interval Algorithm for Homophonic Coding Mamoru Hoshi, Member, IEEE, and Te Sun Han, Fellow, IEEE Abstract It is shown that the idea

More information

Introduction to Sets and Logic (MATH 1190)

Introduction to Sets and Logic (MATH 1190) Introduction to Sets and Logic () Instructor: Email: shenlili@yorku.ca Department of Mathematics and Statistics York University Dec 4, 2014 Outline 1 2 3 4 Definition A relation R from a set A to a set

More information

A study on the Primitive Holes of Certain Graphs

A study on the Primitive Holes of Certain Graphs A study on the Primitive Holes of Certain Graphs Johan Kok arxiv:150304526v1 [mathco] 16 Mar 2015 Tshwane Metropolitan Police Department City of Tshwane, Republic of South Africa E-mail: kokkiek2@tshwanegovza

More information

Lecture 4: September 11, 2003

Lecture 4: September 11, 2003 Algorithmic Modeling and Complexity Fall 2003 Lecturer: J. van Leeuwen Lecture 4: September 11, 2003 Scribe: B. de Boer 4.1 Overview This lecture introduced Fixed Parameter Tractable (FPT) problems. An

More information

Edge Colorings of Complete Multipartite Graphs Forbidding Rainbow Cycles

Edge Colorings of Complete Multipartite Graphs Forbidding Rainbow Cycles Theory and Applications of Graphs Volume 4 Issue 2 Article 2 November 2017 Edge Colorings of Complete Multipartite Graphs Forbidding Rainbow Cycles Peter Johnson johnspd@auburn.edu Andrew Owens Auburn

More information

Parameterized Complexity of Independence and Domination on Geometric Graphs

Parameterized Complexity of Independence and Domination on Geometric Graphs Parameterized Complexity of Independence and Domination on Geometric Graphs Dániel Marx Institut für Informatik, Humboldt-Universität zu Berlin, Unter den Linden 6, 10099 Berlin, Germany. dmarx@informatik.hu-berlin.de

More information

Preemptive Scheduling of Equal-Length Jobs in Polynomial Time

Preemptive Scheduling of Equal-Length Jobs in Polynomial Time Preemptive Scheduling of Equal-Length Jobs in Polynomial Time George B. Mertzios and Walter Unger Abstract. We study the preemptive scheduling problem of a set of n jobs with release times and equal processing

More information

FOUR EDGE-INDEPENDENT SPANNING TREES 1

FOUR EDGE-INDEPENDENT SPANNING TREES 1 FOUR EDGE-INDEPENDENT SPANNING TREES 1 Alexander Hoyer and Robin Thomas School of Mathematics Georgia Institute of Technology Atlanta, Georgia 30332-0160, USA ABSTRACT We prove an ear-decomposition theorem

More information

arxiv:cs/ v2 [cs.cr] 27 Aug 2006

arxiv:cs/ v2 [cs.cr] 27 Aug 2006 On the security of the Yen-Guo s domino signal encryption algorithm (DSEA) arxiv:cs/0501013v2 [cs.cr] 27 Aug 2006 Chengqing Li a, Shujun Li b, Der-Chyuan Lou c and Dan Zhang d a Department of Mathematics,

More information

The Fibonacci hypercube

The Fibonacci hypercube AUSTRALASIAN JOURNAL OF COMBINATORICS Volume 40 (2008), Pages 187 196 The Fibonacci hypercube Fred J. Rispoli Department of Mathematics and Computer Science Dowling College, Oakdale, NY 11769 U.S.A. Steven

More information

Extremal Graph Theory: Turán s Theorem

Extremal Graph Theory: Turán s Theorem Bridgewater State University Virtual Commons - Bridgewater State University Honors Program Theses and Projects Undergraduate Honors Program 5-9-07 Extremal Graph Theory: Turán s Theorem Vincent Vascimini

More information

Restricted edge connectivity and restricted connectivity of graphs

Restricted edge connectivity and restricted connectivity of graphs Restricted edge connectivity and restricted connectivity of graphs Litao Guo School of Applied Mathematics Xiamen University of Technology Xiamen Fujian 361024 P.R.China ltguo2012@126.com Xiaofeng Guo

More information

Module 6 NP-Complete Problems and Heuristics

Module 6 NP-Complete Problems and Heuristics Module 6 NP-Complete Problems and Heuristics Dr. Natarajan Meghanathan Professor of Computer Science Jackson State University Jackson, MS 39217 E-mail: natarajan.meghanathan@jsums.edu P, NP-Problems Class

More information

Matching Algorithms. Proof. If a bipartite graph has a perfect matching, then it is easy to see that the right hand side is a necessary condition.

Matching Algorithms. Proof. If a bipartite graph has a perfect matching, then it is easy to see that the right hand side is a necessary condition. 18.433 Combinatorial Optimization Matching Algorithms September 9,14,16 Lecturer: Santosh Vempala Given a graph G = (V, E), a matching M is a set of edges with the property that no two of the edges have

More information

COMP260 Spring 2014 Notes: February 4th

COMP260 Spring 2014 Notes: February 4th COMP260 Spring 2014 Notes: February 4th Andrew Winslow In these notes, all graphs are undirected. We consider matching, covering, and packing in bipartite graphs, general graphs, and hypergraphs. We also

More information

On minimum m-connected k-dominating set problem in unit disc graphs

On minimum m-connected k-dominating set problem in unit disc graphs J Comb Optim (2008) 16: 99 106 DOI 10.1007/s10878-007-9124-y On minimum m-connected k-dominating set problem in unit disc graphs Weiping Shang Frances Yao Pengjun Wan Xiaodong Hu Published online: 5 December

More information

2. Sets. 2.1&2.2: Sets and Subsets. Combining Sets. c Dr Oksana Shatalov, Fall

2. Sets. 2.1&2.2: Sets and Subsets. Combining Sets. c Dr Oksana Shatalov, Fall c Dr Oksana Shatalov, Fall 2014 1 2. Sets 2.1&2.2: Sets and Subsets. Combining Sets. Set Terminology and Notation DEFINITIONS: Set is well-defined collection of objects. Elements are objects or members

More information

Topology and Topological Spaces

Topology and Topological Spaces Topology and Topological Spaces Mathematical spaces such as vector spaces, normed vector spaces (Banach spaces), and metric spaces are generalizations of ideas that are familiar in R or in R n. For example,

More information

Scan Scheduling Specification and Analysis

Scan Scheduling Specification and Analysis Scan Scheduling Specification and Analysis Bruno Dutertre System Design Laboratory SRI International Menlo Park, CA 94025 May 24, 2000 This work was partially funded by DARPA/AFRL under BAE System subcontract

More information

Distributed minimum spanning tree problem

Distributed minimum spanning tree problem Distributed minimum spanning tree problem Juho-Kustaa Kangas 24th November 2012 Abstract Given a connected weighted undirected graph, the minimum spanning tree problem asks for a spanning subtree with

More information

A RADIO COLORING OF A HYPERCUBE

A RADIO COLORING OF A HYPERCUBE Intern. J. Computer Math., 2002, Vol. 79(6), pp. 665 670 A RADIO COLORING OF A HYPERCUBE OPHIR FRIEDER a, *, FRANK HARARY b and PENG-JUN WAN a a Illinois Institute of Technology, Chicago, IL 60616; b New

More information

arxiv: v2 [math.co] 13 Aug 2013

arxiv: v2 [math.co] 13 Aug 2013 Orthogonality and minimality in the homology of locally finite graphs Reinhard Diestel Julian Pott arxiv:1307.0728v2 [math.co] 13 Aug 2013 August 14, 2013 Abstract Given a finite set E, a subset D E (viewed

More information

On vertex-coloring edge-weighting of graphs

On vertex-coloring edge-weighting of graphs Front. Math. China DOI 10.1007/s11464-009-0014-8 On vertex-coloring edge-weighting of graphs Hongliang LU 1, Xu YANG 1, Qinglin YU 1,2 1 Center for Combinatorics, Key Laboratory of Pure Mathematics and

More information

A General Class of Heuristics for Minimum Weight Perfect Matching and Fast Special Cases with Doubly and Triply Logarithmic Errors 1

A General Class of Heuristics for Minimum Weight Perfect Matching and Fast Special Cases with Doubly and Triply Logarithmic Errors 1 Algorithmica (1997) 18: 544 559 Algorithmica 1997 Springer-Verlag New York Inc. A General Class of Heuristics for Minimum Weight Perfect Matching and Fast Special Cases with Doubly and Triply Logarithmic

More information

Interleaving Schemes on Circulant Graphs with Two Offsets

Interleaving Schemes on Circulant Graphs with Two Offsets Interleaving Schemes on Circulant raphs with Two Offsets Aleksandrs Slivkins Department of Computer Science Cornell University Ithaca, NY 14853 slivkins@cs.cornell.edu Jehoshua Bruck Department of Electrical

More information

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 11 Coding Strategies and Introduction to Huffman Coding The Fundamental

More information

Data Compression - Seminar 4

Data Compression - Seminar 4 Data Compression - Seminar 4 October 29, 2013 Problem 1 (Uniquely decodable and instantaneous codes) Let L = p i l 100 i be the expected value of the 100th power of the word lengths associated with an

More information

6. Advanced Topics in Computability

6. Advanced Topics in Computability 227 6. Advanced Topics in Computability The Church-Turing thesis gives a universally acceptable definition of algorithm Another fundamental concept in computer science is information No equally comprehensive

More information

THE TRANSITIVE REDUCTION OF A DIRECTED GRAPH*

THE TRANSITIVE REDUCTION OF A DIRECTED GRAPH* SIAM J. COMPUT. Vol. 1, No. 2, June 1972 THE TRANSITIVE REDUCTION OF A DIRECTED GRAPH* A. V. AHO, M. R. GAREY" AND J. D. ULLMAN Abstract. We consider economical representations for the path information

More information

Maximizing edge-ratio is NP-complete

Maximizing edge-ratio is NP-complete Maximizing edge-ratio is NP-complete Steven D Noble, Pierre Hansen and Nenad Mladenović February 7, 01 Abstract Given a graph G and a bipartition of its vertices, the edge-ratio is the minimum for both

More information

Report on article The Travelling Salesman Problem: A Linear Programming Formulation

Report on article The Travelling Salesman Problem: A Linear Programming Formulation Report on article The Travelling Salesman Problem: A Linear Programming Formulation Radosław Hofman, Poznań 2008 Abstract This article describes counter example prepared in order to prove that linear formulation

More information

Acyclic Edge Colorings of Graphs

Acyclic Edge Colorings of Graphs Acyclic Edge Colorings of Graphs Noga Alon Ayal Zaks Abstract A proper coloring of the edges of a graph G is called acyclic if there is no 2-colored cycle in G. The acyclic edge chromatic number of G,

More information

New Constructions of Non-Adaptive and Error-Tolerance Pooling Designs

New Constructions of Non-Adaptive and Error-Tolerance Pooling Designs New Constructions of Non-Adaptive and Error-Tolerance Pooling Designs Hung Q Ngo Ding-Zhu Du Abstract We propose two new classes of non-adaptive pooling designs The first one is guaranteed to be -error-detecting

More information

On 2-Subcolourings of Chordal Graphs

On 2-Subcolourings of Chordal Graphs On 2-Subcolourings of Chordal Graphs Juraj Stacho School of Computing Science, Simon Fraser University 8888 University Drive, Burnaby, B.C., Canada V5A 1S6 jstacho@cs.sfu.ca Abstract. A 2-subcolouring

More information

Module 11. Directed Graphs. Contents

Module 11. Directed Graphs. Contents Module 11 Directed Graphs Contents 11.1 Basic concepts......................... 256 Underlying graph of a digraph................ 257 Out-degrees and in-degrees.................. 258 Isomorphism..........................

More information

On the Max Coloring Problem

On the Max Coloring Problem On the Max Coloring Problem Leah Epstein Asaf Levin May 22, 2010 Abstract We consider max coloring on hereditary graph classes. The problem is defined as follows. Given a graph G = (V, E) and positive

More information

THREE LECTURES ON BASIC TOPOLOGY. 1. Basic notions.

THREE LECTURES ON BASIC TOPOLOGY. 1. Basic notions. THREE LECTURES ON BASIC TOPOLOGY PHILIP FOTH 1. Basic notions. Let X be a set. To make a topological space out of X, one must specify a collection T of subsets of X, which are said to be open subsets of

More information

Construction C : an inter-level coded version of Construction C

Construction C : an inter-level coded version of Construction C Construction C : an inter-level coded version of Construction C arxiv:1709.06640v2 [cs.it] 27 Dec 2017 Abstract Besides all the attention given to lattice constructions, it is common to find some very

More information

A Partition Method for Graph Isomorphism

A Partition Method for Graph Isomorphism Available online at www.sciencedirect.com Physics Procedia ( ) 6 68 International Conference on Solid State Devices and Materials Science A Partition Method for Graph Isomorphism Lijun Tian, Chaoqun Liu

More information