Usenet.com

www.Usenet.com

Group Index

Comp Thread Archive from Usenet.com

<-- __Chronological__ --> <-- __Thread__ -->

Re: Missed opportunities with BWT



"Jase" <[EMAIL PROTECTED]> writes:

> Hi everyone,
>     I've been looking at the debug output from my bwt stage, and it seems to
> me that there is a HUGE opportunity for compression that I am missing. My

huge opportunity for compression, eh?

> algorithm works on the doubling principle (use the sort order from the
> previous iteration.. blah blah blah), and outputs the sort depth to the

What's the "depth" of a sort?

> debugger. If a file had to be sorted to a depth of 8192, then there are at
> least 2 strings in the file which are equal for at least the first 4096
> chars (could be up to 8192). 

You don't think that the fact that there's a 4096 character repeated 
string could be reason for the huge opportunity for compression, and 
not your algorithm?

> In fact, if I were to keep track of the strings
> eliminated from the sort at each iteration, I would have a list of strings

Why's the sort "eliminating" anything?

> which are equal to a very large context. Has anyone else investigated this,
> and if so, have there been any papers published? If not, I will start now
> ;-)


Phil

-- 
Unpatched IE vulnerability: file-protocol proxy
Description: cross-domain scripting, cookie/data/identity theft, 
             command execution
Reference: http://safecenter.net/liudieyu/WsOpenFileJPU/WsOpenFileJPU-Content.HTM
Exploit: http://safecenter.net/liudieyu/WsOpenFileJPU/WsOpenFileJPU-MyPage.HTM



<-- __Chronological__ --> <-- __Thread__ -->


Usenet.com



Please check out one of the premium Usenet Newsgroup Service Providers below for access to Usenet.