blob: 9028d0cf5c955a9fdce4f70894dbecbe9cc02264 [file] [log] [blame]
The Android Open Source Project990c1eb2008-12-17 18:05:30 -08001/* xdelta 3 - delta compression tools and library
2 * Copyright (C) 2001, 2003, 2004, 2005, 2006, 2007. Joshua P. MacDonald
3 *
4 * This program is free software; you can redistribute it and/or modify
5 * it under the terms of the GNU General Public License as published by
6 * the Free Software Foundation; either version 2 of the License, or
7 * (at your option) any later version.
8 *
9 * This program is distributed in the hope that it will be useful,
10 * but WITHOUT ANY WARRANTY; without even the implied warranty of
11 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
12 * GNU General Public License for more details.
13 *
14 * You should have received a copy of the GNU General Public License
15 * along with this program; if not, write to the Free Software
16 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
17
18 -------------------------------------------------------------------
19
20 Xdelta 3
21
22 The goal of this library is to to implement both the (stand-alone)
23 data-compression and delta-compression aspects of VCDIFF encoding, and
24 to support a programming interface that works like Zlib
25 (http://www.gzip.org/zlib.html). See RFC3284: The VCDIFF Generic
26 Differencing and Compression Data Format.
27
28 VCDIFF is a unified encoding that combines data-compression and
29 delta-encoding ("differencing").
30
31 VCDIFF has a detailed byte-code instruction set with many features.
32 The instruction format supports an immediate size operand for small
33 COPYs and ADDs (e.g., under 18 bytes). There are also instruction
34 "modes", which are used to compress COPY addresses by using two
35 address caches. An instruction mode refers to slots in the NEAR
36 and SAME caches for recent addresses. NEAR remembers the
37 previous 4 (by default) COPY addresses, and SAME catches
38 frequent re-uses of the same address using a 3-way (by default)
39 256-entry associative cache of [ADDR mod 256], the encoded byte.
40 A hit in the NEAR/SAME cache requires 0/1 ADDR bytes.
41
42 VCDIFF has a default instruction table, but an alternate
43 instruction tables may themselves be be delta-compressed and
44 included in the encoding header. This allows even more freedom.
45 There are 9 instruction modes in the default code table, 4 near, 3
46 same, VCD_SELF (absolute encoding) and VCD_HERE (relative to the
47 current position).
48
49 ----------------------------------------------------------------------
50
51 Algorithms
52
53 Aside from the details of encoding and decoding, there are a bunch
54 of algorithms needed.
55
56 1. STRING-MATCH. A two-level fingerprinting approach is used. A
57 single loop computes the two checksums -- small and large -- at
58 successive offsets in the TARGET file. The large checksum is more
59 accurate and is used to discover SOURCE matches, which are
60 potentially very long. The small checksum is used to discover
61 copies within the TARGET. Small matching, which is more expensive,
62 usually dominates the large STRING-MATCH costs in this code - the
63 more exhaustive the search, the better the results. Either of the
64 two string-matching mechanisms may be disabled.
65
66 2. INSTRUCTION SELECTION. The IOPT buffer here represents a queue
67 used to store overlapping copy instructions. There are two possible
68 optimizations that go beyond a greedy search. Both of these fall
69 into the category of "non-greedy matching" optimizations.
70
71 The first optimization stems from backward SOURCE-COPY matching.
72 When a new SOURCE-COPY instruction covers a previous instruction in
73 the target completely, it is erased from the queue. Randal Burns
74 originally analyzed these algorithms and did a lot of related work
75 (\cite the 1.5-pass algorithm).
76
77 The second optimization comes by the encoding of common very-small
78 COPY and ADD instructions, for which there are special DOUBLE-code
79 instructions, which code two instructions in a single byte.
80
81 The cost of bad instruction-selection overhead is relatively high
82 for data-compression, relative to delta-compression, so this second
83 optimization is fairly important. With "lazy" matching (the name
84 used in Zlib for a similar optimization), the string-match
85 algorithm searches after a match for potential overlapping copy
86 instructions. In Xdelta and by default, VCDIFF, the minimum match
87 size is 4 bytes, whereas Zlib searches with a 3-byte minimum. This
88 feature, combined with double instructions, provides a nice
89 challenge. Search in this file for "black magic", a heuristic.
90
91 3. STREAM ALIGNMENT. Stream alignment is needed to compress large
92 inputs in constant space. See xd3_srcwin_move_point().
93
94 4. WINDOW SELECTION. When the IOPT buffer flushes, in the first call
95 to xd3_iopt_finish_encoding containing any kind of copy instruction,
96 the parameters of the source window must be decided: the offset into
97 the source and the length of the window. Since the IOPT buffer is
98 finite, the program may be forced to fix these values before knowing
99 the best offset/length.
100
101 5. SECONDARY COMPRESSION. VCDIFF supports a secondary encoding to
102 be applied to the individual sections of the data format, which are
103 ADDRess, INSTruction, and DATA. Several secondary compressor
104 variations are implemented here, although none is standardized yet.
105
106 One is an adaptive huffman algorithm -- the FGK algorithm (Faller,
107 Gallager, and Knuth, 1985). This compressor is extremely slow.
108
109 The other is a simple static Huffman routine, which is the base
110 case of a semi-adaptive scheme published by D.J. Wheeler and first
111 widely used in bzip2 (by Julian Seward). This is a very
112 interesting algorithm, originally published in nearly cryptic form
113 by D.J. Wheeler. !!!NOTE!!! Because these are not standardized,
114 secondary compression remains off by default.
115 ftp://ftp.cl.cam.ac.uk/users/djw3/bred3.{c,ps}
116 --------------------------------------------------------------------
117
118 Other Features
119
120 1. USER CONVENIENCE
121
122 For user convenience, it is essential to recognize Gzip-compressed
123 files and automatically Gzip-decompress them prior to
124 delta-compression (or else no delta-compression will be achieved
125 unless the user manually decompresses the inputs). The compressed
126 represention competes with Xdelta, and this must be hidden from the
127 command-line user interface. The Xdelta-1.x encoding was simple, not
128 compressed itself, so Xdelta-1.x uses Zlib internally to compress the
129 representation.
130
131 This implementation supports external compression, which implements
132 the necessary fork() and pipe() mechanics. There is a tricky step
133 involved to support automatic detection of a compressed input in a
134 non-seekable input. First you read a bit of the input to detect
135 magic headers. When a compressed format is recognized, exec() the
136 external compression program and create a second child process to
137 copy the original input stream. [Footnote: There is a difficulty
138 related to using Gzip externally. It is not possible to decompress
139 and recompress a Gzip file transparently. If FILE.GZ had a
140 cryptographic signature, then, after: (1) Gzip-decompression, (2)
141 Xdelta-encoding, (3) Gzip-compression the signature could be
142 broken. The only way to solve this problem is to guess at Gzip's
143 compression level or control it by other means. I recommend that
144 specific implementations of any compression scheme store
145 information needed to exactly re-compress the input, that way
146 external compression is transparent - however, this won't happen
147 here until it has stabilized.]
148
149 2. APPLICATION-HEADER
150
151 This feature was introduced in RFC3284. It allows any application
152 to include a header within the VCDIFF file format. This allows
153 general inter-application data exchange with support for
154 application-specific extensions to communicate metadata.
155
156 3. VCDIFF CHECKSUM
157
158 An optional checksum value is included with each window, which can
159 be used to validate the final result. This verifies the correct source
160 file was used for decompression as well as the obvious advantage:
161 checking the implementation (and underlying) correctness.
162
163 4. LIGHT WEIGHT
164
165 The code makes efforts to avoid copying data more than necessary.
166 The code delays many initialization tasks until the first use, it
167 optimizes for identical (perfectly matching) inputs. It does not
168 compute any checksums until the first lookup misses. Memory usage
169 is reduced. String-matching is templatized (by slightly gross use
170 of CPP) to hard-code alternative compile-time defaults. The code
171 has few outside dependencies.
172 ----------------------------------------------------------------------
173
174 The default rfc3284 instruction table:
175 (see RFC for the explanation)
176
177 TYPE SIZE MODE TYPE SIZE MODE INDEX
178 --------------------------------------------------------------------
179 1. Run 0 0 Noop 0 0 0
180 2. Add 0, [1,17] 0 Noop 0 0 [1,18]
181 3. Copy 0, [4,18] 0 Noop 0 0 [19,34]
182 4. Copy 0, [4,18] 1 Noop 0 0 [35,50]
183 5. Copy 0, [4,18] 2 Noop 0 0 [51,66]
184 6. Copy 0, [4,18] 3 Noop 0 0 [67,82]
185 7. Copy 0, [4,18] 4 Noop 0 0 [83,98]
186 8. Copy 0, [4,18] 5 Noop 0 0 [99,114]
187 9. Copy 0, [4,18] 6 Noop 0 0 [115,130]
188 10. Copy 0, [4,18] 7 Noop 0 0 [131,146]
189 11. Copy 0, [4,18] 8 Noop 0 0 [147,162]
190 12. Add [1,4] 0 Copy [4,6] 0 [163,174]
191 13. Add [1,4] 0 Copy [4,6] 1 [175,186]
192 14. Add [1,4] 0 Copy [4,6] 2 [187,198]
193 15. Add [1,4] 0 Copy [4,6] 3 [199,210]
194 16. Add [1,4] 0 Copy [4,6] 4 [211,222]
195 17. Add [1,4] 0 Copy [4,6] 5 [223,234]
196 18. Add [1,4] 0 Copy 4 6 [235,238]
197 19. Add [1,4] 0 Copy 4 7 [239,242]
198 20. Add [1,4] 0 Copy 4 8 [243,246]
199 21. Copy 4 [0,8] Add 1 0 [247,255]
200 --------------------------------------------------------------------
201
202 Reading the source: Overview
203
204 This file includes itself in several passes to macro-expand certain
205 sections with variable forms. Just read ahead, there's only a
206 little confusion. I know this sounds ugly, but hard-coding some of
207 the string-matching parameters results in a 10-15% increase in
208 string-match performance. The only time this hurts is when you have
209 unbalanced #if/endifs.
210
211 A single compilation unit tames the Makefile. In short, this is to
212 allow the above-described hack without an explodingMakefile. The
213 single compilation unit includes the core library features,
214 configurable string-match templates, optional main() command-line
215 tool, misc optional features, and a regression test. Features are
216 controled with CPP #defines, see Makefile.am.
217
218 The initial __XDELTA3_C_HEADER_PASS__ starts first, the _INLINE_ and
219 _TEMPLATE_ sections follow. Easy stuff first, hard stuff last.
220
221 Optional features include:
222
223 xdelta3-main.h The command-line interface, external compression
224 support, POSIX-specific, info & VCDIFF-debug tools.
225 xdelta3-second.h The common secondary compression routines.
226 xdelta3-decoder.h All decoding routines.
227 xdelta3-djw.h The semi-adaptive huffman secondary encoder.
228 xdelta3-fgk.h The adaptive huffman secondary encoder.
229 xdelta3-test.h The unit test covers major algorithms,
230 encoding and decoding. There are single-bit
231 error decoding tests. There are 32/64-bit file size
232 boundary tests. There are command-line tests.
233 There are compression tests. There are external
234 compression tests. There are string-matching tests.
235 There should be more tests...
236
237 Additional headers include:
238
239 xdelta3.h The public header file.
240 xdelta3-cfgs.h The default settings for default, built-in
241 encoders. These are hard-coded at
242 compile-time. There is also a single
243 soft-coded string matcher for experimenting
244 with arbitrary values.
245 xdelta3-list.h A cyclic list template
246
247 Misc little debug utilities:
248
249 badcopy.c Randomly modifies an input file based on two
250 parameters: (1) the probability that a byte in
251 the file is replaced with a pseudo-random value,
252 and (2) the mean change size. Changes are
253 generated using an expoential distribution
254 which approximates the expected error_prob
255 distribution.
256 --------------------------------------------------------------------
257
258 This file itself is unusually large. I hope to defend this layout
259 with lots of comments. Everything in this file is related to
260 encoding and decoding. I like it all together - the template stuff
261 is just a hack. */
262
263#ifndef __XDELTA3_C_HEADER_PASS__
264#define __XDELTA3_C_HEADER_PASS__
265
266#include <errno.h>
267#include <string.h>
268
269#include "xdelta3.h"
270
271/***********************************************************************
272 STATIC CONFIGURATION
273 ***********************************************************************/
274
275#ifndef XD3_MAIN /* the main application */
276#define XD3_MAIN 0
277#endif
278
279#ifndef VCDIFF_TOOLS
280#define VCDIFF_TOOLS XD3_MAIN
281#endif
282
283#ifndef SECONDARY_FGK /* one from the algorithm preservation department: */
284#define SECONDARY_FGK 0 /* adaptive Huffman routines */
285#endif
286
287#ifndef SECONDARY_DJW /* semi-adaptive/static Huffman for the eventual */
288#define SECONDARY_DJW 0 /* standardization, off by default until such time. */
289#endif
290
291#ifndef GENERIC_ENCODE_TABLES /* These three are the RFC-spec'd app-specific */
292#define GENERIC_ENCODE_TABLES 0 /* code features. This is tested but not recommended */
293#endif /* unless there's a real application. */
294#ifndef GENERIC_ENCODE_TABLES_COMPUTE
295#define GENERIC_ENCODE_TABLES_COMPUTE 0
296#endif
297#ifndef GENERIC_ENCODE_TABLES_COMPUTE_PRINT
298#define GENERIC_ENCODE_TABLES_COMPUTE_PRINT 0
299#endif
300
301#if XD3_ENCODER
302#define IF_ENCODER(x) x
303#else
304#define IF_ENCODER(x)
305#endif
306
307/***********************************************************************/
308
309typedef enum {
310
311 /* header indicator bits */
312 VCD_SECONDARY = (1 << 0), /* uses secondary compressor */
313 VCD_CODETABLE = (1 << 1), /* supplies code table data */
314 VCD_APPHEADER = (1 << 2), /* supplies application data */
315 VCD_INVHDR = ~7U,
316
317 /* window indicator bits */
318 VCD_SOURCE = (1 << 0), /* copy window in source file */
319 VCD_TARGET = (1 << 1), /* copy window in target file */
320 VCD_ADLER32 = (1 << 2), /* has adler32 checksum */
321 VCD_INVWIN = ~7U,
322
323 VCD_SRCORTGT = VCD_SOURCE | VCD_TARGET,
324
325 /* delta indicator bits */
326 VCD_DATACOMP = (1 << 0),
327 VCD_INSTCOMP = (1 << 1),
328 VCD_ADDRCOMP = (1 << 2),
329 VCD_INVDEL = ~0x7U,
330
331} xd3_indicator;
332
333typedef enum {
334 VCD_DJW_ID = 1,
335 VCD_FGK_ID = 16, /* Note: these are not standard IANA-allocated IDs! */
336} xd3_secondary_ids;
337
338typedef enum {
339 SEC_NOFLAGS = 0,
340
341 /* Note: SEC_COUNT_FREQS Not implemented (to eliminate 1st Huffman pass) */
342 SEC_COUNT_FREQS = (1 << 0),
343} xd3_secondary_flags;
344
345typedef enum {
346 DATA_SECTION, /* These indicate which section to the secondary
347 * compressor. */
348 INST_SECTION, /* The header section is not compressed, therefore not
349 * listed here. */
350 ADDR_SECTION,
351} xd3_section_type;
352
353typedef enum
354{
355 XD3_NOOP = 0,
356 XD3_ADD = 1,
357 XD3_RUN = 2,
358 XD3_CPY = 3, /* XD3_CPY rtypes are represented as (XD3_CPY +
359 * copy-mode value) */
360} xd3_rtype;
361
362/***********************************************************************/
363
364#include "xdelta3-list.h"
365
366XD3_MAKELIST(xd3_rlist, xd3_rinst, link);
367
368/***********************************************************************/
369
370#define SECONDARY_MIN_SAVINGS 2 /* Secondary compression has to save
371 at least this many bytes. */
372#define SECONDARY_MIN_INPUT 10 /* Secondary compression needs at
373 least this many bytes. */
374
375#define VCDIFF_MAGIC1 0xd6 /* 1st file byte */
376#define VCDIFF_MAGIC2 0xc3 /* 2nd file byte */
377#define VCDIFF_MAGIC3 0xc4 /* 3rd file byte */
378#define VCDIFF_VERSION 0x00 /* 4th file byte */
379
380#define VCD_SELF 0 /* 1st address mode */
381#define VCD_HERE 1 /* 2nd address mode */
382
383#define CODE_TABLE_STRING_SIZE (6 * 256) /* Should fit a code table string. */
384#define CODE_TABLE_VCDIFF_SIZE (6 * 256) /* Should fit a compressed code
385 * table string */
386
387#define SECONDARY_ANY (SECONDARY_DJW || SECONDARY_FGK)
388
389#define ALPHABET_SIZE 256 /* Used in test code--size of the secondary
390 * compressor alphabet. */
391
392#define HASH_PERMUTE 1 /* The input is permuted by random nums */
393#define ADLER_LARGE_CKSUM 1 /* Adler checksum vs. RK checksum */
394
395#define HASH_CKOFFSET 1U /* Table entries distinguish "no-entry" from
396 * offset 0 using this offset. */
397
398#define MIN_SMALL_LOOK 2U /* Match-optimization stuff. */
399#define MIN_LARGE_LOOK 2U
400#define MIN_MATCH_OFFSET 1U
401#define MAX_MATCH_SPLIT 18U /* VCDIFF code table: 18 is the default limit
402 * for direct-coded ADD sizes */
403
404#define LEAST_MATCH_INCR 0 /* The least number of bytes an overlapping
405 * match must beat the preceding match by. This
406 * is a bias for the lazy match optimization. A
407 * non-zero value means that an adjacent match
408 * has to be better by more than the step
409 * between them. 0. */
410
411#define MIN_MATCH 4U /* VCDIFF code table: MIN_MATCH=4 */
412#define MIN_ADD 1U /* 1 */
413#define MIN_RUN 8U /* The shortest run, if it is shorter than this
414 * an immediate add/copy will be just as good.
415 * ADD1/COPY6 = 1I+1D+1A bytes, RUN18 =
416 * 1I+1D+1A. */
417
418#define MAX_MODES 9 /* Maximum number of nodes used for
419 * compression--does not limit decompression. */
420
421#define ENC_SECTS 4 /* Number of separate output sections. */
422
423#define HDR_TAIL(s) ((s)->enc_tails[0])
424#define DATA_TAIL(s) ((s)->enc_tails[1])
425#define INST_TAIL(s) ((s)->enc_tails[2])
426#define ADDR_TAIL(s) ((s)->enc_tails[3])
427
428#define HDR_HEAD(s) ((s)->enc_heads[0])
429#define DATA_HEAD(s) ((s)->enc_heads[1])
430#define INST_HEAD(s) ((s)->enc_heads[2])
431#define ADDR_HEAD(s) ((s)->enc_heads[3])
432
433#define SIZEOF_ARRAY(x) (sizeof(x) / sizeof(x[0]))
434
435#define TOTAL_MODES(x) (2+(x)->acache.s_same+(x)->acache.s_near)
436
437/* Template instances. */
438#if XD3_BUILD_SLOW
439#define IF_BUILD_SLOW(x) x
440#else
441#define IF_BUILD_SLOW(x)
442#endif
443#if XD3_BUILD_FAST
444#define IF_BUILD_FAST(x) x
445#else
446#define IF_BUILD_FAST(x)
447#endif
448#if XD3_BUILD_FASTER
449#define IF_BUILD_FASTER(x) x
450#else
451#define IF_BUILD_FASTER(x)
452#endif
453#if XD3_BUILD_FASTEST
454#define IF_BUILD_FASTEST(x) x
455#else
456#define IF_BUILD_FASTEST(x)
457#endif
458#if XD3_BUILD_SOFT
459#define IF_BUILD_SOFT(x) x
460#else
461#define IF_BUILD_SOFT(x)
462#endif
463#if XD3_BUILD_DEFAULT
464#define IF_BUILD_DEFAULT(x) x
465#else
466#define IF_BUILD_DEFAULT(x)
467#endif
468
469/* Consume N bytes of input, only used by the decoder. */
470#define DECODE_INPUT(n) \
471 do { \
472 stream->total_in += (xoff_t) (n); \
473 stream->avail_in -= (n); \
474 stream->next_in += (n); \
475 } while (0)
476
477/* Update the run-length state */
478#define NEXTRUN(c) do { if ((c) == run_c) { run_l += 1; } \
479 else { run_c = (c); run_l = 1; } } while (0)
480
481/* This CPP-conditional stuff can be cleaned up... */
482#if XD3_DEBUG
483#define IF_DEBUG(x) x
484#else
485#define IF_DEBUG(x)
486#endif
487#if XD3_DEBUG > 1
488#define IF_DEBUG1(x) x
489#else
490#define IF_DEBUG1(x)
491#endif
492#if XD3_DEBUG > 2
493#define IF_DEBUG2(x) x
494#else
495#define IF_DEBUG2(x)
496#endif
497#if REGRESSION_TEST
498#define IF_REGRESSION(x) x
499#else
500#define IF_REGRESSION(x)
501#endif
502
503/***********************************************************************/
504
505#if XD3_ENCODER
506static void* xd3_alloc0 (xd3_stream *stream,
507 usize_t elts,
508 usize_t size);
509
510
511static xd3_output* xd3_alloc_output (xd3_stream *stream,
512 xd3_output *old_output);
513
514static int xd3_alloc_iopt (xd3_stream *stream, int elts);
515
516static void xd3_free_output (xd3_stream *stream,
517 xd3_output *output);
518
519static int xd3_emit_byte (xd3_stream *stream,
520 xd3_output **outputp,
521 uint8_t code);
522
523static int xd3_emit_bytes (xd3_stream *stream,
524 xd3_output **outputp,
525 const uint8_t *base,
526 usize_t size);
527
528static int xd3_emit_double (xd3_stream *stream, xd3_rinst *first,
529 xd3_rinst *second, usize_t code);
530static int xd3_emit_single (xd3_stream *stream, xd3_rinst *single,
531 usize_t code);
532
533static usize_t xd3_sizeof_output (xd3_output *output);
534static void xd3_encode_reset (xd3_stream *stream);
535
536static int xd3_source_match_setup (xd3_stream *stream, xoff_t srcpos);
537static int xd3_source_extend_match (xd3_stream *stream);
538static int xd3_srcwin_setup (xd3_stream *stream);
539static usize_t xd3_iopt_last_matched (xd3_stream *stream);
540static int xd3_emit_uint32_t (xd3_stream *stream, xd3_output **output,
541 uint32_t num);
542
543static usize_t xd3_smatch (xd3_stream *stream,
544 usize_t base,
545 usize_t scksum,
546 usize_t *match_offset);
547static int xd3_string_match_init (xd3_stream *stream);
548static uint32_t xd3_scksum (uint32_t *state, const uint8_t *seg, const int ln);
549static int xd3_comprun (const uint8_t *seg, int slook, uint8_t *run_cp);
550static int xd3_srcwin_move_point (xd3_stream *stream,
551 usize_t *next_move_point);
552
553static int xd3_emit_run (xd3_stream *stream, usize_t pos,
554 usize_t size, uint8_t run_c);
555static usize_t xd3_checksum_hash (const xd3_hash_cfg *cfg,
556 const usize_t cksum);
557static xoff_t xd3_source_cksum_offset(xd3_stream *stream, usize_t low);
558static void xd3_scksum_insert (xd3_stream *stream,
559 usize_t inx,
560 usize_t scksum,
561 usize_t pos);
562
563
564#if XD3_DEBUG
565static void xd3_verify_run_state (xd3_stream *stream,
566 const uint8_t *inp,
567 int x_run_l,
568 uint8_t x_run_c);
569static void xd3_verify_large_state (xd3_stream *stream,
570 const uint8_t *inp,
571 uint32_t x_cksum);
572static void xd3_verify_small_state (xd3_stream *stream,
573 const uint8_t *inp,
574 uint32_t x_cksum);
575
576#endif /* XD3_DEBUG */
577#endif /* XD3_ENCODER */
578
579static int xd3_decode_allocate (xd3_stream *stream, usize_t size,
580 uint8_t **copied1, usize_t *alloc1);
581
582static void xd3_compute_code_table_string (const xd3_dinst *code_table,
583 uint8_t *str);
584static void* xd3_alloc (xd3_stream *stream, usize_t elts, usize_t size);
585static void xd3_free (xd3_stream *stream, void *ptr);
586
587static int xd3_read_uint32_t (xd3_stream *stream, const uint8_t **inpp,
588 const uint8_t *max, uint32_t *valp);
589
590#if REGRESSION_TEST
591static int xd3_selftest (void);
592#endif
593
594/***********************************************************************/
595
596#define UINT32_OFLOW_MASK 0xfe000000U
597#define UINT64_OFLOW_MASK 0xfe00000000000000ULL
598
599#ifndef UINT32_MAX
600#define UINT32_MAX 4294967295U
601#endif
602
603#ifndef UINT64_MAX
604#define UINT64_MAX 18446744073709551615ULL
605#endif
606
607#if SIZEOF_USIZE_T == 4
608#define USIZE_T_MAX UINT32_MAX
609#define xd3_decode_size xd3_decode_uint32_t
610#define xd3_emit_size xd3_emit_uint32_t
611#define xd3_sizeof_size xd3_sizeof_uint32_t
612#define xd3_read_size xd3_read_uint32_t
613#elif SIZEOF_USIZE_T == 8
614#define USIZE_T_MAX UINT64_MAX
615#define xd3_decode_size xd3_decode_uint64_t
616#define xd3_emit_size xd3_emit_uint64_t
617#define xd3_sizeof_size xd3_sizeof_uint64_t
618#define xd3_read_size xd3_read_uint64_t
619#endif
620
621#if SIZEOF_XOFF_T == 4
622#define XOFF_T_MAX UINT32_MAX
623#define xd3_decode_offset xd3_decode_uint32_t
624#define xd3_emit_offset xd3_emit_uint32_t
625#elif SIZEOF_XOFF_T == 8
626#define XOFF_T_MAX UINT64_MAX
627#define xd3_decode_offset xd3_decode_uint64_t
628#define xd3_emit_offset xd3_emit_uint64_t
629#endif
630
631#define USIZE_T_OVERFLOW(a,b) ((USIZE_T_MAX - (usize_t) (a)) < (usize_t) (b))
632#define XOFF_T_OVERFLOW(a,b) ((XOFF_T_MAX - (xoff_t) (a)) < (xoff_t) (b))
633
634const char* xd3_strerror (int ret)
635{
636 switch (ret)
637 {
638 case XD3_INPUT: return "XD3_INPUT";
639 case XD3_OUTPUT: return "XD3_OUTPUT";
640 case XD3_GETSRCBLK: return "XD3_GETSRCBLK";
641 case XD3_GOTHEADER: return "XD3_GOTHEADER";
642 case XD3_WINSTART: return "XD3_WINSTART";
643 case XD3_WINFINISH: return "XD3_WINFINISH";
644 case XD3_TOOFARBACK: return "XD3_TOOFARBACK";
645 case XD3_INTERNAL: return "XD3_INTERNAL";
646 case XD3_INVALID_INPUT: return "XD3_INVALID_INPUT";
647 }
648 return NULL;
649}
650
651/***********************************************************************/
652
653#define xd3_sec_data(s) ((s)->sec_stream_d)
654#define xd3_sec_inst(s) ((s)->sec_stream_i)
655#define xd3_sec_addr(s) ((s)->sec_stream_a)
656
657struct _xd3_sec_type
658{
659 int id;
660 const char *name;
661 xd3_secondary_flags flags;
662
663 /* xd3_sec_stream is opaque to the generic code */
664 xd3_sec_stream* (*alloc) (xd3_stream *stream);
665 void (*destroy) (xd3_stream *stream,
666 xd3_sec_stream *sec);
667 void (*init) (xd3_sec_stream *sec);
668 int (*decode) (xd3_stream *stream,
669 xd3_sec_stream *sec_stream,
670 const uint8_t **input,
671 const uint8_t *input_end,
672 uint8_t **output,
673 const uint8_t *output_end);
674#if XD3_ENCODER
675 int (*encode) (xd3_stream *stream,
676 xd3_sec_stream *sec_stream,
677 xd3_output *input,
678 xd3_output *output,
679 xd3_sec_cfg *cfg);
680#endif
681};
682
683#define BIT_STATE_ENCODE_INIT { 0, 1 }
684#define BIT_STATE_DECODE_INIT { 0, 0x100 }
685
686typedef struct _bit_state bit_state;
687struct _bit_state
688{
689 usize_t cur_byte;
690 usize_t cur_mask;
691};
692
693#if SECONDARY_ANY == 0
694#define IF_SEC(x)
695#define IF_NSEC(x) x
696#else /* yuck */
697#define IF_SEC(x) x
698#define IF_NSEC(x)
699static int
700xd3_decode_secondary (xd3_stream *stream,
701 xd3_desect *sect,
702 xd3_sec_stream **sec_streamp);
703#if XD3_ENCODER
704static int
705xd3_encode_secondary (xd3_stream *stream,
706 xd3_output **head,
707 xd3_output **tail,
708 xd3_sec_stream **sec_streamp,
709 xd3_sec_cfg *cfg,
710 int *did_it);
711#endif
712#endif /* SECONDARY_ANY */
713
714#if SECONDARY_FGK
715static const xd3_sec_type fgk_sec_type;
716#define IF_FGK(x) x
717#define FGK_CASE(s) \
718 s->sec_type = & fgk_sec_type; \
719 break;
720#else
721#define IF_FGK(x)
722#define FGK_CASE(s) \
723 s->msg = "unavailable secondary compressor: FGK Adaptive Huffman"; \
724 return XD3_INTERNAL;
725#endif
726
727#if SECONDARY_DJW
728static const xd3_sec_type djw_sec_type;
729#define IF_DJW(x) x
730#define DJW_CASE(s) \
731 s->sec_type = & djw_sec_type; \
732 break;
733#else
734#define IF_DJW(x)
735#define DJW_CASE(s) \
736 s->msg = "unavailable secondary compressor: DJW Static Huffman"; \
737 return XD3_INTERNAL;
738#endif
739
740/***********************************************************************/
741
742#include "xdelta3-hash.h"
743
744/* Process template passes - this includes xdelta3.c several times. */
745#define __XDELTA3_C_TEMPLATE_PASS__
746#include "xdelta3-cfgs.h"
747#undef __XDELTA3_C_TEMPLATE_PASS__
748
749/* Process the inline pass. */
750#define __XDELTA3_C_INLINE_PASS__
751#include "xdelta3.c"
752#undef __XDELTA3_C_INLINE_PASS__
753
754/* Secondary compression */
755#if SECONDARY_ANY
756#include "xdelta3-second.h"
757#endif
758
759#if SECONDARY_FGK
760#include "xdelta3-fgk.h"
761static const xd3_sec_type fgk_sec_type =
762{
763 VCD_FGK_ID,
764 "FGK Adaptive Huffman",
765 SEC_NOFLAGS,
766 (xd3_sec_stream* (*)()) fgk_alloc,
767 (void (*)()) fgk_destroy,
768 (void (*)()) fgk_init,
769 (int (*)()) xd3_decode_fgk,
770 IF_ENCODER((int (*)()) xd3_encode_fgk)
771};
772#endif
773
774#if SECONDARY_DJW
775#include "xdelta3-djw.h"
776static const xd3_sec_type djw_sec_type =
777{
778 VCD_DJW_ID,
779 "Static Huffman",
780 SEC_COUNT_FREQS,
781 (xd3_sec_stream* (*)()) djw_alloc,
782 (void (*)()) djw_destroy,
783 (void (*)()) djw_init,
784 (int (*)()) xd3_decode_huff,
785 IF_ENCODER((int (*)()) xd3_encode_huff)
786};
787#endif
788
789#if XD3_MAIN || PYTHON_MODULE || SWIG_MODULE || NOT_MAIN
790#include "xdelta3-main.h"
791#endif
792
793#if REGRESSION_TEST
794#include "xdelta3-test.h"
795#endif
796
797#if PYTHON_MODULE
798#include "xdelta3-python.h"
799#endif
800
801#endif /* __XDELTA3_C_HEADER_PASS__ */
802#ifdef __XDELTA3_C_INLINE_PASS__
803
804/****************************************************************
805 Instruction tables
806 *****************************************************************/
807
808/* The following code implements a parametrized description of the
809 * code table given above for a few reasons. It is not necessary for
810 * implementing the standard, to support compression with variable
811 * tables, so an implementation is only required to know the default
812 * code table to begin decompression. (If the encoder uses an
813 * alternate table, the table is included in compressed form inside
814 * the VCDIFF file.)
815 *
816 * Before adding variable-table support there were two functions which
817 * were hard-coded to the default table above.
818 * xd3_compute_default_table() would create the default table by
819 * filling a 256-elt array of xd3_dinst values. The corresponding
820 * function, xd3_choose_instruction(), would choose an instruction
821 * based on the hard-coded parameters of the default code table.
822 *
823 * Notes: The parametrized code table description here only generates
824 * tables of a certain regularity similar to the default table by
825 * allowing to vary the distribution of single- and
826 * double-instructions and change the number of near and same copy
827 * modes. More exotic tables are only possible by extending this
828 * code.
829 *
830 * For performance reasons, both the parametrized and non-parametrized
831 * versions of xd3_choose_instruction remain. The parametrized
832 * version is only needed for testing multi-table decoding support.
833 * If ever multi-table encoding is required, this can be optimized by
834 * compiling static functions for each table.
835 */
836
837/* The XD3_CHOOSE_INSTRUCTION calls xd3_choose_instruction with the
838 * table description when GENERIC_ENCODE_TABLES are in use. The
839 * IF_GENCODETBL macro enables generic-code-table specific code. */
840#if GENERIC_ENCODE_TABLES
841#define XD3_CHOOSE_INSTRUCTION(stream,prev,inst) xd3_choose_instruction (stream->code_table_desc, prev, inst)
842#define IF_GENCODETBL(x) x
843#else
844#define XD3_CHOOSE_INSTRUCTION(stream,prev,inst) xd3_choose_instruction (prev, inst)
845#define IF_GENCODETBL(x)
846#endif
847
848/* This structure maintains information needed by
849 * xd3_choose_instruction to compute the code for a double instruction
850 * by first indexing an array of code_table_sizes by copy mode, then
851 * using (offset + (muliplier * X)) */
852struct _xd3_code_table_sizes {
853 uint8_t cpy_max;
854 uint8_t offset;
855 uint8_t mult;
856};
857
858/* This contains a complete description of a code table. */
859struct _xd3_code_table_desc
860{
861 /* Assumes a single RUN instruction */
862 /* Assumes that MIN_MATCH is 4 */
863
864 uint8_t add_sizes; /* Number of immediate-size single adds (default 17) */
865 uint8_t near_modes; /* Number of near copy modes (default 4) */
866 uint8_t same_modes; /* Number of same copy modes (default 3) */
867 uint8_t cpy_sizes; /* Number of immediate-size single copies (default 15) */
868
869 uint8_t addcopy_add_max; /* Maximum add size for an add-copy double instruction,
870 all modes (default 4) */
871 uint8_t addcopy_near_cpy_max; /* Maximum cpy size for an add-copy double instruction,
872 up through VCD_NEAR modes (default 6) */
873 uint8_t addcopy_same_cpy_max; /* Maximum cpy size for an add-copy double instruction,
874 VCD_SAME modes (default 4) */
875
876 uint8_t copyadd_add_max; /* Maximum add size for a copy-add double instruction,
877 all modes (default 1) */
878 uint8_t copyadd_near_cpy_max; /* Maximum cpy size for a copy-add double instruction,
879 up through VCD_NEAR modes (default 4) */
880 uint8_t copyadd_same_cpy_max; /* Maximum cpy size for a copy-add double instruction,
881 VCD_SAME modes (default 4) */
882
883 xd3_code_table_sizes addcopy_max_sizes[MAX_MODES];
884 xd3_code_table_sizes copyadd_max_sizes[MAX_MODES];
885};
886
887/* The rfc3284 code table is represented: */
888static const xd3_code_table_desc __rfc3284_code_table_desc = {
889 17, /* add sizes */
890 4, /* near modes */
891 3, /* same modes */
892 15, /* copy sizes */
893
894 4, /* add-copy max add */
895 6, /* add-copy max cpy, near */
896 4, /* add-copy max cpy, same */
897
898 1, /* copy-add max add */
899 4, /* copy-add max cpy, near */
900 4, /* copy-add max cpy, same */
901
902 /* addcopy */
903 { {6,163,3},{6,175,3},{6,187,3},{6,199,3},{6,211,3},{6,223,3},{4,235,1},{4,239,1},{4,243,1} },
904 /* copyadd */
905 { {4,247,1},{4,248,1},{4,249,1},{4,250,1},{4,251,1},{4,252,1},{4,253,1},{4,254,1},{4,255,1} },
906};
907
908#if GENERIC_ENCODE_TABLES
909/* An alternate code table for testing (5 near, 0 same):
910 *
911 * TYPE SIZE MODE TYPE SIZE MODE INDEX
912 * ---------------------------------------------------------------
913 * 1. Run 0 0 Noop 0 0 0
914 * 2. Add 0, [1,23] 0 Noop 0 0 [1,24]
915 * 3. Copy 0, [4,20] 0 Noop 0 0 [25,42]
916 * 4. Copy 0, [4,20] 1 Noop 0 0 [43,60]
917 * 5. Copy 0, [4,20] 2 Noop 0 0 [61,78]
918 * 6. Copy 0, [4,20] 3 Noop 0 0 [79,96]
919 * 7. Copy 0, [4,20] 4 Noop 0 0 [97,114]
920 * 8. Copy 0, [4,20] 5 Noop 0 0 [115,132]
921 * 9. Copy 0, [4,20] 6 Noop 0 0 [133,150]
922 * 10. Add [1,4] 0 Copy [4,6] 0 [151,162]
923 * 11. Add [1,4] 0 Copy [4,6] 1 [163,174]
924 * 12. Add [1,4] 0 Copy [4,6] 2 [175,186]
925 * 13. Add [1,4] 0 Copy [4,6] 3 [187,198]
926 * 14. Add [1,4] 0 Copy [4,6] 4 [199,210]
927 * 15. Add [1,4] 0 Copy [4,6] 5 [211,222]
928 * 16. Add [1,4] 0 Copy [4,6] 6 [223,234]
929 * 17. Copy 4 [0,6] Add [1,3] 0 [235,255]
930 * --------------------------------------------------------------- */
931static const xd3_code_table_desc __alternate_code_table_desc = {
932 23, /* add sizes */
933 5, /* near modes */
934 0, /* same modes */
935 17, /* copy sizes */
936
937 4, /* add-copy max add */
938 6, /* add-copy max cpy, near */
939 0, /* add-copy max cpy, same */
940
941 3, /* copy-add max add */
942 4, /* copy-add max cpy, near */
943 0, /* copy-add max cpy, same */
944
945 /* addcopy */
946 { {6,151,3},{6,163,3},{6,175,3},{6,187,3},{6,199,3},{6,211,3},{6,223,3},{0,0,0},{0,0,0} },
947 /* copyadd */
948 { {4,235,1},{4,238,1},{4,241,1},{4,244,1},{4,247,1},{4,250,1},{4,253,1},{0,0,0},{0,0,0} },
949};
950#endif
951
952/* Computes code table entries of TBL using the specified description. */
953static void
954xd3_build_code_table (const xd3_code_table_desc *desc, xd3_dinst *tbl)
955{
956 usize_t size1, size2, mode;
957 usize_t cpy_modes = 2 + desc->near_modes + desc->same_modes;
958 xd3_dinst *d = tbl;
959
960 (d++)->type1 = XD3_RUN;
961 (d++)->type1 = XD3_ADD;
962
963 for (size1 = 1; size1 <= desc->add_sizes; size1 += 1, d += 1)
964 {
965 d->type1 = XD3_ADD;
966 d->size1 = size1;
967 }
968
969 for (mode = 0; mode < cpy_modes; mode += 1)
970 {
971 (d++)->type1 = XD3_CPY + mode;
972
973 for (size1 = MIN_MATCH; size1 < MIN_MATCH + desc->cpy_sizes; size1 += 1, d += 1)
974 {
975 d->type1 = XD3_CPY + mode;
976 d->size1 = size1;
977 }
978 }
979
980 for (mode = 0; mode < cpy_modes; mode += 1)
981 {
982 for (size1 = 1; size1 <= desc->addcopy_add_max; size1 += 1)
983 {
984 usize_t max = (mode < 2U + desc->near_modes) ?
985 desc->addcopy_near_cpy_max :
986 desc->addcopy_same_cpy_max;
987
988 for (size2 = MIN_MATCH; size2 <= max; size2 += 1, d += 1)
989 {
990 d->type1 = XD3_ADD;
991 d->size1 = size1;
992 d->type2 = XD3_CPY + mode;
993 d->size2 = size2;
994 }
995 }
996 }
997
998 for (mode = 0; mode < cpy_modes; mode += 1)
999 {
1000 usize_t max = (mode < 2U + desc->near_modes) ?
1001 desc->copyadd_near_cpy_max :
1002 desc->copyadd_same_cpy_max;
1003
1004 for (size1 = MIN_MATCH; size1 <= max; size1 += 1)
1005 {
1006 for (size2 = 1; size2 <= desc->copyadd_add_max; size2 += 1, d += 1)
1007 {
1008 d->type1 = XD3_CPY + mode;
1009 d->size1 = size1;
1010 d->type2 = XD3_ADD;
1011 d->size2 = size2;
1012 }
1013 }
1014 }
1015
1016 XD3_ASSERT (d - tbl == 256);
1017}
1018
1019/* This function generates the static default code table. */
1020static const xd3_dinst*
1021xd3_rfc3284_code_table (void)
1022{
1023 static xd3_dinst __rfc3284_code_table[256];
1024
1025 if (__rfc3284_code_table[0].type1 != XD3_RUN)
1026 {
1027 xd3_build_code_table (& __rfc3284_code_table_desc, __rfc3284_code_table);
1028 }
1029
1030 return __rfc3284_code_table;
1031}
1032
1033#if XD3_ENCODER
1034#if GENERIC_ENCODE_TABLES
1035/* This function generates the alternate code table. */
1036static const xd3_dinst*
1037xd3_alternate_code_table (void)
1038{
1039 static xd3_dinst __alternate_code_table[256];
1040
1041 if (__alternate_code_table[0].type1 != XD3_RUN)
1042 {
1043 xd3_build_code_table (& __alternate_code_table_desc, __alternate_code_table);
1044 }
1045
1046 return __alternate_code_table;
1047}
1048
1049/* This function computes the ideal second instruction INST based on
1050 * preceding instruction PREV. If it is possible to issue a double
1051 * instruction based on this pair it sets PREV->code2, otherwise it
1052 * sets INST->code1. */
1053static void
1054xd3_choose_instruction (const xd3_code_table_desc *desc, xd3_rinst *prev, xd3_rinst *inst)
1055{
1056 switch (inst->type)
1057 {
1058 case XD3_RUN:
1059 /* The 0th instruction is RUN */
1060 inst->code1 = 0;
1061 break;
1062
1063 case XD3_ADD:
1064
1065 if (inst->size > desc->add_sizes)
1066 {
1067 /* The first instruction is non-immediate ADD */
1068 inst->code1 = 1;
1069 }
1070 else
1071 {
1072 /* The following ADD_SIZES instructions are immediate ADDs */
1073 inst->code1 = 1 + inst->size;
1074
1075 /* Now check for a possible COPY-ADD double instruction */
1076 if (prev != NULL)
1077 {
1078 int prev_mode = prev->type - XD3_CPY;
1079
1080 /* If previous is a copy. Note: as long as the previous
1081 * is not a RUN instruction, it should be a copy because
1082 * it cannot be an add. This check is more clear. */
1083 if (prev_mode >= 0 && inst->size <= desc->copyadd_add_max)
1084 {
1085 const xd3_code_table_sizes *sizes = & desc->copyadd_max_sizes[prev_mode];
1086
1087 /* This check and the inst->size-<= above are == in
1088 the default table. */
1089 if (prev->size <= sizes->cpy_max)
1090 {
1091 /* The second and third exprs are 0 in the
1092 default table. */
1093 prev->code2 = sizes->offset +
1094 (sizes->mult * (prev->size - MIN_MATCH)) +
1095 (inst->size - MIN_ADD);
1096 }
1097 }
1098 }
1099 }
1100 break;
1101
1102 default:
1103 {
1104 int mode = inst->type - XD3_CPY;
1105
1106 /* The large copy instruction is offset by the run, large add,
1107 * and immediate adds, then multipled by the number of
1108 * immediate copies plus one (the large copy) (i.e., if there
1109 * are 15 immediate copy instructions then there are 16 copy
1110 * instructions per mode). */
1111 inst->code1 = 2 + desc->add_sizes + (1 + desc->cpy_sizes) * mode;
1112
1113 /* Now if the copy is short enough for an immediate instruction. */
1114 if (inst->size < MIN_MATCH + desc->cpy_sizes &&
1115 /* TODO: there needs to be a more comprehensive test for this
1116 * boundary condition, merge is now exercising code in which
1117 * size < MIN_MATCH is possible and it's unclear if the above
1118 * size < (MIN_MATCH + cpy_sizes) should be a <= from inspection
1119 * of the default table version below. */
1120 inst->size >= MIN_MATCH)
1121 {
1122 inst->code1 += inst->size + 1 - MIN_MATCH;
1123
1124 /* Now check for a possible ADD-COPY double instruction. */
1125 if ( (prev != NULL) &&
1126 (prev->type == XD3_ADD) &&
1127 (prev->size <= desc->addcopy_add_max) )
1128 {
1129 const xd3_code_table_sizes *sizes = & desc->addcopy_max_sizes[mode];
1130
1131 if (inst->size <= sizes->cpy_max)
1132 {
1133 prev->code2 = sizes->offset +
1134 (sizes->mult * (prev->size - MIN_ADD)) +
1135 (inst->size - MIN_MATCH);
1136 }
1137 }
1138 }
1139 }
1140 }
1141}
1142#else /* GENERIC_ENCODE_TABLES */
1143
1144/* This version of xd3_choose_instruction is hard-coded for the default
1145 table. */
1146static void
1147xd3_choose_instruction (xd3_rinst *prev, xd3_rinst *inst)
1148{
1149 switch (inst->type)
1150 {
1151 case XD3_RUN:
1152 inst->code1 = 0;
1153 break;
1154
1155 case XD3_ADD:
1156 inst->code1 = 1;
1157
1158 if (inst->size <= 17)
1159 {
1160 inst->code1 += inst->size;
1161
1162 if ( (inst->size == 1) &&
1163 (prev != NULL) &&
1164 (prev->size == 4) &&
1165 (prev->type >= XD3_CPY) )
1166 {
1167 prev->code2 = 247 + (prev->type - XD3_CPY);
1168 }
1169 }
1170
1171 break;
1172
1173 default:
1174 {
1175 int mode = inst->type - XD3_CPY;
1176
1177 XD3_ASSERT (inst->type >= XD3_CPY && inst->type < 12);
1178
1179 inst->code1 = 19 + 16 * mode;
1180
1181 if (inst->size <= 18 && inst->size >= 4)
1182 {
1183 inst->code1 += inst->size - 3;
1184
1185 if ( (prev != NULL) &&
1186 (prev->type == XD3_ADD) &&
1187 (prev->size <= 4) )
1188 {
1189 if ( (inst->size <= 6) &&
1190 (mode <= 5) )
1191 {
1192 prev->code2 = 163 + (mode * 12) + (3 * (prev->size - 1)) + (inst->size - 4);
1193
1194 XD3_ASSERT (prev->code2 <= 234);
1195 }
1196 else if ( (inst->size == 4) &&
1197 (mode >= 6) )
1198 {
1199 prev->code2 = 235 + ((mode - 6) * 4) + (prev->size - 1);
1200
1201 XD3_ASSERT (prev->code2 <= 246);
1202 }
1203 }
1204 }
1205
1206 XD3_ASSERT (inst->code1 <= 162);
1207 }
1208 break;
1209 }
1210}
1211#endif /* GENERIC_ENCODE_TABLES */
1212
1213/***********************************************************************
1214 Instruction table encoder/decoder
1215 ***********************************************************************/
1216
1217#if GENERIC_ENCODE_TABLES
1218#if GENERIC_ENCODE_TABLES_COMPUTE == 0
1219
1220/* In this case, we hard-code the result of
1221 * compute_code_table_encoding for each alternate code table,
1222 * presuming that saves time/space. This has been 131 bytes, but
1223 * secondary compression was turned off. */
1224static const uint8_t __alternate_code_table_compressed[178] =
1225{0xd6,0xc3,0xc4,0x00,0x00,0x01,0x8a,0x6f,0x40,0x81,0x27,0x8c,0x00,0x00,0x4a,0x4a,0x0d,0x02,0x01,0x03,
12260x01,0x03,0x00,0x01,0x00,0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09,0x0a,0x0b,0x0c,0x0d,0x0e,
12270x0f,0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,0x00,0x01,0x01,0x01,0x02,0x02,0x02,0x03,0x03,0x03,0x04,
12280x04,0x04,0x04,0x00,0x04,0x05,0x06,0x01,0x02,0x03,0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x05,0x05,0x05,
12290x06,0x06,0x06,0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x00,0x02,0x00,0x18,0x13,0x63,0x00,0x1b,0x00,0x54,
12300x00,0x15,0x23,0x6f,0x00,0x28,0x13,0x54,0x00,0x15,0x01,0x1a,0x31,0x23,0x6c,0x0d,0x23,0x48,0x00,0x15,
12310x93,0x6f,0x00,0x28,0x04,0x23,0x51,0x04,0x32,0x00,0x2b,0x00,0x12,0x00,0x12,0x00,0x12,0x00,0x12,0x00,
12320x12,0x00,0x12,0x53,0x57,0x9c,0x07,0x43,0x6f,0x00,0x34,0x00,0x0c,0x00,0x0c,0x00,0x0c,0x00,0x0c,0x00,
12330x0c,0x00,0x0c,0x00,0x15,0x00,0x82,0x6f,0x00,0x15,0x12,0x0c,0x00,0x03,0x03,0x00,0x06,0x00,};
1234
1235static int
1236xd3_compute_alternate_table_encoding (xd3_stream *stream, const uint8_t **data, usize_t *size)
1237{
1238 (*data) = __alternate_code_table_compressed;
1239 (*size) = sizeof (__alternate_code_table_compressed);
1240 return 0;
1241}
1242
1243#else
1244
1245/* The alternate code table will be computed and stored here. */
1246static uint8_t __alternate_code_table_compressed[CODE_TABLE_VCDIFF_SIZE];
1247static usize_t __alternate_code_table_compressed_size;
1248
1249/* This function generates a delta describing the code table for
1250 * encoding within a VCDIFF file. This function is NOT thread safe
1251 * because it is only intended that this function is used to generate
1252 * statically-compiled strings. */
1253int xd3_compute_code_table_encoding (xd3_stream *in_stream,
1254 const xd3_dinst *code_table,
1255 uint8_t *comp_string,
1256 usize_t *comp_string_size)
1257{
1258 /* TODO: use xd3_encode_memory() */
1259 uint8_t dflt_string[CODE_TABLE_STRING_SIZE];
1260 uint8_t code_string[CODE_TABLE_STRING_SIZE];
1261 xd3_stream stream;
1262 xd3_source source;
1263 xd3_config config;
1264 int ret;
1265
1266 memset (& source, 0, sizeof (source));
1267
1268 xd3_compute_code_table_string (xd3_rfc3284_code_table (), dflt_string);
1269 xd3_compute_code_table_string (code_table, code_string);
1270
1271 /* Use DJW secondary compression if it is on by default. This saves
1272 * about 20 bytes. */
1273 xd3_init_config (& config, XD3_FLUSH | (SECONDARY_DJW ? XD3_SEC_DJW : 0));
1274
1275 /* Be exhaustive. */
1276 config.sprevsz = 1<<11;
1277 config.srcwin_maxsz = CODE_TABLE_STRING_SIZE;
1278
1279 config.smatch_cfg = XD3_SMATCH_SOFT;
1280 config.smatcher_soft.large_look = 4;
1281 config.smatcher_soft.large_step = 1;
1282 config.smatcher_soft.small_look = 4;
1283 config.smatcher_soft.small_chain = CODE_TABLE_STRING_SIZE;
1284 config.smatcher_soft.small_lchain = CODE_TABLE_STRING_SIZE;
1285 config.smatcher_soft.max_lazy = CODE_TABLE_STRING_SIZE;
1286 config.smatcher_soft.long_enough = CODE_TABLE_STRING_SIZE;
1287
1288 if ((ret = xd3_config_stream (& stream, & config))) { goto fail; }
1289
1290 source.size = CODE_TABLE_STRING_SIZE;
1291 source.blksize = CODE_TABLE_STRING_SIZE;
1292 source.onblk = CODE_TABLE_STRING_SIZE;
1293 source.name = "";
1294 source.curblk = dflt_string;
1295 source.curblkno = 0;
1296
1297 if ((ret = xd3_set_source (& stream, & source))) { goto fail; }
1298
1299 if ((ret = xd3_encode_stream (& stream, code_string, CODE_TABLE_STRING_SIZE,
1300 comp_string, comp_string_size, CODE_TABLE_VCDIFF_SIZE))) { goto fail; }
1301
1302 fail:
1303
1304 in_stream->msg = stream.msg;
1305 xd3_free_stream (& stream);
1306 return ret;
1307}
1308
1309/* Compute a delta between alternate and rfc3284 tables. As soon as
1310 * another alternate table is added, this code should become generic.
1311 * For now there is only one alternate table for testing. */
1312static int
1313xd3_compute_alternate_table_encoding (xd3_stream *stream, const uint8_t **data, usize_t *size)
1314{
1315 int ret;
1316
1317 if (__alternate_code_table_compressed[0] == 0)
1318 {
1319 if ((ret = xd3_compute_code_table_encoding (stream, xd3_alternate_code_table (),
1320 __alternate_code_table_compressed,
1321 & __alternate_code_table_compressed_size)))
1322 {
1323 return ret;
1324 }
1325
1326 /* During development of a new code table, enable this variable to print
1327 * the new static contents and determine its size. At run time the
1328 * table will be filled in appropriately, but at least it should have
1329 * the proper size beforehand. */
1330#if GENERIC_ENCODE_TABLES_COMPUTE_PRINT
1331 {
1332 int i;
1333
1334 DP(RINT, "\nstatic const usize_t __alternate_code_table_compressed_size = %u;\n",
1335 __alternate_code_table_compressed_size);
1336
1337 DP(RINT, "static const uint8_t __alternate_code_table_compressed[%u] =\n{",
1338 __alternate_code_table_compressed_size);
1339
1340 for (i = 0; i < __alternate_code_table_compressed_size; i += 1)
1341 {
1342 DP(RINT, "0x%02x,", __alternate_code_table_compressed[i]);
1343 if ((i % 20) == 19) { DP(RINT, "\n"); }
1344 }
1345
1346 DP(RINT, "};\n");
1347 }
1348#endif
1349 }
1350
1351 (*data) = __alternate_code_table_compressed;
1352 (*size) = __alternate_code_table_compressed_size;
1353
1354 return 0;
1355}
1356#endif /* GENERIC_ENCODE_TABLES_COMPUTE != 0 */
1357#endif /* GENERIC_ENCODE_TABLES */
1358
1359#endif /* XD3_ENCODER */
1360
1361/* This function generates the 1536-byte string specified in sections 5.4 and
1362 * 7 of rfc3284, which is used to represent a code table within a VCDIFF
1363 * file. */
1364void xd3_compute_code_table_string (const xd3_dinst *code_table, uint8_t *str)
1365{
1366 int i, s;
1367
1368 XD3_ASSERT (CODE_TABLE_STRING_SIZE == 6 * 256);
1369
1370 for (s = 0; s < 6; s += 1)
1371 {
1372 for (i = 0; i < 256; i += 1)
1373 {
1374 switch (s)
1375 {
1376 case 0: *str++ = (code_table[i].type1 >= XD3_CPY ? XD3_CPY : code_table[i].type1); break;
1377 case 1: *str++ = (code_table[i].type2 >= XD3_CPY ? XD3_CPY : code_table[i].type2); break;
1378 case 2: *str++ = (code_table[i].size1); break;
1379 case 3: *str++ = (code_table[i].size2); break;
1380 case 4: *str++ = (code_table[i].type1 >= XD3_CPY ? code_table[i].type1 - XD3_CPY : 0); break;
1381 case 5: *str++ = (code_table[i].type2 >= XD3_CPY ? code_table[i].type2 - XD3_CPY : 0); break;
1382 }
1383 }
1384 }
1385}
1386
1387/* This function translates the code table string into the internal representation. The
1388 * stream's near and same-modes should already be set. */
1389static int
1390xd3_apply_table_string (xd3_stream *stream, const uint8_t *code_string)
1391{
1392 int i, s;
1393 int modes = TOTAL_MODES (stream);
1394 xd3_dinst *code_table;
1395
1396 if ((code_table = stream->code_table_alloc =
1397 (xd3_dinst*) xd3_alloc (stream, sizeof (xd3_dinst), 256)) == NULL)
1398 {
1399 return ENOMEM;
1400 }
1401
1402 for (s = 0; s < 6; s += 1)
1403 {
1404 for (i = 0; i < 256; i += 1)
1405 {
1406 switch (s)
1407 {
1408 case 0:
1409 if (*code_string > XD3_CPY)
1410 {
1411 stream->msg = "invalid code-table opcode";
1412 return XD3_INTERNAL;
1413 }
1414 code_table[i].type1 = *code_string++;
1415 break;
1416 case 1:
1417 if (*code_string > XD3_CPY)
1418 {
1419 stream->msg = "invalid code-table opcode";
1420 return XD3_INTERNAL;
1421 }
1422 code_table[i].type2 = *code_string++;
1423 break;
1424 case 2:
1425 if (*code_string != 0 && code_table[i].type1 == XD3_NOOP)
1426 {
1427 stream->msg = "invalid code-table size";
1428 return XD3_INTERNAL;
1429 }
1430 code_table[i].size1 = *code_string++;
1431 break;
1432 case 3:
1433 if (*code_string != 0 && code_table[i].type2 == XD3_NOOP)
1434 {
1435 stream->msg = "invalid code-table size";
1436 return XD3_INTERNAL;
1437 }
1438 code_table[i].size2 = *code_string++;
1439 break;
1440 case 4:
1441 if (*code_string >= modes)
1442 {
1443 stream->msg = "invalid code-table mode";
1444 return XD3_INTERNAL;
1445 }
1446 if (*code_string != 0 && code_table[i].type1 != XD3_CPY)
1447 {
1448 stream->msg = "invalid code-table mode";
1449 return XD3_INTERNAL;
1450 }
1451 code_table[i].type1 += *code_string++;
1452 break;
1453 case 5:
1454 if (*code_string >= modes)
1455 {
1456 stream->msg = "invalid code-table mode";
1457 return XD3_INTERNAL;
1458 }
1459 if (*code_string != 0 && code_table[i].type2 != XD3_CPY)
1460 {
1461 stream->msg = "invalid code-table mode";
1462 return XD3_INTERNAL;
1463 }
1464 code_table[i].type2 += *code_string++;
1465 break;
1466 }
1467 }
1468 }
1469
1470 stream->code_table = code_table;
1471 return 0;
1472}
1473
1474/* This function applies a code table delta and returns an actual code table. */
1475static int
1476xd3_apply_table_encoding (xd3_stream *in_stream, const uint8_t *data, usize_t size)
1477{
1478 uint8_t dflt_string[CODE_TABLE_STRING_SIZE];
1479 uint8_t code_string[CODE_TABLE_STRING_SIZE];
1480 usize_t code_size;
1481 xd3_stream stream;
1482 xd3_source source;
1483 int ret;
1484
1485 /* The default code table string can be cached if alternate code tables ever become
1486 * popular. */
1487 xd3_compute_code_table_string (xd3_rfc3284_code_table (), dflt_string);
1488
1489 source.size = CODE_TABLE_STRING_SIZE;
1490 source.blksize = CODE_TABLE_STRING_SIZE;
1491 source.onblk = CODE_TABLE_STRING_SIZE;
1492 source.name = "rfc3284 code table";
1493 source.curblk = dflt_string;
1494 source.curblkno = 0;
1495
1496 if ((ret = xd3_config_stream (& stream, NULL)) ||
1497 (ret = xd3_set_source (& stream, & source)) ||
1498 (ret = xd3_decode_stream (& stream, data, size, code_string, & code_size, sizeof (code_string))))
1499 {
1500 in_stream->msg = stream.msg;
1501 goto fail;
1502 }
1503
1504 if (code_size != sizeof (code_string))
1505 {
1506 stream.msg = "corrupt code-table encoding";
1507 ret = XD3_INTERNAL;
1508 goto fail;
1509 }
1510
1511 if ((ret = xd3_apply_table_string (in_stream, code_string))) { goto fail; }
1512
1513 fail:
1514
1515 xd3_free_stream (& stream);
1516 return ret;
1517}
1518
1519/***********************************************************************/
1520
1521static inline void
1522xd3_swap_uint8p (uint8_t** p1, uint8_t** p2)
1523{
1524 uint8_t *t = (*p1);
1525 (*p1) = (*p2);
1526 (*p2) = t;
1527}
1528
1529static inline void
1530xd3_swap_usize_t (usize_t* p1, usize_t* p2)
1531{
1532 usize_t t = (*p1);
1533 (*p1) = (*p2);
1534 (*p2) = t;
1535}
1536
1537/* It's not constant time, but it computes the log. */
1538static int
1539xd3_check_pow2 (usize_t value, usize_t *logof)
1540{
1541 usize_t x = 1;
1542 usize_t nolog;
1543 if (logof == NULL) {
1544 logof = &nolog;
1545 }
1546
1547 *logof = 0;
1548
1549 for (; x != 0; x <<= 1, *logof += 1)
1550 {
1551 if (x == value)
1552 {
1553 return 0;
1554 }
1555 }
1556
1557 return XD3_INTERNAL;
1558}
1559
1560static usize_t
1561xd3_pow2_roundup (usize_t x)
1562{
1563 usize_t i = 1;
1564 while (x > i) {
1565 i <<= 1;
1566 }
1567 return i;
1568}
1569
1570static usize_t
1571xd3_round_blksize (usize_t sz, usize_t blksz)
1572{
1573 usize_t mod = sz & (blksz-1);
1574
1575 XD3_ASSERT (xd3_check_pow2 (blksz, NULL) == 0);
1576
1577 return mod ? (sz + (blksz - mod)) : sz;
1578}
1579
1580/***********************************************************************
1581 Adler32 stream function: code copied from Zlib, defined in RFC1950
1582 ***********************************************************************/
1583
1584#define A32_BASE 65521L /* Largest prime smaller than 2^16 */
1585#define A32_NMAX 5552 /* NMAX is the largest n such that 255n(n+1)/2 + (n+1)(BASE-1) <= 2^32-1 */
1586
1587#define A32_DO1(buf,i) {s1 += buf[i]; s2 += s1;}
1588#define A32_DO2(buf,i) A32_DO1(buf,i); A32_DO1(buf,i+1);
1589#define A32_DO4(buf,i) A32_DO2(buf,i); A32_DO2(buf,i+2);
1590#define A32_DO8(buf,i) A32_DO4(buf,i); A32_DO4(buf,i+4);
1591#define A32_DO16(buf) A32_DO8(buf,0); A32_DO8(buf,8);
1592
1593static unsigned long adler32 (unsigned long adler, const uint8_t *buf, usize_t len)
1594{
1595 unsigned long s1 = adler & 0xffff;
1596 unsigned long s2 = (adler >> 16) & 0xffff;
1597 int k;
1598
1599 while (len > 0)
1600 {
1601 k = (len < A32_NMAX) ? len : A32_NMAX;
1602 len -= k;
1603
1604 while (k >= 16)
1605 {
1606 A32_DO16(buf);
1607 buf += 16;
1608 k -= 16;
1609 }
1610
1611 if (k != 0)
1612 {
1613 do
1614 {
1615 s1 += *buf++;
1616 s2 += s1;
1617 }
1618 while (--k);
1619 }
1620
1621 s1 %= A32_BASE;
1622 s2 %= A32_BASE;
1623 }
1624
1625 return (s2 << 16) | s1;
1626}
1627
1628/***********************************************************************
1629 Run-length function
1630 ***********************************************************************/
1631
1632#if XD3_ENCODER
1633static int
1634xd3_comprun (const uint8_t *seg, int slook, uint8_t *run_cp)
1635{
1636 int i;
1637 int run_l = 0;
1638 uint8_t run_c = 0;
1639
1640 for (i = 0; i < slook; i += 1)
1641 {
1642 NEXTRUN(seg[i]);
1643 }
1644
1645 (*run_cp) = run_c;
1646
1647 return run_l;
1648}
1649#endif
1650
1651/***********************************************************************
1652 Basic encoder/decoder functions
1653 ***********************************************************************/
1654
1655static inline int
1656xd3_decode_byte (xd3_stream *stream, usize_t *val)
1657{
1658 if (stream->avail_in == 0)
1659 {
1660 stream->msg = "further input required";
1661 return XD3_INPUT;
1662 }
1663
1664 (*val) = stream->next_in[0];
1665
1666 DECODE_INPUT (1);
1667 return 0;
1668}
1669
1670static inline int
1671xd3_decode_bytes (xd3_stream *stream, uint8_t *buf, usize_t *pos, usize_t size)
1672{
1673 usize_t want;
1674 usize_t take;
1675
1676 /* Note: The case where (*pos == size) happens when a zero-length appheader or code
1677 * table is transmitted, but there is nothing in the standard against that. */
1678
1679 while (*pos < size)
1680 {
1681 if (stream->avail_in == 0)
1682 {
1683 stream->msg = "further input required";
1684 return XD3_INPUT;
1685 }
1686
1687 want = size - *pos;
1688 take = min (want, stream->avail_in);
1689
1690 memcpy (buf + *pos, stream->next_in, take);
1691
1692 DECODE_INPUT (take);
1693 (*pos) += take;
1694 }
1695
1696 return 0;
1697}
1698
1699#if XD3_ENCODER
1700static inline int
1701xd3_emit_byte (xd3_stream *stream,
1702 xd3_output **outputp,
1703 uint8_t code)
1704{
1705 xd3_output *output = (*outputp);
1706
1707 if (output->next == output->avail)
1708 {
1709 xd3_output *aoutput;
1710
1711 if ((aoutput = xd3_alloc_output (stream, output)) == NULL)
1712 {
1713 return ENOMEM;
1714 }
1715
1716 output = (*outputp) = aoutput;
1717 }
1718
1719 output->base[output->next++] = code;
1720
1721 return 0;
1722}
1723
1724static inline int
1725xd3_emit_bytes (xd3_stream *stream,
1726 xd3_output **outputp,
1727 const uint8_t *base,
1728 usize_t size)
1729{
1730 xd3_output *output = (*outputp);
1731
1732 do
1733 {
1734 usize_t take;
1735
1736 if (output->next == output->avail)
1737 {
1738 xd3_output *aoutput;
1739
1740 if ((aoutput = xd3_alloc_output (stream, output)) == NULL)
1741 {
1742 return ENOMEM;
1743 }
1744
1745 output = (*outputp) = aoutput;
1746 }
1747
1748 take = min (output->avail - output->next, size);
1749
1750 memcpy (output->base + output->next, base, take);
1751
1752 output->next += take;
1753 size -= take;
1754 base += take;
1755 }
1756 while (size > 0);
1757
1758 return 0;
1759}
1760#endif /* XD3_ENCODER */
1761
1762/*********************************************************************
1763 Integer encoder/decoder functions
1764 **********************************************************************/
1765
1766#define DECODE_INTEGER_TYPE(PART,OFLOW) \
1767 while (stream->avail_in != 0) \
1768 { \
1769 usize_t next = stream->next_in[0]; \
1770 \
1771 DECODE_INPUT(1); \
1772 \
1773 if (PART & OFLOW) \
1774 { \
1775 stream->msg = "overflow in decode_integer"; \
1776 return XD3_INVALID_INPUT; \
1777 } \
1778 \
1779 PART = (PART << 7) | (next & 127); \
1780 \
1781 if ((next & 128) == 0) \
1782 { \
1783 (*val) = PART; \
1784 PART = 0; \
1785 return 0; \
1786 } \
1787 } \
1788 \
1789 stream->msg = "further input required"; \
1790 return XD3_INPUT
1791
1792#define READ_INTEGER_TYPE(TYPE, OFLOW) \
1793 TYPE val = 0; \
1794 const uint8_t *inp = (*inpp); \
1795 usize_t next; \
1796 \
1797 do \
1798 { \
1799 if (inp == max) \
1800 { \
1801 stream->msg = "end-of-input in read_integer"; \
1802 return XD3_INVALID_INPUT; \
1803 } \
1804 \
1805 if (val & OFLOW) \
1806 { \
1807 stream->msg = "overflow in read_intger"; \
1808 return XD3_INVALID_INPUT; \
1809 } \
1810 \
1811 next = (*inp++); \
1812 val = (val << 7) | (next & 127); \
1813 } \
1814 while (next & 128); \
1815 \
1816 (*valp) = val; \
1817 (*inpp) = inp; \
1818 \
1819 return 0
1820
1821#define EMIT_INTEGER_TYPE() \
1822 /* max 64-bit value in base-7 encoding is 9.1 bytes */ \
1823 uint8_t buf[10]; \
1824 usize_t bufi = 10; \
1825 \
1826 XD3_ASSERT (num >= 0); \
1827 \
1828 /* This loop performs division and turns on all MSBs. */ \
1829 do \
1830 { \
1831 buf[--bufi] = (num & 127) | 128; \
1832 num >>= 7; \
1833 } \
1834 while (num != 0); \
1835 \
1836 /* Turn off MSB of the last byte. */ \
1837 buf[9] &= 127; \
1838 \
1839 XD3_ASSERT (bufi >= 0); \
1840 \
1841 return xd3_emit_bytes (stream, output, buf + bufi, 10 - bufi)
1842
1843#define IF_SIZEOF32(x) if (num < (1U << (7 * (x)))) return (x);
1844#define IF_SIZEOF64(x) if (num < (1ULL << (7 * (x)))) return (x);
1845
1846#if USE_UINT32
1847static inline uint32_t
1848xd3_sizeof_uint32_t (uint32_t num)
1849{
1850 IF_SIZEOF32(1);
1851 IF_SIZEOF32(2);
1852 IF_SIZEOF32(3);
1853 IF_SIZEOF32(4);
1854 return 5;
1855}
1856
1857static inline int
1858xd3_decode_uint32_t (xd3_stream *stream, uint32_t *val)
1859{ DECODE_INTEGER_TYPE (stream->dec_32part, UINT32_OFLOW_MASK); }
1860
1861static inline int
1862xd3_read_uint32_t (xd3_stream *stream, const uint8_t **inpp,
1863 const uint8_t *max, uint32_t *valp)
1864{ READ_INTEGER_TYPE (uint32_t, UINT32_OFLOW_MASK); }
1865
1866#if XD3_ENCODER
1867static inline int
1868xd3_emit_uint32_t (xd3_stream *stream, xd3_output **output, uint32_t num)
1869{ EMIT_INTEGER_TYPE (); }
1870#endif
1871#endif
1872
1873#if USE_UINT64
1874static inline int
1875xd3_decode_uint64_t (xd3_stream *stream, uint64_t *val)
1876{ DECODE_INTEGER_TYPE (stream->dec_64part, UINT64_OFLOW_MASK); }
1877
1878#if XD3_ENCODER
1879static inline int
1880xd3_emit_uint64_t (xd3_stream *stream, xd3_output **output, uint64_t num)
1881{ EMIT_INTEGER_TYPE (); }
1882#endif
1883
1884/* These are tested but not used */
1885#if REGRESSION_TEST
1886static int
1887xd3_read_uint64_t (xd3_stream *stream, const uint8_t **inpp,
1888 const uint8_t *max, uint64_t *valp)
1889{ READ_INTEGER_TYPE (uint64_t, UINT64_OFLOW_MASK); }
1890
1891static uint32_t
1892xd3_sizeof_uint64_t (uint64_t num)
1893{
1894 IF_SIZEOF64(1);
1895 IF_SIZEOF64(2);
1896 IF_SIZEOF64(3);
1897 IF_SIZEOF64(4);
1898 IF_SIZEOF64(5);
1899 IF_SIZEOF64(6);
1900 IF_SIZEOF64(7);
1901 IF_SIZEOF64(8);
1902 IF_SIZEOF64(9);
1903
1904 return 10;
1905}
1906#endif
1907
1908#endif
1909
1910/***********************************************************************
1911 Address cache stuff
1912 ***********************************************************************/
1913
1914static int
1915xd3_alloc_cache (xd3_stream *stream)
1916{
1917 if (stream->acache.near_array != NULL)
1918 {
1919 xd3_free (stream, stream->acache.near_array);
1920 }
1921
1922 if (stream->acache.same_array != NULL)
1923 {
1924 xd3_free (stream, stream->acache.same_array);
1925 }
1926
1927 if (((stream->acache.s_near > 0) &&
1928 (stream->acache.near_array = (usize_t*)
1929 xd3_alloc (stream, stream->acache.s_near, sizeof (usize_t)))
1930 == NULL) ||
1931 ((stream->acache.s_same > 0) &&
1932 (stream->acache.same_array = (usize_t*)
1933 xd3_alloc (stream, stream->acache.s_same * 256, sizeof (usize_t)))
1934 == NULL))
1935 {
1936 return ENOMEM;
1937 }
1938
1939 return 0;
1940}
1941
1942void
1943xd3_init_cache (xd3_addr_cache* acache)
1944{
1945 if (acache->s_near > 0)
1946 {
1947 memset (acache->near_array, 0, acache->s_near * sizeof (usize_t));
1948 acache->next_slot = 0;
1949 }
1950
1951 if (acache->s_same > 0)
1952 {
1953 memset (acache->same_array, 0, acache->s_same * 256 * sizeof (usize_t));
1954 }
1955}
1956
1957static void
1958xd3_update_cache (xd3_addr_cache* acache, usize_t addr)
1959{
1960 if (acache->s_near > 0)
1961 {
1962 acache->near_array[acache->next_slot] = addr;
1963 acache->next_slot = (acache->next_slot + 1) % acache->s_near;
1964 }
1965
1966 if (acache->s_same > 0)
1967 {
1968 acache->same_array[addr % (acache->s_same*256)] = addr;
1969 }
1970}
1971
1972#if XD3_ENCODER
1973/* OPT: this gets called a lot, can it be optimized? */
1974static int
1975xd3_encode_address (xd3_stream *stream, usize_t addr, usize_t here, uint8_t* mode)
1976{
1977 usize_t d, bestd;
1978 usize_t i, bestm, ret;
1979 xd3_addr_cache* acache = & stream->acache;
1980
1981#define SMALLEST_INT(x) do { if (((x) & ~127) == 0) { goto good; } } while (0)
1982
1983 /* Attempt to find the address mode that yields the smallest integer value
1984 * for "d", the encoded address value, thereby minimizing the encoded size
1985 * of the address. */
1986 bestd = addr;
1987 bestm = VCD_SELF;
1988
1989 XD3_ASSERT (addr < here);
1990
1991 SMALLEST_INT (bestd);
1992
1993 if ((d = here-addr) < bestd)
1994 {
1995 bestd = d;
1996 bestm = VCD_HERE;
1997
1998 SMALLEST_INT (bestd);
1999 }
2000
2001 for (i = 0; i < acache->s_near; i += 1)
2002 {
2003 d = addr - acache->near_array[i];
2004
2005 if (d >= 0 && d < bestd)
2006 {
2007 bestd = d;
2008 bestm = i+2; /* 2 counts the VCD_SELF, VCD_HERE modes */
2009
2010 SMALLEST_INT (bestd);
2011 }
2012 }
2013
2014 if (acache->s_same > 0 && acache->same_array[d = addr%(acache->s_same*256)] == addr)
2015 {
2016 bestd = d%256;
2017 bestm = acache->s_near + 2 + d/256; /* 2 + s_near offsets past the VCD_NEAR modes */
2018
2019 if ((ret = xd3_emit_byte (stream, & ADDR_TAIL (stream), bestd))) { return ret; }
2020 }
2021 else
2022 {
2023 good:
2024
2025 if ((ret = xd3_emit_size (stream, & ADDR_TAIL (stream), bestd))) { return ret; }
2026 }
2027
2028 xd3_update_cache (acache, addr);
2029
2030 (*mode) += bestm;
2031
2032 return 0;
2033}
2034#endif
2035
2036static int
2037xd3_decode_address (xd3_stream *stream, usize_t here,
2038 usize_t mode, const uint8_t **inpp,
2039 const uint8_t *max, uint32_t *valp)
2040{
2041 int ret;
2042 usize_t same_start = 2 + stream->acache.s_near;
2043
2044 if (mode < same_start)
2045 {
2046 if ((ret = xd3_read_size (stream, inpp, max, valp))) { return ret; }
2047
2048 switch (mode)
2049 {
2050 case VCD_SELF:
2051 break;
2052 case VCD_HERE:
2053 (*valp) = here - (*valp);
2054 break;
2055 default:
2056 (*valp) += stream->acache.near_array[mode - 2];
2057 break;
2058 }
2059 }
2060 else
2061 {
2062 if (*inpp == max)
2063 {
2064 stream->msg = "address underflow";
2065 return XD3_INVALID_INPUT;
2066 }
2067
2068 mode -= same_start;
2069
2070 (*valp) = stream->acache.same_array[mode*256 + (**inpp)];
2071
2072 (*inpp) += 1;
2073 }
2074
2075 xd3_update_cache (& stream->acache, *valp);
2076
2077 return 0;
2078}
2079
2080/***********************************************************************
2081 Alloc/free
2082***********************************************************************/
2083
2084static void*
2085__xd3_alloc_func (void* opaque, usize_t items, usize_t size)
2086{
2087 return malloc (items * size);
2088}
2089
2090static void
2091__xd3_free_func (void* opaque, void* address)
2092{
2093 free (address);
2094}
2095
2096static void*
2097xd3_alloc (xd3_stream *stream,
2098 usize_t elts,
2099 usize_t size)
2100{
2101 void *a = stream->alloc (stream->opaque, elts, size);
2102
2103 if (a != NULL)
2104 {
2105 IF_DEBUG (stream->alloc_cnt += 1);
2106 IF_DEBUG2 (DP(RINT "[stream %p malloc] size %u ptr %p\n",
2107 stream, elts * size, a));
2108 }
2109 else
2110 {
2111 stream->msg = "out of memory";
2112 }
2113
2114 return a;
2115}
2116
2117static void
2118xd3_free (xd3_stream *stream,
2119 void *ptr)
2120{
2121 if (ptr != NULL)
2122 {
2123 IF_DEBUG (stream->free_cnt += 1);
2124 XD3_ASSERT (stream->free_cnt <= stream->alloc_cnt);
2125 IF_DEBUG2 (DP(RINT "[stream %p free] %p\n",
2126 stream, ptr));
2127 stream->free (stream->opaque, ptr);
2128 }
2129}
2130
2131#if XD3_ENCODER
2132static void*
2133xd3_alloc0 (xd3_stream *stream,
2134 usize_t elts,
2135 usize_t size)
2136{
2137 void *a = xd3_alloc (stream, elts, size);
2138
2139 if (a != NULL)
2140 {
2141 memset (a, 0, elts * size);
2142 }
2143
2144 return a;
2145}
2146
2147static xd3_output*
2148xd3_alloc_output (xd3_stream *stream,
2149 xd3_output *old_output)
2150{
2151 xd3_output *output;
2152 uint8_t *base;
2153
2154 if (stream->enc_free != NULL)
2155 {
2156 output = stream->enc_free;
2157 stream->enc_free = output->next_page;
2158 }
2159 else
2160 {
2161 if ((output = (xd3_output*) xd3_alloc (stream, 1, sizeof (xd3_output))) == NULL)
2162 {
2163 return NULL;
2164 }
2165
2166 if ((base = (uint8_t*) xd3_alloc (stream, XD3_ALLOCSIZE, sizeof (uint8_t))) == NULL)
2167 {
2168 xd3_free (stream, output);
2169 return NULL;
2170 }
2171
2172 output->base = base;
2173 output->avail = XD3_ALLOCSIZE;
2174 }
2175
2176 output->next = 0;
2177
2178 if (old_output)
2179 {
2180 old_output->next_page = output;
2181 }
2182
2183 output->next_page = NULL;
2184
2185 return output;
2186}
2187
2188static usize_t
2189xd3_sizeof_output (xd3_output *output)
2190{
2191 usize_t s = 0;
2192
2193 for (; output; output = output->next_page)
2194 {
2195 s += output->next;
2196 }
2197
2198 return s;
2199}
2200
2201static void
2202xd3_freelist_output (xd3_stream *stream,
2203 xd3_output *output)
2204{
2205 xd3_output *tmp;
2206
2207 while (output)
2208 {
2209 tmp = output;
2210 output = output->next_page;
2211
2212 tmp->next = 0;
2213 tmp->next_page = stream->enc_free;
2214 stream->enc_free = tmp;
2215 }
2216}
2217
2218static void
2219xd3_free_output (xd3_stream *stream,
2220 xd3_output *output)
2221{
2222 xd3_output *next;
2223
2224 again:
2225 if (output == NULL)
2226 {
2227 return;
2228 }
2229
2230 next = output->next_page;
2231
2232 xd3_free (stream, output->base);
2233 xd3_free (stream, output);
2234
2235 output = next;
2236 goto again;
2237}
2238#endif /* XD3_ENCODER */
2239
2240void
2241xd3_free_stream (xd3_stream *stream)
2242{
2243 xd3_iopt_buflist *blist = stream->iopt_alloc;
2244
2245 while (blist != NULL)
2246 {
2247 xd3_iopt_buflist *tmp = blist;
2248 blist = blist->next;
2249 xd3_free (stream, tmp->buffer);
2250 xd3_free (stream, tmp);
2251 }
2252
2253 xd3_free (stream, stream->large_table);
2254 xd3_free (stream, stream->small_table);
2255 xd3_free (stream, stream->small_prev);
2256
2257#if XD3_ENCODER
2258 {
2259 int i;
2260 for (i = 0; i < ENC_SECTS; i += 1)
2261 {
2262 xd3_free_output (stream, stream->enc_heads[i]);
2263 }
2264 xd3_free_output (stream, stream->enc_free);
2265 }
2266#endif
2267
2268 xd3_free (stream, stream->acache.near_array);
2269 xd3_free (stream, stream->acache.same_array);
2270
2271 xd3_free (stream, stream->inst_sect.copied1);
2272 xd3_free (stream, stream->addr_sect.copied1);
2273 xd3_free (stream, stream->data_sect.copied1);
2274
2275 xd3_free (stream, stream->dec_buffer);
2276 xd3_free (stream, (uint8_t*) stream->dec_lastwin);
2277
2278 xd3_free (stream, stream->buf_in);
2279 xd3_free (stream, stream->dec_appheader);
2280 xd3_free (stream, stream->dec_codetbl);
2281 xd3_free (stream, stream->code_table_alloc);
2282
2283#if SECONDARY_ANY
2284 xd3_free (stream, stream->inst_sect.copied2);
2285 xd3_free (stream, stream->addr_sect.copied2);
2286 xd3_free (stream, stream->data_sect.copied2);
2287
2288 if (stream->sec_type != NULL)
2289 {
2290 stream->sec_type->destroy (stream, stream->sec_stream_d);
2291 stream->sec_type->destroy (stream, stream->sec_stream_i);
2292 stream->sec_type->destroy (stream, stream->sec_stream_a);
2293 }
2294#endif
2295
2296 xd3_free (stream, stream->whole_target.adds);
2297 xd3_free (stream, stream->whole_target.inst);
2298 xd3_free (stream, stream->whole_target.wininfo);
2299
2300 XD3_ASSERT (stream->alloc_cnt == stream->free_cnt);
2301
2302 memset (stream, 0, sizeof (xd3_stream));
2303}
2304
2305#if (XD3_DEBUG > 1 || VCDIFF_TOOLS)
2306static const char*
2307xd3_rtype_to_string (xd3_rtype type, int print_mode)
2308{
2309 switch (type)
2310 {
2311 case XD3_NOOP:
2312 return "NOOP ";
2313 case XD3_RUN:
2314 return "RUN ";
2315 case XD3_ADD:
2316 return "ADD ";
2317 default: break;
2318 }
2319 if (! print_mode)
2320 {
2321 return "CPY ";
2322 }
2323 switch (type)
2324 {
2325 case XD3_CPY + 0: return "CPY_0";
2326 case XD3_CPY + 1: return "CPY_1";
2327 case XD3_CPY + 2: return "CPY_2";
2328 case XD3_CPY + 3: return "CPY_3";
2329 case XD3_CPY + 4: return "CPY_4";
2330 case XD3_CPY + 5: return "CPY_5";
2331 case XD3_CPY + 6: return "CPY_6";
2332 case XD3_CPY + 7: return "CPY_7";
2333 case XD3_CPY + 8: return "CPY_8";
2334 case XD3_CPY + 9: return "CPY_9";
2335 default: return "CPY>9";
2336 }
2337}
2338#endif
2339
2340/****************************************************************
2341 Stream configuration
2342 ******************************************************************/
2343
2344int
2345xd3_config_stream(xd3_stream *stream,
2346 xd3_config *config)
2347{
2348 int ret;
2349 xd3_config defcfg;
2350 xd3_smatcher *smatcher = &stream->smatcher;
2351
2352 if (config == NULL)
2353 {
2354 config = & defcfg;
2355 memset (config, 0, sizeof (*config));
2356 }
2357
2358 /* Initial setup: no error checks yet */
2359 memset (stream, 0, sizeof (*stream));
2360
2361 stream->winsize = config->winsize ? config->winsize : XD3_DEFAULT_WINSIZE;
2362 stream->sprevsz = config->sprevsz ? config->sprevsz : XD3_DEFAULT_SPREVSZ;
2363 stream->srcwin_maxsz = config->srcwin_maxsz ?
2364 config->srcwin_maxsz : XD3_DEFAULT_SRCWINSZ;
2365
2366 if (config->iopt_size == 0)
2367 {
2368 stream->iopt_size = XD3_ALLOCSIZE / sizeof(xd3_rinst);
2369 stream->iopt_unlimited = 1;
2370 }
2371 else
2372 {
2373 stream->iopt_size = config->iopt_size;
2374 }
2375
2376 stream->getblk = config->getblk;
2377 stream->alloc = config->alloc ? config->alloc : __xd3_alloc_func;
2378 stream->free = config->freef ? config->freef : __xd3_free_func;
2379 stream->opaque = config->opaque;
2380 stream->flags = config->flags;
2381
2382 /* Secondary setup. */
2383 stream->sec_data = config->sec_data;
2384 stream->sec_inst = config->sec_inst;
2385 stream->sec_addr = config->sec_addr;
2386
2387 stream->sec_data.data_type = DATA_SECTION;
2388 stream->sec_inst.data_type = INST_SECTION;
2389 stream->sec_addr.data_type = ADDR_SECTION;
2390
2391 /* Check static sizes. */
2392 if (sizeof (usize_t) != SIZEOF_USIZE_T ||
2393 sizeof (xoff_t) != SIZEOF_XOFF_T ||
2394 (ret = xd3_check_pow2(XD3_ALLOCSIZE, NULL)))
2395 {
2396 stream->msg = "incorrect compilation: wrong integer sizes";
2397 return XD3_INTERNAL;
2398 }
2399
2400 /* Check/set secondary compressor. */
2401 switch (stream->flags & XD3_SEC_TYPE)
2402 {
2403 case 0:
2404 if (stream->flags & XD3_SEC_NOALL)
2405 {
2406 stream->msg = "XD3_SEC flags require a secondary compressor type";
2407 return XD3_INTERNAL;
2408 }
2409 break;
2410 case XD3_SEC_FGK:
2411 FGK_CASE (stream);
2412 case XD3_SEC_DJW:
2413 DJW_CASE (stream);
2414 default:
2415 stream->msg = "too many secondary compressor types set";
2416 return XD3_INTERNAL;
2417 }
2418
2419 /* Check/set encoder code table. */
2420 switch (stream->flags & XD3_ALT_CODE_TABLE) {
2421 case 0:
2422 stream->code_table_desc = & __rfc3284_code_table_desc;
2423 stream->code_table_func = xd3_rfc3284_code_table;
2424 break;
2425#if GENERIC_ENCODE_TABLES
2426 case XD3_ALT_CODE_TABLE:
2427 stream->code_table_desc = & __alternate_code_table_desc;
2428 stream->code_table_func = xd3_alternate_code_table;
2429 stream->comp_table_func = xd3_compute_alternate_table_encoding;
2430 break;
2431#endif
2432 default:
2433 stream->msg = "alternate code table support was not compiled";
2434 return XD3_INTERNAL;
2435 }
2436
2437 /* Check sprevsz */
2438 if (smatcher->small_chain == 1 &&
2439 smatcher->small_lchain == 1)
2440 {
2441 stream->sprevsz = 0;
2442 }
2443 else
2444 {
2445 if ((ret = xd3_check_pow2 (stream->sprevsz, NULL)))
2446 {
2447 stream->msg = "sprevsz is required to be a power of two";
2448 return XD3_INTERNAL;
2449 }
2450
2451 stream->sprevmask = stream->sprevsz - 1;
2452 }
2453
2454 /* Default scanner settings. */
2455#if XD3_ENCODER
2456 switch (config->smatch_cfg)
2457 {
2458 IF_BUILD_SOFT(case XD3_SMATCH_SOFT:
2459 {
2460 *smatcher = config->smatcher_soft;
2461 smatcher->string_match = __smatcher_soft.string_match;
2462 smatcher->name = __smatcher_soft.name;
2463 if (smatcher->large_look < MIN_MATCH ||
2464 smatcher->large_step < 1 ||
2465 smatcher->small_look < MIN_MATCH)
2466 {
2467 stream->msg = "invalid soft string-match config";
2468 return XD3_INVALID;
2469 }
2470 break;
2471 })
2472
2473 IF_BUILD_DEFAULT(case XD3_SMATCH_DEFAULT:
2474 *smatcher = __smatcher_default;
2475 break;)
2476 IF_BUILD_SLOW(case XD3_SMATCH_SLOW:
2477 *smatcher = __smatcher_slow;
2478 break;)
2479 IF_BUILD_FASTEST(case XD3_SMATCH_FASTEST:
2480 *smatcher = __smatcher_fastest;
2481 break;)
2482 IF_BUILD_FASTER(case XD3_SMATCH_FASTER:
2483 *smatcher = __smatcher_faster;
2484 break;)
2485 IF_BUILD_FAST(case XD3_SMATCH_FAST:
2486 *smatcher = __smatcher_fast;
2487 break;)
2488 default:
2489 stream->msg = "invalid string match config type";
2490 return XD3_INTERNAL;
2491 }
2492
2493 if (config->smatch_cfg == XD3_SMATCH_DEFAULT &&
2494 (stream->flags & XD3_COMPLEVEL_MASK) != 0)
2495 {
2496 int level = (stream->flags & XD3_COMPLEVEL_MASK) >> XD3_COMPLEVEL_SHIFT;
2497
2498 switch (level)
2499 {
2500 case 1:
2501 IF_BUILD_FASTEST(*smatcher = __smatcher_fastest;
2502 break;)
2503 case 2:
2504 IF_BUILD_FASTER(*smatcher = __smatcher_faster;
2505 break;)
2506 case 3: case 4: case 5:
2507 IF_BUILD_FAST(*smatcher = __smatcher_fast;
2508 break;)
2509 case 6:
2510 IF_BUILD_DEFAULT(*smatcher = __smatcher_default;
2511 break;)
2512 default:
2513 IF_BUILD_SLOW(*smatcher = __smatcher_slow;
2514 break;)
2515 IF_BUILD_DEFAULT(*smatcher = __smatcher_default;
2516 break;)
2517 IF_BUILD_FAST(*smatcher = __smatcher_fast;
2518 break;)
2519 IF_BUILD_FASTER(*smatcher = __smatcher_faster;
2520 break;)
2521 IF_BUILD_FASTEST(*smatcher = __smatcher_fastest;
2522 break;)
2523 }
2524 }
2525#endif
2526
2527 return 0;
2528}
2529
2530/*****************************************************************
2531 Getblk interface
2532 ***********************************************************/
2533
2534/* This function interfaces with the client getblk function, checks
2535 * its results, etc. */
2536static int
2537xd3_getblk (xd3_stream *stream, xoff_t blkno)
2538{
2539 int ret;
2540 xd3_source *source = stream->src;
2541
2542 if (source->curblk == NULL ||
2543 blkno != source->curblkno)
2544 {
2545 if (blkno >= source->blocks)
2546 {
2547 stream->msg = "source file too short";
2548 return XD3_INTERNAL;
2549 }
2550
2551 XD3_ASSERT (source->curblk != NULL || blkno != source->curblkno);
2552
2553 source->getblkno = blkno;
2554
2555 if (stream->getblk == NULL)
2556 {
2557 stream->msg = "getblk source input";
2558 return XD3_GETSRCBLK;
2559 }
2560 else if ((ret = stream->getblk (stream, source, blkno)) != 0)
2561 {
2562 stream->msg = "getblk failed";
2563 return ret;
2564 }
2565
2566 XD3_ASSERT (source->curblk != NULL);
2567 }
2568
2569 if (source->onblk != (blkno == source->blocks - 1 ?
2570 source->onlastblk : source->blksize))
2571 {
2572 stream->msg = "getblk returned short block";
2573 return XD3_INTERNAL;
2574 }
2575
2576 return 0;
2577}
2578
2579/***********************************************************
2580 Stream open/close
2581 ***************************************************************/
2582
2583int
2584xd3_set_source (xd3_stream *stream,
2585 xd3_source *src)
2586{
2587 xoff_t blk_num;
2588 usize_t tail_size, shiftby;
2589
2590 IF_DEBUG1 (DP(RINT "[set source] size %"Q"u\n", src->size));
2591
2592 if (src == NULL || src->size < stream->smatcher.large_look) { return 0; }
2593
2594 stream->src = src;
2595
2596 // If src->blksize is a power-of-two, xd3_blksize_div() will use
2597 // shift and mask rather than divide. Check that here.
2598 if (xd3_check_pow2 (src->blksize, &shiftby) == 0)
2599 {
2600 src->shiftby = shiftby;
2601 src->maskby = (1 << shiftby) - 1;
2602 }
2603 else if (src->size <= src->blksize)
2604 {
2605 int x = xd3_pow2_roundup (src->blksize);
2606 xd3_check_pow2 (x, &shiftby);
2607 src->shiftby = shiftby;
2608 src->maskby = (1 << shiftby) - 1;
2609 }
2610 else
2611 {
2612 src->shiftby = 0;
2613 src->maskby = 0;
2614 }
2615
2616 xd3_blksize_div (src->size, src, &blk_num, &tail_size);
2617 src->blocks = blk_num + (tail_size > 0);
2618 src->onlastblk = xd3_bytes_on_srcblk (src, src->blocks - 1);
2619 src->srclen = 0;
2620 src->srcbase = 0;
2621
2622 return 0;
2623}
2624
2625void
2626xd3_abort_stream (xd3_stream *stream)
2627{
2628 stream->dec_state = DEC_ABORTED;
2629 stream->enc_state = ENC_ABORTED;
2630}
2631
2632int
2633xd3_close_stream (xd3_stream *stream)
2634{
2635 if (stream->enc_state != 0 && stream->enc_state != ENC_ABORTED)
2636 {
2637 if (stream->buf_leftover != NULL)
2638 {
2639 stream->msg = "encoding is incomplete";
2640 return XD3_INTERNAL;
2641 }
2642
2643 if (stream->enc_state == ENC_POSTWIN)
2644 {
2645#if XD3_ENCODER
2646 xd3_encode_reset (stream);
2647#endif
2648 stream->current_window += 1;
2649 stream->enc_state = ENC_INPUT;
2650 }
2651
2652 /* If encoding, should be ready for more input but not actually
2653 have any. */
2654 if (stream->enc_state != ENC_INPUT || stream->avail_in != 0)
2655 {
2656 stream->msg = "encoding is incomplete";
2657 return XD3_INTERNAL;
2658 }
2659 }
2660 else
2661 {
2662 switch (stream->dec_state)
2663 {
2664 case DEC_VCHEAD:
2665 case DEC_WININD:
2666 /* TODO: Address the zero-byte ambiguity. Does the encoder
2667 * emit a window or not? If so, then catch an error here.
2668 * If not, need another routine to say
2669 * decode_at_least_one_if_empty. */
2670 case DEC_ABORTED:
2671 break;
2672 default:
2673 /* If decoding, should be ready for the next window. */
2674 stream->msg = "EOF in decode";
2675 return XD3_INTERNAL;
2676 }
2677 }
2678
2679 return 0;
2680}
2681
2682/**************************************************************
2683 Application header
2684 ****************************************************************/
2685
2686int
2687xd3_get_appheader (xd3_stream *stream,
2688 uint8_t **data,
2689 usize_t *size)
2690{
2691 if (stream->dec_state < DEC_WININD)
2692 {
2693 stream->msg = "application header not available";
2694 return XD3_INTERNAL;
2695 }
2696
2697 (*data) = stream->dec_appheader;
2698 (*size) = stream->dec_appheadsz;
2699 return 0;
2700}
2701
2702/**********************************************************
2703 Decoder stuff
2704 *************************************************/
2705
2706#include "xdelta3-decode.h"
2707
2708/****************************************************************
2709 Encoder stuff
2710 *****************************************************************/
2711
2712#if XD3_ENCODER
2713void
2714xd3_set_appheader (xd3_stream *stream,
2715 const uint8_t *data,
2716 usize_t size)
2717{
2718 stream->enc_appheader = data;
2719 stream->enc_appheadsz = size;
2720}
2721
2722#if XD3_DEBUG
2723static int
2724xd3_iopt_check (xd3_stream *stream)
2725{
2726 usize_t ul = xd3_rlist_length (& stream->iopt_used);
2727 usize_t fl = xd3_rlist_length (& stream->iopt_free);
2728
2729 return (ul + fl + (stream->iout ? 1 : 0)) == stream->iopt_size;
2730}
2731#endif
2732
2733static xd3_rinst*
2734xd3_iopt_free (xd3_stream *stream, xd3_rinst *i)
2735{
2736 xd3_rinst *n = xd3_rlist_remove (i);
2737 xd3_rlist_push_back (& stream->iopt_free, i);
2738 return n;
2739}
2740
2741static void
2742xd3_iopt_free_nonadd (xd3_stream *stream, xd3_rinst *i)
2743{
2744 if (i->type != XD3_ADD)
2745 {
2746 xd3_rlist_push_back (& stream->iopt_free, i);
2747 }
2748}
2749
2750/* When an instruction is ready to flush from the iopt buffer, this
2751 * function is called to produce an encoding. It writes the
2752 * instruction plus size, address, and data to the various encoding
2753 * sections. */
2754static int
2755xd3_iopt_finish_encoding (xd3_stream *stream, xd3_rinst *inst)
2756{
2757 int ret;
2758
2759 /* Check for input overflow. */
2760 XD3_ASSERT (inst->pos + inst->size <= stream->avail_in);
2761
2762 switch (inst->type)
2763 {
2764 case XD3_CPY:
2765 {
2766 /* the address may have an offset if there is a source window. */
2767 usize_t addr;
2768 xd3_source *src = stream->src;
2769
2770 if (src != NULL)
2771 {
2772 /* If there is a source copy, the source must have its
2773 * source window decided before we can encode. This can
2774 * be bad -- we have to make this decision even if no
2775 * source matches have been found. */
2776 if (stream->srcwin_decided == 0)
2777 {
2778 if ((ret = xd3_srcwin_setup (stream))) { return ret; }
2779 }
2780
2781 /* xtra field indicates the copy is from the source */
2782 if (inst->xtra)
2783 {
2784 XD3_ASSERT (inst->addr >= src->srcbase);
2785 XD3_ASSERT (inst->addr + inst->size <= src->srcbase + src->srclen);
2786 addr = (inst->addr - src->srcbase);
2787 stream->n_scpy += 1;
2788 stream->l_scpy += inst->size;
2789 }
2790 else
2791 {
2792 /* with source window: target copy address is offset by taroff. */
2793 addr = stream->taroff + (usize_t) inst->addr;
2794 stream->n_tcpy += 1;
2795 stream->l_tcpy += inst->size;
2796 }
2797 }
2798 else
2799 {
2800 addr = (usize_t) inst->addr;
2801 stream->n_tcpy += 1;
2802 stream->l_tcpy += inst->size;
2803 }
2804
2805 /* Note: used to assert inst->size >= MIN_MATCH, but not true
2806 * for merge operations & identical match heuristics. */
2807 /* the "here" position is always offset by taroff */
2808 if ((ret = xd3_encode_address (stream, addr, inst->pos + stream->taroff,
2809 & inst->type)))
2810 {
2811 return ret;
2812 }
2813
2814 IF_DEBUG1 ({
2815 static int cnt;
2816 DP(RINT "[iopt copy:%d] pos %"Q"u-%"Q"u addr %"Q"u-%"Q"u size %u\n",
2817 cnt++,
2818 stream->total_in + inst->pos,
2819 stream->total_in + inst->pos + inst->size,
2820 inst->addr, inst->addr + inst->size, inst->size);
2821 });
2822 break;
2823 }
2824 case XD3_RUN:
2825 {
2826 XD3_ASSERT (inst->size >= MIN_MATCH);
2827
2828 if ((ret = xd3_emit_byte (stream, & DATA_TAIL (stream), inst->xtra))) { return ret; }
2829
2830 stream->n_run += 1;
2831 stream->l_run += inst->size;
2832
2833 IF_DEBUG1 ({
2834 static int cnt;
2835 DP(RINT "[iopt run:%d] pos %"Q"u size %u\n", cnt++, stream->total_in + inst->pos, inst->size);
2836 });
2837 break;
2838 }
2839 case XD3_ADD:
2840 {
2841 if ((ret = xd3_emit_bytes (stream, & DATA_TAIL (stream),
2842 stream->next_in + inst->pos, inst->size))) { return ret; }
2843
2844 stream->n_add += 1;
2845 stream->l_add += inst->size;
2846
2847 IF_DEBUG1 ({
2848 static int cnt;
2849 DP(RINT "[iopt add:%d] pos %"Q"u size %u\n", cnt++, stream->total_in + inst->pos, inst->size);
2850 });
2851
2852 break;
2853 }
2854 }
2855
2856 /* This is the only place stream->unencoded_offset is incremented. */
2857 XD3_ASSERT (stream->unencoded_offset == inst->pos);
2858 stream->unencoded_offset += inst->size;
2859
2860 inst->code2 = 0;
2861
2862 XD3_CHOOSE_INSTRUCTION (stream, stream->iout, inst);
2863
2864 if (stream->iout != NULL)
2865 {
2866 if (stream->iout->code2 != 0)
2867 {
2868 if ((ret = xd3_emit_double (stream, stream->iout, inst, stream->iout->code2))) { return ret; }
2869
2870 xd3_iopt_free_nonadd (stream, stream->iout);
2871 xd3_iopt_free_nonadd (stream, inst);
2872 stream->iout = NULL;
2873 return 0;
2874 }
2875 else
2876 {
2877 if ((ret = xd3_emit_single (stream, stream->iout, stream->iout->code1))) { return ret; }
2878
2879 xd3_iopt_free_nonadd (stream, stream->iout);
2880 }
2881 }
2882
2883 stream->iout = inst;
2884
2885 return 0;
2886}
2887
2888/* This possibly encodes an add instruction, iadd, which must remain
2889 * on the stack until the following call to
2890 * xd3_iopt_finish_encoding. */
2891static int
2892xd3_iopt_add (xd3_stream *stream, usize_t pos, xd3_rinst *iadd)
2893{
2894 int ret;
2895 usize_t off = stream->unencoded_offset;
2896
2897 if (pos > off)
2898 {
2899 iadd->type = XD3_ADD;
2900 iadd->pos = off;
2901 iadd->size = pos - off;
2902
2903 if ((ret = xd3_iopt_finish_encoding (stream, iadd))) { return ret; }
2904 }
2905
2906 return 0;
2907}
2908
2909/* This function calls xd3_iopt_finish_encoding to finish encoding an
2910 * instruction, and it may also produce an add instruction for an
2911 * unmatched region. */
2912static int
2913xd3_iopt_add_encoding (xd3_stream *stream, xd3_rinst *inst)
2914{
2915 int ret;
2916 xd3_rinst iadd;
2917
2918 if ((ret = xd3_iopt_add (stream, inst->pos, & iadd))) { return ret; }
2919
2920 if ((ret = xd3_iopt_finish_encoding (stream, inst))) { return ret; }
2921
2922 return 0;
2923}
2924
2925/* Generates a final add instruction to encode the remaining input. */
2926static int
2927xd3_iopt_add_finalize (xd3_stream *stream)
2928{
2929 int ret;
2930 xd3_rinst iadd;
2931
2932 if ((ret = xd3_iopt_add (stream, stream->avail_in, & iadd))) { return ret; }
2933
2934 if (stream->iout)
2935 {
2936 if ((ret = xd3_emit_single (stream, stream->iout, stream->iout->code1))) { return ret; }
2937
2938 xd3_iopt_free_nonadd (stream, stream->iout);
2939 stream->iout = NULL;
2940 }
2941
2942 return 0;
2943}
2944
2945/* Compact the instruction buffer by choosing the best non-overlapping
2946 * instructions when lazy string-matching. There are no ADDs in the
2947 * iopt buffer because those are synthesized in xd3_iopt_add_encoding
2948 * and during xd3_iopt_add_finalize. */
2949static int
2950xd3_iopt_flush_instructions (xd3_stream *stream, int force)
2951{
2952 xd3_rinst *r1 = xd3_rlist_front (& stream->iopt_used);
2953 xd3_rinst *r2;
2954 xd3_rinst *r3;
2955 usize_t r1end;
2956 usize_t r2end;
2957 usize_t r2off;
2958 usize_t r2moff;
2959 usize_t gap;
2960 usize_t flushed;
2961 int ret;
2962
2963 XD3_ASSERT (xd3_iopt_check (stream));
2964
2965 /* Note: once tried to skip this step if it's possible to assert
2966 * there are no overlapping instructions. Doesn't work because
2967 * xd3_opt_erase leaves overlapping instructions. */
2968 while (! xd3_rlist_end (& stream->iopt_used, r1) &&
2969 ! xd3_rlist_end (& stream->iopt_used, r2 = xd3_rlist_next (r1)))
2970 {
2971 r1end = r1->pos + r1->size;
2972
2973 /* If the instructions do not overlap, continue. */
2974 if (r1end <= r2->pos)
2975 {
2976 r1 = r2;
2977 continue;
2978 }
2979
2980 r2end = r2->pos + r2->size;
2981
2982 /* The min_match adjustments prevent this. */
2983 XD3_ASSERT (r2end > (r1end + LEAST_MATCH_INCR));
2984
2985 /* If r3 is available... */
2986 if (! xd3_rlist_end (& stream->iopt_used, r3 = xd3_rlist_next (r2)))
2987 {
2988 /* If r3 starts before r1 finishes or just about, r2 is irrelevant */
2989 if (r3->pos <= r1end + 1)
2990 {
2991 xd3_iopt_free (stream, r2);
2992 continue;
2993 }
2994 }
2995 else if (! force)
2996 {
2997 /* Unless force, end the loop when r3 is not available. */
2998 break;
2999 }
3000
3001 r2off = r2->pos - r1->pos;
3002 r2moff = r2end - r1end;
3003 gap = r2end - r1->pos;
3004
3005 /* If the two matches overlap almost entirely, choose the better match
3006 * and discard the other. The else branch can still create inefficient
3007 * copies, e.g., a 4-byte copy that takes 4 bytes to encode, which
3008 * xd3_smatch() wouldn't allow by its crude efficiency check. However,
3009 * in this case there are adjacent copies which mean the add would cost
3010 * one extra byte. Allow the inefficiency here. */
3011 if (gap < 2*MIN_MATCH || r2moff <= 2 || r2off <= 2)
3012 {
3013 /* Only one match should be used, choose the longer one. */
3014 if (r1->size < r2->size)
3015 {
3016 xd3_iopt_free (stream, r1);
3017 r1 = r2;
3018 }
3019 else
3020 {
3021 /* We are guaranteed that r1 does not overlap now, so advance past r2 */
3022 r1 = xd3_iopt_free (stream, r2);
3023 }
3024 continue;
3025 }
3026 else
3027 {
3028 /* Shorten one of the instructions -- could be optimized
3029 * based on the address cache. */
3030 usize_t average;
3031 usize_t newsize;
3032 usize_t adjust1;
3033
3034 XD3_ASSERT (r1end > r2->pos && r2end > r1->pos);
3035
3036 /* Try to balance the length of both instructions, but avoid
3037 * making both longer than MAX_MATCH_SPLIT . */
3038 average = gap / 2;
3039 newsize = min (MAX_MATCH_SPLIT, gap - average);
3040
3041 /* Should be possible to simplify this code. */
3042 if (newsize > r1->size)
3043 {
3044 /* shorten r2 */
3045 adjust1 = r1end - r2->pos;
3046 }
3047 else if (newsize > r2->size)
3048 {
3049 /* shorten r1 */
3050 adjust1 = r1end - r2->pos;
3051
3052 XD3_ASSERT (r1->size > adjust1);
3053
3054 r1->size -= adjust1;
3055
3056 /* don't shorten r2 */
3057 adjust1 = 0;
3058 }
3059 else
3060 {
3061 /* shorten r1 */
3062 adjust1 = r1->size - newsize;
3063
3064 if (r2->pos > r1end - adjust1)
3065 {
3066 adjust1 -= r2->pos - (r1end - adjust1);
3067 }
3068
3069 XD3_ASSERT (r1->size > adjust1);
3070
3071 r1->size -= adjust1;
3072
3073 /* shorten r2 */
3074 XD3_ASSERT (r1->pos + r1->size >= r2->pos);
3075
3076 adjust1 = r1->pos + r1->size - r2->pos;
3077 }
3078
3079 /* Fallthrough above if-else, shorten r2 */
3080 XD3_ASSERT (r2->size > adjust1);
3081
3082 r2->size -= adjust1;
3083 r2->pos += adjust1;
3084 r2->addr += adjust1;
3085
3086 XD3_ASSERT (r1->size >= MIN_MATCH);
3087 XD3_ASSERT (r2->size >= MIN_MATCH);
3088
3089 r1 = r2;
3090 }
3091 }
3092
3093 XD3_ASSERT (xd3_iopt_check (stream));
3094
3095 /* If forcing, pick instructions until the list is empty, otherwise
3096 * this empties 50% of the queue. */
3097 for (flushed = 0; ! xd3_rlist_empty (& stream->iopt_used); )
3098 {
3099 xd3_rinst *renc = xd3_rlist_pop_front (& stream->iopt_used);
3100 if ((ret = xd3_iopt_add_encoding (stream, renc)))
3101 {
3102 return ret;
3103 }
3104
3105 if (! force)
3106 {
3107 if (++flushed > stream->iopt_size / 2)
3108 {
3109 break;
3110 }
3111
3112 /* If there are only two instructions remaining, break,
3113 * because they were not optimized. This means there were
3114 * more than 50% eliminated by the loop above. */
3115 r1 = xd3_rlist_front (& stream->iopt_used);
3116 if (xd3_rlist_end(& stream->iopt_used, r1) ||
3117 xd3_rlist_end(& stream->iopt_used, r2 = xd3_rlist_next (r1)) ||
3118 xd3_rlist_end(& stream->iopt_used, r3 = xd3_rlist_next (r2)))
3119 {
3120 break;
3121 }
3122 }
3123 }
3124
3125 XD3_ASSERT (xd3_iopt_check (stream));
3126
3127 XD3_ASSERT (!force || xd3_rlist_length (& stream->iopt_used) == 0);
3128
3129 return 0;
3130}
3131
3132static int
3133xd3_iopt_get_slot (xd3_stream *stream, xd3_rinst** iptr)
3134{
3135 xd3_rinst *i;
3136 int ret;
3137
3138 if (xd3_rlist_empty (& stream->iopt_free))
3139 {
3140 if (stream->iopt_unlimited)
3141 {
3142 int elts = XD3_ALLOCSIZE / sizeof(xd3_rinst);
3143
3144 if ((ret = xd3_alloc_iopt (stream, elts)))
3145 {
3146 return ret;
3147 }
3148
3149 stream->iopt_size += elts;
3150 }
3151 else
3152 {
3153 if ((ret = xd3_iopt_flush_instructions (stream, 0))) { return ret; }
3154
3155 XD3_ASSERT (! xd3_rlist_empty (& stream->iopt_free));
3156 }
3157 }
3158
3159 i = xd3_rlist_pop_back (& stream->iopt_free);
3160
3161 xd3_rlist_push_back (& stream->iopt_used, i);
3162
3163 (*iptr) = i;
3164
3165 ++stream->i_slots_used;
3166
3167 return 0;
3168}
3169
3170/* A copy is about to be emitted that extends backwards to POS,
3171 * therefore it may completely cover some existing instructions in the
3172 * buffer. If an instruction is completely covered by this new match,
3173 * erase it. If the new instruction is covered by the previous one,
3174 * return 1 to skip it. */
3175static void
3176xd3_iopt_erase (xd3_stream *stream, usize_t pos, usize_t size)
3177{
3178 while (! xd3_rlist_empty (& stream->iopt_used))
3179 {
3180 xd3_rinst *r = xd3_rlist_back (& stream->iopt_used);
3181
3182 /* Verify that greedy is working. The previous instruction
3183 * should end before the new one begins. */
3184 XD3_ASSERT ((stream->flags & XD3_BEGREEDY) == 0 || (r->pos + r->size <= pos));
3185 /* Verify that min_match is working. The previous instruction
3186 * should end before the new one ends. */
3187 XD3_ASSERT ((stream->flags & XD3_BEGREEDY) != 0 || (r->pos + r->size < pos + size));
3188
3189 /* See if the last instruction starts before the new
3190 * instruction. If so, there is nothing to erase. */
3191 if (r->pos < pos)
3192 {
3193 return;
3194 }
3195
3196 /* Otherwise, the new instruction covers the old one, delete it
3197 and repeat. */
3198 xd3_rlist_remove (r);
3199 xd3_rlist_push_back (& stream->iopt_free, r);
3200 --stream->i_slots_used;
3201 }
3202}
3203
3204/* This function tells the last matched input position. */
3205static usize_t
3206xd3_iopt_last_matched (xd3_stream *stream)
3207{
3208 xd3_rinst *r;
3209
3210 if (xd3_rlist_empty (& stream->iopt_used))
3211 {
3212 return 0;
3213 }
3214
3215 r = xd3_rlist_back (& stream->iopt_used);
3216
3217 return r->pos + r->size;
3218}
3219
3220/*********************************************************
3221 Emit routines
3222 ***********************************************************/
3223
3224static int
3225xd3_emit_single (xd3_stream *stream, xd3_rinst *single, usize_t code)
3226{
3227 int has_size = stream->code_table[code].size1 == 0;
3228 int ret;
3229
3230 IF_DEBUG1 (DP(RINT "[emit1] %u %s (%u) code %u\n",
3231 single->pos,
3232 xd3_rtype_to_string ((xd3_rtype) single->type, 0),
3233 single->size,
3234 code));
3235
3236 if ((ret = xd3_emit_byte (stream, & INST_TAIL (stream), code)))
3237 {
3238 return ret;
3239 }
3240
3241 if (has_size)
3242 {
3243 if ((ret = xd3_emit_size (stream, & INST_TAIL (stream), single->size)))
3244 {
3245 return ret;
3246 }
3247 }
3248
3249 return 0;
3250}
3251
3252static int
3253xd3_emit_double (xd3_stream *stream, xd3_rinst *first,
3254 xd3_rinst *second, usize_t code)
3255{
3256 int ret;
3257
3258 /* All double instructions use fixed sizes, so all we need to do is
3259 * output the instruction code, no sizes. */
3260 XD3_ASSERT (stream->code_table[code].size1 != 0 &&
3261 stream->code_table[code].size2 != 0);
3262
3263 if ((ret = xd3_emit_byte (stream, & INST_TAIL (stream), code)))
3264 {
3265 return ret;
3266 }
3267
3268 IF_DEBUG1 (DP(RINT "[emit2]: %u %s (%u) %s (%u) code %u\n",
3269 first->pos,
3270 xd3_rtype_to_string ((xd3_rtype) first->type, 0),
3271 first->size,
3272 xd3_rtype_to_string ((xd3_rtype) second->type, 0),
3273 second->size,
3274 code));
3275
3276 return 0;
3277}
3278
3279/* This enters a potential run instruction into the iopt buffer. The
3280 * position argument is relative to the target window. */
3281static int
3282xd3_emit_run (xd3_stream *stream, usize_t pos, usize_t size, uint8_t run_c)
3283{
3284 xd3_rinst* ri;
3285 int ret;
3286
3287 if ((ret = xd3_iopt_get_slot (stream, & ri))) { return ret; }
3288
3289 ri->type = XD3_RUN;
3290 ri->xtra = run_c;
3291 ri->pos = pos;
3292 ri->size = size;
3293
3294 return 0;
3295}
3296
3297/* This enters a potential copy instruction into the iopt buffer. The
3298 * position argument is relative to the target window.. */
3299int
3300xd3_found_match (xd3_stream *stream, usize_t pos,
3301 usize_t size, xoff_t addr, int is_source)
3302{
3303 xd3_rinst* ri;
3304 int ret;
3305
3306 if ((ret = xd3_iopt_get_slot (stream, & ri))) { return ret; }
3307
3308 ri->type = XD3_CPY;
3309 ri->xtra = is_source;
3310 ri->pos = pos;
3311 ri->size = size;
3312 ri->addr = addr;
3313
3314 return 0;
3315}
3316
3317static int
3318xd3_emit_hdr (xd3_stream *stream)
3319{
3320 int ret;
3321 int use_secondary = stream->sec_type != NULL;
3322 int use_adler32 = stream->flags & (XD3_ADLER32 | XD3_ADLER32_RECODE);
3323 int vcd_source = xd3_encoder_used_source (stream);
3324 usize_t win_ind = 0;
3325 usize_t del_ind = 0;
3326 usize_t enc_len;
3327 usize_t tgt_len;
3328 usize_t data_len;
3329 usize_t inst_len;
3330 usize_t addr_len;
3331
3332 if (stream->current_window == 0)
3333 {
3334 usize_t hdr_ind = 0;
3335 int use_appheader = stream->enc_appheader != NULL;
3336 int use_gencodetbl = GENERIC_ENCODE_TABLES &&
3337 (stream->code_table_desc != & __rfc3284_code_table_desc);
3338
3339 if (use_secondary) { hdr_ind |= VCD_SECONDARY; }
3340 if (use_gencodetbl) { hdr_ind |= VCD_CODETABLE; }
3341 if (use_appheader) { hdr_ind |= VCD_APPHEADER; }
3342
3343 if ((ret = xd3_emit_byte (stream, & HDR_TAIL (stream),
3344 VCDIFF_MAGIC1)) != 0 ||
3345 (ret = xd3_emit_byte (stream, & HDR_TAIL (stream),
3346 VCDIFF_MAGIC2)) != 0 ||
3347 (ret = xd3_emit_byte (stream, & HDR_TAIL (stream),
3348 VCDIFF_MAGIC3)) != 0 ||
3349 (ret = xd3_emit_byte (stream, & HDR_TAIL (stream),
3350 VCDIFF_VERSION)) != 0 ||
3351 (ret = xd3_emit_byte (stream, & HDR_TAIL (stream), hdr_ind)) != 0)
3352 {
3353 return ret;
3354 }
3355
3356 /* Secondary compressor ID */
3357#if SECONDARY_ANY
3358 if (use_secondary &&
3359 (ret = xd3_emit_byte (stream, & HDR_TAIL (stream),
3360 stream->sec_type->id)))
3361 {
3362 return ret;
3363 }
3364#endif
3365
3366 /* Compressed code table */
3367 if (use_gencodetbl)
3368 {
3369 usize_t code_table_size;
3370 const uint8_t *code_table_data;
3371
3372 if ((ret = stream->comp_table_func (stream, & code_table_data,
3373 & code_table_size)))
3374 {
3375 return ret;
3376 }
3377
3378 if ((ret = xd3_emit_size (stream, & HDR_TAIL (stream),
3379 code_table_size + 2)) ||
3380 (ret = xd3_emit_byte (stream, & HDR_TAIL (stream),
3381 stream->code_table_desc->near_modes)) ||
3382 (ret = xd3_emit_byte (stream, & HDR_TAIL (stream),
3383 stream->code_table_desc->same_modes)) ||
3384 (ret = xd3_emit_bytes (stream, & HDR_TAIL (stream),
3385 code_table_data, code_table_size)))
3386 {
3387 return ret;
3388 }
3389 }
3390
3391 /* Application header */
3392 if (use_appheader)
3393 {
3394 if ((ret = xd3_emit_size (stream, & HDR_TAIL (stream),
3395 stream->enc_appheadsz)) ||
3396 (ret = xd3_emit_bytes (stream, & HDR_TAIL (stream),
3397 stream->enc_appheader,
3398 stream->enc_appheadsz)))
3399 {
3400 return ret;
3401 }
3402 }
3403 }
3404
3405 /* try to compress this window */
3406#if SECONDARY_ANY
3407 if (use_secondary)
3408 {
3409 int data_sec = 0;
3410 int inst_sec = 0;
3411 int addr_sec = 0;
3412
3413# define ENCODE_SECONDARY_SECTION(UPPER,LOWER) \
3414 ((stream->flags & XD3_SEC_NO ## UPPER) == 0 && \
3415 (ret = xd3_encode_secondary (stream, \
3416 & UPPER ## _HEAD (stream), \
3417 & UPPER ## _TAIL (stream), \
3418 & xd3_sec_ ## LOWER (stream), \
3419 & stream->sec_ ## LOWER, \
3420 & LOWER ## _sec)))
3421
3422 if (ENCODE_SECONDARY_SECTION (DATA, data) ||
3423 ENCODE_SECONDARY_SECTION (INST, inst) ||
3424 ENCODE_SECONDARY_SECTION (ADDR, addr))
3425 {
3426 return ret;
3427 }
3428
3429 del_ind |= (data_sec ? VCD_DATACOMP : 0);
3430 del_ind |= (inst_sec ? VCD_INSTCOMP : 0);
3431 del_ind |= (addr_sec ? VCD_ADDRCOMP : 0);
3432 }
3433#endif
3434
3435 /* if (vcd_target) { win_ind |= VCD_TARGET; } */
3436 if (vcd_source) { win_ind |= VCD_SOURCE; }
3437 if (use_adler32) { win_ind |= VCD_ADLER32; }
3438
3439 /* window indicator */
3440 if ((ret = xd3_emit_byte (stream, & HDR_TAIL (stream), win_ind)))
3441 {
3442 return ret;
3443 }
3444
3445 /* source window */
3446 if (vcd_source)
3447 {
3448 /* or (vcd_target) { ... } */
3449 if ((ret = xd3_emit_size (stream, & HDR_TAIL (stream),
3450 stream->src->srclen)) ||
3451 (ret = xd3_emit_offset (stream, & HDR_TAIL (stream),
3452 stream->src->srcbase))) { return ret; }
3453 }
3454
3455 tgt_len = stream->avail_in;
3456 data_len = xd3_sizeof_output (DATA_HEAD (stream));
3457 inst_len = xd3_sizeof_output (INST_HEAD (stream));
3458 addr_len = xd3_sizeof_output (ADDR_HEAD (stream));
3459
3460 /* The enc_len field is a redundency for future extensions.*/
3461 enc_len = (1 + (xd3_sizeof_size (tgt_len) +
3462 xd3_sizeof_size (data_len) +
3463 xd3_sizeof_size (inst_len) +
3464 xd3_sizeof_size (addr_len)) +
3465 data_len +
3466 inst_len +
3467 addr_len +
3468 (use_adler32 ? 4 : 0));
3469
3470 if ((ret = xd3_emit_size (stream, & HDR_TAIL (stream), enc_len)) ||
3471 (ret = xd3_emit_size (stream, & HDR_TAIL (stream), tgt_len)) ||
3472 (ret = xd3_emit_byte (stream, & HDR_TAIL (stream), del_ind)) ||
3473 (ret = xd3_emit_size (stream, & HDR_TAIL (stream), data_len)) ||
3474 (ret = xd3_emit_size (stream, & HDR_TAIL (stream), inst_len)) ||
3475 (ret = xd3_emit_size (stream, & HDR_TAIL (stream), addr_len)))
3476 {
3477 return ret;
3478 }
3479
3480 if (use_adler32)
3481 {
3482 uint8_t send[4];
3483 uint32_t a32;
3484
3485 if (stream->flags & XD3_ADLER32)
3486 {
3487 a32 = adler32 (1L, stream->next_in, stream->avail_in);
3488 }
3489 else
3490 {
3491 a32 = stream->recode_adler32;
3492 }
3493
3494 send[0] = (a32 >> 24);
3495 send[1] = (a32 >> 16);
3496 send[2] = (a32 >> 8);
3497 send[3] = (a32 & 0xff);
3498
3499 if ((ret = xd3_emit_bytes (stream, & HDR_TAIL (stream), send, 4)))
3500 {
3501 return ret;
3502 }
3503 }
3504
3505 return 0;
3506}
3507
3508/****************************************************************
3509 Encode routines
3510 ****************************************************************/
3511
3512static int
3513xd3_encode_buffer_leftover (xd3_stream *stream)
3514{
3515 usize_t take;
3516 usize_t room;
3517
3518 /* Allocate the buffer. */
3519 if (stream->buf_in == NULL &&
3520 (stream->buf_in = (uint8_t*) xd3_alloc (stream, stream->winsize, 1)) == NULL)
3521 {
3522 return ENOMEM;
3523 }
3524
3525 /* Take leftover input first. */
3526 if (stream->buf_leftover != NULL)
3527 {
3528 XD3_ASSERT (stream->buf_avail == 0);
3529 XD3_ASSERT (stream->buf_leftavail < stream->winsize);
3530
3531 IF_DEBUG1 (DP(RINT "[leftover] previous %u avail %u\n", stream->buf_leftavail, stream->avail_in));
3532
3533 memcpy (stream->buf_in, stream->buf_leftover, stream->buf_leftavail);
3534
3535 stream->buf_leftover = NULL;
3536 stream->buf_avail = stream->buf_leftavail;
3537 }
3538
3539 /* Copy into the buffer. */
3540 room = stream->winsize - stream->buf_avail;
3541 take = min (room, stream->avail_in);
3542
3543 memcpy (stream->buf_in + stream->buf_avail, stream->next_in, take);
3544
3545 stream->buf_avail += take;
3546
3547 if (take < stream->avail_in)
3548 {
3549 /* Buffer is full */
3550 stream->buf_leftover = stream->next_in + take;
3551 stream->buf_leftavail = stream->avail_in - take;
3552
3553 IF_DEBUG1 (DP(RINT "[leftover] take %u remaining %u\n", take, stream->buf_leftavail));
3554 }
3555 else if ((stream->buf_avail < stream->winsize) && !(stream->flags & XD3_FLUSH))
3556 {
3557 /* Buffer has space */
3558 IF_DEBUG1 (DP(RINT "[leftover] %u emptied\n", take));
3559 return XD3_INPUT;
3560 }
3561
3562 /* Use the buffer: */
3563 stream->next_in = stream->buf_in;
3564 stream->avail_in = stream->buf_avail;
3565 stream->buf_avail = 0;
3566
3567 return 0;
3568}
3569
3570/* Allocates one block of xd3_rlist elements */
3571static int
3572xd3_alloc_iopt (xd3_stream *stream, int elts)
3573{
3574 int i;
3575 xd3_iopt_buflist* last =
3576 (xd3_iopt_buflist*) xd3_alloc (stream, sizeof (xd3_iopt_buflist), 1);
3577
3578 if (last == NULL ||
3579 (last->buffer = (xd3_rinst*) xd3_alloc (stream, sizeof (xd3_rinst), elts)) == NULL)
3580 {
3581 return ENOMEM;
3582 }
3583
3584 last->next = stream->iopt_alloc;
3585 stream->iopt_alloc = last;
3586
3587 for (i = 0; i < elts; i += 1)
3588 {
3589 xd3_rlist_push_back (& stream->iopt_free, & last->buffer[i]);
3590 }
3591
3592 return 0;
3593}
3594
3595/* This function allocates all memory initially used by the encoder. */
3596static int
3597xd3_encode_init (xd3_stream *stream, int full_init)
3598{
3599 int i;
3600
3601 if (full_init)
3602 {
3603 int large_comp = (stream->src != NULL);
3604 int small_comp = ! (stream->flags & XD3_NOCOMPRESS);
3605
3606 /* Memory allocations for checksum tables are delayed until
3607 * xd3_string_match_init in the first call to string_match--that way
3608 * identical or short inputs require no table allocation. */
3609 if (large_comp)
3610 {
3611 usize_t hash_values = (stream->srcwin_maxsz / stream->smatcher.large_step);
3612
3613 xd3_size_hashtable (stream,
3614 hash_values,
3615 & stream->large_hash);
3616 }
3617
3618 if (small_comp)
3619 {
3620 /* TODO: This is under devel: used to have min(sprevsz) here, which sort
3621 * of makes sense, but observed fast performance w/ larger tables, which
3622 * also sort of makes sense. @@@ */
3623 usize_t hash_values = stream->winsize;
3624
3625 xd3_size_hashtable (stream,
3626 hash_values,
3627 & stream->small_hash);
3628 }
3629 }
3630
3631 /* data buffers */
3632 for (i = 0; i < ENC_SECTS; i += 1)
3633 {
3634 if ((stream->enc_heads[i] =
3635 stream->enc_tails[i] =
3636 xd3_alloc_output (stream, NULL)) == NULL)
3637 {
3638 return ENOMEM;
3639 }
3640 }
3641
3642 /* iopt buffer */
3643 xd3_rlist_init (& stream->iopt_used);
3644 xd3_rlist_init (& stream->iopt_free);
3645
3646 if (xd3_alloc_iopt (stream, stream->iopt_size) != 0) { goto fail; }
3647
3648 XD3_ASSERT (xd3_rlist_length (& stream->iopt_free) == stream->iopt_size);
3649 XD3_ASSERT (xd3_rlist_length (& stream->iopt_used) == 0);
3650
3651 /* address cache, code table */
3652 stream->acache.s_near = stream->code_table_desc->near_modes;
3653 stream->acache.s_same = stream->code_table_desc->same_modes;
3654 stream->code_table = stream->code_table_func ();
3655
3656 return xd3_alloc_cache (stream);
3657
3658 fail:
3659
3660 return ENOMEM;
3661}
3662
3663int
3664xd3_encode_init_full (xd3_stream *stream)
3665{
3666 return xd3_encode_init (stream, 1);
3667}
3668
3669int
3670xd3_encode_init_partial (xd3_stream *stream)
3671{
3672 return xd3_encode_init (stream, 0);
3673}
3674
3675/* Called after the ENC_POSTOUT state, this puts the output buffers
3676 * back into separate lists and re-initializes some variables. (The
3677 * output lists were spliced together during the ENC_FLUSH state.) */
3678static void
3679xd3_encode_reset (xd3_stream *stream)
3680{
3681 int i;
3682 xd3_output *olist;
3683
3684 stream->avail_in = 0;
3685 stream->small_reset = 1;
3686 stream->i_slots_used = 0;
3687
3688 if (stream->src != NULL)
3689 {
3690 stream->src->srcbase = 0;
3691 stream->src->srclen = 0;
3692 stream->srcwin_decided = 0;
3693 stream->match_minaddr = 0;
3694 stream->match_maxaddr = 0;
3695 stream->taroff = 0;
3696 }
3697
3698 /* Reset output chains. */
3699 olist = stream->enc_heads[0];
3700
3701 for (i = 0; i < ENC_SECTS; i += 1)
3702 {
3703 XD3_ASSERT (olist != NULL);
3704
3705 stream->enc_heads[i] = olist;
3706 stream->enc_tails[i] = olist;
3707 olist = olist->next_page;
3708
3709 stream->enc_heads[i]->next = 0;
3710 stream->enc_heads[i]->next_page = NULL;
3711
3712 stream->enc_tails[i]->next_page = NULL;
3713 stream->enc_tails[i] = stream->enc_heads[i];
3714 }
3715
3716 xd3_freelist_output (stream, olist);
3717}
3718
3719/* The main encoding routine. */
3720int
3721xd3_encode_input (xd3_stream *stream)
3722{
3723 int ret, i;
3724
3725 if (stream->dec_state != 0)
3726 {
3727 stream->msg = "encoder/decoder transition";
3728 return XD3_INTERNAL;
3729 }
3730
3731 switch (stream->enc_state)
3732 {
3733 case ENC_INIT:
3734 /* Only reached on first time through: memory setup. */
3735 if ((ret = xd3_encode_init_full (stream))) { return ret; }
3736
3737 stream->enc_state = ENC_INPUT;
3738
3739 case ENC_INPUT:
3740
3741 /* If there is no input yet, just return. This checks for
3742 * next_in == NULL, not avail_in == 0 since zero bytes is a
3743 * valid input. There is an assertion in xd3_avail_input() that
3744 * next_in != NULL for this reason. By returning right away we
3745 * avoid creating an input buffer before the caller has supplied
3746 * its first data. It is possible for xd3_avail_input to be
3747 * called both before and after the first call to
3748 * xd3_encode_input(). */
3749 if (stream->next_in == NULL)
3750 {
3751 return XD3_INPUT;
3752 }
3753
3754 enc_flush:
3755 /* See if we should buffer the input: either if there is already
3756 * a leftover buffer, or if the input is short of winsize
3757 * without flush. The label at this point is reached by a goto
3758 * below, when there is leftover input after postout. */
3759 if ((stream->buf_leftover != NULL) ||
3760 (stream->avail_in < stream->winsize && ! (stream->flags & XD3_FLUSH)))
3761 {
3762 if ((ret = xd3_encode_buffer_leftover (stream))) { return ret; }
3763 }
3764
3765 /* Initalize the address cache before each window. */
3766 xd3_init_cache (& stream->acache);
3767
3768 stream->input_position = 0;
3769 stream->min_match = MIN_MATCH;
3770 stream->unencoded_offset = 0;
3771
3772 stream->enc_state = ENC_SEARCH;
3773
3774 IF_DEBUG1 (DP(RINT "[WINSTART:%"Q"u] input bytes %u offset %"Q"u\n",
3775 stream->current_window, stream->avail_in,
3776 stream->total_in));
3777 return XD3_WINSTART;
3778
3779 case ENC_SEARCH:
3780
3781 /* Reentrant matching. */
3782 if (stream->src != NULL)
3783 {
3784 switch (stream->match_state)
3785 {
3786 case MATCH_TARGET:
3787 /* Try matching forward at the start of the target.
3788 * This is entered the first time through, to check for
3789 * a perfect match, and whenever there is a source match
3790 * that extends to the end of the previous window. The
3791 * match_srcpos field is initially zero and later set
3792 * during xd3_source_extend_match. */
3793
3794 if (stream->avail_in > 0)
3795 {
3796 /* This call can't fail because the source window is
3797 unrestricted. */
3798 ret = xd3_source_match_setup (stream, stream->match_srcpos);
3799 XD3_ASSERT (ret == 0);
3800 stream->match_state = MATCH_FORWARD;
3801 }
3802 else
3803 {
3804 stream->match_state = MATCH_SEARCHING;
3805 stream->match_fwd = 0;
3806 }
3807 XD3_ASSERT (stream->match_fwd == 0);
3808
3809 case MATCH_FORWARD:
3810 case MATCH_BACKWARD:
3811 if (stream->avail_in != 0)
3812 {
3813 if ((ret = xd3_source_extend_match (stream)) != 0)
3814 {
3815 return ret;
3816 }
3817
3818 /* The search has to make forward progress here
3819 * or else it can get stuck in a match-backward
3820 * (getsrcblk) then match-forward (getsrcblk),
3821 * find insufficient match length, then repeat
3822 * exactly the same search.
3823 */
3824 stream->input_position += stream->match_fwd;
3825 }
3826
3827 case MATCH_SEARCHING:
3828 /* Continue string matching. (It's possible that the
3829 * initial match continued through the entire input, in
3830 * which case we're still in MATCH_FORWARD and should
3831 * remain so for the next input window.) */
3832 break;
3833 }
3834 }
3835
3836 /* String matching... */
3837 if (stream->avail_in != 0 &&
3838 (ret = stream->smatcher.string_match (stream)))
3839 {
3840 return ret;
3841 }
3842
3843 stream->enc_state = ENC_INSTR;
3844
3845 case ENC_INSTR:
3846 /* Note: Jump here to encode VCDIFF deltas w/o using this
3847 * string-matching code. */
3848
3849 /* Flush the instrution buffer, then possibly add one more
3850 * instruction, then emit the header. */
3851 if ((ret = xd3_iopt_flush_instructions (stream, 1)) ||
3852 (ret = xd3_iopt_add_finalize (stream)))
3853 {
3854 return ret;
3855 }
3856
3857 stream->enc_state = ENC_FLUSH;
3858
3859 case ENC_FLUSH:
3860 /* Note: main_recode_func() bypasses string-matching by setting
3861 * ENC_FLUSH. */
3862 if ((ret = xd3_emit_hdr (stream)))
3863 {
3864 return ret;
3865 }
3866
3867 /* Begin output. */
3868 stream->enc_current = HDR_HEAD (stream);
3869
3870 /* Chain all the outputs together. After doing this, it looks
3871 * as if there is only one section. The other enc_heads are set
3872 * to NULL to avoid freeing them more than once. */
3873 for (i = 1; i < ENC_SECTS; i += 1)
3874 {
3875 stream->enc_tails[i-1]->next_page = stream->enc_heads[i];
3876 stream->enc_heads[i] = NULL;
3877 }
3878
3879 enc_output:
3880
3881 stream->enc_state = ENC_POSTOUT;
3882 stream->next_out = stream->enc_current->base;
3883 stream->avail_out = stream->enc_current->next;
3884 stream->total_out += (xoff_t) stream->avail_out;
3885
3886 /* If there is any output in this buffer, return it, otherwise
3887 * fall through to handle the next buffer or finish the window
3888 * after all buffers have been output. */
3889 if (stream->avail_out > 0)
3890 {
3891 /* This is the only place xd3_encode returns XD3_OUTPUT */
3892 return XD3_OUTPUT;
3893 }
3894
3895 case ENC_POSTOUT:
3896
3897 if (stream->avail_out != 0)
3898 {
3899 stream->msg = "missed call to consume output";
3900 return XD3_INTERNAL;
3901 }
3902
3903 /* Continue outputting one buffer at a time, until the next is NULL. */
3904 if ((stream->enc_current = stream->enc_current->next_page) != NULL)
3905 {
3906 goto enc_output;
3907 }
3908
3909 stream->total_in += (xoff_t) stream->avail_in;
3910 stream->enc_state = ENC_POSTWIN;
3911
3912 IF_DEBUG1 (DP(RINT "[WINFINISH:%"Q"u] in=%"Q"u\n",
3913 stream->current_window,
3914 stream->total_in));
3915 return XD3_WINFINISH;
3916
3917 case ENC_POSTWIN:
3918
3919 xd3_encode_reset (stream);
3920
3921 stream->current_window += 1;
3922 stream->enc_state = ENC_INPUT;
3923
3924 /* If there is leftover input to flush, repeat. */
3925 if ((stream->buf_leftover != NULL) && (stream->flags & XD3_FLUSH))
3926 {
3927 goto enc_flush;
3928 }
3929
3930 /* Ready for more input. */
3931 return XD3_INPUT;
3932
3933 default:
3934 stream->msg = "invalid state";
3935 return XD3_INTERNAL;
3936 }
3937}
3938#endif /* XD3_ENCODER */
3939
3940/*****************************************************************
3941 Client convenience functions
3942 ******************************************************************/
3943
3944static int
3945xd3_process_stream (int is_encode,
3946 xd3_stream *stream,
3947 int (*func) (xd3_stream *),
3948 int close_stream,
3949 const uint8_t *input,
3950 usize_t input_size,
3951 uint8_t *output,
3952 usize_t *output_size,
3953 usize_t output_size_max)
3954{
3955 usize_t ipos = 0;
3956 usize_t n = min(stream->winsize, input_size);
3957
3958 (*output_size) = 0;
3959
3960 stream->flags |= XD3_FLUSH;
3961
3962 xd3_avail_input (stream, input + ipos, n);
3963 ipos += n;
3964
3965 for (;;)
3966 {
3967 int ret;
3968 switch((ret = func (stream)))
3969 {
3970 case XD3_OUTPUT: { /* memcpy below */ break; }
3971 case XD3_INPUT: {
3972 n = min(stream->winsize, input_size - ipos);
3973 if (n == 0) {
3974 goto done;
3975 }
3976 xd3_avail_input (stream, input + ipos, n);
3977 ipos += n;
3978 continue;
3979 }
3980 case XD3_GOTHEADER: { /* ignore */ continue; }
3981 case XD3_WINSTART: { /* ignore */ continue; }
3982 case XD3_WINFINISH: { /* ignore */ continue; }
3983 case XD3_GETSRCBLK:
3984 {
3985 stream->msg = "stream requires source input";
3986 return XD3_INTERNAL;
3987 }
3988 case 0:
3989 {
3990 /* xd3_encode_input/xd3_decode_input never return 0 */
3991 stream->msg = "invalid return: 0";
3992 return XD3_INTERNAL;
3993 }
3994 default:
3995 return ret;
3996 }
3997
3998 if (*output_size + stream->avail_out > output_size_max)
3999 {
4000 stream->msg = "insufficient output space";
4001 return ENOSPC;
4002 }
4003
4004 memcpy (output + *output_size, stream->next_out, stream->avail_out);
4005
4006 *output_size += stream->avail_out;
4007
4008 xd3_consume_output (stream);
4009 }
4010 done:
4011 return (close_stream == 0) ? 0 : xd3_close_stream (stream);
4012}
4013
4014static int
4015xd3_process_memory (int is_encode,
4016 int (*func) (xd3_stream *),
4017 int close_stream,
4018 const uint8_t *input,
4019 usize_t input_size,
4020 const uint8_t *source,
4021 usize_t source_size,
4022 uint8_t *output,
4023 usize_t *output_size,
4024 usize_t output_size_max,
4025 int flags) {
4026 xd3_stream stream;
4027 xd3_config config;
4028 xd3_source src;
4029 int ret;
4030
4031 memset (& stream, 0, sizeof (stream));
4032 memset (& config, 0, sizeof (config));
4033
4034 if (input == NULL || output == NULL) {
4035 stream.msg = "invalid input/output buffer";
4036 ret = XD3_INTERNAL;
4037 goto exit;
4038 }
4039
4040 config.flags = flags;
4041
4042 if (is_encode)
4043 {
4044 config.srcwin_maxsz = source_size;
4045 config.winsize = min(input_size, (usize_t) XD3_DEFAULT_WINSIZE);
4046 config.iopt_size = min(input_size / 32, XD3_DEFAULT_IOPT_SIZE);
4047 config.iopt_size = max(config.iopt_size, 128U);
4048 config.sprevsz = xd3_pow2_roundup (config.winsize);
4049 }
4050
4051 if ((ret = xd3_config_stream (&stream, &config)) != 0)
4052 {
4053 goto exit;
4054 }
4055
4056 if (source != NULL)
4057 {
4058 memset (& src, 0, sizeof (src));
4059 src.size = source_size;
4060
4061 src.blksize = source_size;
4062 src.onblk = source_size;
4063 src.curblk = source;
4064 src.curblkno = 0;
4065
4066 if ((ret = xd3_set_source (&stream, &src)) != 0)
4067 {
4068 goto exit;
4069 }
4070 }
4071
4072 if ((ret = xd3_process_stream (is_encode,
4073 & stream,
4074 func, 1,
4075 input, input_size,
4076 output,
4077 output_size,
4078 output_size_max)) != 0)
4079 {
4080 goto exit;
4081 }
4082
4083 exit:
4084 if (ret != 0) {
4085 IF_DEBUG1 (DP(RINT "process_memory: %d: %s", ret, stream.msg));
4086 }
4087 xd3_free_stream(&stream);
4088 return ret;
4089}
4090
4091int
4092xd3_decode_stream (xd3_stream *stream,
4093 const uint8_t *input,
4094 usize_t input_size,
4095 uint8_t *output,
4096 usize_t *output_size,
4097 usize_t output_size_max)
4098{
4099 return xd3_process_stream (0, stream, & xd3_decode_input, 1,
4100 input, input_size,
4101 output, output_size, output_size_max);
4102}
4103
4104int
4105xd3_decode_memory (const uint8_t *input,
4106 usize_t input_size,
4107 const uint8_t *source,
4108 usize_t source_size,
4109 uint8_t *output,
4110 usize_t *output_size,
4111 usize_t output_size_max,
4112 int flags) {
4113 return xd3_process_memory (0, & xd3_decode_input, 1,
4114 input, input_size,
4115 source, source_size,
4116 output, output_size, output_size_max,
4117 flags);
4118}
4119
4120
4121#if XD3_ENCODER
4122int
4123xd3_encode_stream (xd3_stream *stream,
4124 const uint8_t *input,
4125 usize_t input_size,
4126 uint8_t *output,
4127 usize_t *output_size,
4128 usize_t output_size_max)
4129{
4130 return xd3_process_stream (1, stream, & xd3_encode_input, 1,
4131 input, input_size,
4132 output, output_size, output_size_max);
4133}
4134
4135int
4136xd3_encode_memory (const uint8_t *input,
4137 usize_t input_size,
4138 const uint8_t *source,
4139 usize_t source_size,
4140 uint8_t *output,
4141 usize_t *output_size,
4142 usize_t output_size_max,
4143 int flags) {
4144 return xd3_process_memory (1, & xd3_encode_input, 1,
4145 input, input_size,
4146 source, source_size,
4147 output, output_size, output_size_max,
4148 flags);
4149}
4150#endif
4151
4152
4153/*************************************************************
4154 String matching helpers
4155 *************************************************************/
4156
4157#if XD3_ENCODER
4158/* Do the initial xd3_string_match() checksum table setup.
4159 * Allocations are delayed until first use to avoid allocation
4160 * sometimes (e.g., perfect matches, zero-length inputs). */
4161static int
4162xd3_string_match_init (xd3_stream *stream)
4163{
4164 const int DO_SMALL = ! (stream->flags & XD3_NOCOMPRESS);
4165 const int DO_LARGE = (stream->src != NULL);
4166
4167 if (DO_LARGE && stream->large_table == NULL)
4168 {
4169 if ((stream->large_table =
4170 (usize_t*) xd3_alloc0 (stream, stream->large_hash.size, sizeof (usize_t))) == NULL)
4171 {
4172 return ENOMEM;
4173 }
4174 }
4175
4176 if (DO_SMALL)
4177 {
4178 /* Subsequent calls can return immediately after checking reset. */
4179 if (stream->small_table != NULL)
4180 {
4181 /* The target hash table is reinitialized once per window. */
4182 /* TODO: This would not have to be reinitialized if absolute
4183 * offsets were being stored. */
4184 if (stream->small_reset)
4185 {
4186 stream->small_reset = 0;
4187 memset (stream->small_table, 0,
4188 sizeof (usize_t) * stream->small_hash.size);
4189 }
4190
4191 return 0;
4192 }
4193
4194 if ((stream->small_table =
4195 (usize_t*) xd3_alloc0 (stream,
4196 stream->small_hash.size,
4197 sizeof (usize_t))) == NULL)
4198 {
4199 return ENOMEM;
4200 }
4201
4202 /* If there is a previous table needed. */
4203 if (stream->smatcher.small_lchain > 1 ||
4204 stream->smatcher.small_chain > 1)
4205 {
4206 if ((stream->small_prev =
4207 (xd3_slist*) xd3_alloc (stream,
4208 stream->sprevsz,
4209 sizeof (xd3_slist))) == NULL)
4210 {
4211 return ENOMEM;
4212 }
4213 }
4214 }
4215
4216 return 0;
4217}
4218
4219#if XD3_USE_LARGEFILE64
4220/* This function handles the 32/64bit ambiguity -- file positions are 64bit
4221 * but the hash table for source-offsets is 32bit. */
4222static xoff_t
4223xd3_source_cksum_offset(xd3_stream *stream, usize_t low)
4224{
4225 xoff_t scp = stream->srcwin_cksum_pos;
4226 xoff_t s0 = scp >> 32;
4227
4228 usize_t sr = (usize_t) scp;
4229
4230 if (s0 == 0) {
4231 return low;
4232 }
4233
4234 /* This should not be >= because srcwin_cksum_pos is the next
4235 * position to index. */
4236 if (low > sr) {
4237 return (--s0 << 32) | low;
4238 }
4239
4240 return (s0 << 32) | low;
4241}
4242#else
4243static xoff_t
4244xd3_source_cksum_offset(xd3_stream *stream, usize_t low)
4245{
4246 return (xoff_t) low;
4247}
4248#endif
4249
4250/* This function sets up the stream->src fields srcbase, srclen. The
4251 * call is delayed until these values are needed to encode a copy
4252 * address. At this point the decision has to be made. */
4253static int
4254xd3_srcwin_setup (xd3_stream *stream)
4255{
4256 xd3_source *src = stream->src;
4257 xoff_t length, x;
4258
4259 /* Check the undecided state. */
4260 XD3_ASSERT (src->srclen == 0 && src->srcbase == 0);
4261
4262 /* Avoid repeating this call. */
4263 stream->srcwin_decided = 1;
4264
4265 /* If the stream is flushing, then the iopt buffer was able to
4266 * contain the complete encoding. If no copies were issued no
4267 * source window is actually needed. This prevents the VCDIFF
4268 * header from including source base/len. xd3_emit_hdr checks for
4269 * srclen == 0. */
4270 if (stream->enc_state == ENC_INSTR && stream->match_maxaddr == 0)
4271 {
4272 goto done;
4273 }
4274
4275 /* Check for overflow, srclen is usize_t - this can't happen unless
4276 * XD3_DEFAULT_SRCBACK and related parameters are extreme - should
4277 * use smaller windows. */
4278 length = stream->match_maxaddr - stream->match_minaddr;
4279
4280 x = (xoff_t) USIZE_T_MAX;
4281 if (length > x)
4282 {
4283 stream->msg = "source window length overflow (not 64bit)";
4284 return XD3_INTERNAL;
4285 }
4286
4287 /* If ENC_INSTR, then we know the exact source window to use because
4288 * no more copies can be issued. */
4289 if (stream->enc_state == ENC_INSTR)
4290 {
4291 src->srcbase = stream->match_minaddr;
4292 src->srclen = (usize_t) length;
4293 XD3_ASSERT (src->srclen);
4294 goto done;
4295 }
4296
4297 /* Otherwise, we have to make a guess. More copies may still be
4298 * issued, but we have to decide the source window base and length
4299 * now. */
4300 src->srcbase = stream->match_minaddr;
4301 src->srclen = max ((usize_t) length, stream->avail_in + (stream->avail_in >> 2));
4302 if (src->size < src->srcbase + (xoff_t) src->srclen)
4303 {
4304 /* Could reduce srcbase, as well. */
4305 src->srclen = src->size - src->srcbase;
4306 }
4307
4308 XD3_ASSERT (src->srclen);
4309 done:
4310 /* Set the taroff. This convenience variable is used even when
4311 stream->src == NULL. */
4312 stream->taroff = src->srclen;
4313 return 0;
4314}
4315
4316/* Sets the bounding region for a newly discovered source match, prior
4317 * to calling xd3_source_extend_match(). This sets the match_maxfwd,
4318 * match_maxback variables. Note: srcpos is an absolute position
4319 * (xoff_t) but the match_maxfwd, match_maxback variables are usize_t.
4320 * Returns 0 if the setup succeeds, or 1 if the source position lies
4321 * outside an already-decided srcbase/srclen window. */
4322static int
4323xd3_source_match_setup (xd3_stream *stream, xoff_t srcpos)
4324{
4325 xd3_source *src = stream->src;
4326 usize_t greedy_or_not;
4327
4328 stream->match_maxback = 0;
4329 stream->match_maxfwd = 0;
4330 stream->match_back = 0;
4331 stream->match_fwd = 0;
4332
4333 /* This avoids a non-blocking endless loop caused by scanning
4334 * backwards across a block boundary, only to find not enough
4335 * matching bytes to beat the current min_match due to a better lazy
4336 * target match: the re-entry to xd3_string_match() repeats the same
4337 * long match because the input position hasn't changed. TODO: if
4338 * ever duplicates are added to the source hash table, this logic
4339 * won't suffice to avoid loops. See testing/regtest.cc's
4340 * TestNonBlockingProgress test! */
4341 if (srcpos != 0 && srcpos == stream->match_last_srcpos)
4342 {
4343 goto bad;
4344 }
4345
4346 /* Going backwards, the 1.5-pass algorithm allows some
4347 * already-matched input may be covered by a longer source match.
4348 * The greedy algorithm does not allow this. */
4349 if (stream->flags & XD3_BEGREEDY)
4350 {
4351 /* The greedy algorithm allows backward matching to the last
4352 matched position. */
4353 greedy_or_not = xd3_iopt_last_matched (stream);
4354 }
4355 else
4356 {
4357 /* The 1.5-pass algorithm allows backward matching to go back as
4358 * far as the unencoded offset, which is updated as instructions
4359 * pass out of the iopt buffer. If this (default) is chosen, it
4360 * means xd3_iopt_erase may be called to eliminate instructions
4361 * when a covering source match is found. */
4362 greedy_or_not = stream->unencoded_offset;
4363 }
4364
4365 /* Backward target match limit. */
4366 XD3_ASSERT (stream->input_position >= greedy_or_not);
4367 stream->match_maxback = stream->input_position - greedy_or_not;
4368
4369 /* Forward target match limit. */
4370 XD3_ASSERT (stream->avail_in > stream->input_position);
4371 stream->match_maxfwd = stream->avail_in - stream->input_position;
4372
4373 /* Now we take the source position into account. It depends whether
4374 * the srclen/srcbase have been decided yet. */
4375 if (stream->srcwin_decided == 0)
4376 {
4377 /* Unrestricted case: the match can cover the entire source,
4378 * 0--src->size. We compare the usize_t
4379 * match_maxfwd/match_maxback against the xoff_t
4380 * src->size/srcpos values and take the min. */
4381 xoff_t srcavail;
4382
4383 if (srcpos < (xoff_t) stream->match_maxback)
4384 {
4385 stream->match_maxback = srcpos;
4386 }
4387
4388 srcavail = src->size - srcpos;
4389 if (srcavail < (xoff_t) stream->match_maxfwd)
4390 {
4391 stream->match_maxfwd = srcavail;
4392 }
4393
4394 goto good;
4395 }
4396
4397 /* Decided some source window. */
4398 XD3_ASSERT (src->srclen > 0);
4399
4400 /* Restricted case: fail if the srcpos lies outside the source window */
4401 if ((srcpos < src->srcbase) || (srcpos > (src->srcbase + (xoff_t) src->srclen)))
4402 {
4403 goto bad;
4404 }
4405 else
4406 {
4407 usize_t srcavail;
4408
4409 srcavail = (usize_t) (srcpos - src->srcbase);
4410 if (srcavail < stream->match_maxback)
4411 {
4412 stream->match_maxback = srcavail;
4413 }
4414
4415 srcavail = (usize_t) (src->srcbase + (xoff_t) src->srclen - srcpos);
4416 if (srcavail < stream->match_maxfwd) {
4417 stream->match_maxfwd = srcavail;
4418 }
4419
4420 goto good;
4421 }
4422
4423 good:
4424 stream->match_state = MATCH_BACKWARD;
4425 stream->match_srcpos = srcpos;
4426 stream->match_last_srcpos = srcpos;
4427 return 0;
4428
4429 bad:
4430 stream->match_state = MATCH_SEARCHING;
4431 return 1;
4432}
4433
4434/* This code is experimental, and I'm having trouble benchmarking
4435 * it reliably. */
4436#if 0
4437static inline int
4438xd3_forward_match(const uint8_t *s1c, const uint8_t *s2c, size_t n)
4439{
4440 size_t i = 0;
4441#if UNALIGNED_OK
4442 size_t nint = n / sizeof(int);
4443
4444 if (nint >> 3)
4445 {
4446 size_t j = 0;
4447 const int *s1 = (const int*)s1c;
4448 const int *s2 = (const int*)s2c;
4449 size_t nint_8 = nint - 8;
4450
4451 while (i <= nint_8 &&
4452 s1[i++] == s2[j++] &&
4453 s1[i++] == s2[j++] &&
4454 s1[i++] == s2[j++] &&
4455 s1[i++] == s2[j++] &&
4456 s1[i++] == s2[j++] &&
4457 s1[i++] == s2[j++] &&
4458 s1[i++] == s2[j++] &&
4459 s1[i++] == s2[j++]) { }
4460
4461 i = (i - 1) * sizeof(int);
4462 }
4463#endif
4464
4465 while (i < n && s1c[i] == s2c[i])
4466 {
4467 i++;
4468 }
4469 return i;
4470}
4471#else
4472static inline usize_t
4473xd3_forward_match(const uint8_t *s1c,
4474 const uint8_t *s2c,
4475 usize_t n) {
4476 usize_t i = 0;
4477 while (i < n && s1c[i] == s2c[i])
4478 {
4479 i++;
4480 }
4481 return i;
4482}
4483#endif
4484
4485
4486/* This function expands the source match backward and forward. It is
4487 * reentrant, since xd3_getblk may return XD3_GETSRCBLK, so most
4488 * variables are kept in xd3_stream. There are two callers of this
4489 * function, the string_matching routine when a checksum match is
4490 * discovered, and xd3_encode_input whenever a continuing (or initial)
4491 * match is suspected. The two callers do different things with the
4492 * input_position, thus this function leaves that variable untouched.
4493 * If a match is taken the resulting stream->match_fwd is left
4494 * non-zero. */
4495static int
4496xd3_source_extend_match (xd3_stream *stream)
4497{
4498 int ret;
4499 xd3_source *src = stream->src;
4500 xoff_t matchoff; /* matchoff is the current right/left-boundary of
4501 the source match being tested. */
4502 usize_t streamoff; /* streamoff is the current right/left-boundary
4503 of the input match being tested. */
4504 xoff_t tryblk; /* tryblk, tryoff are the block, offset position
4505 of matchoff */
4506 usize_t tryoff;
4507 usize_t tryrem; /* tryrem is the number of matchable bytes */
4508 usize_t matched;
4509
4510 XD3_ASSERT (src != NULL);
4511
4512 /* Does it make sense to compute backward match AFTER forward match? */
4513 if (stream->match_state == MATCH_BACKWARD)
4514 {
4515 /* Note: this code is practically duplicated below, substituting
4516 * match_fwd/match_back and direction. Consolidate? */
4517 matchoff = stream->match_srcpos - stream->match_back;
4518 streamoff = stream->input_position - stream->match_back;
4519 xd3_blksize_div (matchoff, src, &tryblk, &tryoff);
4520
4521 /* this loops backward over source blocks */
4522 while (stream->match_back < stream->match_maxback)
4523 {
4524 /* see if we're backing across a source block boundary */
4525 if (tryoff == 0)
4526 {
4527 tryoff = src->blksize;
4528 tryblk -= 1;
4529 }
4530
4531 if ((ret = xd3_getblk (stream, tryblk)))
4532 {
4533 /* if search went too far back, continue forward. */
4534 if (ret == XD3_TOOFARBACK)
4535 {
4536 break;
4537 }
4538
4539 /* could be a XD3_GETSRCBLK failure. */
4540 return ret;
4541 }
4542
4543 /* TODO: This code can be optimized similar to xd3_match_forward() */
4544 for (tryrem = min (tryoff, stream->match_maxback -
4545 stream->match_back);
4546 tryrem != 0;
4547 tryrem -= 1, stream->match_back += 1)
4548 {
4549 if (src->curblk[tryoff-1] != stream->next_in[streamoff-1])
4550 {
4551 goto doneback;
4552 }
4553
4554 tryoff -= 1;
4555 streamoff -= 1;
4556 }
4557 }
4558
4559 doneback:
4560 stream->match_state = MATCH_FORWARD;
4561 }
4562
4563 XD3_ASSERT (stream->match_state == MATCH_FORWARD);
4564
4565 matchoff = stream->match_srcpos + stream->match_fwd;
4566 streamoff = stream->input_position + stream->match_fwd;
4567 xd3_blksize_div (matchoff, src, & tryblk, & tryoff);
4568
4569 /* Note: practically the same code as backwards case above: same comments */
4570 while (stream->match_fwd < stream->match_maxfwd)
4571 {
4572 if (tryoff == src->blksize)
4573 {
4574 tryoff = 0;
4575 tryblk += 1;
4576 }
4577
4578 if ((ret = xd3_getblk (stream, tryblk)))
4579 {
4580 /* if search went too far back, continue forward. */
4581 if (ret == XD3_TOOFARBACK)
4582 {
4583 break;
4584 }
4585
4586 /* could be a XD3_GETSRCBLK failure. */
4587 return ret;
4588 }
4589
4590 tryrem = min(stream->match_maxfwd - stream->match_fwd,
4591 src->blksize - tryoff);
4592
4593 matched = xd3_forward_match(src->curblk + tryoff,
4594 stream->next_in + streamoff,
4595 tryrem);
4596 tryoff += matched;
4597 streamoff += matched;
4598 stream->match_fwd += matched;
4599
4600 if (tryrem != matched)
4601 {
4602 break;
4603 }
4604 }
4605
4606 stream->match_state = MATCH_SEARCHING;
4607
4608 /* If the match ends short of the last instruction end, we probably
4609 * don't want it. There is the possibility that a copy ends short
4610 * of the last copy but also goes further back, in which case we
4611 * might want it. This code does not implement such: if so we would
4612 * need more complicated xd3_iopt_erase logic. */
4613 if (stream->match_fwd < stream->min_match)
4614 {
4615 stream->match_fwd = 0;
4616 }
4617 else
4618 {
4619 usize_t total = stream->match_fwd + stream->match_back;
4620
4621 /* Correct the variables to remove match_back from the equation. */
4622 usize_t target_position = stream->input_position - stream->match_back;
4623 usize_t match_length = stream->match_back + stream->match_fwd;
4624 xoff_t match_position = stream->match_srcpos - stream->match_back;
4625 xoff_t match_end = stream->match_srcpos + stream->match_fwd;
4626
4627 /* At this point we may have to erase any iopt-buffer
4628 * instructions that are fully covered by a backward-extending
4629 * copy. */
4630 if (stream->match_back > 0)
4631 {
4632 xd3_iopt_erase (stream, target_position, total);
4633 }
4634
4635 stream->match_back = 0;
4636
4637 /* Update ranges. The first source match occurs with both
4638 values set to 0. */
4639 if (stream->match_maxaddr == 0 ||
4640 match_position < stream->match_minaddr)
4641 {
4642 stream->match_minaddr = match_position;
4643 }
4644
4645 if (match_end > stream->match_maxaddr)
4646 {
4647 /* Note: per-window */
4648 stream->match_maxaddr = match_end;
4649 }
4650
4651 if (match_end > stream->maxsrcaddr)
4652 {
4653 /* Note: across windows */
4654 stream->maxsrcaddr = match_end;
4655 }
4656
4657 IF_DEBUG1 ({
4658 static int x = 0;
4659 DP(RINT "[source match:%d] <inp %"Q"u %"Q"u> <src %"Q"u %"Q"u> (%s) [ %u bytes ]\n",
4660 x++,
4661 stream->total_in + target_position,
4662 stream->total_in + target_position + match_length,
4663 match_position,
4664 match_position + match_length,
4665 (stream->total_in + target_position == match_position) ? "same" : "diff",
4666 match_length);
4667 });
4668
4669 if ((ret = xd3_found_match (stream,
4670 /* decoder position */ target_position,
4671 /* length */ match_length,
4672 /* address */ match_position,
4673 /* is_source */ 1)))
4674 {
4675 return ret;
4676 }
4677
4678 /* If the match ends with the available input: */
4679 if (target_position + match_length == stream->avail_in)
4680 {
4681 /* Setup continuing match for the next window. */
4682 stream->match_state = MATCH_TARGET;
4683 stream->match_srcpos = match_end;
4684 }
4685 }
4686
4687 return 0;
4688}
4689
4690/* Update the small hash. Values in the small_table are offset by
4691 * HASH_CKOFFSET (1) to distinguish empty buckets from real offsets. */
4692static void
4693xd3_scksum_insert (xd3_stream *stream,
4694 usize_t inx,
4695 usize_t scksum,
4696 usize_t pos)
4697{
4698 /* If we are maintaining previous duplicates. */
4699 if (stream->small_prev)
4700 {
4701 usize_t last_pos = stream->small_table[inx];
4702 xd3_slist *pos_list = & stream->small_prev[pos & stream->sprevmask];
4703
4704 /* Note last_pos is offset by HASH_CKOFFSET. */
4705 pos_list->last_pos = last_pos;
4706 }
4707
4708 /* Enter the new position into the hash bucket. */
4709 stream->small_table[inx] = pos + HASH_CKOFFSET;
4710}
4711
4712#if XD3_DEBUG
4713static int
4714xd3_check_smatch (const uint8_t *ref0, const uint8_t *inp0,
4715 const uint8_t *inp_max, usize_t cmp_len)
4716{
4717 usize_t i;
4718
4719 for (i = 0; i < cmp_len; i += 1)
4720 {
4721 XD3_ASSERT (ref0[i] == inp0[i]);
4722 }
4723
4724 if (inp0 + cmp_len < inp_max)
4725 {
4726 XD3_ASSERT (inp0[i] != ref0[i]);
4727 }
4728
4729 return 1;
4730}
4731#endif /* XD3_DEBUG */
4732
4733/* When the hash table indicates a possible small string match, it
4734 * calls this routine to find the best match. The first matching
4735 * position is taken from the small_table, HASH_CKOFFSET is subtracted
4736 * to get the actual position. After checking that match, if previous
4737 * linked lists are in use (because stream->smatcher.small_chain > 1),
4738 * previous matches are tested searching for the longest match. If
4739 * (stream->min_match > MIN_MATCH) then a lazy match is in effect.
4740 */
4741static usize_t
4742xd3_smatch (xd3_stream *stream,
4743 usize_t base,
4744 usize_t scksum,
4745 usize_t *match_offset)
4746{
4747 usize_t cmp_len;
4748 usize_t match_length = 0;
4749 usize_t chain = (stream->min_match == MIN_MATCH ?
4750 stream->smatcher.small_chain :
4751 stream->smatcher.small_lchain);
4752 const uint8_t *inp_max = stream->next_in + stream->avail_in;
4753 const uint8_t *inp;
4754 const uint8_t *ref;
4755
4756 SMALL_HASH_DEBUG1 (stream, stream->next_in + stream->input_position);
4757
4758 XD3_ASSERT (stream->min_match + stream->input_position <= stream->avail_in);
4759
4760 base -= HASH_CKOFFSET;
4761
4762 again:
4763
4764 IF_DEBUG2 (DP(RINT "smatch at base=%u inp=%u cksum=%u\n", base,
4765 stream->input_position, scksum));
4766
4767 /* For small matches, we can always go to the end-of-input because
4768 * the matching position must be less than the input position. */
4769 XD3_ASSERT (base < stream->input_position);
4770
4771 ref = stream->next_in + base;
4772 inp = stream->next_in + stream->input_position;
4773
4774 SMALL_HASH_DEBUG2 (stream, ref);
4775
4776 /* Expand potential match forward. */
4777 while (inp < inp_max && *inp == *ref)
4778 {
4779 ++inp;
4780 ++ref;
4781 }
4782
4783 cmp_len = inp - (stream->next_in + stream->input_position);
4784
4785 /* Verify correctness */
4786 XD3_ASSERT (xd3_check_smatch (stream->next_in + base,
4787 stream->next_in + stream->input_position,
4788 inp_max, cmp_len));
4789
4790 /* Update longest match */
4791 if (cmp_len > match_length)
4792 {
4793 ( match_length) = cmp_len;
4794 (*match_offset) = base;
4795
4796 /* Stop if we match the entire input or have a long_enough match. */
4797 if (inp == inp_max || cmp_len >= stream->smatcher.long_enough)
4798 {
4799 goto done;
4800 }
4801 }
4802
4803 /* If we have not reached the chain limit, see if there is another
4804 previous position. */
4805 while (--chain != 0)
4806 {
4807 /* Calculate the previous offset. */
4808 usize_t prev_pos = stream->small_prev[base & stream->sprevmask].last_pos;
4809 usize_t diff_pos;
4810
4811 if (prev_pos == 0)
4812 {
4813 break;
4814 }
4815
4816 prev_pos -= HASH_CKOFFSET;
4817
4818 if (prev_pos > base)
4819 {
4820 break;
4821 }
4822
4823 base = prev_pos;
4824
4825 XD3_ASSERT (stream->input_position > base);
4826 diff_pos = stream->input_position - base;
4827
4828 /* Stop searching if we go beyond sprevsz, since those entries
4829 * are for unrelated checksum entries. */
4830 if (diff_pos & ~stream->sprevmask)
4831 {
4832 break;
4833 }
4834
4835 goto again;
4836 }
4837
4838 done:
4839 /* Crude efficiency test: if the match is very short and very far back, it's
4840 * unlikely to help, but the exact calculation requires knowing the state of
4841 * the address cache and adjacent instructions, which we can't do here.
4842 * Rather than encode a probably inefficient copy here and check it later
4843 * (which complicates the code a lot), do this:
4844 */
4845 if (match_length == 4 && stream->input_position - (*match_offset) >= 1<<14)
4846 {
4847 /* It probably takes >2 bytes to encode an address >= 2^14 from here */
4848 return 0;
4849 }
4850 if (match_length == 5 && stream->input_position - (*match_offset) >= 1<<21)
4851 {
4852 /* It probably takes >3 bytes to encode an address >= 2^21 from here */
4853 return 0;
4854 }
4855
4856 /* It's unlikely that a window is large enough for the (match_length == 6 &&
4857 * address >= 2^28) check */
4858 return match_length;
4859}
4860
4861#if XD3_DEBUG
4862static void
4863xd3_verify_small_state (xd3_stream *stream,
4864 const uint8_t *inp,
4865 uint32_t x_cksum)
4866{
4867 uint32_t state;
4868 uint32_t cksum = xd3_scksum (&state, inp, stream->smatcher.small_look);
4869
4870 XD3_ASSERT (cksum == x_cksum);
4871}
4872
4873static void
4874xd3_verify_large_state (xd3_stream *stream,
4875 const uint8_t *inp,
4876 uint32_t x_cksum)
4877{
4878 uint32_t cksum = xd3_lcksum (inp, stream->smatcher.large_look);
4879 XD3_ASSERT (cksum == x_cksum);
4880}
4881static void
4882xd3_verify_run_state (xd3_stream *stream,
4883 const uint8_t *inp,
4884 int x_run_l,
4885 uint8_t x_run_c)
4886{
4887 int slook = stream->smatcher.small_look;
4888 uint8_t run_c;
4889 int run_l = xd3_comprun (inp, slook, &run_c);
4890
4891 XD3_ASSERT (run_l == 0 || run_c == x_run_c);
4892 XD3_ASSERT (x_run_l > slook || run_l == x_run_l);
4893}
4894#endif /* XD3_DEBUG */
4895
4896/* This function computes more source checksums to advance the window.
4897 * Called at every entrance to the string-match loop and each time
4898 * stream->input_position reaches the value returned as
4899 * *next_move_point. NB: this is one of the most expensive functions
4900 * in this code and also the most critical for good compression.
4901 *
4902 * TODO: really would like a good test for this logic. how?
4903 * Update: testing/regtest.cc has some basic tests, more would be nice.
4904 * TODO: optimize the inner loop
4905 */
4906static int
4907xd3_srcwin_move_point (xd3_stream *stream, usize_t *next_move_point)
4908{
4909 xoff_t logical_input_cksum_pos;
4910
4911 XD3_ASSERT(stream->srcwin_cksum_pos <= stream->src->size);
4912 if (stream->srcwin_cksum_pos == stream->src->size)
4913 {
4914 *next_move_point = USIZE_T_MAX;
4915 return 0;
4916 }
4917
4918 /* Begin by advancing at twice the input rate, up to half the
4919 * maximum window size. */
4920 logical_input_cksum_pos = min((stream->total_in + stream->input_position) * 2,
4921 (stream->total_in + stream->input_position) +
4922 (stream->srcwin_maxsz / 2));
4923
4924 /* If srcwin_cksum_pos is already greater, wait until the difference
4925 * is met. */
4926 if (stream->srcwin_cksum_pos > logical_input_cksum_pos)
4927 {
4928 *next_move_point = stream->input_position +
4929 (usize_t)(stream->srcwin_cksum_pos - logical_input_cksum_pos);
4930 return 0;
4931 }
4932
4933 /* A long match may have extended past srcwin_cksum_pos. Don't
4934 * start checksumming already-matched source data. */
4935 if (stream->maxsrcaddr > stream->srcwin_cksum_pos)
4936 {
4937 stream->srcwin_cksum_pos = stream->maxsrcaddr;
4938 }
4939
4940 if (logical_input_cksum_pos < stream->srcwin_cksum_pos)
4941 {
4942 logical_input_cksum_pos = stream->srcwin_cksum_pos;
4943 }
4944
4945 /* Advance at least one source block. With the command-line
4946 * defaults this means:
4947 *
4948 * if (src->size <= srcwin_maxsz), index the entire source at once
4949 * using the position of the first non-match. This is good for
4950 * small inputs, especially when the content may have moved anywhere
4951 * in the file (e.g., tar files).
4952 *
4953 * if (src->size > srcwin_maxsz), index at least one block (which
4954 * the command-line sets to 1/32 of srcwin_maxsz) ahead of the
4955 * logical position. This is good for different reasons: when a
4956 * long match spanning several source blocks is encountered, this
4957 * avoids computing checksums for those blocks. If the data can
4958 * move anywhere, this is bad.
4959 */
4960 logical_input_cksum_pos += stream->src->blksize;
4961
4962 IF_DEBUG1 (DP(RINT "[srcwin_move_point] T=%"Q"u S=%"Q"u/%"Q"u\n",
4963 stream->total_in + stream->input_position,
4964 stream->srcwin_cksum_pos,
4965 logical_input_cksum_pos));
4966
4967 while (stream->srcwin_cksum_pos < logical_input_cksum_pos &&
4968 stream->srcwin_cksum_pos < stream->src->size)
4969 {
4970 xoff_t blkno;
4971 xoff_t blkbaseoffset;
4972 usize_t blkrem;
4973 ssize_t oldpos;
4974 ssize_t blkpos;
4975 int ret;
4976 xd3_blksize_div (stream->srcwin_cksum_pos,
4977 stream->src, &blkno, &blkrem);
4978 oldpos = blkrem;
4979 blkpos = xd3_bytes_on_srcblk_fast (stream->src, blkno);
4980
4981 if (oldpos + stream->smatcher.large_look > (usize_t) blkpos)
4982 {
4983 stream->srcwin_cksum_pos = (blkno + 1) * stream->src->blksize;
4984 continue;
4985 }
4986
4987 if ((ret = xd3_getblk (stream, blkno)))
4988 {
4989 /* TOOFARBACK should never occur here, since we read forward. */
4990 if (ret == XD3_TOOFARBACK)
4991 {
4992 ret = XD3_INTERNAL;
4993 }
4994 return ret;
4995 }
4996
4997 /* This inserts checksums for the entire block, in reverse,
4998 * starting from the end of the block. This logic does not test
4999 * stream->srcwin_cksum_pos because it always advances it to the
5000 * start of the next block.
5001 *
5002 * oldpos is the srcwin_cksum_pos within this block. blkpos is
5003 * the number of bytes available. Each iteration inspects
5004 * large_look bytes then steps back large_step bytes. The
5005 * if-stmt above ensures at least one large_look of data. */
5006 blkpos -= stream->smatcher.large_look;
5007 blkbaseoffset = stream->src->blksize * blkno;
5008
5009 do
5010 {
5011 uint32_t cksum = xd3_lcksum (stream->src->curblk + blkpos,
5012 stream->smatcher.large_look);
5013 usize_t hval = xd3_checksum_hash (& stream->large_hash, cksum);
5014
5015 stream->large_table[hval] =
5016 (usize_t) (blkbaseoffset +
5017 (xoff_t)(blkpos + HASH_CKOFFSET));
5018
5019 IF_DEBUG (stream->large_ckcnt += 1);
5020
5021 blkpos -= stream->smatcher.large_step;
5022 }
5023 while (blkpos >= oldpos);
5024
5025 stream->srcwin_cksum_pos = (blkno + 1) * stream->src->blksize;
5026 }
5027
5028 if (stream->srcwin_cksum_pos >= stream->src->size)
5029 {
5030 /* This invariant is needed for xd3_source_cksum_offset() */
5031 stream->srcwin_cksum_pos = stream->src->size;
5032 *next_move_point = USIZE_T_MAX;
5033 return 0;
5034 }
5035
5036 /* How long until this function should be called again. */
5037 XD3_ASSERT(stream->srcwin_cksum_pos >= logical_input_cksum_pos);
5038 *next_move_point = stream->input_position + 1 +
5039 (usize_t)(stream->srcwin_cksum_pos - logical_input_cksum_pos);
5040 return 0;
5041}
5042
5043#endif /* XD3_ENCODER */
5044
5045/********************************************************************
5046 TEMPLATE pass
5047 *********************************************************************/
5048
5049#endif /* __XDELTA3_C_INLINE_PASS__ */
5050#ifdef __XDELTA3_C_TEMPLATE_PASS__
5051
5052#if XD3_ENCODER
5053
5054/********************************************************************
5055 Templates
5056 *******************************************************************/
5057
5058/* Template macros */
5059#define XD3_TEMPLATE(x) XD3_TEMPLATE2(x,TEMPLATE)
5060#define XD3_TEMPLATE2(x,n) XD3_TEMPLATE3(x,n)
5061#define XD3_TEMPLATE3(x,n) x ## n
5062#define XD3_STRINGIFY(x) XD3_STRINGIFY2(x)
5063#define XD3_STRINGIFY2(x) #x
5064
5065static int XD3_TEMPLATE(xd3_string_match_) (xd3_stream *stream);
5066
5067static const xd3_smatcher XD3_TEMPLATE(__smatcher_) =
5068{
5069 XD3_STRINGIFY(TEMPLATE),
5070 XD3_TEMPLATE(xd3_string_match_),
5071#if SOFTCFG == 1
5072 0, 0, 0, 0, 0, 0, 0
5073#else
5074 LLOOK, LSTEP, SLOOK, SCHAIN, SLCHAIN, MAXLAZY, LONGENOUGH
5075#endif
5076};
5077
5078static int
5079XD3_TEMPLATE(xd3_string_match_) (xd3_stream *stream)
5080{
5081 const int DO_SMALL = ! (stream->flags & XD3_NOCOMPRESS);
5082 const int DO_LARGE = (stream->src != NULL);
5083 const int DO_RUN = (1);
5084
5085 const uint8_t *inp;
5086 uint32_t scksum = 0;
5087 uint32_t scksum_state;
5088 uint32_t lcksum = 0;
5089 usize_t sinx;
5090 usize_t linx;
5091 uint8_t run_c;
5092 size_t run_l;
5093 int ret;
5094 usize_t match_length;
5095 usize_t match_offset = 0;
5096 usize_t next_move_point;
5097
5098 /* If there will be no compression due to settings or short input,
5099 * skip it entirely. */
5100 if (! (DO_SMALL || DO_LARGE || DO_RUN) ||
5101 stream->input_position + SLOOK > stream->avail_in) { goto loopnomore; }
5102
5103 if ((ret = xd3_string_match_init (stream))) { return ret; }
5104
5105 /* The restartloop label is reached when the incremental loop state
5106 * needs to be reset. */
5107 restartloop:
5108
5109 /* If there is not enough input remaining for any kind of match,
5110 skip it. */
5111 if (stream->input_position + SLOOK > stream->avail_in) { goto loopnomore; }
5112
5113 /* Now reset the incremental loop state: */
5114
5115 /* The min_match variable is updated to avoid matching the same lazy
5116 * match over and over again. For example, if you find a (small)
5117 * match of length 9 at one position, you will likely find a match
5118 * of length 8 at the next position. */
5119 if (xd3_iopt_last_matched (stream) > stream->input_position)
5120 {
5121 stream->min_match = max(MIN_MATCH,
5122 1 + xd3_iopt_last_matched(stream) -
5123 stream->input_position);
5124 }
5125 else
5126 {
5127 stream->min_match = MIN_MATCH;
5128 }
5129
5130 /* The current input byte. */
5131 inp = stream->next_in + stream->input_position;
5132
5133 /* Small match state. */
5134 if (DO_SMALL)
5135 {
5136 scksum = xd3_scksum (&scksum_state, inp, SLOOK);
5137 }
5138
5139 /* Run state. */
5140 if (DO_RUN)
5141 {
5142 run_l = xd3_comprun (inp, SLOOK, & run_c);
5143 }
5144
5145 /* Large match state. We continue the loop even after not enough
5146 * bytes for LLOOK remain, so always check stream->input_position in
5147 * DO_LARGE code. */
5148 if (DO_LARGE && (stream->input_position + LLOOK <= stream->avail_in))
5149 {
5150 /* Source window: next_move_point is the point that
5151 * stream->input_position must reach before computing more
5152 * source checksum. */
5153 if ((ret = xd3_srcwin_move_point (stream, & next_move_point)))
5154 {
5155 return ret;
5156 }
5157
5158 lcksum = xd3_lcksum (inp, LLOOK);
5159 }
5160
5161 /* TRYLAZYLEN: True if a certain length match should be followed by
5162 * lazy search. This checks that LEN is shorter than MAXLAZY and
5163 * that there is enough leftover data to consider lazy matching.
5164 * "Enough" is set to 2 since the next match will start at the next
5165 * offset, it must match two extra characters. */
5166#define TRYLAZYLEN(LEN,POS,MAX) ((MAXLAZY) > 0 && (LEN) < (MAXLAZY) \
5167 && (POS) + (LEN) <= (MAX) - 2)
5168
5169 /* HANDLELAZY: This statement is called each time an instruciton is
5170 * emitted (three cases). If the instruction is large enough, the
5171 * loop is restarted, otherwise lazy matching may ensue. */
5172#define HANDLELAZY(mlen) \
5173 if (TRYLAZYLEN ((mlen), (stream->input_position), (stream->avail_in))) \
5174 { stream->min_match = (mlen) + LEAST_MATCH_INCR; goto updateone; } \
5175 else \
5176 { stream->input_position += (mlen); goto restartloop; }
5177
5178 /* Now loop over one input byte at a time until a match is found... */
5179 for (;; inp += 1, stream->input_position += 1)
5180 {
5181 /* Now we try three kinds of string match in order of expense:
5182 * run, large match, small match. */
5183
5184 /* Expand the start of a RUN. The test for (run_l == SLOOK)
5185 * avoids repeating this check when we pass through a run area
5186 * performing lazy matching. The run is only expanded once when
5187 * the min_match is first reached. If lazy matching is
5188 * performed, the run_l variable will remain inconsistent until
5189 * the first non-running input character is reached, at which
5190 * time the run_l may then again grow to SLOOK. */
5191 if (DO_RUN && run_l == SLOOK)
5192 {
5193 usize_t max_len = stream->avail_in - stream->input_position;
5194
5195 IF_DEBUG (xd3_verify_run_state (stream, inp, run_l, run_c));
5196
5197 while (run_l < max_len && inp[run_l] == run_c) { run_l += 1; }
5198
5199 /* Output a RUN instruction. */
5200 if (run_l >= stream->min_match && run_l >= MIN_RUN)
5201 {
5202 if ((ret = xd3_emit_run (stream, stream->input_position,
5203 run_l, run_c))) { return ret; }
5204
5205 HANDLELAZY (run_l);
5206 }
5207 }
5208
5209 /* If there is enough input remaining. */
5210 if (DO_LARGE && (stream->input_position + LLOOK <= stream->avail_in))
5211 {
5212 if ((stream->input_position >= next_move_point) &&
5213 (ret = xd3_srcwin_move_point (stream, & next_move_point)))
5214 {
5215 return ret;
5216 }
5217
5218 linx = xd3_checksum_hash (& stream->large_hash, lcksum);
5219
5220 IF_DEBUG (xd3_verify_large_state (stream, inp, lcksum));
5221
5222 if (stream->large_table[linx] != 0)
5223 {
5224 /* the match_setup will fail if the source window has
5225 * been decided and the match lies outside it. You
5226 * could consider forcing a window at this point to
5227 * permit a new source window. */
5228 xoff_t adj_offset =
5229 xd3_source_cksum_offset(stream,
5230 stream->large_table[linx] -
5231 HASH_CKOFFSET);
5232 if (xd3_source_match_setup (stream, adj_offset) == 0)
5233 {
5234 if ((ret = xd3_source_extend_match (stream)))
5235 {
5236 return ret;
5237 }
5238
5239 /* Update stream position. match_fwd is zero if no
5240 * match. */
5241 if (stream->match_fwd > 0)
5242 {
5243 HANDLELAZY (stream->match_fwd);
5244 }
5245 }
5246 }
5247 }
5248
5249 /* Small matches. */
5250 if (DO_SMALL)
5251 {
5252 sinx = xd3_checksum_hash (& stream->small_hash, scksum);
5253
5254 /* Verify incremental state in debugging mode. */
5255 IF_DEBUG (xd3_verify_small_state (stream, inp, scksum));
5256
5257 /* Search for the longest match */
5258 if (stream->small_table[sinx] != 0)
5259 {
5260 match_length = xd3_smatch (stream,
5261 stream->small_table[sinx],
5262 scksum,
5263 & match_offset);
5264 }
5265 else
5266 {
5267 match_length = 0;
5268 }
5269
5270 /* Insert a hash for this string. */
5271 xd3_scksum_insert (stream, sinx, scksum, stream->input_position);
5272
5273 /* Maybe output a COPY instruction */
5274 if (match_length >= stream->min_match)
5275 {
5276 IF_DEBUG1 ({
5277 static int x = 0;
5278 DP(RINT "[target match:%d] <inp %u %u> <cpy %u %u> "
5279 "(-%d) [ %u bytes ]\n",
5280 x++,
5281 stream->input_position,
5282 stream->input_position + match_length,
5283 match_offset,
5284 match_offset + match_length,
5285 stream->input_position - match_offset,
5286 match_length);
5287 });
5288
5289 if ((ret = xd3_found_match (stream,
5290 /* decoder position */
5291 stream->input_position,
5292 /* length */ match_length,
5293 /* address */ match_offset,
5294 /* is_source */ 0)))
5295 {
5296 return ret;
5297 }
5298
5299 /* Copy instruction. */
5300 HANDLELAZY (match_length);
5301 }
5302 }
5303
5304 /* The logic above prevents excess work during lazy matching by
5305 * increasing min_match to avoid smaller matches. Each time we
5306 * advance stream->input_position by one, the minimum match
5307 * shortens as well. */
5308 if (stream->min_match > MIN_MATCH)
5309 {
5310 stream->min_match -= 1;
5311 }
5312
5313 updateone:
5314
5315 /* See if there are no more incremental cksums to compute. */
5316 if (stream->input_position + SLOOK == stream->avail_in)
5317 {
5318 goto loopnomore;
5319 }
5320
5321 /* Compute next RUN, CKSUM */
5322 if (DO_RUN) { NEXTRUN (inp[SLOOK]); }
5323 if (DO_SMALL)
5324 {
5325 scksum = xd3_small_cksum_update (&scksum_state, inp, SLOOK);
5326 }
5327 if (DO_LARGE && (stream->input_position + LLOOK < stream->avail_in))
5328 {
5329 lcksum = xd3_large_cksum_update (lcksum, inp, LLOOK);
5330 }
5331 }
5332
5333 loopnomore:
5334 return 0;
5335}
5336#endif /* XD3_ENCODER */
5337#endif /* __XDELTA3_C_TEMPLATE_PASS__ */