1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
|
#pragma once
// this is the main include file of the cc4group API
// it contains the API struct as well as all types needed by users of the API
// all functions, types and constants available in the API are described here
// before heading on to the descriptions of the single elements needed, first take a look at the general concepts used throughout the API:
// - the API is designed around CC4Group objects
// CC4Group is an opaque type and thus only pointers to it may be used by applications
// a CC4Group instance can be obtained by calling cc4group.new()
// just keep in mind that it may return NULL in the rare case that memory allocation fails for whatever reason
// - after creating a fresh CC4Group object it's intended use needs to be decided on, by either calling cc4group.create on it...
// ...to create an empty group in-memory where contents may be added later
// or call cc4group.openExisting together with the path to a physical group file on disk to load it's contents into memory
// (for more sophisticated cases, one of the other open functions may be used)
// it is important that only one of this function is called only on freshly created groups
// otherwise in the lucky case some assertion may trigger or other undefined things may happen
// this two step process is needed to ensure informational error reporting when something goes wrong while opening existing groups
// - after calling cc4group.create or some cc4group.open-function, the group can be inspected and modified to your heart's content through the rest of the available API
// once all modifications are done, they can be persisted by calling cc4group.save or cc4group.saveOverwrite (but better be sure about overwriting)...
// ...to save everything to the compressed C4Group format on disk
// after saving, the group can still be modified in the same way and saved again as often as desired
// just keep in mind that changes are lost if they are not saved afterwards
// - almost all functions need a path or name as argument
// cc4group.openExisting, cc4group.save and cc4group.saveOverwrite are the only exceptions that take a real full path to a file on disk
// all other functions take only so called "entry paths", which are paths inside the group
// imagine that the contents of the group are your "/" (or C:) and the current working directory is also "/" (or C:)
// thus any entry path is the absolute path of an entry inside the group
// the directory separator used for all entry paths in cc4group is "/" _regardless of the platform_. yes, you better watch out Windows users!
// in contrast to paths names are only the the name of the entry itself, not the whole path
// - to free all resources used by the group when it is not needed anymore, call cc4group.delete with the group pointer and discard it afterwards (i.e. don't use it anymore)
// - all functions, except cc4group.new, cc4group.delete and cc4group.setTmpMemoryStrategy (where the latter can't fail) follow the same scheme
// all of them can fail, either caused by wrong arguments or by things outside of the applications control
// they all return _true_ if everything went well and _false_ if any error occured
// information about the error (incredibly useful for debugging or asking for help with problems) can be obtained by using one the cc4group.getError* functions
// for starters, cc4group.getErrorMessage should contain more than enough information
// just make sure that the returned string is consumed before calling it again
// only errors in the initialization functions (create and open) are seriously critical and make it impossible to continue using the group object
// simple opening errors like file not found just leave the group object in its fresh "new" state, but existing but malformed groups may leave it in an unclean state
// thus the group object should be deleted and a new one created if one of these functions fail
// any other error happening after successfully opening the group file are recoverable and the group object can be continued to be used as usual, ...
// ...although it may be impossible to make progress if the error was caused by something severe like being out of memory
// in general you better check _every_ single API call for errors as it leads to much more consistent user experience in rare error cases and most importantly...
// ...it will be a great help in debugging mistakes that are made by the programmer
// for lazy writers or just some testing, the nature of the API actually makes it possible to just chain a bunch of API calls with && and...
// ...thanks to short-circuiting it will stop exactly at the error and the whole expression will return false and the error information can be retrieved as usual
// sadly, this trick didn't seem to make it into any example, as they are written to output additional, more user-friendly, information together with the error message
// - finally (and i still forgot some other general aspect used for sure), because the return value of the functions is already used for the success-bool, ...
// ...functions that should actually return some real data have one (or more) pointer-arguments where they store the returned information
// - if you are such a patient reader and you made it here, you may now look at some of the examples (in the examples folder)
// my recommendations for starters are c4info, c4ls, unc4group and c4copy as they are really simple and don't have much logic for other things
// some of the examples may also have practical use cases, for example unc4group as a much faster c4group extractor (compared to the official c4group command line tool)...
// ...or maybe c4cat which should be familiar, or maybe c4copy, which can freshly repack some really old groups (like version < 1.2 or maybe with worse compression)...
// ...without modifying the original group, and also keeping modification dates and authors, because currently cc4group doesn't update any of these automatically...
// ...(but it is planned to implement automatic updating of this stuff that can be turned off for such purposes)
// the examples are not really commented, so it is your exercise to actually find out what they do and how they do it exactly
#ifdef __cplusplus
// this is needed for the C++-wrapper cppc4group
// if you intend to use C++ you probably want to use cppc4group instead of cc4group
// although, still take a look at the API-struct below, as this is the main place of documentation about the API, because it is mostly the same for cppc4group and cc4group
extern "C" {
#define this this_
#define new new_
#define delete delete_
#endif
#include <stddef.h>
#include <stdint.h>
#include <stdbool.h>
#include <stdio.h>
// this struct is used by the getEntryInfo and getEntryInfos methods
// contains all information and metadata possibly known about the group's contents
// sizes always denote the uncompressed size of the group's contents
typedef struct {
// the base name of the entry without parent folders and stuff
const char* fileName;
// a unix timestamp when the entry was modified the last time
int32_t modified;
// the author of this entry
// C4Group only stores the author in directories, thus this will be the parent directory's author for file entries
const char* author;
// the size of the entry
// for files this is simply the filesize
// for directories this is the size of the directories group header plus the size of the directories entry dictionary. it represents the overhead of the directory that is used in addition to the stored contents
size_t size;
// for files this is equal to size
// for directories this the size as specified above, plus the size of all contained entries, or simply the total of the directory
size_t totalSize;
// this only has a meaning for files
// it resembles the execute permission known from unix
bool executable;
// true for directories, false for files
bool directory;
// so called "official" groups were distributed by the inventor of C4Group in former times
// in the original editor this was used to protect the original groups from being changed, although its only a meta information and not a real protection
// cc4group can modify official groups in the usual ways and can change groups' "official" status through the setOfficial method
// users of cc4group should honor and change the "official" status how they think is appropriate (or simply ignore it)
bool official;
} CC4Group_EntryInfo;
// CC4Group is the opaque type of the cc4group objects
// only pointers to it can be created and they are passed to the available methods found below
typedef struct CC4Group_t CC4Group;
// callback types for tmp memory strategies
// a custom cleanup function may be used to clean up arbitrary resources
// it receives a single custom argument, which's contents can be defined by the custom tmp memory strategy when allocating the resources
typedef void(*CC4Group_CleanupFunc)(void* data);
// this struct holds the custom cleanup function together with the argument that it will receive when it is called
// the custom tmp memory strategy must at least set this to a valid function. data may be uninitialized if its not used by this function
typedef struct {
CC4Group_CleanupFunc func;
void* data;
} CC4Group_CleanupJob;
// a tmp memory strategy is a function that needs to return a pointer to read- and writable memory at least size big
// in case of an error, the function must return NULL
// in case it returns not NULL, at least the func-member of cleanupJob must be set to a valid function
// additionally it's data-pointer member may be set to arbitrary data that is needed for successfull cleanup. it will be passed when the cleanup function is called
// the strategy also receives the group object for which the memory is needed. this is mainly passed for use internally for accurate error-reporting when using the predefined strategies
typedef void* (*CC4Group_TmpMemoryStrategy)(CC4Group* const this, const size_t size, CC4Group_CleanupJob* cleanupJob);
// callback types for openWithReadCallback
// the callback has to store a pointer to the newly read data in data, as well as the amount of read data in size
// the callback must return true if the end of data is reached or any read error happens and false otherwise
// the pointer passed in will be handled as specified with the corresponding MemoryManagement
typedef bool (*CC4Group_ReadCallback)(const void** const data, size_t* const size, void* const arg);
// can be used as initialization before and deinitialization after all necessary calls to a read callback are made
// for instance for buffer allocation and deletion
// the callback should return true on success and false on failure
// a deinitialization failure will only trigger a warning, as it doesn't really affect the operation of the group
typedef bool (*CC4Group_ReadSetupCallback)(void* const arg);
// this is the main API struct of cc4group
// it contains all available methods and constants
typedef struct {
struct {
int Take; // cc4group will free the data when its not needed anymore; e.g. in the destructor or when setting the file's data again
int Copy; // cc4group will copy the data to use it; the original data is untouched and needs to be freed by the caller whenever desired
int Reference; // cc4group will use the data as is (i.e. store the pointer and use it); the caller must guarantee it's validity throughout the groups lifetime (or until the file's data is set to a new pointer) and needs to take care of freeing it afterwards
} const MemoryManagement;
struct {
// all temporarily uncompressed data will be hold in-memory (RAM)
// this is the preferred strategy because it is very fast and relies only on RAM
CC4Group_TmpMemoryStrategy Memory;
// all temporarily uncompressed data will be stored in a memory-mapped temporary file that is being created in the current working directory
CC4Group_TmpMemoryStrategy File;
// if the uncompressed data is smaller than 500 MB in-memory is tried first
// if in-memory fails (e.g. because there is not enough RAM) or the data's size is greater than or equal to 500 MB, it will fall back to the file strategy
// this is the default strategy and should be appropriate for almost any case
CC4Group_TmpMemoryStrategy Auto;
} const TmpMemoryStrategies;
// sets the global temporary memory strategy to be used for storing the uncompressed data of a group
// NOTE: this is a static method (i.e. it is used without any object)
// this will affect all open*-calls issued after calling this function
// either one of the pre-defined strategies from above or any custom strategy can be specified
// a custom strategy can be used by passing an appropriate function pointer
// for details, look at the description of CC4Group_TmpMemoryStrategy
void (*setTmpMemoryStrategy)(const CC4Group_TmpMemoryStrategy strategy);
// allocates and initializes a new group object, like the operator new
// NULL may be returned if the memory allocation (malloc) fails; in this case errno contains additional error information
CC4Group* (*new)(void);
// destructs the group object and frees all memory used by group, like the operator delete
void (*delete)(CC4Group* const this);
// after creating a group with new, exactly one of create, openExisting, openMemory, openFd, openFilePointer or openWithReadCallback must be called before any other action on the group object may be executed (except delete)
// initializes the group to be a fresh, empty group
bool (*create)(CC4Group* const this);
// opens a group on the filesystem; path may point to a directory inside a group; path "-" can be used to read the group from stdin
bool (*openExisting)(CC4Group* const this, const char* const path);
// opens a group that is stored entirely in memory
// see the description of MemoryManagement to know if and when data has to be freed by the caller
// if the lazy mode is not used, the data can be freed immediately after this function returns
// HINT: the lazy mode is actually not implemented yet, but at least I can't forget to mention it here anymore when it's actually done
bool (*openMemory)(CC4Group* const this, const void* const groupData, size_t const size, int const memoryManagement);
// opens a group through a file descriptor
// the file descriptor must have been opened with read access; also be aware that the file must be opened with binary mode on windows
bool (*openFd)(CC4Group* const this, int fd);
// opens a group through a FILE*
// the file must have been opened with read access; also be aware that the file must be opened with binary mode on windows
bool (*openFilePointer)(CC4Group* const this, FILE* fd);
// opens a group and calls the callback to get the group data
// initCallback is called before readCallback is called and deinitCallback is called after all read operations are done
// initCallback and deinitCallback may be NULL if they should not be used
bool (*openWithReadCallback)(CC4Group* const this, CC4Group_ReadCallback const readCallback, void* const callbackArg, int const memoryManagement, CC4Group_ReadSetupCallback const initCallback, CC4Group_ReadSetupCallback const deinitCallback);
// saves the current in-memory state of the group as a compressed c4group to disk
// fails if the given path already exists
bool (*save)(CC4Group* const this, const char* const path);
// same as save, except that an already existing group will be overwritten
// be careful, any existing file will be overwritten in-place. if any error occurs after opening the target file (e.g. out of memory, a program or system crash), the previous contents will be lost
bool (*saveOverwrite)(CC4Group* const this, const char* const path);
// extraction to disk
// extracts the complete group contents recursively into a newly created directory named by targetPath
// the directory itself must not exist, but the containing directory must exist. otherwise an error will be generated
bool (*extractAll)(CC4Group* const this, const char* const targetPath);
// extracts only a single file or a sub directory of the group denoted by the entryPath to the targetPath
// the containing directory of the targetPath must exist, but the final targetPath must not exist. otherwise an error will be generated
bool (*extractSingle)(CC4Group* const this, const char* const entryPath, const char* const targetPath);
// retrieval of metadata about the stored files and directories
// stores all metadata known about the entry denoted by path into the CC4Group_EntryInfo struct pointed to by info, similar to stat
// an empty path "" or NULL will retrieve information about the root directory
bool (*getEntryInfo)(CC4Group* const this, const char* const path, CC4Group_EntryInfo* const info);
// retrieves all metadata like getEntryInfo of all files and directories inside the directory denoted by path, similar to ls
// a pointer to a dynamically allocated array of all CC4Group_EntryInfo structs will be stored in infos
// the amount of infos (and thus the amount of files in the directory) is stored in size
// the caller must free the pointer stored in infos when it is not needed anymore
// an empty path "" or NULL will retrieve information about the root directory
bool (*getEntryInfos)(CC4Group* const this, const char* const path, CC4Group_EntryInfo** const infos, size_t* const size);
// data retrieval and manipulation
// stores a pointer to the read-only data of the file denoted by entryPath in data, and the size of the data in size
// the group owns the data pointed to. the pointer is valid until the group destructor is called or the data is changed through setEntryData
bool (*getEntryData)(CC4Group* const this, const char* const entryPath, const void** const data, size_t* const size);
// overwrites the data of the file denoted by entryPath with data indicated by data and size
// see the description of MemoryManagement to know if and when data has to be freed by the caller
bool (*setEntryData)(CC4Group* const this, const char* const entryPath, const void* const data, size_t const size, int const memoryManagementMode);
// group metadata handling
// the following set* functions set specific metadata of files or directories denoted by path
// in case of directories, the metadata will be set recursively to all contents as applicable if recursive is true
// the root directory can be addressed by passing NULL or "" as path
// files and directories
bool (*setCreation)(CC4Group* const this, int32_t const creation, const char* const path, bool const recursive);
// directories only
bool (*setMaker)(CC4Group* const this, const char* const maker, const char* const path, bool const recursive);
bool (*setOfficial)(CC4Group* const this, bool const official, const char* const path, bool const recursive);
// files only
bool (*setExecutable)(CC4Group* const this, bool const executable, const char* const path);
// modifying the group
// creates an empty directory inside the group at the place with the name as denoted by path
// the parent directory (if any) must exist alredy
bool (*createDirectory)(CC4Group* const this, const char* const path);
// creates an empty file inside the group at the place with the name as denoted by path
// the parent directory (if any) must exist alredy
// the file contents can afterwards be set with setEntryData
bool (*createFile)(CC4Group* const this, const char* const path);
// renames, and possibly moves, the entry (file or directory) denoted by oldPath to newPath
// the parent directory of newPath (if any) must exist already
// if only strictly renaming is desired, oldPath and newPath must contain the same full path to the parent directory, with only the final name being different
bool (*renameEntry)(CC4Group* const this, const char* const oldPath, const char* const newPath);
// deletes the entry (file or directory) denoted by path
// directorys are only deleted when recursive is true, which has the effect of recursively deleting all contents (if any) and then deleting the directory
// if recursive is false but the path ends up being a directory, an error is generated
bool (*deleteEntry)(CC4Group* const this, const char* const path, bool const recursive);
// error information
// all error information is only meaningfully defined after any error ocurred (indicated by any method returning false) and always describes the last error that ocurred
// returns a human readable error message, including the "error causer", an interpretation of the error code, the internal method in which the error occured ("error method") and possibly also the error code
// the returned error message pointer is valid until the next call to getErrorMessage is issued or the group is destructed
// this method may return NULL if memory allocation for the formatted message fails; so NULL should be considered as out of memory
const char* (*getErrorMessage)(CC4Group* const this);
// returns the error code of the last error; interpretation of this code depends on the internally used function that caused the error
int32_t (*getErrorCode)(const CC4Group* const this);
// returns the internal method name in which the error ocurred
const char* (*getErrorMethod)(const CC4Group* const this);
// returns human readable information during which operation the error occured
const char* (*getErrorCauser)(const CC4Group* const this);
} const CC4Group_API;
// in the case that the cc4group API shall be loaded dynamically at runtime, the same global instance of the struct can instead be accessed by loading the symbol named "cc4group" and casting it to CC4Group_API*
// it contains all necessary and available data of the API and is the only symbol that needs to be loaded
// also you have to #define CC4GROUP_DYNAMIC_LOAD before #include-ing this header file in that case
// for drop-in support for code that is written for the API linked at compile-time (or just for convenience) store the loaded pointer in a global variable, say "cc4group_dyn" and then #define cc4group as (*cc4group_dyn)
// c4cat_dyn is a working example of this concept
#ifndef CC4GROUP_DYNAMIC_LOAD
// access to all cc4group-methods and constants must be made through this global instance of the API struct
extern CC4Group_API cc4group;
#endif
#ifdef __cplusplus
}
#undef this
#undef new
#undef delete
#endif
|