abstraction layer for saving binary data
abstraction layer for saving binary data
i frequently encounter the problem that i cannot store all the data (usually lists of structs) i need for my calculation in main memory and have to swap it to a file on my harddisk.
since the process of reading and writing data is always the same, one could write a piece of code that can do that for every possible type.
One idea is to overload the []-operator for random access, or to construct stl-like iterators. This will only work for non-nested data structures, but maybe there is a way to do it for arbitrary data too.
i am working on a solution for "list <struct whatever>" and will submit it when it works :-)
if you have suggestions how to design the more general case, tell me!
robert
24 Sep 05, 5:24PM
Hi,
Thats a great idea. If I understand you correctly, then this shouldn't be too hard. Using the [b:7a32f5d626]sizeof[/b:7a32f5d626] operator, you will know exactly how big a binary variable is, and therefore where in a file any particular item will be into that file.
Something like (sorry I'm used to the C style for files!)
template <class T> class binaryio { FILE* stream; binaryio(char* filename) { stream = fopen(filename, "b"); } T operator [](int i) { fseek(stream, i * sizeof(T), 0); //<- Can't remember the syntax here. T data; fread(&data, 1, sizeof(T), stream); return data; } };
And something smarter to write back to random locations..
Is that what you're think?24 Sep 05, 5:38PM
yes, that's pretty much the same what i had in mind. but this will work only if the data in <class T> is stored sequentially. For simple types (int, double,char) this is definitely true, but what about complex classes, or classes with complex-type members ?
24 Sep 05, 5:59PM
This approach will work the same as any C++ container, such as vector or list etc.
Your concern shouldn't about how the data inside a class is stored. In truth that's none of your business. What you're interested in doing is saving the contents of the class and retrieving it with (I suspect) a simple cast that put it back into the same form as it was originally. What I think though you're getting at is the use of the [b:472263e98b]new[/b:472263e98b] operator within that class (say).. Thus two variables of class X, could uses substantially different amounts of dynamic memory, because variable 1, allocated 100k of memory, while variable 2 allocated 200k of memory.
This is a much more complex problem and one you can't solve unless you know exactly how that class is configured. The best you can do is insist that your these 'complex' data types have a common method that ensure that they can be forced to save data and reload it on que.
If you go down that line. Then you probably want to structure your binary file in the following form:
------
[data for each data type] * N
------
[key to additional dynamic data]
------
[additional dynamic data] * N
Hope that make sense. But here how I would image it working, with a request for some data
1) You extract the contents of the dynamic data type given the index, using the overload [] operator we discussed above.
2) You call the 'load' method within that newely reloaded class/struct etc, that reloads any additional dynamic data, give a file offset you provide. The offset you get from the key section (or if you prefer you store it immediately after each individual data type).. This simply means on saving you need to know how many data types you have, so you know how much space to leave for each section, with the additional dynamic bit on the end, since you've no idea how big this is until you get to saving..
Thats the only way I can see this will work..
Hope that helps..
25 Sep 05, 1:56PM
hi,
i thought about what you wrote and concluded that an efficient implementation
must define at least two basic classes: one class that will be associated
to the file and define a simple interface for reading and writing and another
class that will put some constraints on the type of element. Furtheron it might
be possible to inherit the first class from "vector" and thus provide a much
more powerful tool. Have a look at the pieces of code below.
robert
//that's the save-binary-file-interface class SaveBinary : std::vector { public SaveBinary * read(int index); public SaveBinary * write(int index); //lots of member functions have to be overwritten } class SaveableType { //force the user to define that function protected virtual int load(int) = 0; }
//that's what a user is going to do with the interface class MyClass : SaveableType { //define the "load" function int load(int position) { //code } //more code } int main() { MyClass mc1,mc2; SaveBinary<MyClass> sb; mc1.do_something(); mc2.do_other(); sb.push_back(mc1); sb.push_back(mc2); //probably you will NOT do this, but some other stl-algorithm //might perform something really elegant on "b" sort(b); return (0); }
27 Sep 05, 11:45AM
Thats seems the way to go..
You might want to create a small abstraction to your class. Not sure if they're is anything really neat you can do with templates to solve this. But you essentially want to be able to apply this approach to constant sized variables (classes included) without being dependent on the SaveableType and also to those that dynamically allocate memory..
I also wondered if you could so something dangerously clever by redefining the new operator, so you could catch memory being allocated by a class and allocate disk in your file accordingly. Given that every good class should have a copy constructor, which make an exact copy of anything it finds, I wonder if it might be possible to hijack something in there.. Mega long shot. And probably overtly complex..
Good luck
Will.
5 Dec 14, 6:48AM
i frequently encounter the problem that i cannot store all the data (usually lists of structs) i need for my calculation in main memory and have to swap it to a file on my harddisk. since the process of reading and writing data is always the same, one could write a piece of code that can do that for every possible type. One idea is to overload the []-operator for random access,
__________________________________________
[url=http://www.pass-4sure.us]pass-4sure.us[/url]