
www.Usenet.com
| <-- __Chronological__ --> | <-- __Thread__ --> |
Nice alliteration, eh? :) I'm contemplating a pretty complex database design for an application I'm developing. The database system needs to replicate across several sites, needs to be fast, support transactions, be fast, etc., and will consist of several databases in a similar format. It will be used from a (well, several) single-process, multithreaded application(s), probably using DB_PRIVATE where applicable. I want to do the whole replication bit, with a log-only server and a server for backups, etc. The structure of each database consists of a series of records, probably indexed by number. Each record is itself a hash table that allows duplicate keys. Value types are variable. There could be hundreds of thousands of records, even millions. One possible use for this DB actually would require hundreds of millions of records. Each record could consist of 10 to 100 key-value pairs... perhaps more as time goes on. Each database may be indexed by one or more values of the record's hash table. Whenever there are multiple keys in an indexed field, all combinations of the multiple-key values are added into the index. I have come up with three possible design approaches, all of which have significant drawbacks: 1) Use a single recno database with the entire hash table serialized as the value, and build all indexes off of this. Since the size and type of the hash table values are variable, the entire table will probably have to be read and written at the same time. Pros: - Locking is less of a concern, because the whole record is being locked for reads and writes... if I'm understanding correctly, it's one lock per page. Each record will probably span many pages unless I make them pretty large. This simplifies the user interface because you don't have to worry about, for instance, accessing the keys in the sub-hash-table in the same order every time to avoid deadlocks, and things like that. Cons: - Can't read or write single elements in the sub-hash-tables without pulling out the whole record and deserializing it. - Everything I've read says large keys or data items are to be avoided for various performance reasons. I believe it. - Indexing is a bit of a drag. Since DB->associate doesn't let the callback generate multiple keys for a given record (hint, hint! ;), I would have to come up with a routine that did the indexing in a safe and appropriate manner. 2) Use a big hash table whose keys consist of the record number and the sub-hash-table key. Pros: - Fast access to individual elements. - Can use all existing Berkeley DB cursort stuff to browse around the sub-hash-tables, as well as browsing between records. - Small key and data values, which is good for performance by all accounts. Cons: - Deadlocks, deadlocks, deadlocks. Sometimes records are dealt with piecemeal, that is, single records or small groups of records are read or written at a time; sometimes the whole record is dumped or deleted or inserted. This means that anytime I *do* anything, I need to make sure that I access everything in the same order every time... what a pain. If there was a way to make the locking scheme less granular without interfering with replication and the transaction subsystem, I would make the system lock by logical record (which in this case would consist of all records of a sub-hash-table). But I don't think that's really possible. Another option to get around this would be to acquire read locks or write locks on the whole record by scanning over it with a cursor or something. But I don't think I can believe that would be fast... unless someone can convince me otherwise? - Indexing is a pain again. This time there is no chance of using DB->associate... I'd need to be able to make one index row correspond to many physical database rows. So this means I'm rolling my own indexing, which presents many of its own interesting problems, not the least of which that modifying a value in a sub-hash-table might trigger a read of many other values in that same logical record, complicating the lock thing even more. 3) Use hundreds of thousands of separate DBs, in one file per logical database, one DB for each logical record. Pros: - I don't really know, it was just a random idea. I don't think it would actually help. Cons: - Opening and closing all those DB handles can't be good, although it would probably make an interesting stress test for the multi-db code. - Dare I say: indexing? ....so, I'm leaning towards (2) right now. But the locking thing really bothers me... is there a way to set up a user-defined locking strategy and yet still take advantage of transactions and replication? To further complicate things, I plan on implementing some type of partitioning by date, so that we can purge old records on a routine basis by simply dropping databases. Anyway, if anyone has any input or ideas, please let me know. - D
| <-- __Chronological__ --> | <-- __Thread__ --> |