Keeping track of on/offs without index-only tables?

DiskuteraPurely Programmers

Bara medlemmar i LibraryThing kan skriva.

Keeping track of on/offs without index-only tables?

Denna diskussion är för närvarande "vilande"—det sista inlägget är mer än 90 dagar gammalt. Du kan återstarta det genom att svara på inlägget.

1timspalding
maj 4, 2011, 9:50 pm

I just posted a question on Stack Overflow.

Help LibraryThing on Stack Overflow: Keeping track of on/offs without index-only tables?
http://stackoverflow.com/questions/5891789/keeping-track-of-on-offs-without-inde...

The case is whether or not a book has been indexed. The indexed byte is currently stored in the main book table. The UPDATES are slow--making indexing across a group likely to send us into replication arrears.

We are helped by overall timestamps for the last time a user was indexed and the last time any change was made. But if the changed is larger than the indexed, it must descend to the books, and then update them.

2bvs
sep 11, 2011, 10:23 pm

Not enough information so I am just guessing but may be you can use a bloom filter? It is a probabilistic structure. Basically use multiple hash functions on an item to create N different indices into the bloom filter bitmap. If the item is *not* a member of the set, at least one of N bits will be clear. if all N are set, this could be a false positive and you must lookup the definitive version (usually slower to get there). So in your case you can still keep the indexed by in the main book table but only need to hit is when the filter has a positive hit. Check out the wikipedia article.

3PaulFoley
sep 12, 2011, 1:15 am

You index all the books belonging to a particular user at the same time, right? Store a number for each book, and for the user: when the former is greater than the latter, the book needs indexing -- you only need one update on the user table to flag them as indexed.