Saturday, February 16, 2008

Still no CTP...

So... We've failed with delivering DataObjects.Net 4.0 CTP yesterday. It was actually predictable from mid-January - I planned to spent these 1.5 months after NY on completion of Xtensive.Storage assembly, but almost all this time was spent on refactoring of Xtensive.Core. There is some info on our current achievements below. Concerning the delivery date \ development: we're taking 3 more weeks, so next checkpoint is 7 of March . Not sure if it will be enough, but as I see, quite limited time helps to increase development speed. In any case I'll publish some info on what we're working \ worked each week here.

Other news:
- We're launching a new web site on the end of the next week (most likely - on weekend). Hopefully you'll like it ;)
- Today \ tomorrow I'll publish a nice tool FiXml - XML documentation post-processor (MSBuild task). It handles XML comment inheritance (<inheritdoc> tag processing), supports overriding inherited documentation parts and allows to use parts of other XML comments (<see cref="..." copy="..." /> tag). Quite helpful, if your project has large set of classes, but there is relatively small set of common base classes and interfaces (exactly our case ;) ) - so basically you should describe just their methods, rather then methods of all the descendants \ implementors (probably, 70% less work). Reusable comment parts are also a nice addition - e.g. normally any constructor is described as "Initializes a new instance of CurrentType class". Although short, the description is always different (type names differ from class to class). We use for this <see cref="ClassDocTemplate" copy="true" /> now. P.S. We've just started to use it internally, so probably it isn't solid rock stable.

Now some info on our current achievements: as I've mentioned, we were refactoring Xtensive.Core (now we're doing the same with Xtensive.Indexing). In particular, we updated:

1. Associate providers framework. We can associate late-bound "matching" handlers with any data types or their pairs and access them quite efficiently. The background is:
- We need our own Comparers (since built-in ones don't work for all data types - funny, but .Net doesn't implements IEquattable<T> e.g. for enums; moreover, we need configurable comparers utilizing provided comparison rules (~ a set of of directions and cultures), but IComparable<T> doesn't imply any possibility to configure it.
- Converters - they're used e.g. to automatically convert item values to measure values.
- Arithmetic - instances of these classes handle arithmetic operations for data types. They're used to calculate measures as well.
- Hashers - long hash calculators for Bloom filters. A single built-in Int32 hash allows to maintain just quite inefficient 512Mb Bloom filter working with 4G keys at max, which is not acceptable. Currently we can generate arbitrary number of long hashes - this allows us to maintain Bloom filters with any desirable (i.e. much higher) efficiency.
- Serializers - strongly typed serializers handling serialization much better. Although random access to tree doesn't require fast serialization (desktop HDD seek time is ~ 8..12ms), index merging (implies sequential read \ write) and SSDs (solid state drives) makes us to think about this.

So basically, we've provided an approach to locate some related handler for any data type(s). Examples:

Getting comparers:
- ComparerProvider.Default.GetComparer<int>(); // Returns AdvancedComparer<T>
- ComparerProvider.Default.GetComparer<int?>();
- ComparerProvider.Default.GetComparer<Direction>(); // comparer for enum
- ComparerProvider.Default.GetComparer<int[]>(); // works ;)
- ComparerProvider.Default.GetComparer<List<int>>(); // works ;)
- ComparerProvider.Default.GetComparer<IEnumerable<int>>(); // works ;)
- ComparerProvider.Default.GetComparer<IEnumerable<Pair<int[],int[]>>>(); // even this ;)

Getting asymmetric comparer comparing IEntire and Tuple:
- Func<IEntire<Tuple>, Tuple, int> assymmetricCompare = ComparerProvider.Default.GetComparer<IEntire<Tuple>>().GetAssymmetric<Tuple>();

Getting system comparers (based on IEquatable<T> & IComparable<T> impl.):
- ComparerProvider.System.GetComparer<int?>();

Getting comparers with applied rules :
- ComparerProvider.Default.GetComparer<Pair<int,string>>().ApplyRules(
...... new ComparisonRules(Direction.Negative, // Default (common) rule
...... Direction.Positive, // rule for int
...... new ComparisonRule(Direction.Positive, new CultureInfo("Ru-ru")))); // rule for string
Gets a comparer for pairs of (int, string) inverting the common direction to negative; int values are compared as usual; strings are compared in "Ru-ru" culture.

Hashers:
- HasherProvider.Default.GetHasher<int>();
- HasherProvider.Default.GetHasher<int?[]>(); // ~ The same sequences may go further

Converters:
- ConverterProvider.Default.GetConverter<int, double>();
- ConverterProvider.Default.GetConverter<int?, string>();
// our AdvancedConverter has IsRough property (means loss in precession may occur on conversion)
- ConverterProvider.Default.GetConverter<double?, int?>().IsRought==true

Notes:
- You're always getting strongly typed handler - i.e. no any boxing \ cast will occur on its usage. So this is fastest handler you may expect.
- Subsequent calls to GetXxx<T, ...> methods are resolved by cache. So getting first instance of some typed handler can be relatively slow (as reflection), but subsequent requests for it are extremely fast (we use our our TypeBasedDictionary for this - it resolves its items by just one virtual method call reading the static field - i.e. it is faster then Dictionary + thread-safe).

As you may expect, everything is quite flexible - i.e. you can write your own data types with your own handlers. Moreover, you can use your own XxxProvider instances (i.e. not default) with overridden search logic. Why this is important? Well, this means we don't bind ourselves by supporting just certain data types. We want to have an ability to support generally any type further; moreover, we want to work with any type natively - i.e. without boxing.

Performance we have here now:
- Simple comparers (e.g. for int) handle up to 100M comparisons per second. Tuple \ IEntire comparer can do ~ 2M to 4M comparisons per second (2M when equal; 4M when mostly different) for 5-field tuples on a single CPU thread on 2GHz Intel Core 2 Duo. It relies on other comparers (field comparers are provided for it by the provider it is bound to), so this result means other comparers should also be fast enough (actually they are normally as good as built-in .Net comparers, but understand comparison rules). This allows us to do up to ~ 300K index seeks per second on dual core CPU for 1M item index, and 200K - for ~ 1G item index (implying the data is in RAM cache), which seems quite good.
- Hashers, serializers and arithmetics behave the same (50M...100M operations per second for simple data types; ~ 10-20 times less for Tuples).

Why we care so much about this: as you see, we use a bit different approach in comparison to other databases: we store everything natively in RAM, i.e. not as serialized byte[] sequences or strings. This allows us to:
- Pass the data to the higher layers "as is" - i.e. without any transforms and intermediate objects in heap. As Tuples. Less RAM \ intermediates = better speed.
- Avoid conversion to binary \ string form on seeks as well - i.e. if you keep pages in binary form, you normally must convert the value you're looking for into the same form, and use byte-by-byte comparison. In our case we don't need this conversion, but the comparison is more complex for implementation; although more flexible as well (rules can be handled with ease).

2. Ray & Range refactoring. We've added new data type: IEntire<T>. It describes a typed n-dimensional vector with possibility to have an infinities (positive or negative) as values of its dimensions, or values with positive \ negative infinitesimal shift. Its implementors (Entire<T> for struct data types and TupleEntire<T> for tuples) are provided by nearly the same fashion as e.g. comparers (by Entire<T>.Create(...) methods). They allow to point on index item range much more precisely - earlier we didn't handle infinitesimal shifts (i.e. ~ it was possible to say ">=", but not ">" - at least for value of desirable dimension).

3. We've changed indexing interfaces & their simple implementations in Xtensive.Core (SortedIndex, etc.). Earlier unique and non-unique index interfaces were more different. Now it's implied that there will be just one non-unique index implementation - NonUniqueIndex acting as wrapper over internal unique index. The big mistake was that our BPlusTree implementation earlier supported both interfaces natively - such implementation doesn't allow to handle big non-uniqie tree serialization \ merging well (i.e. without large amount of RAM) when the amount of non-unique values is high.

4. We've completely rewritten Tuples. Current implementation is much better from the point of interface, and simply better from the point of memory usage \ speed.
- No more field names in Tuples. We noticed we anyway never use them and access fields by index rather then name (since names & mapping are anyway available on higher layers of abstraction, e.g. in RecordSets \ Records). But names were making us to provide different classes for tuples with the same structure, but different names, which isn't good, and now its gone.
- Earlier we were able to compress (i.e. transparently store a set of smaller fields in a single "compressing" field of underlying Tuple class) only the boolean fields and null \ available flags. Now we compress all the fields with the size smaller then the size of int (i.e. bool, byte, sbyte, short, etc.).

Some reuslts (DummyTuple<T> emulates old DataObjects.Net behavior (array-based storage)):

Creation, Operations\s. = Creations\s. for 10-field Tuples:
- Tuple memory usage (1000000): Time: 449,0441ms, Memory: 91999,408kb, Operations: 2,227 M/s.
- DummyTuple memory usage (1000000): Time: 1686,0642ms, Memory: 136001,304kb, Operations: 593,097 K/s.

So Tuple wins ~ 30% even on 10-field objects. Most of secondary indexes needs just 2-field Tuples. In this case they take 60% less memory - e.g. we need 20 bytes for each old collection-to-collection relationship item in index now (8 bytes for 2 integers, 4 bytes for flags; 8 more bytes is standard .Net heap object header); using array for the same purpose implies it will take 12+16*2 = 44 bytes; moreover, it will work slower:

Accessing integer field (typed, i.e. w\o boxing):
- Tuple.SetValue<T>: Operations: 58,104 M/s.
- Tuple.GetValue<T>: Operations: 34,321 M/s.
- DummyTuple.SetValue<T>: Operations: 8,523 M/s.
- DummyTuple.GetValue<T>: Operations: 22,109 M/s.
// Fastest way to deal with Tuples:
- ITupleFieldAccessor<T>.SetValue: Operations: 138,634 M/s.
- ITupleFieldAccessor<T>.GetValueOrDefault: Operations: 166,350 M/s.

Note: non-virtual empty method call: 230 M/s, virtual empty method call: 180 M/s. So performance we have here now seems very good as well.

5. We've made a set of interface changes related to performance.

We discovered calling a method by delegate is ~ 2 times faster then calling interface method. So in the vital parts of the framework (comparers, hashers, serializers, arithmetics) we migrated to "delegate representation of interface": e.g. when you ask for comparer, you get AdvancedComparer<T>, which provides a set of read-only delegate fields mapped to methods of underlying interface (IAdvancedComparer<T>). Funny, but this increases the speed of comparison for simple types by ~ 20%. There is AdvancedComparerStruct<T> also - if some object needs to store some AdvancedComparer, it's the best data type to use for this. Nearly the same as AdvancedComparer<T>, but a struct (there is an implicit conversion to it from AdvancedComparer<T>), so it requires no additional reference resolution.

Switching from IAdvancedComparer<T> \ IComparer<T> \ IEqualityComparer<T> to AdvancedComparer<T> is actually noticeable internal interface change. The same is correct about hashers (i.e. there is IHasher<T>, Hasher<T> and HasherStruct<T>), etc.

We use delegates to call genetic methods faster (e.g. generic methods for comparing Tuple - like TupleComparer.CompareStep<TFieldType>(...)). Virtual generic method is the slowest one - it is nearly 3 times slower then the call of interface method, since it needs type-to-address resolution (~ some internal dictionary lookup). But when it is bound to a delegate, it is already resolved, and thus called ~ as fast as calling a regular method.


Ok. It seems that's enough for now (for Russians it takes more time to write in English ;) ). On the next week I'm going briefly describe why we think the RDBMS engine we're building will be better when the famous existing ones. Transparent DataObjects.Net-based interface for it is one, but probably not the best its benefit ;) Or at least, I'll point on design mistakes most of existing databases have now, the effect of their presence, and how we're going to handle them.

Finally, FiXml will be published shortly.

2 comments:

  1. Hello Alex,

    it's so easy to put some info in this block. Now we customers know what's going on. Continue this information policy.

    Best regards Tom
    MSE-iT

    ReplyDelete
  2. Quick status update:
    - Website: we will be working on it for one week more. I want to have few more features there, + as it was discovered, materials need further proofing.
    - FiXml: will be published this week My intention to publish it earlier was based on report of developers, but after looking on results by my own I asked to make few more improvements; plus there were some bugs related to complex case handling (renaming generic type parameter in descendant, etc.).
    - Everything else: I'm writing separate posts now.

    ReplyDelete