CrossLoop looks cool
As a follow-up to my last rant, I just installed CrossLoop's new remote support client. This is pretty cool because it lets you be a gun-for-hire in the remote assistance space. Updates to follow, but you can ask for help below.
All the (developer) news that's unfit to read...
As a follow-up to my last rant, I just installed CrossLoop's new remote support client. This is pretty cool because it lets you be a gun-for-hire in the remote assistance space. Updates to follow, but you can ask for help below.
Posted by
IDisposable
at
5/08/2008 11:11:00 AM
1 comments
Links to this post
Labels: rants, remote support
In Joel Spolsky's latest tirade, we learn that:
Simply put, to Joel everyone else does everything wrong. He, meanwhile, is milking a not-even-ASP bug-track application (he probably couldn't figure out how to write a ColdFusion version). Or wrapping a proprietary DynamicDNS-like wrapper around invented-elsewhere VNC client (and somehow avoiding the far more evolved UltraVNC variant). Or writing the worst little CMS monstrosity ever created. Joel is bitter about the cost of CS graduates and blames Google and Microsoft (curiously leaving Yahoo out of the mix) for driving it up.
The thing is, he's not entirely wrong... when a company can't figure out or invest in a stronger product, like @Task or Team Foundation, FogBugz isn't a bad compromise. When your dad needs IT support behind his random cable-modem IP, CoPilot isn't a bad tool and far cheaper to use than WebEx or GoToMyPc. And well, the best I can say about CityDesk is that my technophobe wife can figure it out and Joel isn't pushing it.
Seriously, though... what have you done lately that really is worth bragging about, Joel? It's been how many years since FogBugz came out and it still can't track my billable time or hold my requirements docs? It's been years since CityDesk saw a new release, and CoPilot was written by a few interns...over a summer. What has your (self) vaunted managerial skill produced lately. You can only rest on your Excel project manager laurels for so long.
Posted by
IDisposable
at
4/30/2008 11:42:00 PM
2
comments
Links to this post
Labels: rants
I just noticed a really cool article on using Microsoft's new SQL Server Data Services which explains how to use cURL at the command line to talk to the SSDS RESTful interface.
If you've never heard of cURL, it is similar to wget in that it allows you to make HTTP requests of any web service. It can handle all the standard verbs (GET,PUT,POST,DELETE) and also supports all those lovely redirections, security and all that other nonsense. It's great for crufting up any batch/command file that could do all sorts of things as well as to ping-test REST services.
If you've never heard of SSDS, it is similar to the Amazon SimpleDB. It offers the ability to push a database out to the public cloud and allow access from web applications, thick clients or whatever. What differentiates it from Google's APE-based access to BigTable or Amazon's S3/SimpleDB setup is that both of those systems are tuple-based (or name-value-based) non-relational databases. The SSDS stuff, on the other hand, is a Linq-based. This makes querying MUCH simpler to do.
The killer feature to me is that SSDS doesn't make you (the developer) worry about consistency. With SimpleDB or BigTable, the provider only guarantees "eventual consistency". This means that the changes you make will eventually be propogated through the Amazon/Google cloud. During the time the change was post, but not yet propogate your clients may see stale data which makes these services useable mostly for rarely-changed data.
SSDS doesn't have this restriction. Once your call is complete any access will result in the commited data being returned. This is a much simpler model to program against and it puts the replication issues squarely on the database server/service where it belongs. What remains to be seen is if Microsoft will really be able to scale this out reasonably.
Posted by
IDisposable
at
4/15/2008 02:30:00 AM
0
comments
Links to this post
Labels: cURL SQL Server Data Services, REST, SQL Server, SQLServer, SSDS
A while back, we were looking for an easy way to count "hits" against content in a CMS-like system. For the sake of discussion, pretend we have a table called ContentEntry that represents the content. We decided we wanted to track the hits by-hour against a particular content entry, so that's the ContentEntryPlusPlus table on the right. The foreign-key is from ContentEntryPlusPlus.ContentEntryID to ContentEntry.ID.
Now the trick is to insert the row if needed for a particular entry and time-slot then increment the Hits column. The simplest thing to do is to is check to see if the row exists, insert it if not, then do the update. Something like this to find the row's ID:
SELECT TOP 1 ID FROM dbo.ContentEntryPlusPlus WHERE ContentEntryID = @ContentEntryID AND TimeSlot = DateAdd(hh, DateDiff(hh, 0, GetUtcDate()), 0))
Then we have to insert the row if missing:
INSERT INTO dbo.ContentEntryPlusPlus(ContentEntryID, TimeSlot) VALUES (@ContentEntryID, DateAdd(hh, DateDiff(hh, 0, GetUtcDate()), 0)) SELECT Scope_Identity() AS ID
Then we do the update like this:
UPDATE dbo.ContentEntryPlusPlus SET Hits = Hits + 1 WHERE ID = @ID -- from above SELECT or INSERT's Scope_Identity()
Obviously we have to do this inside a transaction or we could have issues and I hate multiple round-trips, so we crafted this cute statement pair to insert the row if needed and then update. Note the use of INSERT FROM coupled with a fake table whose row count is controlled by an EXISTS clause checking for the desired row. This gets executed as a single SQL command.
INSERT INTO dbo.ContentEntryPlusPlus(ContentEntryID, TimeSlot) SELECT TOP 1 @ContentEntryID AS ContentEntryID ,DateAdd(hh, DateDiff(hh, 0, GetUtcDate()), 0) AS TimeSlot FROM (SELECT 1 AS FakeColumn) AS FakeTable WHERE NOT EXISTS (SELECT * FROM dbo.ContentEntryPlusPlus WHERE ContentEntryID = @ContentEntryID AND TimeSlot = DateAdd(hh, DateDiff(hh, 0, GetUtcDate()), 0)) UPDATE dbo.ContentEntryPlusPlus SET Hits = Hits + 1 WHERE ContentEntryID = @ContentEntryID AND TimeSlot = DateAdd(hh, DateDiff(hh, 0, GetUtcDate()), 0)
This got tested and deployed, working as expected. The only problem is that every once in a while, for some particularly popular content, we would get a violation of the clustered-key's uniqueness check on the ContentEntryPlusPlus table. This was quite surprising, honestly as the code obviously worked when we tested it.
The only thing that could cause this is if the two calls executed the inner existence-check simultaneously and both decided an INSERT was warranted. I had assumed that locks would be acquired, and they are, for the inner SELECT, but since there are no rows to when this is executed, there are no rows locked, so both statements will plow on through. So, I just had to add a quick WITH (HOLDLOCK) hint to the inner SELECT and poof it works.
So, the moral of the story? You can't hold onto nothing...
The final version is:
INSERT INTO dbo.ContentEntryPlusPlus(ContentEntryID, TimeSlot) SELECT TOP 1 @ContentEntryID AS ContentEntryID ,DateAdd(hh, DateDiff(hh, 0, GetUtcDate()), 0) AS TimeSlot FROM (SELECT 1 AS FakeColumn) AS FakeTable WHERE NOT EXISTS (SELECT * FROM dbo.ContentEntryPlusPlus WITH (HOLDLOCK) WHERE ContentEntryID = @ContentEntryID AND TimeSlot = DateAdd(hh, DateDiff(hh, 0, GetUtcDate()), 0)) UPDATE dbo.ContentEntryPlusPlus SET Hits = Hits + 1 WHERE ContentEntryID = @ContentEntryID AND TimeSlot = DateAdd(hh, DateDiff(hh, 0, GetUtcDate()), 0)
Posted by
IDisposable
at
4/10/2008 05:20:00 PM
0
comments
Links to this post
So, hash has been on my mind lately. No, not that kind of hash, or that kind either. First, there was last week, when I installed Internet Explorer 8 beta 1. I was reading the release notes and was amazed to find that # (you know, octothorpe, pound sign) was not considered part of the URL by this version. Thus you can't link directly to a named element on a page. Eeew!
Then today, Hugh Brown dropped a comment on my diatribe post about value-types, reference-types, Equals and GetHashCode. The post has been live for many months now, and has quite a bit of Google juice. Until now, nobody has ever quibbled with the stuff I wrote, but Hugh had some interesting observations.
In a minor part of his comment, he was surprised by the many overloads of GetHashCode that I suggest, wondering why I didn't just always expect callers to use the params int[] version. Quite simply, this is because by providing several overloads for a small number of arguments (5 in my example), I avoid paying the cost of allocating the array of integers and copying the values for each call to the CombineHashCodes. While this may seem like a trivial savings, remember that GetHashCode is called many times when dealing with HashTable collections and thus it is worth it to provide expedited code paths for the more common usages. Additional savings inside the CombineHashCodes method are garnered by avoiding the loop setup/iteration overhead. Finally, in optimized builds, these simpler method calls will be inlined by the compiler and/or JIT, where methods having loops in the body are never inlined (in CLR releases thus far). It is worth noting that the .Net runtime implementation does the same thing for System.Web.Util.HashCodeCombiner and System.String.Format.
The main body of his comment was that my code actually didn't return useful values. That concerned me a lot. Given his use of Python and inlined implementation, I had to write my own test jig. Unfortunately it confirmed his complaint. On the one hand, the values he was using to test were not normal values you would expect from GetHashCode. Normally GetHashCode values are well-distributed across the entire 32-bit range of an Int32. He was using sequential, smallish, numbers which was skewing the result oddly. That said, the values SHOULD have been different for very-similar inputs. I delved a little into the code I originally wrote and found that what's on the web page does NOT match what is now in use in the BCL's internal code to combine hash codes (which is where I got the idea of left-shifting by 5 bits before XORing). I think that my code was originally based on the 1.1 BCL but I'm not really sure.
In the .Net 2.0 version, there's a class called System.Web.Util.HashCodeCombiner that actually reflects essentially the same technique as my code, with one huge and very significant difference. Where I simply left-shift the running hash code by 5 bits and then XOR in the next value, they are doing the left-shift and also adding in the running hash, then doing the XOR.
You might be wondering why do the left shift in the first place. The simple answer is that by doing a left-shift by some number of bits, we preserve the low order bits of the running hash somewhat. This prevents the incoming value from XORing away all the significance of the bits thus far, and also insures that low-byte-only intermediate hash codes don't simply cancel each other out. By shifting left 5 digits, we're simply multiplying by 32 (and thus preserving the lowest 5 digits). Then the original running hash value is added in on more time, making the effective multiplier 33. This isn't far off from Hugh's suggestion of multiplying by 37, while being significantly faster in the binary world of computers. Once the shift and add (e.g. multiplication by 33) is completed, the XOR of the new values results in much better distribution of the final value.
I've updated my code in the Utilities library, and I'm going back to the original post to point to this post and the new code. So, I owe you one, Hugh...and maybe Microsoft does too because while I was reviewing their code in the newly released BCL source code, I found a very unexpected implementation. This is the snippet in question:
internal static int CombineHashCodes(int h1, int h2) {
return ((h1 << 5) + h1) ^ h2;
}
internal static int CombineHashCodes(int h1, int h2, int h3) {
return CombineHashCodes(CombineHashCodes(h1, h2), h3);
}
internal static int CombineHashCodes(int h1, int h2, int h3, int h4) {
return CombineHashCodes(CombineHashCodes(h1, h2), CombineHashCodes(h3, h4));
}
internal static int CombineHashCodes(int h1, int h2, int h3, int h4, int h5) {
return CombineHashCodes(CombineHashCodes(h1, h2, h3, h4), h5);
}Did you see the oddity? That implementation taking 4 values does its work by calling the two-value one three times. Once to combine the first pair (h1 and h2) of arguments, once to combine the second pair (h3 and h4), then finally to combine the two intermediate values. That's a bit different than doing what the 3-value and 5-value overloads use. I personally think it should have called the 2-value against output of the 3-value to combine the 4th value (h4). That would be more like what the 3-value and 5-value overload do. In other words, the method should be: internal static int CombineHashCodes(int h1, int h2, int h3, int h4) {
return CombineHashCodes(CombineHashCodes(h1, h2, h3), h4);
}
Perhaps they don't care that the values are inconsistent, especially since they don't provide a combiner that takes a params int[] overload, but imagine if I had blindly copied that code and you got two different values from this:
Console.WriteLine("Testing gotcha:");
Console.WriteLine(String.Format("1,2: {0:x}", Utilities.CombineHashCodes(1, 2)));
Console.WriteLine(String.Format("1,2,3: {0:x}", Utilities.CombineHashCodes(1, 2, 3)));
Console.WriteLine(String.Format("1,2,3,4: {0:x}", Utilities.CombineHashCodes(1, 2, 3, 4)));
Console.WriteLine(String.Format("1,2,3,4,5: {0:x}", Utilities.CombineHashCodes(1, 2, 3, 4, 5)));
Console.WriteLine(String.Format("[1,2]: {0:x}", Utilities.CombineHashCodes(new int[] { 1, 2 })));
Console.WriteLine(String.Format("[1,2,3]: {0:x}", Utilities.CombineHashCodes(new int[] { 1, 2, 3 })));
Console.WriteLine(String.Format("[1,2,3,4]: {0:x}", Utilities.CombineHashCodes(new int[] { 1, 2, 3, 4 })));
Console.WriteLine(String.Format("[1,2,3,4,5]: {0:x}", Utilities.CombineHashCodes(new int[] { 1, 2, 3, 4, 5 })));
Here is the revised version of the CombineHashCodes methods from my Utilities library
public static partial class Utilities { public static int CombineHashCodes(params int[] hashes) { int hash = 0; for (int index = 0; index < hashes.Length; index++) { hash = (hash << 5) + hash; hash ^= hashes[index]; } return hash; } private static int GetEntryHash(object entry) { int entryHash = 0x61E04917; // slurped from .Net runtime internals... if (entry != null) { object[] subObjects = entry as object[]; if (subObjects != null) { entryHash = Utilities.CombineHashCodes(subObjects); } else { entryHash = entry.GetHashCode(); } } return entryHash; } public static int CombineHashCodes(params object[] objects) { int hash = 0; for (int index = 0; index < objects.Length; index++) { hash = (hash << 5) + hash; hash ^= GetEntryHash(objects[index]); } return hash; } public static int CombineHashCodes(int hash1, int hash2) { return ((hash1 << 5) + hash1) ^ hash2; } public static int CombineHashCodes(int hash1, int hash2, int hash3) { int hash = CombineHashCodes(hash1, hash2); return ((hash << 5) + hash) ^ hash3; } public static int CombineHashCodes(int hash1, int hash2, int hash3, int hash4) { int hash = CombineHashCodes(hash1, hash2, hash3); return ((hash << 5) + hash) ^ hash4; } public static int CombineHashCodes(int hash1, int hash2, int hash3, int hash4, int hash5) { int hash = CombineHashCodes(hash1, hash2, hash3, hash4); return ((hash << 5) + hash) ^ hash5; } public static int CombineHashCodes(object obj1, object obj2) { return CombineHashCodes(obj1.GetHashCode() , obj2.GetHashCode()); } public static int CombineHashCodes(object obj1, object obj2, object obj3) { return CombineHashCodes(obj1.GetHashCode() , obj2.GetHashCode() , obj3.GetHashCode()); } public static int CombineHashCodes(object obj1, object obj2, object obj3, object obj4) { return CombineHashCodes(obj1.GetHashCode() , obj2.GetHashCode() , obj3.GetHashCode() , obj4.GetHashCode()); } public static int CombineHashCodes(object obj1, object obj2, object obj3, object obj4, object obj5) { return CombineHashCodes(obj1.GetHashCode() , obj2.GetHashCode() , obj3.GetHashCode() , obj4.GetHashCode() , obj5.GetHashCode()); } }
Posted by
IDisposable
at
3/09/2008 03:23:00 AM
5
comments
Links to this post
Labels: .Net, C#, GetHashCode, Microsoft, Utilities
Little bunny FoFo, hopping through the forest, scooping up the field mice and bopping them on the heads.
Down came the good fairy, "Little bunny FoFo, I don't want to see you scooping up the field mice and bopping them on the heads. I'll give you three chances and then I'll turn you into a goon".
Posted by
IDisposable
at
1/02/2008 01:01:00 AM
0
comments
Links to this post
It seems there are several not-very-overlapping audiences for this blog. There are people reading for the SQL stuff, especially the datetime related stuff. There are people reading for the Lightweight Code Generation stuff, especially the DynamicMethod/DynamicSorter library. Then there are the people hunting down information about the RSSToolkit library. Finally, there's the people following the recent URITemplate library.
Since many of you visitors seem to have specific interestes, I've added the ability to subscribe to individual labels applied to the posts via the excellent tip given by Daniel Cazzulino in his instructional posting.
Just check out the labels listing on the right-side navigation. Oh, if you only read via a feed, this might be worth a read of the actual page.
Posted by
IDisposable
at
11/18/2007 11:35:00 PM
0
comments
Links to this post
Labels: CodePlex, DateTime, Dynamic, DynamicMethod, Emit, IL, LCG, lightweight code generation, RSS, RssToolkit, SQL, URI, UriPattern, UriTemplate
I'm in...
(because you are)
Posted by
IDisposable
at
11/13/2007 03:54:00 PM
0
comments
Links to this post
Labels: personal
I can't tell you how happy the last few days have made my inner geek. Last week the Chumby started shipping and today the be-far-coolest idea ever is available for order.
Do you have a digital camera? Snap a lot of shots? Forget to get around to uploading them to your PC and your online site of choice Have we got a solution for you, just get a Eye-Fi SD memory card, configure from your PC/Mac and then install it in your camera. It'll store 2GB of pictures and every time it gets near to a wi-fi network that you have configured it to use, poof instant uploads to your online site. This baby supports all the players (except WinkFlash, what's up with THAT?).
For those of you with CF cards instead... PFFFTT!
Posted by
IDisposable
at
10/30/2007 04:23:00 PM
1 comments
Links to this post
Labels: fun
Today I released a new version of the UriPattern and UriTemplate library on CodePlex (previously announced here). There are two changes in this release:
Pick up Release 1.1 on CodePlex
Posted by
IDisposable
at
10/26/2007 07:51:00 PM
0
comments
Links to this post
Labels: CodePlex, Source, URI, UriPattern, UriTemplate
With a new baby around, you can imagine that our family's sleep patterns are changing. To say that we are tired misses the point entirely... we're all a "bit slow" round the house. Arianna doesn't want to get up for the Montessori school that she dearly loves to go to, Beth is stressed and struggling with emotion... and mellow me is actually not catching those "snaps of testosterone". That's just the emotionaly impact... the cognitive impact is much worse. I've found it difficult to grok code-review changes that occured in the last 5 days at work... I couldn't even recognize a bad web.config connection-string issue (something that would have jumped out before the problem description was finished a mere week ago). It's getting better, though... today is better than yesterday by far... and the biggest difference is in how much sleep we've gotten. I can easily see the pattern in myself--I even might generalize to Beth--but did I extend this to a general behavior pattern for Arianna, or kids in general? I am not that smart (today?).
Today, I read an article by Po Bronson, who authored an article a while back that really resounded with me. I wrote about it here back in March. This new article shows astonishing evidence for the direct link between how much sleep a child gets and thier cognitive ability the next (and following days). In one study of 77 kids (half asked to stay up a little later and half asked to go to bed a little earlier) the resulting merely one hour difference in the amount of sleep showed the same cognitive difference after three days as that between an average 4th and 6th grader. In other words, three hours of sleep difference cost two years worth of cognitive ability.
So let, no MAKE, your kids (and you) get that extra sleep. Read more at: Can a Lack of Sleep Set Back Your Child's Cognitive Abilities?
Posted by
IDisposable
at
10/09/2007 05:22:00 PM
1 comments
Links to this post
I am happy to announce the birth of Xavier Eli Brooks at 1322 of October 4th.
After faking us out by turning himself around the night before the inversion, he resumed his (dad mirroring) ways and refused to turn the crown fully upside down. After 12 hours of Cervidil and 18 hours of contractions standing on his ear, he wasn't coming any closer to finding the stage door so we opened a new one just to his right.
He emerged warping space-time at a mass of 7 pounds 6 ounces, and a length of 19 3/4 inches, not that those numbers actually tell you anything about him.
Beth and baby are both fine, thanks for asking.
Posted by
IDisposable
at
10/04/2007 10:41:00 PM
1 comments
Links to this post
I will always remember the feeling of wonder that overtook me as I read "A Wrinkle in Time" for the first time in 1971... a book born of a fertile mind the same year I was born has shaped me ever since. We've lost a wonderful person today.
Madeleine L’Engle, Children’s Writer, Is Dead - New York Times
Posted by
IDisposable
at
9/07/2007 04:02:00 PM
0
comments
Links to this post
UPDATED:On 9 March, 2008, I fixed some issues with this posting due to comments from Hugh Brown, make sure you use the read the follow-up post Sometimes you make a hash of things.
Checking out a new blog today [Davy Brion's Blog] I stumbled across a very nice entry about Implementing A Value Object. Go read that now if you don't know what a value object is, what immutable means or why it's good.
What I want to talk about is GetHashCode() as used with value-type objects (e.g. struct in C#) but to do that, I really need to talk about the difference between reference-type objects (RTOs from here out) vs. value-type objects (VTOs from here out). Feel free to skip down if this is old hat to you.
What's important to realize is that if your are a reference-type object, your identity revolved around "where you are". This is expressed, in terms of .Net, by the fact that you have the same reference handle/memory address. The problem with this is that you might have an Person object that currently represents me and thus has the FirstName property == "Marc" and LastName property == "Brooks". If I give you a reference to that Person object and you change the FirstName property to "Charles", you're suddenly talking about my father. What's dangerous about this is that you have changed the underlying object to which I gave you a reference, thus my reference also now seems to be my father.
On the other hand, if I gave you a copy of the original Person object (perhaps via a Clone() operation), then you can change any property you wish and I will never know. This is good, if that's what you intend. Your personal copy of the object is not my copy of the object, they have different physical identities, even though they might initially share the same logical identity. To me, it's much like the difference between giving you a money order, or simply a copy of a money order. In the former case you are free to set the payee name to be whatever you want and cash/spend that money order. In the latter, you can do whatever you want to your copy, but it doesn't affect mine.
VTOs automatically enforce the making of copies, you simply cannot change the original, no matter what... though you might change the property values on your copy, this does nothing to my original properties. What this means is that comparing value-type objects cannot meaningfully compare the physical identity (e.g. the reference handle/memory address) between to value-type objects because they will always be different.
So, how do you meaningfully compare VTOs? By their logical identity. In the example of a money order, the logical identity is actually the money order number, not the physical piece of paper. Some less-sophisticated verifications of the money order's validity might hinge on the appearance of the piece of paper, but a much better authoritative verification comes from calling in the money order number to the issuer and seeing if that number is still valid and for what amount. Even modern sporting event venues operate similarly, checking not the physical appearance of of a ticket; rather they scan the barcode and match that against a database to insure the ticket is valid and hasn't been used yet.
Thus, a VTO's identity must be defined in terms of one or more of the property values. To check logical equality of two VTOs, you compare the equality of the identifying properties. In the case of a money order, the money order number.
When you drop an object in collection you expect to be able to later be able to retrieve that object (or, in the case of a VTO, a copy of the object) back out. The simplest way is enumeration, but that's not very quick. More commonly, you stick the object in some sort of dictionary keyed by some value. In the case of an Array, the key is simply the integer index of where you stuck the item, but for large numbers of potential objects you really need a identifying property on the object itself. In the event ticket example, it's the barcode of the ticket. That key is used to store and retrieve the ticket information into a collection (perhaps a Dictionary<TicketNumber, TicketStatus> collection) is the barcode value. To make the storage and lookup quick, the collection internally stores the key values in "buckets" that are based in some way on the key's value. Each "bucket" contains a list of objects that have the same key-gives-bucket-number collection. Once you find the right bucket, you scan through all the objects in that bucket by doing an identity comparison. This means that:
The standard method used in the .Net Framework Class Library for identity comparison is the the Equals() method. The standard method in the FCL to map object key values into buckets is the GetHashCode() method. In practical terms, this means that the Equals() method and the GetHashCode() method work together for any object you might want to place in a collection. They must agree one what properties of an object are identifying. When you design a value-type object, you really have to get it right because they have no meaningful physical identity.
Equals() and GetHashCode() are free!In the the .Net runtime, all objects automatically inherit an implementation of both the Equals() method and the GetHashCode() method. But as with many things in life, not all free things are really worth much. The default implementation of the Equals() method for reference-type objects is simply to compare the reference handle for equality. If we're pointing at the same object, we're talking about equal objects. The default implementation of the GetHashCode() method similarly bases its answer on the reference handle value. For value-type objects, the .Net runtime treats them as-if they inherit from ValueType, so the the Equals() method on ValueType is what is called. This method compares field-by-field the individual elements of the object and returns if each is equal. Likewise, the the GetHashCode() method of ValueType is the default and it merely computes and combines the field-by-field hash-codes and combines them in an unspecified way to generate an overall object hash-code.
In summary, this means is that the default treatment of VTOs is to treat all fields as identifying. The default treatment of RTOs is to treat none of the fields as identifying. Rarely would this be the right thing to do, but that's what you get for the low-low-price of free.
If you have a logical identity for a VTO, or an RTO. then you need to supply your own implementation of Equals() and GetHashCode(). As detailed above, you need to make sure that they are coupled in their understanding of what fields and/or properties are the identifying ones.
Once you've identified what fields or properties to use when comparing to objects for logical identity, you need to implement an Equals() method and a GetHashCode() method the right way. For the Equals() method, there are only a few rules:
System.Object) operator ==, you must have a corresponding Equals() method. IComparable interface, you should override Equals()So, a classic implementation of a value-type object would be something like this (borrowed from Davy's post):
public override bool Equals(object obj)
{
Address address = obj as Address;
if (address != null)
{
return this.Equals(address);
}
return object.Equals(obj);
}
public bool Equals(Address address)
{
if (address != null)
{
return this.Street.Equals(address.Street)
&& this.City.Equals(address.City)
&& this.Region.Equals(address.Region)
&& this.PostalCode.Equals(address.PostalCode)
&& this.Country.Equals(address.Country);
}
return false;
}
Note that it's perfectly fine for the Address object to have many other properties that are not considered identifying and thus not included in the implementation of the Equals() method. That's really the whole point of implementing the Equals() method on an object. For VTOs you are trying to ignore some fields that the default ValueType implementation would have included. For RTOs, you are trying to establish some properties that give logical equivalence.
Once you've established a the body for Equals() method, you absolutely must define the GetHashCode() method. This is where Davy's gets it 99% right. He correctly states that every field/property value you call Equals() against should also be included in the GetHashCode() return value. Most people get that right, and Davy avoids the common mistake of adding the GetHashCode() sub-values together (which would skew the distribution pattern toward larger absolute values) and does an XOR of the sub-values. This is excellent, but we can get it a tiny bit better by following the pattern of many Microsoft provided classes and shifting the accumulated value before the XOR of the next sub-value. This leads to the low-order bits of the sub-value hash-codes being "distributed" into the final value instead of canceling each other out. Thus, my version of Davy's method is:
public override int GetHashCode()
{
return (((((((this.Street.GetHashCode() << 5)
^ this.City.GetHashCode()) << 5)
^ this.Region.GetHashCode()) << 5)
^ this.PostalCode.GetHashCode()) << 5)
^ this.Country.GetHashCode();
}
Unfortunately, that's kind of ugly and error prone due to all the operator precedence issues. Can we make it better?
So, a much better approach would be to have a little helper method set that knows how to do the combining according to this rule. For simplicity and ultimate flexibility, we'll have a version that takes an params array of objects and calls GetHashCode() on each of them in-turn. For better performance (to avoid boxing and unboxing) we'll add a version that takes a params array of precomputed hash codes (actually System.Int32 values). Finally, for ultimate performance, we'll have a few overloads that take a specific number of objects or hash-code values to avoid the allocation of the params array. You can add more as needed, but your really ought to rethink your class if you get more than five identifying fields/properties.
public static partial class Utilities
{
public static int CombineHashCodes(params int[] hashes)
{
int hash = 0;
for (int index = 0; index < hashes.Length; index++)
{
hash <<= 5;
hash ^= hashes[index];
}
return hash;
}
public static int CombineHashCodes(params object[] objects)
{
int hash = 0;
for (int index = 0; index < objects.Length; index++)
{
int entryHash = 0x61E04917; // slurped from .Net runtime internals...
object entry = objects[index];
if (entry != null)
{
object[] subObjects = entry as object[];
if (subObjects != null)
{
entryHash = Utilities.CombineHashCodes(subObjects);
}
else
{
entryHash = entry.GetHashCode();
}
}
hash <<= 5;
hash ^= entryHash;
}
return hash;
}
public static int CombineHashCodes(int hash1, int hash2)
{
return (hash1 << 5)
^ hash2;
}
public static int CombineHashCodes(int hash1, int hash2, int hash3)
{
return (((hash1 << 5)
^ hash2) << 5)
^ hash3;
}
public static int CombineHashCodes(int hash1, int hash2, int hash3, int hash4)
{
return (((((hash1 << 5)
^ hash2) << 5)
^ hash3) << 5)
^ hash4;
}
public static int CombineHashCodes(int hash1, int hash2, int hash3, int hash4, int hash5)
{
return (((((((hash1 << 5)
^ hash2) << 5)
^ hash3) << 5)
^ hash4) << 5)
^ hash5;
}
public static int CombineHashCodes(object object1, object object2)
{
return CombineHashCodes(object1.GetHashCode()
, object2.GetHashCode());
}
public static int CombineHashCodes(object object1, object object2, object object3)
{
return CombineHashCodes(object1.GetHashCode()
, object2.GetHashCode()
, object3.GetHashCode());
}
public static int CombineHashCodes(object object1, object object2, object object3, object object4)
{
return CombineHashCodes(object1.GetHashCode()
, object2.GetHashCode()
, object3.GetHashCode()
, object4.GetHashCode());
}
}This leaves us with the final version of Davy's GetHashCode() method looking like this:
public override int GetHashCode()
{
return CombineHashCodes(this.Street, this.City, this.Region, this.PostalCode, this.Country);
}That's pretty clean and easy to understand, right?
UPDATED:On 9 March, 2008, I fixed some issues with this posting due to comments from Hugh Brown, make sure you use the read the follow-up post Sometimes you make a hash of things.
Posted by
IDisposable
at
8/17/2007 05:21:00 PM
5
comments
Links to this post
This is not right:
internal static bool DoesDbExist(SqlConnection conn, string database)
{
using (SqlCommand cmd = conn.CreateCommand())
{
// prefer this to a where clause as this is not prone to injection attacks
cmd.CommandText = "SELECT name FROM sys.databases";
cmd.CommandType = CommandType.Text;using (SqlDataReader reader = cmd.ExecuteReader())
{
while (reader.Read())
{
string dbName = reader.GetString(0);
if (string.Compare(dbName, database, true, CultureInfo.CurrentCulture) == 0)
{
// the database already exists - return
return true;
}
}
}
}return false;
}
This is right:
internal static bool DoesDbExist(SqlConnection conn, string database)
{
using (SqlCommand cmd = conn.CreateCommand())
{
cmd.CommandText = "SELECT name FROM sys.databases WHERE name=@name";
cmd.CommandType = CommandType.Text;
cmd.Parameters.Add(new SqlParameter("@name", database));using (SqlDataReader reader = cmd.ExecuteReader())
{
return reader.Read();
}
}
}
Someone please assure me that this is not how everyone else handles avoiding SQL injection.
Posted by
IDisposable
at
8/10/2007 01:47:00 PM
1 comments
Links to this post
Labels: best practice, bug, injection, Microsoft, SQL
Today, in blinding science, it turns out that kids who "smoke" candy cigarettes are more likely to try the real thing later. Shocking, huh?
Can I please have my tax dollars back on this one?
Posted by
IDisposable
at
6/20/2007 06:29:00 PM
0
comments
Links to this post
Labels: blinding science
I've just created a new project on CodePlex, and it's got the first (and hopefully only) release available. Enjoy UriTemplate.
I admit that sometimes I get a little jealous of other developers, who are not as limited in the things they can adopt. In some cases its a cool new idea like doing RESTful applications. In other cases its a bit of nice functionality living in another platform like java. In still more cases I'm lusting after the cool new stuff in various Microsoft .Net CTPs, betas and such.
The real-life world I find myself in, though, often has me coding against legacy systems running on ASP.Net 1.1, on servers I cannot control or upgrade with inherited systems that barely grok the idea that WebControls can have properties. Woe is me, and probably many others of you out there.
Today, however, I'm taking back the future for slobs like me, and I'm doing it one class at a time.
A while back, I was reading Steve Maine's excellent blog and found the interesting post UriTemplate 101, which talks all about a new class available in an upcoming release of .Net. The basic idea of this class is to let you specify a pattern of replaceable tokens to use when constructing or parsing URIs. The class looks to be quite nice, but being a future released, I just filed it away for later cogitation.
Dare and everyone else have been talking about this wonderful RESTful world forever, where everything is about URIs that mean something and state transitions occur by following those meaningful paths. Couple that with the long standing best-practice of building Web systems with "hackable URLs" . This resonates with me, and I start thinking about UriTemplate as the application. Of course I don't have any new stuff I'm building that would let me play that way... until last week.
Suddenly, a new project appears on the near horizon... a chance to retrofit an cool new set of functionality to an existing ASP.Net 1.1 site. This new stuff would really benefit from hackable URLs and thus needs a good URL Rewriter and Virtual path handler. Sure, those exist, but almost all force me to map URLs to pages via some lovely RegEx matching. This project, however, is all content-driven and just cries out REST. I want a more general solution and UriTemplate sounds like a match.
So, off I go, looking for the DLL for UriTemplate that Steve's talked about for "investigation". I spin and whirl and Google and Live Search (not a verbable word!) till I'm blue in the face, but I can't figure out where this wonderful class has even been sneaked out for peek. In fact, none the searches turn up much more than Joe Gregorio's original idea posting as a follow up to the application to RESTful development of templated URIs.
Eventually, I stumble across Jeff Newsom's curiously titled posting about some upcoming WCF features that shows using a UriTemplate in a WebInvokeAttribute. He also mentions in another post that some functionality was "folded into the BizTalk Services SDK", which brings me right back to the start with Steve Maine's blog. So, I now know where to look. A quick download of the BizTalk Services SDK and I've got some code to look at. Fire up Reflector and ugh...way to complex for me to use since I can't deploy the SDK to my production environment. I guess I'm going to have to write my own, but I'm sure not going to back port the one in the SDK.
So, last Friday, I finally got around to deciding it was time to just write the code myself, I did another search based on the links that I previously found and stumbled across James Snells posting about draft specification for URI Templates, which included a Java implementation. This code is simple and clean... very likeable. Too bad it's in the wrong language and built for the wrong platform. But I know Java, I know C# and I know how to make one look like the other. A short time later and I've got a fully functional .Net 2.0 version of the package that James wrote.
This week, my coworker Ryan Stephenson did a quick back-port to the .Net 1.1 framework (I told you we had deployment restrictions!) and today I bundled it all up, created a project on CodePlex and made a quick home page for it.
So, like I said way back up there... there's now a perfectly serviceable UriTemplate implemention available for schmucks like me. If you are interested, the goods are here.
Posted by
IDisposable
at
6/20/2007 06:11:00 PM
0
comments
Links to this post
Thanks to some amazing work by Piyush Shah of Microsoft, the ASP.Net RssToolkit originally authored by Dmitry Robsman has grown up big and strong!
The new release adds some awesome features that many users have be asking for, some considerable tightening of the code base was done by myself and Jon Gallant, and I got off my lazy butt and update the Wiki using some documentation that Piyush wrote as a basis.
This release adds support for some huge features that I'll summarize here, but you should really head to the project home page to read the Wiki documentation.
New features: