Mar
12.
2007

This particular rant was caused by a post that doesn’t really go in the direction I’m about to, but is interesting read nevertheless (check out the comments!): Agile and Test-Driven: A Marriage Made In Hell?

I want to touch a topic concerning single acronym in this post: YAGNI, or You Ain’t Gonna Need It.

In general, I agree that YAGNI is a solid concept. Don’t build too much, especially if you’re going to throw it away later. Makes sense. Less work is better. Less code means less bugs and less time spent for no benefit. Leaves you more time to work on things that a customer wants now. Goodness, right?

Not always. If some of us are crazy for building code even though a customer did not request it (so we’re doing exactly the opposite of what’s recommended) there is a method in our madness. Why exactly are developers building more than necessary?

Sometimes it’s because customer’s behavior is actually not so unpredictable. Customers are not technical people (most of the time). They can’t think in abstract terms. So what they tell you and what they really mean are most often a variations of one and the same thing – a “hardcoded” version they give you, and a more flexible, a bit more generalized version they are vaguely aware of, but can’t express in abstract terms so you can translate it literally to code.

So you detect this situation. You don’t say anything, but you code more generalized version of what customer wants. A few weeks/months later, the customer realizes that his initial idea was too narrow in scope and wants something better/more complicated/more general.

Now, if you followed YAGNI, the chances are that the architecture of the app as is at that moment simply does not support that scenario. If you want to incorporate changes the customer is insisting on, you’d need to refactor large portions of your code. All your unit tests go down the drain, as you’re not testing the same thing any more. You end up writing lots of code with a risk of breaking big chunks of previously written code, plus you have to write a ton of unit tests again.

Contrast this to the “developer premonition” approach above. You have already incorporated the general idea, your tests reflect that so all you need to do now is to expose a GUI item or two and call the right method. You’re done. Besides from not jeopardizing the existing code, you look cool in the eyes of customer because you adapt to change so fast.

This is oversimplification of course, but once you observe the above scenario in the wild, if repeated enough, you’ll start adapting to it and ignore YAGNI. The real issue here, just like with YAGNI, is to know when to stop.

I think it’s precisely because of this – developer’s inability to stop themselves from generalizing too much – that YAGNI was introduced. When in doubt, YAGNI.

But it’s hard to fight the real-world experience. Use your own judgment and most of all, adapt to the customer.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
0 Comments
Feb
22.
2007

Scott Hanselman has recently discussed the ways that an application can auto-update itself.

He’s only discussing the nice side of the auto-update, assuming that the app will behave correctly and predictably. Here’s my take: if you can’t implement this functionality right, don’t even try, you’ll only annoy your users.

Examples of auto-update done right:

  • Reflector, besides from being an all-around cool tool, updates seamlessly and with the lest fuss
  • Eclipse, even though it consists of a million plug-ins and sub-platforms and whatnot
  • Paint.net, including auto-update to and from a beta version
  • Windows Live Messenger, notifies you, downloads and installs, simple and effective
  • Firefox, downloads the update in the background, just restart and that’s it

Examples of auto-update done bad:

  • Samsung PC Studio, where the update consists of the individual downloads of each and every DLL and EXE (I kid you not), judging by the look of the progress bar uncompressed; takes a lot of time on my 15Mb connection, but at least it works
  • Nero 7 Premium, confusing UI, mindless questions, does not work and it’s not obvious why; looking from the Process Explorer it does establish the HTTP connection(s) to the French mirrors (OK, at least it got right that I live in France) but then nothing happens and it declares “download problems” – “very” informative

If it takes more trouble and nerves to use the auto-update, please don’t bother. Just link to the Web page with the info on the latest version and I’ll download it myself, thank you very much.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
0 Comments
Jun
28.
2006

By “code packaging” I mean separating your code (on a high level) into manageable chunks (“components”) ready for (re)use. On Windows, this would generally be a DLL.

Windows was originally written in C (and to some extent now probably C++) so its packaging mechanisms are C based too. Unfortunately, this means that support for C++ had to be “bolted on” in some way, which resulted in a mess. Just think about it – should I export a class, or some of its methods? How should I export static members? How do I link with component of this particular library? Is this library static or is it import library (can’t guess by extension)? If I link dynamically, can I/should I link with another “component” of the same library statically? Where the hell are the boundaries?

I was reminded of all these questions while trying to build and link a 3rd party C++ library. One of the symbols just wasn’t exported properly. Even though I knew what was wrong, I was unable to fix the problem because the macros that control __declspec(dllimport) and/or __declspec(dllexport) were impossible to mentally unwrap.

Contrast this to Java/.NET packaging – from the very beginning, the packaging is well defined. The linking is always dynamic (at least for .NET, not sure for Java) which on a surface might seem too crude but in reality simplifies things – less choice is sometimes better (you can still do the ILMerge if you wish). The visibility of symbols is explicit both on the VM level and via keywords in each of the languages that compile into IL. “Linking” is always directly to the final DLL (there are no separate artifacts to use because all the metadata you need is directly in the final image).

While this is better from the standpoint of less hassle when designing libraries, there is one much bigger advantage here that is often overlooked – when packaging is complicated, we tend to avoid the issue altogether. Sometimes this leads into monolithic designs because separating/refactoring into “components” is difficult if for no other reason than packaging. When packaging is easy, this is a no-issue and leads into better designs for everyone: library designer and library consumer.

This has been one of the top few features of the .NET that I prefer over native/C++ development. Garbage collector is nice and some .NET/C# features like (asynchronous) delegates are cool, but packaging is the feature that saved me a lot of boring work while building my libraries, which left me more time to deal with real issues.

 

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
0 Comments
May
21.
2006

I completely missed this post from Martin Fowler that apparently caused quite a stir in the development community. But that is to be expected - for every person out there that criticizes language X there will be dozens to defend it.

In this case, Martin admits how he likes the fact that with some of the Ruby classes, most if not all of the methods one could frequently need are included. This is contrasted to Java where most of the classes have very minimal interface. The classic example is where Ruby gives you method last for the Array class Java requires you to use something like get(list.length – 1).

Java is not the only OO language whose library follows this kind of approach – most of the highly regarded C++ libraries will look the same as will many C# libraries too. I must confess that I’ve been a subscriber to this philosophy too, but the blog post and the comments that followed made me think.

From whose point of view is the minimalistic interface better than the rich one? Class’ designer or the consumer? Who benefits more if the class has richer interface? To me, the answer is obvious – unless there’s a lot of ambiguity whether method A is better for the purpose than method B (in other words, there is no unnecessary functional overlap, just more convenience methods), the richer interface is better for the consumer but harder for the implementer. My approach would be – implement the convenience methods as long as they add value, that is they are really frequently used like the last method above.

Actually, if you have read (and if you haven’t, I definitely recommend you to buy it) Framework Design Guidelines… from Brad Abrams and Krysztof Cwalina you will notice something really interesting – even though the default “rule” for the C# classes is generally to have relatively thin interfaces, the class designers are encouraged to add extra convenience methods so that the most frequent case is supported by less code from the consumer. This is great practical advice even if it looks like the wrong thing to do as it adds more methods to the classes that would work perfectly fine without them. It is also roughly in favor of Ruby’s approach of having extra convenience functions.

Another interesting perspective is Herb Sutter’s chapter on std::string refactoring in his Exceptional C++ Style… book. He argues that std::string has too many members and that most of these can be pulled out. He then successfully refactors the class into a version with a lot less methods, and keeps the functionality. At first, this might look like a point against Ruby’s approach, until you realize that the functionality is kept only by externalizing the methods into free functions. If anything, I would argue that C++ std::string does not have enough methods, externalized or not, but that’s just me rambling.

I don’t think there’s anything wrong with having convenience methods in your classes, that is methods that can be expressed in the terms of other methods of your class. You just have to be careful not to go overboard and implement any conceivable combination. If you follow the reasoning that Abrams and Cwalina present in their book you can’t go wrong.

We are developing for humans too, not just for the computer/compiler.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
0 Comments
Apr
27.
2006

While developing my (BitTorrent based) P2P product, I have stumbled upon a technique that should help shape the network traffic so that the application does not slow down (more than it does by its share of throughput) overall Internet access or jeopardize other network services. It is a well known fact that P2P traffic can be quite heavy and sometimes even bring to its knees network equipment. Thus any effort that could help shape the traffic should be welcomed.

Most of the other BitTorrent based P2P applications decided to use the Type Of Service of the IP header. That seems to work well according to reports from individuals who were capable of using this to regulate their networks so I decided to implement this in my app too. It’s only a single call to set this, once for incoming sockets and once for outgoing.

But nothing is as it seems – the TOS cannot be set by default on all post-Windows NT4 machines. Before anyone starts talking about Microsoft’s secret evil plans, it’s nothing of that sort – turns out Microsoft is following the newer “spec” (I put it in quotes because we are talking about RFCs here) that obsoletes usage of TOS bits. More details can be found in a support article Q248611.

That said, we all know that specs are one thing and reality another. Setting TOS does in fact work on today’s equipment (again, I don’t have hard data but people with no reason to lie are reporting success on several P2P forums) and it would be nice if one could set it.

Not surprisingly, Microsoft does provide a workaround – there is a registry setting that does the trick and it is detailed in the support article I linked above. Note that you must restart your machine before the setting will be effective. Unfortunately, this also means that users of my app will have to do the same.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
0 Comments
Feb
21.
2006

While trying to track the delivery of my new router I noticed I can install a local application provided by French post (called e-Como) that should notify me in realtime when my package moves toward me.

e-Como did not work too well in the end (I have my package and it still thinks that it's about 200Km away from here) but that's not what I wanted to point out.

The authors of e-Como made a classic mistake - they assumed I was running French Windows, and while trying to add the app into the startup group they hard-coded the path to the name of the Startup group (see the picture above) in French (there are more places like this in Windows where a name of a system folder is localized).

Don't do this. There is a documented way to get this information - all you need to do is to call SHGetSpecialFolderPath API. This function lives in shell32.dll version 4.71 and higher - in other words, everyone with Internet Explorer 4.0 and higher should have it. Pass CSIDL_STARTUP to it and you'll get the desired folder location.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
0 Comments
Jan
10.
2006

Continuing the topic of my earlier post, Ben responds at the bottom of the post:

Drazen wrote a good response to me in Quick, which is better suited for XML transformation: C++ or XSLT? It is nicely presented, but I've got to say it ultimately only makes the point here stronger. It is significant that he uses "EXSLT" to get the power function he needed. Though he tried to preemptively defend this decision, it is an example of how you are at the mercy of your XSLT component to implement an extension, and it stops my MXSML 4.0 test from working. His new XSL is half the lines and if I remove the power function (and ignore the incorrect line numbers) it runs as fast as my C++ example. But it looks like the line numbering in his solution still depends on a certain number of elements as I gather from the 6 in math:power(6, count($recurse)), whereas my C++ example can have any number of col elements in any number of column elements. Sure he can fix that too, but the fact that someone skilled with XSLT can improve this particular stylesheet does not prove the value of XSLT for this problem.

Ben is right on almost all acounts - I did make an oversight and implemented the solution assuming that the number of child elements of column is equal for all elements (original post has been modified to mention this and provide the fix). I don't agree that using EXSLT is unfair - it's like complaining that if you used hash_map in your C++ app you are relying on an extension (even though all vendors provide this).

Nevertheless, let's see what the problem really is here. The stylesheet is more complicated and required power function only to work-around an inherent design issue of XSLT as a language without side-effects to variables - once set, you can't change variable's value but that's exactly what I need to implement a counter. If the problem did not require the result rows to be numbered, things would have worked out perfectly.

But let's see what I can do to fix the stylesheet as is, using something other than power and not using EXSLT. Due to the wrong assumption about number of child nodes of column element, I can't use power anyway. What I need to do is to compute the product of number of child elements following the current node. In order to do so, I'll change the line that computes the increment to this:

<xsl:variable name="increment">
  <xsl:call-template name="product">
    <xsl:with-param name="nodes" select="$recurse"/>
  </xsl:call-template>
</xsl:variable>

But where is the mysterious product template? It's in the separate XSLT file - it's just a utility function that computes a product of a value of some nodes, where “value” is defined in terms of the template you provide. Think of templates here as functions that call each other to compute something. Here's the definition of product:

<xsl:template name="product">
  <xsl:param name="nodes"/>
  <xsl:param name="result" select="1"/>
  <xsl:choose>
    <xsl:when test="not($nodes)">
      <xsl:value-of select="$result"/>
    </xsl:when>
    <xsl:otherwise>
      <xsl:variable name="value">
        <xsl:call-template name="get-node-value">
          <xsl:with-param name="node" select="$nodes[1]"/>
        </xsl:call-template>
      </xsl:variable>
      <xsl:call-template name="product">
        <xsl:with-param name="nodes" select="$nodes[position() != 1]"/>
        <xsl:with-param name="result" select="$result * $value"/>
      </xsl:call-template>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

Aha, it's getting complicated, you might say. True, but again think of this as a general purpose utility function that is reusable for other purposes as well and can be included in any stylesheet. Otherwise I can dismiss Ben's C++ solution for using a custom string implementation instead of CString :)  Note the get-node-value template it depends on - that's the “value“ of node I just talked about. In this case the template boils down to this (should be replaced with a different implementation for each distinct purpose):

<xsl:template name="get-node-value">
  <xsl:param name="node"/>
  <xsl:value-of select="count($node/*)"/>
</xsl:template>

The resulting complete template is only 6-7 lines longer than the original (not counting utility template product). But it's still complicated for the beginner, and Ben is right - simple problems like this should have simple solution. I wouldn't be telling you all this if I didn't have one, would I? :)) Let's look again at the real problem here - all I need is a simple counter. The only reason I bothered with power and product was to avoid counting. But there's another way to avoid counting too: two-pass solution. Don't worry - it is as performant as the original solution and a few lines shorter (complete source presented for completeness):

<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:ms="urn:schemas-microsoft-com:xslt">
  <xsl:output method="text" omit-xml-declaration="yes" encoding="iso-8859-1"/>
  <xsl:template match="columns">
    <xsl:variable name="lines">
      <xsl:apply-templates select="column[1]"/> 
    </xsl:variable>
    <xsl:for-each select="ms:node-set($lines)/line">
      <xsl:value-of select="concat(position(), .)"/>
    </xsl:for-each>
  </xsl:template>
  <xsl:template match="column">
    <xsl:param name="running" select="''"/>
    <xsl:variable name="recurse" select="following-sibling::*"/>
    <xsl:for-each select="col">
      <xsl:variable name="current_running" select="concat($running, ' ', .)"/>
      <xsl:if test="$recurse">
        <xsl:apply-templates select="$recurse[1]">
          <xsl:with-param name="running" select="$current_running"/>
        </xsl:apply-templates>
      </xsl:if>
      <xsl:if test="not($recurse)">
        <xsl:element name="line">
          <xsl:value-of select="concat($current_running, '&#13;&#10;')"/>
        </xsl:element>
      </xsl:if>
    </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>

First I create a temporary node set, and then output it out using the position of the nodes as a counter. I know - another extension (node-set). Hey, it works with MSXSL 4.0 so at least Ben shouldn't complain ;) Seriously, the only reason I am using the extension is because it is practically a mandatory extension for every decent XSLT processor as it rectifies what I consider a flawed spec for XSLT 1.0. The proof is in the XSLT 2.0 spec - if you remove the extension the stylesheet will work as-is with any 2.0 conformant XSLT processor (tried with Saxon).

I don't think this example proves that one should avoid XSLT. What is obvious though is that the mental shift required for an average developer when going from C++ to XSLT might indeed be too high. If you can't figure XSLT out, by all means use CMarkup or a similar library. But don't do it just because XSLT is different - there's value there too. If we'd all skip technologies and languages we are not familiar with we'd code most of the Web apps in assembler instead of PHP, Ruby or Python :)

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
0 Comments
Jan
9.
2006

I'm sure you've seen this one before somewhere - it came to me while reading some Java book, of all possible places. One of the cooler uses of macros in C++ is to define a “forever” loop, like this:

#define ever ;;
// ...
for(ever) { /* code */ }

Helps readability of code - we all should agree by now that reading code is at least as important than writing it.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
0 Comments
Dec
29.
2005

I thought that this debate was over some time ago even if by each camp staying entrenched in their own beliefs. Ben Bryant does not think so and writes the following in his Transformation Example: Apples to Oranges:

It is truly impressive that some developers have mastered a challenging technology such as XSLT which makes any moderately complex task nearly impossible. But is it really worth it? It is funny to me when someone being helpful on Microsoft XSL newsgroup produces a big long stylesheet solution that appears absolutely cryptic.

So according to Ben the XSLT is ill suited for a task like the one described in his post. I suggest you first go and check it out - it's not a long read and boils down to this - if you want to solve a certain class of problems, CMarkup (Ben's product) is the way to go.

Naturally I had to try to prove him wrong ;) I can't blame Ben for advertising his own product, but this is a poor choice of a problem - the problem did not look as something XSLT interpreter/compiler would waste 20 seconds on (using his test case - 6 elements each with 6 child elements, resulting file size about 1.7MB). I first tried the XSLT he described as cryptic (and to a certain degree I agree that it is not very readable). It did not take longer than two seconds on a reasonably fast (or slow) Pentium-M (600MHz when idle that shortly spikes to 1.7GHz while transforming).

Then I went on to produce my own XSLT at fist thinking I might need XSLT 2.0, but it turned out there's nothing here that can't be solved with XSLT 1.0. You could argue that I cheated because I used EXSLT, but that's not really fair - for some reason basic set of math functions in XSLT does not include power; at the same time, EXSLT is an industry standard way of extending XSLT that is present on practically every platform and implementation. Besides, I only used EXSLT for the power function and nothing else. My console test runner was nxslt that itself uses .NET Framework 2.0 compiled XSLT transformer. But first, the stylesheet:

<?xml version="1.0" encoding="iso-8859-1"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:math="http://exslt.org/math">
  <xsl:output method="text" omit-xml-declaration="yes" encoding="iso-8859-1"/>
  <xsl:template match="columns">
    <xsl:apply-templates select="column[1]"/>
  </xsl:template>
  <xsl:template match="column">
    <xsl:param name="running" select="''"/>
    <xsl:param name="index" select="1"/>
    <xsl:variable name="recurse" select="following-sibling::*"/>
    <xsl:variable name="increment" select="math:power(6, count($recurse))"/>
    <xsl:for-each select="col">
      <xsl:variable name="current_running" select="concat($running, ' ', .)"/>
      <xsl:variable name="current_index" select="$index + (position() - 1) * $increment"/>
      <xsl:if test="$recurse">
        <xsl:apply-templates select="$recurse[1]">
          <xsl:with-param name="running" select="$current_running"/>
          <xsl:with-param name="index" select="$current_index"/>
        </xsl:apply-templates>
      </xsl:if>
      <xsl:if test="not($recurse)">
        <xsl:value-of select="concat($current_index, $current_running, '&#13;&#10;')"/>
      </xsl:if>
    </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>

[Edit: as Ben points out in his response, I hard-coded the number of elements to 6; this is easily fixed by changing 6 to count(*) but the fact remains - I assumed same number of child elements of column element] Of course, if you don't know XSLT this will still look cryptic to you.  But if you do understand basic XSLT it should make all the sense, is very short and runs sub-second including the compilation of the stylesheet. It generates 66 rows (list of all combinations [no repeats] of nodes) - about 1.7MB of text, exactly in line with his own test.

Should I conclude that because of this XSLT is better (or at least equal to) than C++ based solution (CMarkup)? Definitely not. If the transformation involved more of the processing that was highly iterative and best suited for a language that allows changes to variables (like C#, C++ or Java) I am sure that using purely XSLT would yield a really ugly solution whereas CMarkup (C++) based one would still look nice and clean. In fact if I wanted I'd probably come up with a better use case for CMarkup where C++ would really shine ;)

Moral of the story is (again) use the right tool for the right job. If all you have is a hammer, don't look at every problem as if it is a nail :)

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
0 Comments
Dec
29.
2005

In a thought provoking post I See Markup, Part 5 - Not Chasing Standards Ben Bryant complains about the extra work that is needed for developers to support various XML standards.

On the other hand, he should probably be thankful because it's precisely because of this that he has a sustainable business :) Ben is the author of CMarkup, a lightweight C++ parser for XML that satisfies a niche. Those that want a fast solution that is trivial to integrate into their C++ code base probably should be happy with his product. But things are not as simple as they look - for example, Ben claims that:

But I never saw the business case for XSLT 1.0 let alone 2.0. To me, solutions formed around XML transformation by style sheet and even validation by DTD/Schema are "artificial". I call them artificial, because they create work rather than solving a problem. They are unnecessary detours because programmers would otherwise transform XML for viewing in more direct ways and validate their data in more natural and reliable ways, using tools they are already familiar with.

The truth is as usual in the eye of the beholder. For some of Ben's users, his product is God-given; they don't have to learn anything new. But this is precisely the reason why it might be dangerous to look into the world of XML through the CMarkup colored glasses - you might miss that there's a lot more out there than you think.

Here's an example - one of the tasks I had to implement once was to transform an XML representation of a formula like structure into text. The developer who worked on the task before me already put in place C++ SAX based parsing solution, but at the time he did it the XML was quite simple and this worked fine. In the meantime the requirements changed and the new structure was quite a bit more complicated - note that there wasn't much of data, but its structure was involved. After struggling with C++ for a while, I noticed that the problem can be almost trivially solved with XSLT, so I did. I am sure that CMarkup would make the solution simpler than C++ SAX solution I had to start with, but just because XSLT is not well suited for certain transformation cases does not mean it is useless for other.

Yet another example is reading XML documents without a schema from C++. I am sure that you can get even malformed XML with CMarkup very fast but what would you do with it in case many elements contained optional attributes and sub-elements with default values? Litter the code with zillion ifs to check if something is there or not and then populate defaults? How would you check for deeper structural dependencies between elements and attributes? Ah, but you wouldn't - your XML documents would be trivially simple (in structure!) anyway.

CMarkup is great tool for a specific purpose, but might be horribly inadequate for another. While it helps those that have simple structured XML documents to deal with and just want to get the job done, it also supports the lazy among us developers who cannot be bothered with specs. This can sometimes result in developer thinking that “this whole XML thing” is really simple and start producing ill-formed and invalid XML. Here's an example of that - name one thing that is the hardest part of building an RSS/Atom reader? It's not following the standards of the RSS/Atom markup, it's struggling with all kinds of ill-formed markup that content producers spew out. If we'd all pay just a bit more attention to “boring” specs, the interoperability (that everyone seems to agree is the good thing) would be so much easier.

In conclusion, I don't think that the fact that CMarkup exists (and is quite useful in a special niche) invalidates specs for any XML technology. Right tool for the right job - sometimes this will mean (a bit) more work for developer.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5
0 Comments