Every once in a while, I run into a problem with collections. I have some operation that I need to do with all of the elements, but I need to treat the first or the last case differently.
Typically when I see see people attempting to do this, they handle the special case:
if(words.length > 0) {
result += words[0];
}
and then loop over the remaining elements:
for(int n = 1; n < words.length; n++) {
result += ", " + words[n];
}
That's one way of doing it but it's a pain, and it does make you wish for blocks and open classes so that you can write this sort of code once, add it to your collection classes and then never have to deal with it again.. but I do most of my programming in old school languages so sometimes I do something like the following:
for(int n = 0; n < words.length; n++) {
result += (n == 0 ? "" : ", ") + words[n];
}
You may not want to do this if you're working on something that is performance critical, but it is a handy little idiom.
In python it's:
",".join(words)
It only puts the join in between items, not at the beginning or end.
Otherwise:
char *prefix = "";
for(int n=0; nwords.length; n++) {
result = result + prefix + words[n];
prefix = ", ";
}
is better provided the assignment to prefix is cheaper than the comparison and assignment. Quite often, the trick is to eliminate decisions, rather than to make them smaller inline.
And of course, one can avoid a lot of reallocating by not appending to a string, but by reserving space in a stream or preallocating size. Sometimes the brk() that allocates memory is the thing that slows you down, and reallocations often involve extra copying that makes it worse.
so if you can reserve a string buffer, and then use the prefix trick, you often have a much better result. I think that's what the string.join() example is doing in python, too.
Posted by: Tim | July 19, 2007 at 11:02 AM
Tim,
That's a cool trick. It looks like it trades some clarity for speed. And, if speed doesn't matter in a particular case, maybe being explicit about the condition (check for 0 if 0 is special or check for n-1 if the last element is special) could be a good tradeoff.
That's something I fret about a bit. We know that optimization only matters when it matters, and that it is better to do it when we discover that it matters. But, it seems that there's this line where code (like my example) looks like it is willfully inefficient, and it cuts across the grain a bit. It seems that we (as an industry) are willing to buy Michael Jackson's mantra about optimization, but only up to a point.
All said, I do wish collections could have special methods for this, like your join example.
Michael Feathers
Posted by: Michael Feathers | July 22, 2007 at 04:27 AM
Try this:
http://diditwith.net/PermaLink,guid,a1a76478-03d2-428f-9db6-9cf4e300ea0f.aspx
Posted by: Joseph Gutierrez | July 23, 2007 at 06:58 PM
In Java, I'm often working with iterators, which don't have a numeric index or 'isFirst' test. So I often just check the length of the 'result' object, and append a comma when needed:
final Collection words = Arrays.asList("one", "two", "three");
final StringBuilder result = new StringBuilder();
for (final String word : words) {
if (result.length() > 0)
result.append(", ");
result.append(word);
}
assertEquals("one, two, three", result.toString());
Now, for *extra credit* ;-> one can get tricky:
assertEquals("one, two, three", toStringSansBrackets(words));
private static CharSequence toStringSansBrackets(final Collection values) {
final String nativeListString = values.toString();
return nativeListString.subSequence(1, nativeListString.length() - 1);
}
Posted by: Jeff Grigg | September 22, 2007 at 03:23 PM
In Haskell, to efficiently intersperse a string ", " between elements of an array, one can use:
commas :: Array Int String -> String
commas words = foldr1 with (elems words)
where with a b = a ++ (pre b)
pre = foldr (\c f -> (c:).f) id ", "
The idea is that pre is evaluated only once into a function that prepends ", ".
In a less clear pointless style, which just composes the different functions:
commas = foldr1 with . elems
where with = (flip (.)) pre . (++)
pre = foldr ((.) . (:)) id ", "
Posted by: Chris K | October 09, 2007 at 12:27 PM