« The Fallacy of One Definite Meaning | Main | Guiding Software Development with Design Challenges »

March 18, 2013



Should not 2.reduce(:+) return a partial application?

Michael Feathers

Malk: Can you say more about that?


Interesting idea that reminds me of a dilemma I was facing a little while ago in ruby to do with designing an API. I wanted to retrieve stored objects with zero or more named properties and zero or more conditions

store.retrieve('Account', :name) #retrieve objects of type Account populating just the name field
store.retrieve('Account', :name, limit: 1) #retrieve first object of type Account populating just the name field
store.retrieve('Account', [:name, :last_contact_date], {tier: 'Gold', country: 'US'}) #retrieve US, Gold-tier objects of type Account populating name and last_contact_date fields

This feels very Rubyish: a single method with three parameters. the last two of which are optional (no field names means all fields and no conditions means an unconstrained retrieval). The optional parameters may be a single thing or a list of things.

Basically this doesn't work without some nasty code (if is_a? or if type_of?) to distinguish between whether an optional argument is a single value or a collection (and, in the case of a Hash its actually both). Asking around for opinions the responses I got were mixed:

* Have different names for the methods to distinguish between a single field and multiple fields. Nicely intention-explicit but a complex API rather than a single method
* Always pass in a collection (e.g. [:name]. Simple code but the intention is less clear (do I mean :name or a collection with :name in it?) and the onus is on the caller of the method to do extra work to make the provider of the method's life easier.
* Write a little language for the query so there is always one parameter, e.g. retrieve('Account', "Name, LastContactDate where Tier = 'Gold' and Country = 'US"). Nice but the point of the API is to make retrieval a lot more declarative and writing a robust parser will require more and more complex code than writing the API as envisaged with a bit of type checking.

Someone pointed out to me that the explode (*) operator always returns a collection so the if is_a? Array code can be replaced with *property_names which will always be a collection even if property_names is passed in as a single value. But it seems to me a language where a thing is also a collection of itself (I'd go for single value rather than infinite but suspect there are philosophical and practical problems with both) would make life a lot simpler. JQuery has some aspect of this.


How about some kind of implicit conversion feature? You might have seen something like this in Scala, and type theorists have been studying more powerful systems with dependent types under the name of "Coercive Subtyping". Here's the type theory view

Basically, we add an inference rule

f : B -> C x : A c : A sub B
f(x) = f(c x) : C

And declare certain functions as coercions A sub B (from A to B)

So, if we want values to act like collections, we can declare (\x -> [x]) as a coercion and hence (sum 2) now works.

What about (map sum 2)? Do we want the coercion to be applied twice? And what about this coercion's interaction with other coercions? (We have answers to some of these questions.)

Does this translate to a dynamically typed OO setting? Maybe - it might boil down to "if we can't find method M in class/proto C or its ancestors then look in different class D"... That might be fun to try.

Jens Schauder

It does remove an edge, but it introduces another one.

The new edge becomes obvious when you do your trick with an object that is already a collection.

You have two rules

x[0] returns x


x[0] returns the first element of x which is != x

So you just replaced the edge with another one.

If you actually disallow null and use Option you actually remove an edge from your code and at least transform it in a proper step.

James Manning

2 places I think of as having done this (at least to an extent)

- jQuery (and similar) selectors/objects, where $('#foo') will return a collection even though your base js method would be getElementById. It's a good example that "finders" don't have to return a scalar that then necessitates a null check.
- PowerShell pipelines, although they have the same issue as Jens mentions when you feed in a scalar that is enumerable, and on assignment will collapse to $null, scalar, collection (necessitating the workaround of wrapping with @() when you want to ensure the result is a collection)

You could certainly construct programs that kept everything as enumerators/generators/etc, but how many places would go from "missing null check" to "missing empty check"? For LINQ usage, you could certainly just replace your First(OrDefault)/Single(OrDefault) usages with Take(1) and see how things went from there. :)

My knee-jerk reaction is you'd end up with code that's more difficult to reason about, even if it increased brevity with removed null checks - however, given how much working code there is leveraging jQuery selectors, maybe that's either wrong, or small enough to be worth the trade off. :)


Re. comment from Jens - the ambiguity only arises if your implicit rule is allowed to apply everywhere. That's why approaches like coercive subtyping only trigger when the value type does not match the expected type - ie it uses both source type and target type, not just source type alone.

Andrew Gibson

Perhaps there are two concerns here?
1. Caller-side concern: Scalar vs Collection stuff (regarding which your suggestions seem very sensible), and the disposition of the caller to the resulting value (is it an error for there to be no returned value?)
2. Callee-side concern: The question of whether calling a method means "invoke function X on Y" or, the more adventurous "send message X to Y" (a la SmallTalk). If the former, a function must exist for the operation to be valid. If the latter, a message receiver only need handle the message when appropriate.


trying to say more about that:

lets just consider for the sake of generalicity that '+' is a function so I shall use + 1, 2, 3 instead of 1+2+3, my argument would be the same if '+' was an operator really, is just that I think the example is even more useful on functions

so +(1) is just a partial application
let +1 = +(1)

if I understood you correctly

Collections should have the current behaviour of most languages
map (+1) [1,2,3] → [+1(1), +1(2), +1(3)] → [2, 3, 4]
map (+1) [3] → [+1(3)] → [4]
map (+1) [] → []

A scalar should behave the same way as a collection with a single element, except that the result should remain scalar
map (+1) 3 → +1(3) → 4

A null should behave like the empty collection
map (+1) ∅ → ∅

IF that understandig is correct, I DO wholeheartdly agree with you, in my eyes this would be rendering the maybe monad the default behaviour with normal language syntax, I think it is a sensible default behaviour, we could have specific syntax for when we want to get nitpicky.

Well if I follow that same logic for reduce, in my humble eyes:

[].reduce(:+) → []
∅.reduce(:+) → ∅

[1, 2, 3].reduce(:+) → +(+(1,2),3) → 6


[2].reduce(:+) → +(2)

Which is just a single argument passed to the + function, whose arity is bigger than 1, so it is a partial application (like our +1 up there)

and hence

2.reduce(:+) → +(2)

which is the same thing

What is your opinion on this?


My post was based on the "what" and "how" and now I realize that maybe you just wanted "why"!

so: to add a why

for me that respects the semantics of the applied function

we could use a reduce with a starting element say

[1, 2].reduce(0, +) → +(+(0,1),2) → 3

Such a reduce is made to accept lists of a single element

We have chosen a reduce whose behaviour is to start with the two first elements of a collection and reduce from there, applying a function that needs at least two arguments but whose partial application is useful (as shown by your usage of it in map)

So it makes sense in that choice to say "we only have part of our elements lets use what we have and this would be a function accepting others"

when I think of it like that, one could also argue that

2.reduce(:+) becomes another way of saying

reduce :+ 2

and hence a partial application of reduce itself

the returned function would reduce anything passed to it as argument using the + function and using 2 as its initial element.

I have mixed feelings about which choice would be the sanest default


Actually, if you want to, you can make Ruby do this pretty easily:

class Object
include Enumerable

def each
yield self

def [](_)

However, I can't remember when I last wrote a method that could either take a singular object or an array of them (other than via varargs), so this would not benefit me much.


At first this sounded like a great idea - an isomorphism between 2 and the list [unit, unit]. You could use a foreach loop over a number and get a while loop! Then you kept talking, and it sounded worse and worse and worse.

This kind of "It's a type error to do such and so now, but we could make up a rule for what happens instead, and people would only have to remember the rule!" (e.g. "a number behaves like a singleton list of that number, except when it behaves like an infinite stream of that number" - how would you even distinguish 1 from 2 in this scenario?) violates the continuity and symmetries of a language.

There was a long elaborate discussion in the logic community of "tonk", a connective with an introduction rule like Or and an elimination rule like And. The point is, there are symmetries and continuities in logic which are violated by adding tonk, and the general impression that when you are defining a new thing, you can define it to have whatever rules you want is wrong.


J. B. Rainsberger

Programming is hard enough without complicating reasoning about types for the sake of minor convenience.

I would sooner see functions like map and reduce accept scalars and treat them as collections-of-one than see the entire system try to coerce scalars to collections-of-one. Look at the nonsense Javascript does to coerce strings to numbers and the confusion that causes. I'd like not to encourage more of that.

gabriele renzi

from what I understand, R does this (scalars are not really scalars) without any macroscopic issue

> a b class(a)
[1] "numeric"
> class(b)
[1] "numeric"
> a+b
[1] 2 3 4
> sum(a)
[1] 1
> sum(b)
[1] 6
> for(e in a) print(e)
[1] 1
> for(e in b) print(e)
[1] 1
[1] 2
[1] 3

Jason Stangroome

PowerShell v3 allows scalars to be treated as collections:

$foo = 7
$foo.Length # 1 element
$foo[0] # returns 7

And the reverse

$bar = @( [DateTime]::Now , [DateTime]::Today )
$bar.AddDays(1) # normally a method on the scalar, applies to all items in the array.

Peter Williams

Seems to work in Python

import numpy as np
a,b,c = 1, np.arange(2), np.arange(4).reshape((2,2))
for x in (a*2,b*2,c*2): print x


[0 2]
[[0 2]
[4 6]]


Old mathematicians trick: "Singleton sets are identified with their sole element" or something alike is in the preface of many mathematics books ...

The comments to this entry are closed.