Wednesday, March 9, 2011

Passing Second-Class Arguments

Before it's current incarnation, I was frustrated by my own, soon-to-be-renamed, Proxy object. [Mind you, I don't mean the upcoming E5 Proxy object.] As a function manager, Proxy offers the benefit of passing secondary execution information, to it's managed routines. (These "extra" arguments permit comprehensive forking.) As with many function managers, this well-intentioned practice wound up restricting the number of arguments allowed in managed routines, resulting in a polluted signature.


Determining when a signature is polluted can be difficult, since such judgement is subject to the code's intended use - not it's explicit structure. For example, within an event-dispatch context, an argument signature is expected (if not required) by the handler routines - in order to properly process data from the dispatched event. On the other hand, I discovered that Proxy had a polluted signature, only after using it with a non-conforming routine. Thus, it's the problem-domain which reveals whether the solution suffers from this type of symptom.

The Subtle Condition
To clarify the problem, let's invent a silly function manager. (For arguments sake, I'm reducing the concept of function "manager" to any routine that calls another using the apply or call methods.) Below is an example of such a routine - I've named it "Silly" - which sends secondary information (albiet trifling) to the function(s) it calls (or "manages").
function Silly(fnc) {
  return function () {
    var num = 5,
      rand = Math.random();
    return fnc.call(this, arguments[0], num, rand);
  }
}
You can see that Silly will invoke fnc, passing one given argument while adding two of it's own values (num and rand). The example demonstrates an organic principle of JavaScript functions: argument order connotes primacy. Many function managers take advantage of this principle, in order to distinguish "extra" from "core" arguments, in their calls to external functions. For instance, below is line 870 in attribute.js, of YUI 3's Attribute Object (version 3.3.0).
retVal = setter.call(host, newVal, name);
The line above calls an external function which can (according to their documentation) "manipulate the value which is stored for the attribute." In this case, newVal is the core argument - to be tested and manipulated. However, name is passed after the core argument. Again, the data is passed to support robust functionality, but is definitely not central to functionality.

In and of itself, there is nothing inherently wrong or dangerous in either example. Since it is the problem-domain which reveals the issue, let's now use Silly to create two functions (rather augment two closures). One will reveal when our mandated signature becomes a pollutted signature.
var stringify = Silly(
  function (x) {
    return x + '';
  }
);
stringify(5); // -> '5'
var sum = Silly(
  function sum(x, y) {
    return isNaN(x + y) ? 0 : x + y;
  }
);
sum(5, 20); // -> 10
While the stringify function is fine, the sum function is now broken, or - at least - not working as designed. Because Silly only passes the first given argument, it greatly impacts the output of it's managed routines. Sure, that's not good, but how can we tell which function needs fixing? More to the point: How can a function pass non-critical arguments without impacting the arguments expected by the routines it invokes?

Regarding the first question. Were our example intended to work with any routine, then we can safely state that Silly has failed, and suffers from a polluted signature. However, were there documentation stating the limitations our routine placed on called functions, then our sum function would be the culprit... The line is blurry, because the structure of our code doesn't hint to either intention. Drawing that line in documentation (or worse, comments) is the last place we want to write code. We'd like that line to be self-evident in the code's structure.

Clarifying Code Structure
The problem is that most managing functions attempt to communicate an argument's relevance with it's order; an abuse of the order-equates-primacy principle. That is, the greater the arguments index the less a called routine should consider it within it's logic. This is how our Silly function works, along with earlier versions of Proxy, and (perhaps even) YUI's Attribute Object.

The truth is that there are no secondary arguments in a single arguments object. You might think of it as an overriding principle: all arguments should be useful. After all, arguments exist to impact the logic of a function. Otherwise, what are they arguing for (excuse the poor pun)? Nevertheless, order communicates primacy. With the arguments object, relevance is given by inclusion alone.

While trying to solve this same problem in Proxy - using order alone - I ran through a number of options. (Mind you, my final option was to document it's proper usage, and move on.) I'll sum them up with our example function, below. None of these variations net a less polluted signature in Silly, because all continue to require that called functions use a specific signature. Nonetheless, I've listed them below.
// pass all given arguments as the first argument
function Silly(fnc) {
  return function () {
    var num = 5,
      rand = Math.random();
    return fnc.apply(this, [arguments, num, rand]);
  }
}
// prepend a variable number of arguments
function Silly(fnc) {
  return function () {
    var args = [].slice.call(arguments),
      rand = Math.random();
    return fnc.apply(this, args.concat(num, rand));
  }
}
// append a variable number of arguments
function Silly(fnc) {
  return function () {
    var args = [].slice.call(arguments),
      num = 5,
      rand = Math.random();
    return fnc.apply(this, [num,rand].concat(args));
  }
}
In all cases, the code structure does not communicate the relevancy of the "extra" arguments. In fact, once passed to a called function, attempts to distinguish "core" from "extra" arguments (using order alone) are blurred. Thankfully, there's more to an arguments object than it's values...

Using the Call-Stack
The call-stack is a temporal structure, accessible (oddly enough) via the arguments object itself. Accessed by a jungle of callee and caller properties, any function can reference it's calling function (and it's calling function, etc.) - all via the arguments object.

Each "step" thru the call-stack goes backwards, to a function which is waiting to complete. Each stacked function (like any executing function) has it's own arguments object (containing the values it was initially passed). The farther back one goes, the less relevance that stacked function would have on the logic of the one executing. Thus, the call-stack is a structure for identifying relevance.

By executing a closure, before calling a managed function, and passing it non-critical data, our example function can now work with any routine.
function Silly(fnc) {
  return function () {
    var that = this,
      args = arguments,
      num = 5,
      rand = Math.random();
    return (
      function () {
        return fnc.apply(that, args)
      }
    )(num, rand)
  }
}
Silly now passes all given arguments to the managed routine, which can access the tiered or second-class arguments as it sees fit. Of course, this insertion of secondary (or second-class) arguments in the call-stack will require extracting. Though our old sum routine would now work just fine, it could be recoded as follows, to include the second-class data in it's logic:
var sum = Silly(
  function sum(x, y) {
    var sArgs = arguments.caller.callee.arguments;
    return isNaN(x + y) ? (sArgs[1] || sArgs[0]) : x + y;
  }
);
Conclusions
By using the call-stack, non-critical, execution-only data can be passed to a function indirectly. The routines managed by functions that employ this technique will need to change their logic, in order to access this second-tier of data - or, second-class arguments. However, eliminating polluted signatures, reducing an API's dependence on them, makes for flexible, clearly-structured code with a longer shelf-life.

I've done no performance research on the impact from unnecessarily increasing the call-stack in this manner. I am sure this technique does impact memory footprints and processing power, but suspect it's negligible, compared to the aforementioned gains.

Epilogue

Naturally, second-class arguments have vastly improved the usefulness and expanded the application of my own Proxy code. However, I quickly realized that the process of extracting this data was tasking and bothersome.

Thus, I defined a static method for Proxy, called getContext. The method takes a function's arguments object and does all the long syntax to build object-literal containing the second-class data. So instead of Proxy users having to target index 1 from "arguments.callee...." in order to get the "alias" value, they can simply use this line within the functions given to Proxy.
var alias = Proxy.getContext(arguments).alias;
If you're a library author who uses managing functions, I highly recommend a similar convenience method if you employ this technique!

Share/Bookmark

No comments:

Post a Comment