= Finding Functions =

When creating a workflow, users need a way to find functions that are available
on their machine.  This is managed by two main classes, FunctionLibrary and
FunctionSearch.  FunctionLibrary is in charge of tracking a set of
modules/packages that a user would like to search. FunctionSearch allows the
user to search for functions that match a particular string.  The typical usage
is as follows:

{{{
    from blockcanvas.function_tools.function_library import FunctionLibrary
    from blockcanvas.function_tools.function_search import FunctionSearch

    # Find the functions in the os module and the xml package.
    library = FunctionLibrary(modules=['os','xml'])

    # Search all the functions found by the library for the pattern 'esc'
    search = FunctionSearch(all_functions=library.functions)
    search.search_term = 'esc'

    # Print any results found.
    print search.search_results
    #[(xml.sax.saxutils, escape), (xml.sax.saxutils, unescape)]
}}}

When FunctionLibrary searches a module, it returns only the functions found
directly in that module.  If it searches a package, it does a recursive search
for functions in all the sub-packages and modules of the package.  In most
cases, the FunctionLibrary is careful not to import any packages or modules.
Instead it searches the code in modules for function definition.  This allows
the user to specify a very large number of functions to search (potentially the
entire python path) without having imported all the modules into memory.. The
exception to this rule is when an extension function is specified directly in
the library modules list.  The only way to find the functions within the
extension function is to import it.  Extension modules inside a package
specified in the modules are *not* loaded.

FunctionLibrary and FunctionSearch use very little extra infrastructure, relying
only on search_package.py in the function_tools package.  This is intentional.
The functionality offered by these classes is fairly general and can be used by
other applications.  Its small footprint improves the chances of this happening.

FunctionLibrary and FunctionSearch are de-coupled intentionally.  One produces a
list of functions, the other sifts through a list of functions.  This simplifies
there design and also allows them to potentially deal with lists of functions
generated by other objects.  If it is important to couple them together for a
specific application, it can be done by synchronizing traits or potentially
using a sub-class.

= FunctionLibrary =


FunctionLibrary has the following external interface:

{{{
class FunctionLibrary(HasTraits):

    # List of strings such as foo.bar that specifies which modules to search.
    modules = List

    # List of the functions found in the specified modules/pacakges.
    # Each function is represented by an object with 'name' and 'module'
    # attributes.
    functions = List

    # A factory function for generating items in the function lists.  The
    # only requirements is that it take 'module' and 'name' as keyword args.
    function_factory = Callable(BasicFunction)

    def refresh_module(self, module=None):
        "Reparse the module,specified as a string , looking for functions."
}}}

In most cases, users only care about the 'modules' and 'functions' traits.
When modules or any of its items are changed, the 'functions' list
automatically updates.  For each function found, a BasicFunction object
is created by default.  Its interface is extremely simple, holding the
name of the function and the name of the module it is in.

{{{
class BasicFunction(HasTraits):

    # The name of the module/package where the function lives.
    module = Str

    # The name of the function.
    name = Str
}}}

This is all the information needed for uniquely identifying a python function.
Keeping this specification small minimizes dependencies.  In the case where
someone wants a fancier representation of a function (perhaps something that can
retreive its doc_string, code, etc.), they can set the 'function_factory' trait
to this class, and it will be used instead of 'BasicFunction'.  This is only
needed if you want the FunctionLibrary to manage these objects for you.  Otherwise,
you can adapt the simple function representation to your needs externally.  I
lean toward the 2nd use.

== Caching in FunctionLibrary==

Searching modules and (particularly) packages can be time consuming.  Something
like the xml library takes a few tenths of a second to search.  While this isn't
huge, if you try and search something like the entire python library, it would
take quite a while.  Once the work is done for a module, you don't want to have
to redo it unless absolutely necessary.  FunctionLibrary keeps a simple internal
cache of the functions that were found for each entry in the modules list.
When the modules list changes, it uses this cache for rebuilding the functions
list.  the 'refresh_module' method can be used to force a refresh of the
cache for a particular module or for the entire set of modules.

== Caching between applications runs ==

[Future work design discussion]

As mentioned above, it can take a lot of time to search a list of modules
and packages for functions.  This is typically done at this is done at start-up
time for the application.  Unless you use, threads to do the update in the
background, the application can take an unpleasant amount of time to load.
One way around this is to save the FunctionLibrary, including its cache,
to disk when it changes so that the next time an application starts, it
can just read in the functions instead of having to parse them.

This hasn't been done yet.  While the basics aren't hard, making sure you
don't have any stale information could be.  To help prevent this, the
cache could be modified to store each file found in the module search
in the cache along with the functions in it and its modified date.  Now,
after reading in the cache, you can then do a stat on each file and only
refresh the ones that are out of date.

Currently, our start up costs are not so bad for cp.rockphysics
(1 second or so).  For all of cp, it is about 40 seconds the first time, and
6 seconds or so for sub-sequent restarts (disk caching is helping on this I
guess).  For now it isn't a critical problem (cp.rockphysics has most of
what we need to show, and 1 second is managable).  If we end up needing
a larger set of functions in the app, caching will be important.

== Issues with FunctionLibrary ==

* I am nervous about search_package and probably more about pkgutil.  It seems
  to return different results depending on what has imported.  Sometimes xml
  returns "simple paths" sometimes it returns 'eggified' path names.  I don't
  get it.  This seems to cause some problems, but the cp stuff does just fine...

* We don't have a mechanism in place to store signatures for extension functions
  or alternative signature for python functions.  While this isn't a major
  short term issue, it is something we will want to fix in the future.

= FunctionSearch =

FunctionSearch is used to search a for functions that match a comma delimited
search term. It provides options for filtering out functions that name or
module(package) match a particular pattern.  This helps limit the amount of
unwanted functions that appear in the list from places such as test directories.
It also provides options on whether to search for the term in the function name,
the module name, or both.

The search term searches for matches anywhere within the name or module string.
Search terms that match at the beginning of the name are returned first in the
search_results.  The patterns specified for filters are more strict, matching
exactly.  They do allow wildcard matching using '*' to match any set of
characters.

The interface for FunctionSearch is below.  The search_results update any
time that the search_term, filters, etc. are changed.

{{{
class FunctionSearch(HasTraits):
    # The search term.  This pattern will match *anywhere* in the name or
    # module of a function (depending on how the search flags are set).
    search_term = Str

    # The search results in a form suitable for display.
    # List(CallableObject)
    search_results = List

    # The filters which are applied against the functions' names. Unlike
    # search_term, these filters match exactly -- not anywhere in the
    # searched text.
    name_filters = Str("_*, *test*")

    # The filters which are applied against the functions' package. Unlike
    # search_term, these filters match exactly -- not anywhere in the
    # searched text.
    module_filters = Str("*tests*, *retired*, *.setup")

    # Should we search for pattern matches in the function name?
    search_name = Bool(True)

    # Should we search for pattern matches in the function module?
    search_module = Bool(True)

    # List of objects that have module and name as string attributes.
    all_functions = List
}}}

= Searh UI =

TBD


= Canvas Interactions =


== "Active" function box ==

When adding a new function to the canvas either through the function library
or through a right-click action, that function should become the "Active"
function box denoted by the fact that it is selected.

If a function is added to the canvas through additions to the code, it should
not change the active function box.



= The Execution Model =

The "Block Context" model, divides a computational problem into two parts, the
exeuction instructions (block) and the data (context).  The name "block" is
better described as an "execution model", so we'll use that term here.

In our architecture, the Python execution engine is used to run our execution
models.  This means that our execution models must be representable as a python
script.  This raises the question, "Why not use python code, python byte code,
or the abstract syntax tree of the code as the execution model?" These
representations turn out to be too low level.  In the following sections, we
discuss why and also lay out the architecture for our representation.


= Various Representation Options =

== Abstract Syntax Tree ==

The following Python code is the executable form of an executable model.

{{{
    from blah import bar
    from flop import foo as foo1

    # Here is a python function
    def foo(a,b):
        return a,b


    c,d = foo(1,2)
    e,f = bar(c,d)
    e,f = bar(e,f)
}}}

In our current architecture, there is a Block object that is one
representation of the execution model.  It is, however, is a problem
specific representation that doesn't contain all the information we need
when representing a coherent execution model to the end user.  Its main
issues are that (1) It separates import statements from function call
statements so that they are not represented as a single unit and (2)
It doesn't hold any representation of the actual function objects that
are being called.  This is needed so that an application can present
the potential call options to the end users.


When building a script, you need (1) a way to represent a call to a function
with specific input arguments and output arugments, and (2) a way to specify
which functions the calls are actually calling. To do this, we have two separate
objects.  In this architecture, (1) is handled by FunctionCall objects and
(2) is handled by CallableObject.  This architecture can grow with a Group,
Mask, Loop, and other structures.


= Abstract Syntax Tree =

#In a sense, we are Why don't we just use the abstract syntax tree


= CallableObject =

Python has the notion of a "callable" which is anything that can be "called"
as if it was a function.  Any callable object should be usable as a function
in our execution model.  However, if we are to display information about
the inputs, outputs, documentation, etc. for a callable object, we need a
little more information than a raw python object provides.  For example, a
python function may return multiple items that we want to treat as separate
outputs from the function.  So, we need to define a structure that includes
all this information.


There are several classes of CallableObjects including standard python functions
loaded from a module, python functions locally defined in an execution model,
extension functions, classes, and callable objects.  While all of these
are important, our current efforts have focused on the first two.  They
represent the vast majority of the use cases we are considering, and are the
easiest to deal with since they are queriable for all the input/output information
that we need.



A FunctionCall would have a reference to a CallableObject in it.  There would be
multiple subclasses of this that would represent the different types of callable
objects we see.  The most common is a pure python function that exists in a
module on disk. The real driver for this is to handle a function that is local
to the execution model (block).  Here, we want the code to really be contained
within the object so that it is easily moved from one block to another.
ExtensionFunctions would be useful because they provide a place for us to
specify the input and output arguments of the function that can be bound to by a
FunctionCall.

Note: Do we really care to have the sub-classes, or should we have one
      uber object that is the callable??

CallableObject

    PythonFunction
    LocalPythonFunction
    ExtensionFunction

Here is the basic signature:

class CallableObject(HasTraits):

    # The package and module that this callable comes from.  The canonical
    # import statementthat would import this object is:
    #     "from %(module) import %(name)
    # If the function does not have a module, the string is empty.
    module = Str

    # The name of the callable object.
    name = Str

    # The full python path module+name for the object.  This is read-only.
    full_name = Property(Str)

    # List of the input arguments to the function
    inputs = List(InputArgument)

    # List of the output arguments of the function
    outputs = List(OutputArgument)

    # Documentation for the function if available
    doc_string = Str

    # The actual code for this callable.  If not available returns None.
    code = Str

    # Whether we are pointing at a valid callable
    is_valid = Bool

    # Is the code for this definition editable?
    editable = Bool(False)

I've added the editable trait so that a FunctionCall can determine if it
is wrapped around editable code or not.  This is a "sort of" UI related, but
not really.  Keep an eye on whether it should really be here.


= FunctionCall =

Note: Some of the notions in this section are out of sync with the actual
specification right now because CallableObject has taken over some of its
duties.

BlockCanvas needs to inspect the name, inputs, and outputs of a python function
so that it can inform the user about how to call the function.  Also, it needs
to keep track of variables that the user has "bound" to the inputs and "outputs"
of function.  This information is used in generating the "function call" in a
python script that is executed to actually carry out the block computations. Input
variables that have default values need to be treated differently that inputs
that don't when generating code.


{{{
    from blah import foo

#    # Here is a python function
#    def foo(a,b):
#        return a,b


    # Each of these would be represented as a FunctionCall, and they
    # both would refer to the same python function "foo".
    c,d = foo(1,2)
    e,f = foo(c,d)
}}}

So there is a single python function representing "foo" and there are multiple
FunctionCall objects that describe a call to "foo." All of these FunctionCall
objects refer to the same function.


The signature of FunctionCall looks like this:

class FunctionCall(HasTraits):

    # Name of the function
    name = Str

    # Name displayed for the function in a UI as well as in the call signature.
    # If this is changed, then the import statement will look like:
    # from python.path import name as label_name
    label_name = Str

    # List of the input variable names, bindings, and default values.
    inputs = List(InputVariable)

    # List of the output variables names and bindings
    outputs = List(OutputVariable)

    # Read-only string of python code that calls the function.
    call_signature = Property

    # FIXME: I BELIEVE THESE MAY ALL GO AWAY AND BE REPLACED
    #        BY A CallabaleObject
    # A dotted string that specifies the location in the python path where
    # this function can be imported from.
    python_path = Str

    # The absolute path to the file storing the function.
    file_path = Property

    # The code for this function including its signature line.
    code = Property



    # Indicates that something about the FunctionCall has changed.
    # Not sure if this is a boolean or event.
    # Not sure if it is needed either...
    dirty = Event

NOTE: I'm recently thinking that we should really factor the python_path
     and file_path information out into a separate object called a FunctionObject.
     We can have local FunctionObjects that reference code that is local to
     the canvas as well as FunctionObjects that reference a location on disk
     where you can find code.  I believe, with this separation, FunctionCall
     and FunctionObject classes could represent the execution model for our
     app.[fixme: More thought needed, but this seems like a good path].

We only store information in the FunctionCall object that is needed for the following:

    a) Specify where to import the function from if needed.
    b) Generate code to call the function with the given input and output values.
    c) Access the function for displaying/editing its code.

Note that FunctionCall actually doesn't keep an actual reference to the python
function object, its code, or its AST in its current form.  This information isn't
needed for the functioncall to generate its code or allow the UI to be viewed.
Code is always persisted somewhere -- either in the block code or in a file on disk.
With references (python path, name, file path) to where the function lives, we can
always go retreive it when a UI needs to retreive/edit the code.

== FunctionCall Construction ==

There are three different ways we generally want to create FunctionCall
objects.  The are:


    1. From an existing python function.
    2. From an ast that contains a function call.
    3. From "scratch", ie. creating an entirely new python function.

These are discussed in the following sections.  There are potentially other
needs for creating FunctionCall objects.  If they become critical to the application,
they should be added to the list.


=== Creating a FunctionCall object from a Python function ===

If a person drags a function from the function library onto the canvas, then
we'll be creating a FunctionCall object from an actual function object.  The
nice thing about this approach is that we can inspect a live python function
and learn everything about it -- all its arguments, its return values as
best we can tell, and most importantly its function location (file, etc.).  This
makes it very easy to find the code for editing, create the correct import
statements, etc.

=== Creating a FunctionCall object from a function call string ===

Imagine we see a string that calls a function and returns outputs in a script:

{{{
    a,b = foo(x,y=3)
}}}

This string or its equivalent ast has a lot of information about a specific
function call of 'foo'.  From this, it is reasonable to expect that we can create
a FunctionCall object In the following way:

{{{
    function_call = FunctionCall.from_call_ast(call_ast)
}}}

However, to get a fully specificed FunctionCall object, we need some more
information such as where 'foo' is located and what the actual input argument
list for 'foo' looks like.  It may actually have 4 default arguments that
weren't specified in this call.  If we have the location information about the
function, however, this second problem is solved because we can go find the
function and use our standard methods for finding its inputs. So, to get a more
fully specified FunctionCall object, we need something like:

{{{
    function_call = FunctionCall.from_call_ast(call_ast, function_location_info)
}}}

Now the question is, what should the function_location_info look like?  In the
case where we have the ast for the full script that had the foo call in it, we
can find the location from the import location specified in the script:

{{{
    from somewhere import foo
    ...
    a,b = foo(x,y=3)
}}}

or even the following where we have the function getting a new label:

{{{
    from somewhere import bar as foo
    ...
    a,b = foo(x,y=3)
}}}

Passing in the entire ast doesn't seem that useful.  It seems that it would
be helpful to have a pre-processing helper function that maps all function
imports specified in the script to locations and labels.  To make it handle
more cases that we care about, it would also need to find functions that
are defined locally in the script.

{{{
    def find_imports_and_functions(ast):
        """ Return a dictionary of function_name -> (import_location, label)
            mappings that can be used to look up the location and potential
            label remapping ('' if not used) for all the functions in a
            script.
        """
}}}

The result of this is passed in as the 'function_location_info' to a the
FunctionCall object.  It will be important to differentiate between the
cases where a function isn't defined at all and where it is locally defined.
I think we can mark the location using the same 'Undefined' object that
we do in InputVariable to denote the lack of binding to identify this
situation.  An alternative that might be nicer is just using 'None' to
identify this case [I think this is better].  An empty string for
location means that it is local.[Revisit this when we have more code
written.]


This works well if the location info is specified and accurate, but how do
we handle these cases where this isn't true?  For case 1, no location
information is specified.  For case 2, the location information is
specified, but is inaccuarate.  I believe the solution for each of these
cases is an *application* level issue, and it shouldn't be handled inside
the FunctionCall class. The following two sections discuss how it should
be handled.

==== Non-existent location information for a function ====

This should be handled at the application level.

If there isn't any information to specify the location of a function, the app
should try and find the function for the person.  It should first look in the
user's directory for a function with the same name.  It should then look in the
rockphysics library for a function with the same name.  If that fails, it should
have some kind of icon that marks the block (or line of code) with an
"undefined" flag.  If the user clicks to edit the code, we should ask if they
want to find the function or create a new local function.

==== Inaccurate location information for a function ====

This should be handled at the application level.

We should check that there is a module on the path (using pkgutil?) that
will find the function.  If we don't find it, again, we flag the function
with an icon somehow.  If the user clicks on the icon, they can either
search the file system for the function or decide to create a 'local'
function to specify it.

On the searching the file system side of things, we should do everything
we can to facilitate this process.  Google desktop integration to find
functions in .py files with that name would be very nice.  Also, if they
choose on of these items that isn't currently on the search path, we can
offer to add the directory to the search path.

Note: We may want to add a "valid_path" flag to FunctionCall.  I think
      this could be a cached property that just looks up whether the
      package path is found by pkgutil.




=== Creating a FunctionCall object from scratch ===
TBD


== Dealing with Extension Functions ==

Extension functions are written in C or Fortran, and we don't have access
to the Python function object to view/edit their code or to inspect
their input/output variables.  This makes it difficult to specify a
FunctionCall (or FunctionObject) instance automatically from them.
Some functions, such as those generated by f2py, may have enough information
in their doc-string to be able to determine their input/output argument
types.  Most others will not.

In these cases, it would be nice to have a way for the end user to
specify the input/output information for an extension function and have
it stored for as a description that can be used for interfacing with the
function.  One way of storing this would simply be a pure python function
that is a wrapper around the extension function.


== Dealing with Callable objects that aren't functions ==
TBD.
This case hasn't come up much yet in our geophysics apps, so it is lower
priority.  However, I don't think it should be very hard to deal with.
We can use most of the same tools we have now to parse the __call__
method if it is a class object.  If it is something else (?), the
approaches to extenstion functions should work.



Where might functions "live" that a FunctionCall is referring too?

    1. Local to a block itself. file_path and python_path should be empty in this case...
    2. The standard python library.
    3. A library on the python path.
    4. [low] A python file not on the python path. [Should the UI inform the user that
       this won't execute right and inform them how to change their python
       path so that it should work]

What can a user edit on a FunctionCall?

    1. Bindings on inputs and outputs.
    2. Label name.  If a user edits this, does not change the underlying function
       but the label in the UI and also the import statement in the block.
    3. Function name.  This could mean two things.
        a) We want to use a different function.
        b) We want to rename this function to that name and make it a
           local function or a "user function."   This is more likely.
           May need an explicit rename() method.

    4. code.  If the code is changed on a "library" function, we need to
       either copy it to a local function to the canvas or to our user function
       directory and change our file/python path to point to the new function.
       If the function is already in an "editable" location, then we edit away.
       If other objects refer to the function, we need to ask whether this is
       a new version and change its name or if we want to affect all calls
       refering to this function.
    5. python path.  I'm not sure this could be edited through a "normal"
       means.  If it changes, I don't know that we do anything special.

The first two primarily affect how code is generated in the call signature. The
second two often result in the code being edited or moved and saved onto disk.

Authoring New Functions
-----------------------

Users will create new functions from scratch.  These should, by default,
be local functions.

Editing Code and code location.
------------------------------

When a function from a library module is edited, it becomes a local function
to the block.  We need to scan the code of the function and import any
undefined symbols from the library that it came from.  These will be
injected into the code right after the doc string.  This code is
copied into the top of the block code (after imports).

Users can "convert" either a library or a local function to a "user" function
that is a user editable version of the function that is also available for
use on other canvases.  This could be a right-click operation on a FunctionBox...
In this case, we have to manage creating a uniquely named file with this function
and modifying the FunctionCall to point at this.  We will also have to update
the code block to reflect the new import location [this need is common to any
change in python_path, name or label_name in the FunctionCall].

If we "convert" a function to a user function, we need to ask whether the
user wants all FunctionCalls on the canvas to "point" at this new function
or just the one that converted. ie:

{{{
    from foo import bar

    a,b = bar(1,2)  # bar1
    c,d = bar(a,b)  # bar2
}}}

So if bar1 is requested to be converted to a user function, the user is prompted
about changing bar2 as well.  If they choose to "re-factor" all of them to use
the converted function.

There are a couple of ways we could handle this.  The first is:

{{{
    from user.bar import bar as bar1
    from foo import bar

    a,b = bar1(1,2)  # bar1
    c,d = bar(a,b)  # bar2
}}}

or:

{{{
    from user.bar import bar

    a,b = bar(1,2)  # bar1
    from foo import bar
    c,d = bar(a,b)  # bar2
}}}

The first has the benefit of "prettier" code, but it modifies names automatically.
The 2nd one keeps all the names the same, but it is uglier.  Katrina and I like the
first.  So, if there were multiple modules named the same thing, we generate
a unique label_name for the new user function.  This deals with it on the
canvas, but not on the disk.

[todo]Think through the logic on renaming on the canvas and on the disk when we
have naming conflicts.  Document what we do when a function is dropped on the
canvas and they are different functions but with the same name.

    from foo import bar
    from goo import bar

    a,b = bar(1,2)
    c,d = bar(a,b)

We need to force a name_label change on one of them here.  Think through
the same conflicts for writing functions to the user directory.

"Lost" Function definitions
---------------------------

What if we go looking for the code of a FunctionCall, and we end up with
errors trying to import/find its code representation.  In this case, we should
allow the user to search around and find the appropriate file.


Finding Functions based on path
-------------------------------

In the original FunctionCall code, we didn't have any code for "finding"
a function and loading it.  That was done by the FunctionObject code.  It
used import to find the functions and inspect to find the code for a function.
I believe this was problematic and failed in some cases so the FunctionDefinition
code used pkgutil which is supposed to be more robust (?) at doing this sort of
thing.

I don't believe FunctionCall needs to the "loading" capabilities in it, but
it obviously needs to be done somewhere.

Where are FunctionCall objects created?
---------------------------------------

We have a function library that allows people to search for functions.
Currently, the search only occurs on function name and the package it is in.
Later we'd like to allow searching on the documentation, but that hasn't
been added yet.

None of this information needs a full FunctionCall definition, and also
the library is representing a "function" instead of a call to a function.
So, while we don't want a FunctionCall object in the function library, when
someone selects a function and puts it on the canvas, we need to create a
FunctionCall object.  ALso, if a person creates a new function on the
canvas, then we'll need to have a FunctionCall object.

The other place where we have to add FunctionCalls is on changes to the
code block (loading a new block or people typing in the code view of the
block).  If we load in a new block, we will need to generate a FunctionCall
for each function call line in the code.  Also, whenever we edit a line
in the code, we will have to update a FunctionCall based on the changes
to the code.

 we'll need to generate a FunctionCall for each of the functions
that are called in the block.

So, the 4 places to potentially create FunctionCall objects are:

    1) Adding a function to a canvas from the function library.
    2) Creating a new function on the canvas.
    3) Loading a new block and generating function calls for the
       lines in the code.
    4) Editing code in the block.

One other point to note.  We are talking about the canvas and the code
editor in these discussions, but it'd be better to actually talk about
editing the execution model from an arbitrary type of view.  For the moment,
we aren't that far along.

Editing existing blocks
-----------------------

Editing code (item 4) for an existing code block is actually the most
difficult problem to deal with.  Editing the code can change the names
of FunctionCall object, their bindings, leave them in an unparsable state.
If something as simple as the name changes, we have to think about whether
we now also replace the python_path and file_path with something new (I
think we do) and how do we do this.


Dealing with this is the hardest part of the application.  This may inform
where we spend our work effort over the next few weeks.  We may want to
allow people to see the code, but potentially not edit it -- or at least
not sink tons of time into handling this robustly.  This assumes
we can get the canvas in good enough shape to do real world problems.

Changing a function name by editing the block
=============================================

If the name of a function is changed in the code editor, we should first
look and see if that name is available locally already in the block.  If
it is, we simply change the python_path and file_path to refer to that
location.  If it isn't in the local scope, but there is a unique version
in the "function library", then we use that location, assuming that things
in the function library are likely to be what the user wants.  If there are
multiple items, we could ask the library the most recently used, or we could
provide the list for the end user to choose from.  [This shouldn't be done
in a dialog immediately - the function should be flagged, and the user should
be asked about where it lives when the click on it.  If name isn't available
locally or in the function library, then we again flag it and let the user
either find its location (searching the file system) for us or tell us that
it is a new local function.

= Function Search =

Users need to be able to search a list of available python functions
and place them on the canvas.  Visio's search mechanism for shapes that
you can place on a Visio canvas is a good analogy for this.

There are really two parts to the searching of functions.  The first is
to find/specify a "library" of functions to search.  The second is to
search these functions.

== Architecture Overview ==

library = FunctionLibrary(modules)
search = FunctionSearch()
search.library = [PythonFunction(name=name, module=module) for (name, module) in library.functions]


FunctionLibrary:
    Manage a set of packages/modules and finds all functions that are defined
    within them.  Searching modules for the functions within them is time
    consuming for a large set of modules, so its this classes job to handle
    caching of this information within an application session and also between
    application sessions.


FunctionSearch:
    Searches a list of functions using a specified search term.

search_package.py:
    Utility functions for traversing python modules and packages looking for
    functions within them.  Package searching is recursive, looking at all
    functions in any module underneath it.  Modules can be searched one of
    two ways.  The first is to import the module into python and search its
    namespace for all the functions found.  The 2nd way is to parse the
    code in the module looking for all the FuncDef nodes. [more to do]

== Specifying the function library ==

We are designing a system that allows *any* python function (or callable really)
to be used on our canvas.  Thus, the library can conceivably contain any python
function.  So which functions should be included? We could conceivably have all
the functions on the PYTHONPATH included in the search.  The drawback to this
approach is that it is (1) slow for us to process that number of modules
currently and (2) we end up with a ton of functions available to the user that
they will not likely use. A third issue currently is that not *all* functions
may behave well in our system and it would be better not to expose those to the
casual user for now. Extension functions, for example, are not fully specified
with their inputs/outputs.

Our approach is to allow the user to specify a set of python modules and
python packages that the user would like to include in the search.  These
modules and packages need to be on the PYTHONPATH to ensure the application
can find them.  Alternatively, a user can specify a file system directory that
will be searched.  This file system path should be on the PYTHONPATH.

Note that we need to manage the PYTHONPATH here if we are going to give the
user a nice experience setting up their library paths.  However, the PYTHONPATH
shouldn't be a local trait of the library.  Instead, the path should
be something that is managed at the application level for the entire application.
We should provide a unified UI for managing the library modules/packages and the
PYTHONPATH so that it is straight forward for the user to set up.  Note that this
doesn't necessarily mean that the PYTHONPATH is displayed in the Library
preferences dialog -- only that the dialog is aware of and helpful in managing the
path for the user.


== Finding Functions in a Module/Package ==

extension vs standard functions.
caching issues.

== Persistence of Search/Libary ==


== Comments on old code ==

Persistence of the library and search preferences *should be done at the
application level*.  They should not be hard coded into constructors of
of our search/library classes.  These classes may have methods that
read/write data for persistence, but higher level objects should choose
what location and file to persist the data too.  Also, things like a
function library should not be a global singleton in a module.  This leads
code that is hard to figure out as well as lacks flexibility.

Generally, if you think you need a singleton, rethink your design.
*Sometimes* singletons are the correct solution, but it is very very rare.


Questions about the overall app function model
----------------------------------------------

How is a FunctionCall used in the rest of the application object model?
Is it fundemental to the underlying execution model?  Is it linked to a
block uuid?  Or, is it only the model for the canvas FunctionBoxes?  If
so, how are the code and block and FunctionCall object linked together?
What are the rules about how they interact and update one another?



= General =


The "block canvas", in its simplest form, is an application that allows
users to describe a set of computations that will be carried out on a set
of data.

Block canvas provides a way for end users to connect the inputs
and outputs of functions together so that data generated by one function
passes to the next.  We do this by allowing users to specify variables
that are the inputs to and outputs from functions.  We then use this
information to generate an "execution block" that expresses the
particular set of computations.  This execution block is converted to
a python script where the functions
given the inputs and outputs.  This script is executed in a "context"
or namespace so that all the calculated variables are available for
inspection after the execution has taken place.  The context is also
special in that it fires events any time that data changes from a
calculation or even an external source.  Plots and other displays
can listen to this context and update as the data changes.  Widgets
can also write to this context to change an input variable to the
calculations.  The python script then executes to update output
variables based on these new inputs.  In this respects, the script
is like any other plot or widget in that it simply listens/responds to,
and writes to variables in the context.

FunctionCall Tests
------------------

*. Ensure we handle function in __main__ etc. correctly.
*. How do we allow a person to create a FunctionCall description
   (inputs, outputs) for an extension function?


Features

    1. Creating new functions.
    2. Migrating functions from the standard library to being
       local on a canvas.
    3. Editing Functions to change there code.
    4. Function call signatures synchronized with editing of
       functions.
    5. Handling default values in functions correctly.
