* Exercises (AP, VT)
** Convert more tests to Python (easy)
Choose tests/tafkit/foo.chk, and turn it into tests/python/foo.py.  Take
inspiration from tests/python/derivation.py.  Try to be "dumb" in the
conversion: it should be easy to see that the original *.chk and the
final *.py are really alike.  Don't over-engineer something that hides the
tests.

** sum, star, etc. (harder)
We have

  star(automaton) -> automaton
  left_mult(weight, automaton) -> automaton
  right_mult(automaton, weight) -> automaton

that exist at in the various API.  Do the same for ratexps:

  star(ratexp) -> ratexp
  left_mult(weight, ratexp) -> ratexp
  right_mult(ratexp, weight) -> ratexp

in dyn::, and in Python.  Write the corresponding tests in Python only: take
inspiration from tests/python/derivation.py to write a single *.chk file for
this task.

** subsequence (or subword?), prefix, suffix, factor
From an automaton generate another that corresponds to the subsequences,
prefixes, suffixes and factors.

Do that in the static API (templated), then the dyn:: API, then the Python
API, then tests/.  Don't do TAF-Kit.

Eventually (when the ratexp ASTs have been upgraded, do not do that before,
that would be a loss of time), provide the same routines for ratexps.

* Build
** --enable-compilation-mode
Do not set GLIBCXX_DEBUG in speed mode.  And anyway, we have segv when both
NDEBUG and GLIBCXX_DEBUG are used.  It would be nice to find where and send
a bug report...

** Coverage
We need to add coverage tests on the BF.  From ADL:

  First, compile (and link) Spot with coverage enabled.

   % ./configure CXX='g++ --coverage'
   % make

  Then run the test suite (or any program you want to study).

   % make check

  Executing programs using Spot will generate a lot of *.gc* files
  everywhere.  Collect these using lcov:

   % lcov --capture --directory src --output spot.info

  Finally generate a coverage report in HTML:

   % genhtml --legend --demangle-cpp --output-directory html spot.info

  This should create the directory html/.

** -fsanitize=address
Try that on the BF.  For some reason, Libtool strips this flag at link time.
Use -Wc,-fsanitize=address.

AD can't get this to work on OS X (with Python).  Any help would be
appreciated.

* Tests
** operations on automata
Use ratexps as weights to check multiplication order.  As in checks for
standard(exp).

** proper
Check the case where we do not want to eliminate states.  Actually check
that "a.proper(False).accessible() == a.proper()".

* Bugs
** Parse error
We need to provide more acurate locations in error messages.  For instance,
create a dot file with an incorrect context definition, and see the failure.

Note that when printing an automaton with an empty alphabet, we write
'lal_char_b' which fails to parse correctly for lack of () (should be
'lal_char()_b').

** Open coded file names
We must use genuine temporary files for vcsn.py conversions.  When possible,
avoid files and use streams.

** shortest
Loops forever:

  $ VCSN_ORIGINS=1 vcsn standard -C 'lal_char(a)_z' -Ee 'a*+<-1>a*)' \
     | vcsn shortest -f-  1

We can check that we reach the same state, but can we have periodic
behaviors?

** expand
Beware of the missing parens when displaying weights in iPython
br = vcsn.context('lal_char(abc)_ratexpset<lal_char(xy)_b>')
br.ratexp("(a+b+b)*").expand()
=> (a+\e+\eb)∗
which is actually
=> (a+(\e+\e)b)∗

** infiltration does not work because of the "new_transition" optimization
The "NEWS.txt" entry is false.

  $ vcsn derived-term -C 'lal_char(ab)_b' -e 'a' -o a.gv
  $ vcsn derived-term -C 'lal_char(ab)_ratexpset<lal_char(uv)_b>' \
                      -e '(<u>a+<v>b)*' -o ab.gv
  $ vcsn infiltration -f ab.gv a.gv | vcsn shortest -f - 4
  Assertion failed: (!has_transition(src, dst, l)), function new_transition, file ../../vcsn/core/mutable_automaton.hh, line 442.
  vcsn shortest: 1.1: syntax error, unexpected $end, expecting digraph
  zsh: abort      vcsn infiltration -f ab.gv a.gv |
  zsh: exit 1     vcsn shortest -f - 4

Fix this bug, but check that we don't ruin performances of infiltration.
Uncomment the tests in infiltration.chk (there are two places).

** sum, union: relax the constraints on weightset
Now that our products support heterogeneous, but compatible, types, we
should propagate this to other n-ary algorithms.

** products: lazyness
We need a lazy implementation of product.

** is-valid(automaton)
In the case of RW, check that the ratexp is valid.

** Generators and alphabets
ladybird "works" properly on "lal_char(ab)_b", although it needs abc.
Of course, output then fails.  Should ladybird enrich the alphabet?
Or use what's in there instead of always abc?

** debug compilation mode
crange should not feature size and empty if !VCSN_DEBUG.

** star-normal-form
It does not work for weighted ratexps.

** intersection
Check star-normal-form, standard, thompson.  Check the trivial identities.
What should be the definition of the B (breaking/splitting) operator?  I
have taken a definition which is modeled after the one of concatenation (on
order "not to break too much") rather that something like expand.  What are
the expected features of the right definition?

* Benchmarks
** determinization
Experiment determinization on (a^p)*+(a^q)*+(a^r)*, with p, q, r primes.
Possible name: "Chrobak" (reference to Marek Chrobak), "cycles", "gears".

** determinization
Introduce aaaaa+aaaab+...+bbbbb for length n.

** eps-removal
(1+a)^n

* Transducers
** tupleset::one/is_one
Currently tupleset does not define a one, even if all the members
do.  Adjust is-eps-acyclic.

* Refactoring
** Use Boost.Format to support multiple formats
We have loads of code that open-code the output format.  This is
wrong.  We should rather have format strings such as

        "%s/%s" vs. "\\frac{%s}{%s}"

and use these formats strings.  Instead of passing a format name, we would
actually pass a "format" object whose members are Boost.Formats.  It would
then be much easier to customize the pretty-printers.

Alternatively, we could also use a collection of functions/lambdas.  More
generic, and "more portable".  Note, however, that Boost.Format is
header-only.  A mixture of both might be the best: where they suffice,
format _are_ superior.

** Eliminate rat::exp (Antoine)
It does seem to be useful, except for one thing: dyn::ratexpset uses a
shared_ptr to exp (which is rat::exp_t is) as a means to handle ratexps.  It
would make much more sense for dyn::ratexpset to use dyn::ratexp as an
abstraction to typed ratexp, but this should be thoroughly benched to make
sure this is not (much) costlier.

Getting rid of this will allow rat::node to use single inheritance.

** Make nullableset a labelset wrapper
One would say 'nullableset<letterset>'.  This would allow to introduce
'nullableset<tupleset<lal, lal>>', which is not possible currently.

** Remove references to ratexp/ratexpset from context
On second thought, while it is definitely handy, it is not normal that
contexts depend on ratexpset.

** More documentation
The Doxygen style documentation is way too poor.  Aim at completion.

** We need to go from a labelset to its wordset
Or something like that.  I'm not sure "wordset" is appropriate, however,
polynomials for instance really need to go from label_t to word_t, and be
able to deal with them.

** polynomialset
It seems that monomials would be a useful abstraction.

Why "weight" in add_weight?  In place operations would be useful.

We need to hide the map, and expose a polynomial_t type:

    /// Construction from a list of monomials.
    ///
    /// Even without this constructor we can write:
    ///
    /// polynomialset{{l1, w1}, {l2, w2}}
    ///
    /// However, it is the laws of a map that apply, not those of a
    /// polynomial.  In particular (i) if a label is zero, then the
    /// polynomial will contains such a monomial (which is forbidden),
    /// and (ii) if l1 and l2 are equal, the resulting weight will be
    /// w2 instead of w1+w2.
    polynomial(std::initializer_list<monomial_t> init)
    {
      polynomial res;
      for (const auto& m: init)
        res.add_weight(m);
      return res;
    }

likewise with assignements.  The problem came from derivation with
intersection:

      virtual void
      visit(const hadam_t& e)
      {
	e.head()->accept(*this);
	auto res = ratexp(res_);
	for (auto v: e.tail())
	  {
	    v->accept(*this);
	    res = rs_.hadam(res, ratexp(res_));
	  }
        res_ = {{res, ws_.one()}};
        apply_weights(e);
      }

here, res _can_ be \z, in which case we build a polynomial (\z -> 1), which
is forbidden (this should be the empty polynomial).

** Metadata
We need to keep metadata about automata, especially state names.  Then
update all the "VCSN_ORIGINS" uses.

** Dyn weights
Now that we have dynamic weights, they could be used when parsing, instead
of a simple string that is converted by the weightset.

* Optimizations
** accessible etc.
The extraction of the accessible subautomaton currently requires several
traversal of the automaton, although one would suffice.  This is because it
is factored a lot, using std::copy for instance.

Keep it factored, but for instance, introduce a std::copy that walks only
the accessible parts.

** constant-term
To compute the constant-term of a ratexp, we compute that of its subratexp,
which will be recomputed uselessly by derived-term.

** difference
We don't need the rhs to be complete, it suffices to adjust product to
generate a pseudo sink state each time we exit the RHS.  And of course to
change the accepting states to be the non-accepting states of the rhs.

This would avoid the completion of the rhs, which might add many many
transitions.

** evaluate
Vectors of weights indexed by states are bad structures to iterate upon.  We
don't need to work on states that are not part of the computation.  This
shows in the "if (!ws_.is_zero(v1[s]))" in the code.

** power
There are better means to compute the power in some cases, see
http://en.wikipedia.org/wiki/Addition-chain_exponentiation
Also, because the algo is written recursively, we are calling
accessible too often (on products, which are accessible, of course).

** shortest, enumerate
Check that the data structures are really the best possible.  map guarantees
that if there are no deletions, references and iterator remain valid.  So we
should store references or iterators in the working queue, instead of
duplicating the monomials.

* aut-to-exp
** More heuristics
See what V1 did.

** Incremental
Transform the current implementation of the "naive" heuristics into
some incremental.  See what TAF-Kit.pdf B.1.4.1 says about it.

** Rename to a better name.
Drop "aut", because we have more entries than that.  Drop "to", as it's
clear enough from its name.  Move to "ratexp" for consistency with the name
used everywhere else.

* dyn::
** Implement implicit conversions
So that, for instance, we can run is-derministic on a proper lan.

** n-ary product, shuffle and infiltration
Currently it works at TAF-Kit level.  It should be easy to do at dyn::
level, and we should accept 0-ary.

It would be nice to think about n-ary static too (variadic).

* trivial identities
As a goal, we want every rational expression to yield the same result
in lal and law.  This is a failure:

  $ vcsn-cat -C 'lal_char_z' -e 'ab<3>' -E
  (a.b)<3>
  $ vcsn-cat -C 'law_char_z' -e 'ab<3>' -E
  <3>ab

Well, this will not change (and the note is kept only as a reminder that it
must be put in the doc at some point).  The trivial identities are really
_trivial_: they deal with constants (\z and \e), and weights.  They don't
deal with products for instance, which is the case here: it would require
c<k> -> <k>c where c is a concatenation of labels.

Note that TAF-Kit's documentation (Section 2.2.1) reports:

  Caveat: The definition of the identity Cat corresponds to what is actually
  implemented in Vaucanson 1.4 and is somehow a mistake. A more natural
  definition would be m<k> ⇒ <k>m with m any element of the monoid. This may
  be corrected in forthcoming revisions of Vaucanson 1.4 but should anyway
  be reevaluated in connection with the definition of the function
  derived-term for the weighthed automata.

but here 'm' just denotes 'monoid element', i.e., a label in V2 parlance.
In other words, for V2, x<k> ⇒ <k>x and m<k> ⇒ <k>m are exactly the same
thing: x is a label of LAL, and m is a label of LAW.

* I/O
** fado
I/O with words.  See the way they also _name_ states, for instance, read the
dl4.fado as generated below.

  $ vcsn ladybird -O fado 4 | \
    python -c "from FAdo import fa
  nfa = fa.readFromFile('/dev/stdin')[0]
  dfa = nfa.toDFA()
  fa.saveToFile('dl4.fado', dfa)"

** Grail
We should also be able to read Grail+ files.

** Forlan
See http://alleystoughton.us/forlan/.  In
http://alleystoughton.us/forlan/book.pdf, page 121:

  {states}
  A, B, C
  {start state}
  A
  {accepting states}
  A, C
  {transitions}
  A, 1 -> A; B, 11 -> B; C, 111 -> C;
  A, 0 -> B; A, 2 -> B;
  A, 0 -> C; A, 2 -> C;
  B, 0 -> C; B, 2 -> C

  Transitions that only differ in their right-hand states can be merged into
  single transition families. E.g., we can merge A, 0 -> B and A, 0 -> C into
  the transition family A, 0 -> B | C.

Note that "\e" is denoted "%".

** Vaucanson 1
We can easily read simple V1 automata thanks to its dot format.  However, we
must pay attention to more complex cases (e.g., rich weightsets).

However we cannot easily feed V1 with automata from V2.  This is troublesome
for benches.  However, we can probably work out something simple by using
the "edit-automaton" input of V1: we generate a script that builds the
automaton.

** XML
At some point, someone should really work on the XML formalism.

* More algorithms
** are-isomorphic
Implement it.  Sources of inspiration: Vaucanson 1 (I have been told there
are two implementations there), Forlan.

** determinization
Look for other implementations (cf. "Five determinization algorithms").  And
pay attention to the case of large alphabets.

** minimization (LS)
Implement more minimization algorithms (Hopcroft, Revuz, Brzozowski...).
Work for trim automata.

We need generalizations of minimization for weighted automata.  See TAF-Kit
1 and quotient.

** variadic products
Our products (product, infiltration, shuffle) should provide variadic
versions.  dyn:: for a start, but a variadic template for products would be
very nice too.

And then convert "n-ary" tests in tests/python/product and infiltration.

** "check" algorithm
There should be a means to check that the invariants are verified.  A
separate algorithm would do.  In particular check the alphabet, that
the special letter labels the initial and final transitions etc.

** Levenshtein automata
http://en.wikipedia.org/wiki/Levenshtein_automata

* edit-automaton
Currently it converts the \e in initial/final labels to the special-letter.
Is this what we want?

* vcsn/alphabets/char.cc
  char_letters::special_letter(...) is protected and
  set_alpha<T>::add_letter(...) (in file vcsn/alphabets/setalpha.hh)
  need it.

* mutable_automaton::set_transition
We should find a means to forbid transition from pre to post.  This
was the case initially, but it is a useless constraint in aut-to-exp.
Maybe it should be efforced only in non labels_are_unit case.

* automata: handle with shared_ptr
One would really like to have a transpose_automaton that is able to
build its underlying automaton.  This means that using a const& to
keep the original automaton is not the best model: pointers would be
better.  But then there are issues with memory tracking, issues that
we already know how to handle thanks to shared_ptr.

* compilation jit
Well, you know what I mean.

* move files around
The hierarchy and the namespaces do not match.

* Readings
About dyn/static bridge:

http://www.lrde.epita.fr/dload/papers/gcse00-yrw/olena.html
ls ~theo/pub/*ouil*

Local Variables:
coding: utf-8
fill-column: 76
ispell-dictionary: "american"
mode: outline
End:
