=head1 NAME

iPE::SequenceReader::NoLoad::Align - NoLoad version of the align format.

=head1 DESCRIPTION

This SequenceReader expects a file formatted with a fasta header with space-delimited sequence names, starting from the top sequence to the bottom sequence of the alignment (left to right in the header).  Each sequence should occupy a single line in the file, and each sequence should be of the same length (unaligned regions should be '.' and gapped regions should be '_').

=head1 FUNCTIONS

=over 8

=cut
package iPE::SequenceReader::NoLoad::Align;

use Carp;
use iPE::Util::DNATools;
use base ("iPE::SequenceReader::NoLoad");
use strict;

=item new (memberHash)

Constructs a new alignment sequence reader.  This doesn't read in the sequences, it simply make calculations which will enable it to later read portions of the sequence in from the file.

=cut
sub new {
    my $class = shift;
    my $this = $class->SUPER::new(@_);

    $this->{seqNames_} = [ split (' ', $this->header) ];
    $this->{numSeqs_} = scalar(@{$this->{seqNames_}});
    $this->{lineLength_} = ($this->fileSize - $this->headerLength)/
        $this->{numSeqs_};
    $this->{length_} = $this->{lineLength_} - 1;
    
    return $this;
}

=item length ()

Returns length of all sequences in the alignment.  This assumes that all sequences are the same length.  If they are not, the sequence reader will not function correctly.

=cut
sub length      { shift->{length_}          }
#internal--length of each line (which is equivalent to length+1 for the newline)
sub lineLength  { shift->{lineLength_}      }
=item numSeqs (), seqNames ()

numSeqs () returns the number of sequences in this alignment.  This is equivalent to the size of the array reference returned by seqNames ().  seqNames () is inferred from the space delimited names in the first line of the file after the fasta header character '>'.

=cut
sub numSeqs     { shift->{numSeqs_}         }
sub seqNames    { shift->{seqNames_}        }

=item getSeq (start, length, seqNum), getRevSeq (start, length, seqNum)

This gets sequence directly from the file.  This works similarly to substr in that it takes in a start and 'offset' or length of the substring to retrieve.  Note that getRevSeq also reverse-complements the sequence.

=cut
sub getSeq {
    my ($this, $start, $length, $seqNum) = @_;
    $seqNum = 0 if !defined $seqNum;

    my $startOfs = $this->{headerLength_}+$seqNum*$this->{lineLength_}+$start;

    my $fh = $this->fh;
    my $seq;
    sysseek($fh, $startOfs, 0);
    sysread($fh, $seq,      $length);

    return $seq;
}

sub getRevSeq {
    my ($this, $start, $length, $seqNum) = @_;
    
    $start = $this->length-$start-$length;
    my $forwSeq = $this->getSeq($start, $length, $seqNum);
    my $revSeqRef = reverseComplement(\$forwSeq);
    return $$revSeqRef;
}

=back

=head1 SEE ALSO

L<iPE::SequenceReader::NoLoad>

=head1 AUTHOR

Bob Zimmermann (rpz@cse.wustl.edu).
(With much acknowledgement to Sam Gross's code).

=cut
1;
