% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/read.quitte.R
\name{read.quitte}
\alias{read.quitte}
\title{Read IAMC-style .csv or .xlsx files}
\usage{
read.quitte(
  file,
  sep = NULL,
  quote = "",
  na.strings = c("UNDF", "NA", "N/A", "n_a"),
  convert.periods = FALSE,
  check.duplicates = TRUE,
  factors = TRUE,
  drop.na = FALSE,
  comment = "#",
  filter.function = NULL,
  chunk_size = 200000L
)
}
\arguments{
\item{file}{Path of IAMC-style .csv or xlsx. file or vector of paths to read.}

\item{sep}{Column separator, defaults to ";" in read_mif_header().}

\item{quote}{Quote characters, empty by default.}

\item{na.strings}{Entries to interpret as NA; defaults to
\code{c("UNDF", "NA", "N/A", "n_a")}}

\item{convert.periods}{If \code{TRUE}, periods are converted to
\code{\link[base:DateTimeClasses]{POSIXct}}.  If \code{FALSE} (the default), periods
are numerical.}

\item{check.duplicates}{If \code{TRUE} a duplicates check will be performed on the
data.  For time- and memory-critical applications this can be switched
off.}

\item{factors}{Return columns as factors (\code{TRUE}, the default) or not.}

\item{drop.na}{Should \code{NA} values be dropped from the \code{quitte}?}

\item{comment}{A character which at line start signifies the optional comment
header with metadata at the head of \code{file}.  The comment header, if
present, is returned as a \code{comment_header} attribute.  If multiple files
are read, the \code{comment_header} attribute is a list of comment headers with
file paths as names.}

\item{filter.function}{A function used to filter data during read.  See
Details.}

\item{chunk_size}{Number of lines to read at a time.  Defaults to 200000.
(REMIND .mif files have between 55000 and 105000 lines for H12 and EU21
regional settings, respectively.)}
}
\value{
A quitte data frame.
}
\description{
Reads IAMC-style .csv or .xlsx files into a quitte data frame.
}
\details{
In order to process large data sets, like IIASA data base snapshots,
\code{read.quitte()} reads provided files (other then Excel files) in chunks of
\code{chunk_size} lines, and applies \code{filter.function()} to the chunks.  This
allows for filtering data piece-by-piece, without exceeding available memory.
\code{filter.function} is a function taking one argument, a quitte data frame of
the read chunk, and is expected to return a data frame.  Usually it should
simply contain all the filters usually applied after all the data is read in.
Suppose there is a file \code{big_IIASA_snapshot.csv}, from which only data for
the REMIND and MESSAGE models between the years 2020 to 2050 is of interest.
Normally, this data would be processed as

\if{html}{\out{<div class="sourceCode">}}\preformatted{read.quitte(file = 'big_IIASA_snapshot.csv') \%>\%
    filter(grepl('^(REMIND|MESSAGE)', .data$model),
           between(.data$period, 2020, 2060))
}\if{html}{\out{</div>}}

If however \code{big_IIASA_snapshot.csv} is too large to be read in completely,
it can be read using

\if{html}{\out{<div class="sourceCode">}}\preformatted{read.quitte(file = 'big_IIASA_snapshot.csv',
            filter.function = function(x) \{
                x \%>\%
                    filter(grepl('^(REMIND|MESSAGE)', .data$model),
                           between(.data$period, 2020, 2060))
            \})
}\if{html}{\out{</div>}}
}
\examples{
\dontrun{
read.quitte(c("some/data/file.mif", "some/other/data/file.mif"))
read.quitte("some/data/file.csv", sep = ",", quote = '"')
}

}
\author{
Michaja Pehl
}
