Table Dialect (previously called CSV dialect) is a simple format to describe the dialect of a tabular data file, including its delimiter, header rows, escape characters, etc.
In this document we use the terms “package” for Data Package, “resource” for Data Resource, “dialect” for Table Dialect, and “schema” for Table Schema.
Frictionless supports most dialect properties to read Tabular
Data Resources. Dialect manipulation is limited to setting a
delimiter
. When writing resources, it (mainly) makes uses
of default dialect properties, removing the necessity to define
them.
read_resource()
uses the dialect
property
of a resource to parse a tabular data file. Only properties that deviate
from the default need to be specified. E.g. a tab-delimited file without
header rows must have the following dialect:
Frictionless does not support direct manipulation of the dialect.
add_resource()
allows to set one property
(dialect$delimiter
) when data are provided as a file, all
other properties are assumed to be the default.
write_package()
writes a package to disk as a
datapackage.json
file. This file includes the metadata of
all the resources, including the dialect (if defined).
write_package()
writes resources created from a data frame
to CSV files, but no dialect
property is set for those,
since only defaults are used.
delimiter
is used by read_resource()
and defaults to
","
. It is passed to delim
in
readr::read_delim()
. add_resource()
does not
set delimiter
, unless provided in delim
and
different from the default ","
:
lineTerminator
is ignored by read_resource()
. It relies on
readr::read_delim()
instead, which interprets line
terminator LF
and CRLF
automatically and does
not support CR
(used by Classic Mac OS, final release
2001).
quoteChar
is used by read_resource()
and defaults to "
.
It is passed to quote
in
readr::read_delim()
.
doubleQuote
is used by read_resource()
and defaults to
true
, but can be overruled by escapeChar
. It
is passed to escape_double
in
readr::read_delim()
.
escapeChar
is ignored by read_resource()
unless it is
"\\"
. It is passed as escape_backslash = TRUE
and escape_double = FALSE
in
readr::read_delim()
.
escapeChar
and doubleQuote
are mutually
exclusive, so you cannot escape with \"
and ""
in the same file.
nullSequence
is ignored by read_resource()
. Provide as
missingValues
in the schema instead (see
vignette("table-schema")
).
skipInitialSpace
is used by read_resource()
and defaults to
false
. It is passed to trim_ws
in
readr::read_delim()
.
header
is used by read_resource()
and defaults to
true
. It is passed as trim_ws = 1
(or
0
) in readr::read_delim()
.
commentChar
is used by read_resource()
and defaults to undefined. It is
passed to comment
in readr::read_delim()
.
caseSensitiveHeader
is ignored by read_resource()
.
csvddfVersion
is ignored by read_resource()
.