⚙️ DevOps Utilities

What Is the CSV Format? Delimiters, Quoting & Pitfalls

By Justin Le · 6 min read · Updated June 27, 2026

CSV is the lingua franca of tabular data — every spreadsheet and data tool reads it. It looks almost too simple to need explaining, but the details (especially quoting) are where data gets silently corrupted. Here's what you actually need to know.

What is CSV?

CSV stands for Comma-Separated Values. It's a plain-text format where each line is a row and fields within a row are separated by a delimiter, usually a comma. The first row is typically a header naming each column. That's the whole idea — its simplicity is why it's so universal.

name,role,active
Alice,admin,true
Bob,user,false

The quoting rules (where it gets tricky)

The simplicity breaks down as soon as a value contains the delimiter. What if a name is "Doe, John"? The comma would be misread as a field separator. The convention, formalised in RFC 4180, handles this with quoting:

A field containing a comma, quote or newline is wrapped in double quotes.
A double quote inside a quoted field is escaped by doubling it ("").

So Doe, John becomes "Doe, John", and a value containing a quote like say "hi" becomes "say ""hi""". A naive "split on comma" parser gets these wrong — which is why you should use a proper CSV parser.

Values can even contain newlines

Because of quoting, a single field can span multiple lines if it's wrapped in quotes. That means you can't reliably parse CSV by splitting on line breaks either — a quoted newline is part of the value, not a new row. This trips up many home-grown parsers.

Common pitfalls

Delimiter confusion. Some locales use a semicolon (;) instead of a comma, because the comma is their decimal separator. "CSV" doesn't always mean comma.
Lost leading zeros. Spreadsheets may read 00123 or a phone number as a number and drop the zeros. CSV itself has no types — everything is text.
Encoding issues. Non-ASCII characters need a consistent encoding (UTF-8); the wrong one turns accents into garbage.
No nested data. CSV is flat. Nested objects or arrays have no native representation and must be flattened or serialised.

CSV vs JSON

CSV is compact and perfect for flat, tabular data that goes into spreadsheets. JSON handles nested structures and carries (some) type information, making it better for APIs and complex data. Converting between them is a common task — just remember that going to CSV flattens and stringifies everything. See our JSON vs YAML guide for the broader format picture.

Try it

Convert a JSON array of objects to clean, correctly-quoted CSV — and back — with our JSON ↔ CSV converter, which handles the RFC 4180 quoting for you. Tidy the JSON side first with the JSON formatter.

Frequently asked questions

How does CSV handle a comma inside a value?

The field is wrapped in double quotes, so 'Doe, John' becomes "Doe, John". A double quote inside a quoted field is escaped by doubling it. This is the RFC 4180 convention.

Can a CSV value contain a line break?

Yes, if the field is wrapped in double quotes. That's why you can't reliably parse CSV by splitting on newlines — a quoted newline is part of the value, not a row boundary.

Why did my leading zeros or long numbers change in CSV?

CSV has no types — every field is text. Spreadsheets often interpret values like 00123 or long IDs as numbers and reformat them. Keep such fields as text, or use a format that preserves types.