It is a rite of passage for every developer: You write a script to export your database to a CSV file. You open the file in a text editor, and it looks perfect. You email it to the marketing team, they open it in Microsoft Excel, and suddenly "José" becomes "José". What happened?
The Root Cause: Encoding Assumptions
Almost every modern web application, database, and programming language defaults to UTF-8 character encoding. UTF-8 can represent almost any character from any language, plus emojis.
However, Microsoft Excel, due to decades of legacy enterprise support, assumes that a basic `.csv` file is encoded in your computer's local regional encoding (often Windows-1252 or ANSI on English Windows machines). When Excel tries to read a UTF-8 file using an ANSI decoder, special characters get mangled into weird symbols (a phenomenon known as Mojibake).
The Developer Fix: Add a BOM
If you are generating the CSV programmatically, the easiest way to force Excel to open the file correctly is to prepend a Byte Order Mark (BOM) to the very beginning of the file.
The UTF-8 BOM is a specific sequence of three bytes: EF BB BF. When Excel sees these three bytes at the start of a CSV, it says, "Ah, this is a UTF-8 file!" and decodes it perfectly.
// Example in Node.js
const fs = require('fs');
const BOM = '\uFEFF';
const csvContent = "Name,City\nJosé,São Paulo";
// Prepend the BOM before writing
fs.writeFileSync('export.csv', BOM + csvContent, 'utf8');Need to convert data safely?
Convert messy CSVs into clean JSON arrays directly in your browser. No uploads, no encoding headaches.
Try the CSV to JSON ConverterThe User Fix: Import, Don't Double-Click
If you are a non-technical user who received a broken CSV, do not double-click it to open it. Instead:
- Open a blank Excel workbook.
- Go to the Data tab and click From Text/CSV.
- Select your file.
- In the import wizard window, change the File Origin dropdown to 65001: Unicode (UTF-8).
This manually overrides Excel's bad assumption and parses your data correctly.