Python Script to Merge Multiple CSV Files into One Excel Workbook
To merge multiple CSV files into a single Excel workbook using Python, the most efficient and robust approach is using the Pandas library alongside openpyxl. This guide provides two production-ready scripts: one to merge CSVs into separate sheets, and another to combine them into a single consolidated sheet.
Prerequisites
You need to install pandas and openpyxl (the engine used to write Excel files). Run the following command in your terminal:
pip install pandas openpyxl
Option 1: Merge CSVs into Separate Sheets (One Workbook)
Use this script if you want each CSV file to represent a distinct tab (sheet) inside a single Excel file. The sheet names will automatically match the CSV filenames.
import os
import pandas as pd
# Define the directory containing your CSV files and the output path
csv_directory = "./csv_folder"
output_excel = "merged_output_sheets.xlsx"
# Initialize the Excel writer
with pd.ExcelWriter(output_excel, engine='openpyxl') as writer:
# Loop through all files in the directory
for file_name in os.listdir(csv_directory):
if file_name.endswith('.csv'):
file_path = os.path.join(csv_directory, file_name)
# Read CSV (adjust encoding if you encounter UnicodeDecodeError)
df = pd.read_csv(file_path, encoding='utf-8')
# Use the filename (minus .csv) as the sheet name
# Truncated to 31 characters to comply with Excel sheet name limits
sheet_name = os.path.splitext(file_name)[0][:31]
# Write DataFrame to a specific sheet
df.to_excel(writer, sheet_name=sheet_name, index=False)
print(f"Successfully merged CSVs into separate sheets in: {output_excel}")
Option 2: Concatenate CSVs into a Single Sheet
Use this script if all your CSV files share the same column structure (schema) and you want to stack them vertically into one continuous dataset on a single Excel sheet.
import os
import pandas as pd
csv_directory = "./csv_folder"
output_excel = "merged_output_single_sheet.xlsx"
# List to hold individual DataFrames
dataframes = []
# Loop through and read all CSV files
for file_name in os.listdir(csv_directory):
if file_name.endswith('.csv'):
file_path = os.path.join(csv_directory, file_name)
df = pd.read_csv(file_path, encoding='utf-8')
dataframes.append(df)
# Concatenate all DataFrames vertically
if dataframes:
combined_df = pd.concat(dataframes, ignore_index=True)
# Write the combined DataFrame to Excel
combined_df.to_excel(output_excel, index=False, engine='openpyxl')
print(f"Successfully concatenated CSVs into a single sheet in: {output_excel}")
else:
print("No CSV files found in the specified directory.")
Technical Considerations & Troubleshooting
- Excel Row Limits: Excel sheets have a strict limit of 1,048,576 rows. If your concatenated CSVs exceed this limit, the write operation in Option 2 will fail.
- Character Encoding: If your CSVs contain special characters, emojis, or non-English text, you might need to change
encoding='utf-8'toencoding='utf-8-sig'orencoding='latin1'insidepd.read_csv(). - Memory Constraints: Pandas loads datasets entirely into RAM. If you are merging gigabytes of CSV data, ensure your system has sufficient memory, or process the files in chunks.
Need this done fast? order it on Kwork.
I take on freelance fixes and builds in this area.