How to add new worksheets to Excel workbooks with Pandas

José Fernando Costa
5 min readNov 7, 2019

--

In this article I am looking to explain how to add a new worksheet to an already existing Excel workbook, using the famous pandas library as well as the openpyxl library.

Even if you have never touched pandas nor openpyxl, I will explain all the code so you don’t have to worry about that. Though, I will use some intermediate Python concepts, so I’m assuming you are familar with Python. But, at any rate, I explain everything that comes up in the code and point you in the direction of resources where you can learn more about the code shown, so I don’t think there’s really barrier of entry except knowing basic Python! Also, there is a code gist at the end with complete script in case you want to jump straight to the solution.

For the script to work, make sure you have both libraries installed. If you don’t, then you can open the command line and enter pip install pandas and pip install openpyxl to install the libraries. If you’re on Windows, make sure to use the command line with Admin privileges.

For the sake of demonstration, the script will have two parts: the first one where we create a new Excel workbook with a single worksheet and a second workbook where we load that file, create a new worksheet for it and then append it and save the file. Here we go.

import pandas as pd
import openpyxl as pxl
firstMockData = { 'a': [1,2], 'b': [3,4] }
firstMockDF = pd.DataFrame(firstMockData)
firstMockDF.to_excel('test_wb.xlsx', 'sheetA', index=False)

We start by importing pandas and openpyxl to create a new DataFrame. In other words, firstMockDF is a two-column table, with columns a and b, where each column has two rows ([1,2] and [3,4], respectively). The keys in the firstMockData dictionary represent the column names and the values represent the data contained by the columns. Then, we simply call the DataFrame’s to_excel() method to create a new Excel workbook called test_wb (don’t forget to include the file extension!), writing the data to the sheetA worksheet (since neither the workbook nor the worksheet exist, the method creates them for us). The keyword argument index being set to false is just so that the DataFrame indexes are not written to the workbook, just the actual values (column names and rows’ data).

Please note that if you try to use firstMockDF.to_excel('test_wb.xlsx', ‘sheetB', index=false) to write data to sheetB in the hopes that the method will create the worksheet in our file, well, it doesn’t. Instead, it will overwrite the existing file, meaning that you lose sheetA and test_wb still has a single worksheet: sheetB. But fear not, the purpose of this article is to explain how to add a new worksheet to a workbook after all.

excel_book = pxl.load_workbook('test_wb.xlsx')with pd.ExcelWriter('test_wb.xlsx', engine='openpyxl') as writer:
writer.book = excel_book
writer.sheets = {
worksheet.title: worksheet
for worksheet in excel_book.worksheets
}
secondMockData = { 'c': [10,20], 'd': [30,40] }
secondMockDF = pd.DataFrame(secondMockData)
secondMockDF.to_excel(writer, 'sheetB', index=False)
writer.save()

This second chunk includes the code that “actually matters”: loads an existing workbook, creates the logic for adding a new worksheet, writes the new data to that worksheet and finally saves the file in the computer.

But let us go one step at a time. The first line is responsible for loading our target workbook, the one we created before. Then, we use something called a context manager (a.k.a. the code block inside the with block) to wrap our file operations. This is very useful because we are reading from and writing to a file, that is, it means at some point we need to open the file and later close it. While pd.ExcelWriter() is not a file on its own, it “represents” a workbook and, thus, it still needs to be closed to release the memory allocated to it.

Hence the use-case for a context manager. We just need to initally create the ExcelWriter to alocate resources for the “representation” of a file/work in progress workbook in memory, which is then known as the variable writer (because of theas writer part). That WIP workbook only exists until the end of this code block, outside of the with block the WIP is no more, period. As you can see, context managers are something quite handy in Python, and so I highly recommend you to watch this short tutorial Corey Schafer created on the matter.

And please due note that we are not reading the file when we say pd.ExcelWriter('test_wb.xlsx, engine='openpyxl`). With that we are just creating a “representation” of a workbook called test_wb in our Python code, a workbook that’s currently being worked in Python. We could call it any name we want, what binds this writer to the workbook we created previously comes in the next line of code.

writer.book = excel_book then binds the workbook we had loaded before as our “base of work” for the writer. Now our WIP workbook called test_wb knows that it uses the loaded workbook, a workbook that actually exists on our computer, as its base of work. Then, to set the worksheets for our WIP workbook, we use the same ones as that of the loaded file. That strange syntax is called a dictionary comprehension, which is pretty much equivalent to looping through the worksheets of excel_book and then mapping the names to each respective worksheet. Dictionary comprehension simply allows us to do that looping with a more elegant and concise syntax. For more on the subject please read the official Python documentation on list comprehension (same principle, only difference is one is used for lists, the other for dictionaries).

Now that we have set the workbook and the worksheets for our WIP file, we create a new DataFrame (exactly like before) and once again we call the to_excel() method. The big difference here is that we write the DataFrame to the ExcelWriter (writer), not to the workbook that already exists on the computer. Plus, we specify a worksheet called sheetB and since it doesn’t exist yet, pandas creates it for us. Now the work in progress has two worksheets: sheetA and sheetB, both containing their own data.

Now all that’s left is to save the work in progress workbook as an actual file in the computer. For that, just call the save() method on the writer and it will create a new Excel workbook called test_wb in the same directory as of the script. Since a workbook with that name already exists, the practical result is that the one created before with a single worksheet is overwritten by a new one with two worksheets, just like we wanted.

And that’s it on how to create new worksheets for already existing Excel workbooks with pandas. To summarize, you use pandas’ ExcelWriter to create a work in progress workbook, while making use of the data of an existing workbook. Then, write some other data to a new worksheet, save the file and voilà, you have just added a new worksheet to your workbook.

Finally, here’s the promised code gist with the complete demonstration script.

I hope this article has been useful for your work with Excel and Python!

Edit (Nov. 28th 2021): Updated the script to avoid “Alert: We found a problem with some content in ‘test_wb.xlsx’” issues when opening the output workbook. Thank you for Ted Pelas for the feedback!

--

--

José Fernando Costa
José Fernando Costa

Written by José Fernando Costa

Documenting my life in text form for various audiences

Responses (4)