In this tutorial, you’ll learn how to select all the different ways you can select columns in Pandas, either by name or index. You’ll learn how to use the loc, iloc accessors and how to select columns directly. You’ll also learn how to select columns conditionally, such as those containing a specific substring.
By the end of this tutorial, you’ll have learned:
How to select columns by name or by indexHow to select all columns except for named columnsHow to select columns of a specific datatypeHow the .loc and .iloc accessors work to select data in PandasHow to select columns conditionally, such as those containing a stringLet’s get started!
Table of Contents
Loading a Sample Pandas DataFrameTo follow along with this tutorial, let’s load a sample Pandas DataFrame. By copying the code below, you’ll load a dataset that’s hosted on my Github page.
The DataFrame contains a number of columns of different data types, but few rows. This allows us to print out the entire DataFrame, ensuring us to follow along with exactly what’s going on.
import pandas as pddf = pd.read_csv('https://raw.githubusercontent.com/datagy/data/main/SelectingColumns.csv')print(df)# Returns:#Name Age Height Score Random_A Random_B Random_C Random_D Random_E# 0 Joe285'9 307359 5 431# 1 Melissa265'5 323085383280# 2 Nik315'11 348071597153# 3Andrea335'6 383863868142# 4 Jane325'8 29194848 568Let’s break down what we’ve done here:
We imported pandas using the alias pdWe then loaded a DataFrame using the pd.read_csv() function.What Makes Up a Pandas DataFrameBefore diving into how to select columns in a Pandas DataFrame, let’s take a look at what makes up a DataFrame.
A DataFrame has both rows and columns. Each of the columns has a name and an index. For example, the column with the name 'Age' has the index position of 1.
As with other indexed objects in Python, we can also access columns using their negative index. For example, the column with the name 'Random_C' has the index position of -1.
Why Select Columns in Python?The data you work with in lots of tutorials has very clean data with a limited number of columns. But this isn’t true all the time.
In many cases, you’ll run into datasets that have many columns – most of which are not needed for your analysis. In this case, you’ll want to select out a number of columns.
This often has the added benefit of using less memory on your computer (when removing columns you don’t need), as well as reducing the amount of columns you need to keep track of mentally.
How to Select a Single Column in PandasPandas makes it easy to select a single column, using its name. We can do this in two different ways:
Using dot notation to access the columnUsing square-brackets to access the columnLet’s see how we can do this by accessing the 'Name' column:
# Accessing a Single Column by Its Nameprint(df['Name'])# print(df.Name) (This returns the same value)# Returns:# 0Joe# 1Melissa# 2Nik# 3 Andrea# 4Jane# Name: Name, dtype: objectLet’s take a quick look at why using the dot operator is often not recommended (while it’s easier to type). This is because you can’t:
Select columns with spaces in the name,Use columns that have the same names as dataframe methods (such as ‘type’),Pick columns that aren’t strings, andSelect multiple columns (as you’ll see later)Now let’s take a look at what this actually returns. We can do this by using the type() function:
# Checking the Type of Our Columnprint(type(df['Name']))# Returns:#We can see that selecting a single column returns a Pandas Series.
If we wanted to return a Pandas DataFrame instead, we could use double square-brackets to make our selection. Let’s see what this looks like:
# Selecting a Single Column as a Pandas DataFrameprint(type(df[['Name']]))# Returns:#What we’re actually doing here is passing in a list of columns to select. In this case, we’re passing in a list with a single item. When a list is passed in to the selector, a DataFrame is returned.
In the following section, you’ll learn how to select multiple columns in a Pandas DataFrame.
How to Select Multiple Columns in PandasSelecting multiple columns works in a very similar way to selecting a single column. We can pass a list of column names into our selection in order to select multiple columns. Because we need to pass in a list of items, the . notation is not possible when selecting multiple columns.
Let’s take a look at how we can select the the Name, Age, and Height columns:
# Selecting Multiple Columns in a Pandas DataFrameselection = df[['Name', 'Age', 'Height']]print(selection)# Returns:#Name Age Height# 0 Joe285'9# 1 Melissa265'5# 2 Nik315'11# 3Andrea335'6# 4 Jane325'8What’s great about this method, is that you can return columns in whatever order you want. If you wanted to switch the order around, you could just change it in your list:
# Selecting Columns in a Different Orderselection = df[['Name', 'Height', 'Age']]print(selection)# Returns:#Name Height Age# 0 Joe5'928# 1 Melissa5'526# 2 Nik5'1131# 3Andrea5'633# 4 Jane5'832In the next section, you’ll learn how to select columns by data type in Pandas.
How to Select Columns by Data Type in PandasIn this section, you’ll learn how to select Pandas columns by specifying a data type in Pandas. This method allows you to, for example, select all numeric columns. This can be done by using the, aptly-named, .select_dtypes() method.
Let’s take a look at how we can select only text columns, which are stored as the 'object' data type:
# Selecting a Data Type with Pandasselection = df.select_dtypes('object')print(selection)# Returns:#Name Height# 0 Joe5'9# 1 Melissa5'5# 2 Nik5'11# 3Andrea5'6# 4 Jane5'8The data types that are available can be a bit convoluted. The list below breaks down some of the common ones you may encounter:
Numeric data types: np.number or 'number'Strings: 'object'Datetimes: np.datetime64, 'datetime' or 'datetime64'Timedeltas: np.timedelta64, 'timedelta' or 'timedelta64'Categorical: 'category'Datetimes with Timezones: 'datetimetz' or 'datetime64[ns, tz]'Using loc to Select ColumnsThe .loc accessor is a great way to select a single column or multiple columns in a dataframe if you know the column name(s).
This method is great for:
Selecting columns by column name,Selecting columns using a single label, a list of labels, or a sliceThe loc method looks like this:

In the image above, you can see that you need to provide some list of rows to select. In many cases, you’ll want to select all rows (if your intention is to select columns). Because of this, you can pass in simply a colon (:), which selects all rows.
Let’s see how we can select all rows belonging to the name column, using the .loc accessor:
# Using .loc to Select Columnsprint(df.loc[:, 'Name'])# Returns:# 0Joe# 1Melissa# 2Nik# 3 Andrea# 4Jane# Name: Name, dtype: objectNow, if you wanted to select only the name column and the first three rows, you could write:
# Using .loc to Select Rows and Columnsselection = df.loc[:2,'Name']print(selection)# Returns:# 0Joe# 1Melissa# 2NikSelecting Multiple Columns with .loc in PandasSimilarly, Pandas makes it easy to select multiple columns using the .loc accessor. We can include a list of columns to select. For example, if we wanted to select the 'Name' and 'Height' columns, we could pass in the list ['Name', 'Height'] as shown below:
# Selecting Multiple Columns Using .locprint(df.loc[:, ['Name', 'Height']])# Returns:#Name Height# 0 Joe5'9# 1 Melissa5'5# 2 Nik5'11# 3Andrea5'6# 4 Jane5'8We can also select a slice of columns using the .loc accessor. This works a little differently, as we don’t need to pass in a list, but rather a slice of column names. Say we wanted to select all columns from the 'Name' to 'Score' columns, we could write:
# Selecting a Slice of Columns in Pandasprint(df.loc[:, 'Name':'Score'])# Returns:#Name Age Height Score# 0 Joe285'9 30# 1 Melissa265'5 32# 2 Nik315'11 34# 3Andrea335'6 38# 4 Jane325'8 29As a quick recap, the .loc accessor is great for selecting columns and rows by their names. In the following section, you’ll learn about the .iloc accessor, which lets you access rows and columns by their index position.
Using iloc to Select ColumnsThe iloc function is one of the primary way of selecting data in Pandas. The method “iloc” stands for integer location indexing, where rows and columns are selected using their integer positions.
This method is great for:
Selecting columns by column position (index),Selecting columns using a single position, a list of positions, or a slice of positionsThe standard format of the iloc method looks like this:

Now, for example, if we wanted to select the first two columns of our dataframe, we could write:
# Selecting a Range of Columns with .ilocprint(df.iloc[ : , :2])# Returns:#Name Age# 0 Joe28# 1 Melissa26# 2 Nik31# 3Andrea33# 4 Jane32Note that we didn’t write df.iloc[:,0:2], but that would have yielded the same result.
If we wanted to select all columns and only two rows with .iloc, we could do that by writing:
selection = df.iloc[:2,]print(selection)This returns:
Name Age Height Score Random_A Random_B Random_C Random_D Random_E0 Joe285'9 307359 5 4311 Melissa265'5 323085383280How to Select Column Names Containing a String in PandasThere may be times when you want to select columns that contain a certain string. This can, for example, be helpful if you’re looking for columns containing a particular unit. In our example below, we’re selecting columns that contain the string 'Random'.
Since the .loc accessor can accept a list of columns, we can write a list comprehension in the accessor to filter out column names meeting our condition. Let’s see what this looks like:
# Selecting Columns Meeting a Conditionprint(df.loc[ : , [col for col in df.columns if 'Random' in col]])# Returns:#Random_A Random_B Random_C Random_D Random_E# 07359 5 431# 13085383280# 28071597153# 31663868142# 4194048 568How to Select Columns Meeting a ConditionSimilarly, we can select columns where the values meet a condition. We can also do this by using a list comprehension. In the comprehension, we’ll write a condition to evaluate against.
Say we wanted to filter down to only columns where any value is equal to 30. In this case, we could write the following:
# Selecting Columns Where Values Meet a Conditionprint(df.loc[ : , [(df[col] == 30).any() for col in df.columns]])# Returns:#Score Random_A# 0 3073# 1 3230# 2 3480# 3 3816# 4 2919Let’s break down what we did here:
We used to .iloc accessor to access all rowsWe then used a list comprehension to select column names meeting a conditionOur condition returns any column where any of the values are equal to 30 in that columnCopying Columns vs. Selecting ColumnsSomething important to note for all the methods covered above, it might looks like fresh dataframes were created for each. However, that’s not the case!
In Python, the equal sign (“=”), creates a reference to that object.
Because of this, you’ll run into issues when trying to modify a copied dataframe.
In order to avoid this, you’ll want to use the .copy() method to create a brand new object, that isn’t just a reference to the original.
To accomplish this, simply append .copy() to the end of your assignment to create the new dataframe.
For example, if we wanted to create a filtered dataframe of our original that only includes the first four columns, we could write:
new_df = df.iloc[:,:4].copy()print(new_df)This results in this code below:
Name Age Height Score0 Joe285'9 301 Melissa265'5 322 Nik315'11 343Andrea335'6 384 Jane325'8 29This is incredibly helpful if you want to work the only a smaller subset of a dataframe.
Conclusion: Using Pandas to Select ColumnsIn this tutorial, you learned how to use Pandas to select columns. You learned how to use many different methods to select columns, including using square brackets to select a single or multiple columns. You then learned many different ways to use the .loc and .iloc accessors to select columns.
You learned some unique ways of selecting columns, such as when column names contain a string and when a column contains a particular value.
Additional ResourcesTo learn more about related topics, check out the tutorials below:
How to Use Pandas to Read Excel Files in PythonCombine Data in Pandas with merge, join, and concatPandas: How to Drop a Dataframe Index ColumnPandas GroupBy: Group, Summarize, and Aggregate Data in PythonOfficial Documentation for Select Data in Pandas