Load Only Columns You Are Interested In With Pandas

Pavol Kutaj
Jul 31, 2021

--

The aim of this explainerđź’ˇ is to recommend the use of usecol= parameter with read_csv method that is the first step in loading csv files in Pandas.

1. notes

  • the docs say wisely

Using this parameter results in much faster parsing time and lower memory usage.

2. steps

  • check your cols without the parameter
import pandas as pd
pd.set_option('display.max_rows', None)
df = pd.read_csv(".\\dm.csv")
df.head()
  • this returns the first 5 rows of all columns
  • then select necessary either with name or with indices with usecols=
import pandas as pd
pd.set_option('display.max_rows', None)
df = pd.read_csv(".\\dm.csv", usecols=[0,3,4,5])
df

--

--

No responses yet