Pandas basis

 

Question  answer explain
 how to get how big in memory a DataFrame object is?  df.info()
 what is the best representative of null value in Pandas object?  np.nan import numpy as np
 what is the best way to slice a DataFrame by index?  df.iloc[-5:, 2:] use iloc method
 how to convert a DataFrame (excluding indexes) to a numpy ndarray?  df.values it is a attribute, can’t be called
 what is the most basic way to create a DataFrame?  pd.DataFrame(dict) pass dictionary to; keys are column names
 what is broadcasting?  pd[‘new’]=7 all the values of the new column will be 7
 how to change df’s column names, index names?  pd.columns = [‘a’,’b’,…]

pd.index = [‘c’,’d’,…]

assign value directly
 when read csv, how to specify names of the column  pd.read_csv(path, names=[‘a’,’b’,…..])  instead, pass header=None will prevent pandas using data as column names, but use 0,1,2,3 ….
when read csv, how to let pandas to turn some specific values into NaN? pd.read_csv(path, na_values = ‘-1’)

pdf.read_csv(path, na_values = {‘column3’:[‘ -2’, ‘wtf’,…]})

all the values which is character ‘-1’ will be rendered to NaN
how to parse data in reading csv pd.read_csv(path, parse_dates = [[0,1,2]]) pandas will parse column 1, 2, 3 into one datetype column
does index of df have a name? pd.index.name = ‘xxx’ assign a name to the index of df
how to save df to a csv file with other delimiters  rather than ‘,’ pd.to_csv(path, sep=’\t’) save to a csv file which separates data by tab

how to batch convert string to Date type

df[‘datestring’]=pd.to_datetime(df[‘datestring’])

how to get 2 DataFrame together,  & append one df to another?