1 year ago
#290308
KidSudi
Pandas apply function to 100k+ row data frame
I am trying to use Pandas apply row-wise on a 100k+ data frame as follows:
def get_fruit_vege(x, all_df, f_or_v):
y_df = all_df[all_df['key'] == x['key']]
if f_or_v == "F":
f_row = y_df[y_df['Group'] == int(y_df.iloc[0]['key'].split('-')[0])]
return f_row['Number'].iloc[0]
elif f_or_v == "V":
v_row = y_df[y_df['Group'] == int(y_df.iloc[0]['key'].split('-')[1])]
return v_row['Number'].iloc[0]
else:
return np.nan
basket_df['Number'] = basket_df.apply(lambda x: get_fruit_vege(x, all_df, "F"), axis=1)
However the process just hangs when running in the Python console (I'm using Pycharm Community Edition). The reason I am using Pandas apply is because I need to cross-reference another data frame using a key that matches row-wise between each data frame (basket_df and all_df). Not sure what I am doing wrong here, or if I should just not be using Pandas apply. Thanks for your help!
---Update: The function does work, but it takes a good chunk of time, approximately 20 minutes or so. Is there a better way to go about this?
python
pandas
pycharm
apply
rowwise
0 Answers
Your Answer