1 year ago

#290308

test-img

KidSudi

Pandas apply function to 100k+ row data frame

I am trying to use Pandas apply row-wise on a 100k+ data frame as follows:

    def get_fruit_vege(x, all_df, f_or_v):
        y_df = all_df[all_df['key'] == x['key']]
        if f_or_v == "F":
            f_row = y_df[y_df['Group'] == int(y_df.iloc[0]['key'].split('-')[0])]
            return f_row['Number'].iloc[0]
        elif f_or_v == "V":
            v_row = y_df[y_df['Group'] == int(y_df.iloc[0]['key'].split('-')[1])]
            return v_row['Number'].iloc[0]
        else:
            return np.nan

    basket_df['Number'] = basket_df.apply(lambda x: get_fruit_vege(x, all_df, "F"), axis=1)

However the process just hangs when running in the Python console (I'm using Pycharm Community Edition). The reason I am using Pandas apply is because I need to cross-reference another data frame using a key that matches row-wise between each data frame (basket_df and all_df). Not sure what I am doing wrong here, or if I should just not be using Pandas apply. Thanks for your help!

---Update: The function does work, but it takes a good chunk of time, approximately 20 minutes or so. Is there a better way to go about this?

python

pandas

pycharm

apply

rowwise

0 Answers

Your Answer

Accepted video resources