Multiprocessing DataFrame objects¶

class flockmp.dataframe.DataFrameAsync¶

classmethod apply(dataframe, function, style='row-like', chunksize=100, poolSize=5)¶

First we segmentat the orginal DataFrame in chunks, then the executeAsync() will parallelize the function’s operations on the segmented dataframes. There two options for the way it will operate, as row-like or block-like.

Parameters:	dataframe (DataFrame) – Input Dataframe fuction (func) – Function to be applied on the dataframe chunksize (int) – How many chunks the original dataframe will be splitted poolSize (int) – Number of pools of processes style (str) – if “row-like” `function()` will be applied in row-by-row, otherwise it will be applied in `DataFrame` chunks.

Example¶

df = DataFrame({"a": list(range(1000)),
                "b": list(range(1000, 2000))})
res = DataFrameAsync.apply(df, lambda x: x ** 2, style="block-like")