networkcommons.utils.handle_missing_values
- networkcommons.utils.handle_missing_values(df, threshold=0.1, fill=<function mean>)
Handles missing values in a DataFrame by filling them with a specified function or value, or dropping the rows.
Parameters: - df (pandas.DataFrame): The DataFrame containing the data. - threshold (float): The threshold for the share (0<n<1) of missing values in a row. Rows with a share
of missing values greater than or equal to the threshold will be dropped.
- fill (callable, int, float, or None): If callable, the function is applied to each row to fill missing values.
If an integer or float, it is used to fill missing values. If None, no filling is done.
Returns: - df (pandas.DataFrame): The DataFrame with missing values handled.
Raises: - ValueError: If more than one non-numeric column is found in the DataFrame.
Example: >>> df = pd.DataFrame({‘A’: [1, 2, np.nan], ‘B’: [3, 2, np.nan], ‘C’: [np.nan, 7, 8]}) >>> handle_missing_values(df, 0.5, fill=np.mean) Number of genes filled: 1 Number of genes removed: 1