rle_array.array module¶

class RLEArray(data: numpy.ndarray, positions: numpy.ndarray)¶

Bases: pandas.core.arrays.base.ExtensionArray

Run-length encoded array.

Parameters

data – Data for each run. Must be a one-dimensional. All Pandas-supported dtypes are supported.
positions – End-positions for each run. Must be one-dimensional and must have same length as data. dtype must be POSITIONS_DTYPE.

all(axis: Optional[int] = 0, out: Optional[Any] = None) → bool ¶

any(axis: Optional[int] = 0, out: Optional[Any] = None) → bool ¶

astype(dtype: Any, copy: bool = True, casting: str = 'unsafe') → Any ¶

Cast to a NumPy array with ‘dtype’.

Parameters

dtype (str or dtype) – Typecode or data-type to which the array is cast.
copy (bool, default True) – Whether to copy the data, even if not necessary. If False, a copy is made only if the old dtype does not match the new dtype.

Returns

array – NumPy ndarray with ‘dtype’ for its dtype.

Return type

ndarray

copy() → rle_array.array.RLEArray ¶

Return a copy of the array.

Returns
Return type: ExtensionArray

dropna() → rle_array.array.RLEArray ¶

Return ExtensionArray without NA values.

Returns: valid
Return type: ExtensionArray

property dtype¶: An instance of ‘ExtensionDtype’.

factorize(na_sentinel: int = - 1) → Tuple[numpy.ndarray, rle_array.array.RLEArray]¶

Encode the extension array as an enumerated type.

Parameters

na_sentinel (int, default -1) – Value to use in the codes array to indicate missing values.

Returns

codes (ndarray) – An integer NumPy array that’s an indexer into the original ExtensionArray.
uniques (ExtensionArray) – An ExtensionArray containing the unique values of self.

Note

uniques will not contain an entry for the NA value of the ExtensionArray if there are any missing values present in self.

See also

factorize: Top-level factorize method that dispatches here.

Notes

pandas.factorize() offers a sort keyword as well.

fillna(value: Optional[Any] = None, method: Optional[str] = None, limit: Optional[int] = None) → rle_array.array.RLEArray ¶

Fill NA/NaN values using the specified method.

Parameters

value (scalar, array-like) – If a scalar value is passed it is used to fill all missing values. Alternatively, an array-like ‘value’ can be given. It’s expected that the array-like have the same length as ‘self’.
method ({'backfill', 'bfill', 'pad', 'ffill', None}, default None) – Method to use for filling holes in reindexed Series pad / ffill: propagate last valid observation forward to next valid backfill / bfill: use NEXT valid observation to fill gap.
limit (int, default None) – If method is specified, this is the maximum number of consecutive NaN values to forward/backward fill. In other words, if there is a gap with more than this number of consecutive NaNs, it will only be partially filled. If method is not specified, this is the maximum number of entries along the entire axis where NaNs will be filled.

Returns

With NA/NaN filled.

Return type

ExtensionArray

isna() → rle_array.array.RLEArray ¶

A 1-D array indicating if each value is missing.

Returns: na_values – In most cases, this should return a NumPy ndarray. For exceptional cases like SparseArray, where returning an ndarray would be expensive, an ExtensionArray may be returned.
Return type: Union[np.ndarray, ExtensionArray]

Notes

If returning an ExtensionArray, then

na_values._is_boolean should be True
na_values should implement ExtensionArray._reduce()
na_values.any and na_values.all should be implemented

kurt(skipna: bool = True) → Any ¶

max(skipna: bool = True, axis: Optional[int] = 0, out: Optional[Any] = None) → Any ¶

mean(skipna: bool = True, dtype: Optional[Any] = None, axis: Optional[int] = 0, out: Optional[Any] = None) → Any ¶

median(skipna: bool = True, axis: Optional[int] = 0, out: Optional[Any] = None) → Any ¶

min(skipna: bool = True, axis: Optional[int] = 0, out: Optional[Any] = None) → Any ¶

property nbytes¶: The number of bytes needed to store this object in memory.

prod(skipna: bool = True, axis: Optional[int] = 0, out: Optional[Any] = None) → Any ¶

round(decimals: int = 0) → rle_array.array.RLEArray ¶

shift(periods: int = 1, fill_value: Optional[object] = None) → rle_array.array.RLEArray ¶

Shift values by desired number.

Newly introduced missing values are filled with self.dtype.na_value.

New in version 0.24.0.

Parameters

periods (int, default 1) – The number of periods to shift. Negative values are allowed for shifting backwards.
fill_value (object, optional) –
The scalar value to use for newly introduced missing values. The default is self.dtype.na_value.

New in version 0.24.0.

Returns

Shifted.

Return type

ExtensionArray

Notes

If self is empty or periods is 0, a copy of self is returned.

If periods > len(self), then an array of size len(self) is returned, with all values filled with self.dtype.na_value.

skew(skipna: bool = True) → Any ¶

std(skipna: bool = True, ddof: int = 1, dtype: Optional[Any] = None, axis: Optional[int] = 0, out: Optional[Any] = None) → Any ¶

sum(skipna: bool = True, axis: Optional[int] = 0, out: Optional[Any] = None) → Any ¶

take(indices: Sequence[int], allow_fill: bool = False, fill_value: Optional[Any] = None) → rle_array.array.RLEArray ¶

Take elements from an array.

Parameters

indices (sequence of int) – Indices to be taken.
allow_fill (bool, default False) –
How to handle negative values in indices.
- False: negative values in indices indicate positional indices from the right (the default). This is similar to numpy.take().
- True: negative values in indices indicate missing values. These values are set to fill_value. Any other other negative values raise a ValueError.
fill_value (any, optional) –
Fill value to use for NA-indices when allow_fill is True. This may be None, in which case the default NA value for the type, self.dtype.na_value, is used.

For many ExtensionArrays, there will be two representations of fill_value: a user-facing “boxed” scalar, and a low-level physical NA value. fill_value should be the user-facing version, and the implementation should handle translating that to the physical version for processing the take if necessary.

Returns

Return type

ExtensionArray

Raises

IndexError – When the indices are out of bounds for the array.
ValueError – When indices contains negative values other than -1 and allow_fill is True.

See also

numpy.take, api.extensions.take

Notes

ExtensionArray.take is called by Series.__getitem__, .loc, iloc, when indices is a sequence of values. Additionally, it’s called by Series.reindex(), or any other method that causes realignment, with a fill_value.

Examples

Here’s an example implementation, which relies on casting the extension array to object dtype. This uses the helper method pandas.api.extensions.take().

def take(self, indices, allow_fill=False, fill_value=None):
    from pandas.core.algorithms import take

    # If the ExtensionArray is backed by an ndarray, then
    # just pass that here instead of coercing to object.
    data = self.astype(object)

    if allow_fill and fill_value is None:
        fill_value = self.dtype.na_value

    # fill value should always be translated from the scalar
    # type for the array, to the physical storage type for
    # the data, before passing to take.

    result = take(data, indices, fill_value=fill_value,
                  allow_fill=allow_fill)
    return self._from_sequence(result, dtype=self.dtype)

unique() → rle_array.array.RLEArray ¶

Compute the ExtensionArray of unique values.

Returns: uniques
Return type: ExtensionArray

value_counts(dropna: bool = True) → pandas.core.series.Series ¶

var(skipna: bool = True, ddof: int = 1, dtype: Optional[Any] = None, axis: Optional[int] = 0, out: Optional[Any] = None) → Any ¶

view(dtype: Optional[Any] = None) → Any ¶

Return a view on the array.

Parameters: dtype (str, np.dtype, or ExtensionDtype, optional) – Default None.
Returns: A view on the ExtensionArray’s data.
Return type: ExtensionArray or np.ndarray

rle_array.array module¶

rle-array

Navigation

Related Topics