rle_array.array module¶
-
class
RLEArray
(data: numpy.ndarray, positions: numpy.ndarray)¶ Bases:
pandas.core.arrays.base.ExtensionArray
Run-length encoded array.
- Parameters
data – Data for each run. Must be a one-dimensional. All Pandas-supported dtypes are supported.
positions – End-positions for each run. Must be one-dimensional and must have same length as
data
. dtype must bePOSITIONS_DTYPE
.
-
astype
(dtype: Any, copy: bool = True, casting: str = 'unsafe') → Any¶ Cast to a NumPy array with ‘dtype’.
- Parameters
dtype (str or dtype) – Typecode or data-type to which the array is cast.
copy (bool, default True) – Whether to copy the data, even if not necessary. If False, a copy is made only if the old dtype does not match the new dtype.
- Returns
array – NumPy ndarray with ‘dtype’ for its dtype.
- Return type
-
copy
() → rle_array.array.RLEArray¶ Return a copy of the array.
- Returns
- Return type
-
dropna
() → rle_array.array.RLEArray¶ Return ExtensionArray without NA values.
- Returns
valid
- Return type
-
property
dtype
¶ An instance of ‘ExtensionDtype’.
-
factorize
(na_sentinel: int = - 1) → Tuple[numpy.ndarray, rle_array.array.RLEArray]¶ Encode the extension array as an enumerated type.
- Parameters
na_sentinel (int, default -1) – Value to use in the codes array to indicate missing values.
- Returns
codes (ndarray) – An integer NumPy array that’s an indexer into the original ExtensionArray.
uniques (ExtensionArray) – An ExtensionArray containing the unique values of self.
Note
uniques will not contain an entry for the NA value of the ExtensionArray if there are any missing values present in self.
See also
factorize
Top-level factorize method that dispatches here.
Notes
pandas.factorize()
offers a sort keyword as well.
-
fillna
(value: Optional[Any] = None, method: Optional[str] = None, limit: Optional[int] = None) → rle_array.array.RLEArray¶ Fill NA/NaN values using the specified method.
- Parameters
value (scalar, array-like) – If a scalar value is passed it is used to fill all missing values. Alternatively, an array-like ‘value’ can be given. It’s expected that the array-like have the same length as ‘self’.
method ({'backfill', 'bfill', 'pad', 'ffill', None}, default None) – Method to use for filling holes in reindexed Series pad / ffill: propagate last valid observation forward to next valid backfill / bfill: use NEXT valid observation to fill gap.
limit (int, default None) – If method is specified, this is the maximum number of consecutive NaN values to forward/backward fill. In other words, if there is a gap with more than this number of consecutive NaNs, it will only be partially filled. If method is not specified, this is the maximum number of entries along the entire axis where NaNs will be filled.
- Returns
With NA/NaN filled.
- Return type
-
isna
() → rle_array.array.RLEArray¶ A 1-D array indicating if each value is missing.
- Returns
na_values – In most cases, this should return a NumPy ndarray. For exceptional cases like
SparseArray
, where returning an ndarray would be expensive, an ExtensionArray may be returned.- Return type
Notes
If returning an ExtensionArray, then
na_values._is_boolean
should be Truena_values should implement
ExtensionArray._reduce()
na_values.any
andna_values.all
should be implemented
-
mean
(skipna: bool = True, dtype: Optional[Any] = None, axis: Optional[int] = 0, out: Optional[Any] = None) → Any¶
-
property
nbytes
¶ The number of bytes needed to store this object in memory.
-
round
(decimals: int = 0) → rle_array.array.RLEArray¶
-
shift
(periods: int = 1, fill_value: Optional[object] = None) → rle_array.array.RLEArray¶ Shift values by desired number.
Newly introduced missing values are filled with
self.dtype.na_value
.New in version 0.24.0.
- Parameters
- Returns
Shifted.
- Return type
Notes
If
self
is empty orperiods
is 0, a copy ofself
is returned.If
periods > len(self)
, then an array of size len(self) is returned, with all values filled withself.dtype.na_value
.
-
std
(skipna: bool = True, ddof: int = 1, dtype: Optional[Any] = None, axis: Optional[int] = 0, out: Optional[Any] = None) → Any¶
-
take
(indices: Sequence[int], allow_fill: bool = False, fill_value: Optional[Any] = None) → rle_array.array.RLEArray¶ Take elements from an array.
- Parameters
indices (sequence of int) – Indices to be taken.
allow_fill (bool, default False) –
How to handle negative values in indices.
False: negative values in indices indicate positional indices from the right (the default). This is similar to
numpy.take()
.True: negative values in indices indicate missing values. These values are set to fill_value. Any other other negative values raise a
ValueError
.
Fill value to use for NA-indices when allow_fill is True. This may be
None
, in which case the default NA value for the type,self.dtype.na_value
, is used.For many ExtensionArrays, there will be two representations of fill_value: a user-facing “boxed” scalar, and a low-level physical NA value. fill_value should be the user-facing version, and the implementation should handle translating that to the physical version for processing the take if necessary.
- Returns
- Return type
- Raises
IndexError – When the indices are out of bounds for the array.
ValueError – When indices contains negative values other than
-1
and allow_fill is True.
See also
Notes
ExtensionArray.take is called by
Series.__getitem__
,.loc
,iloc
, when indices is a sequence of values. Additionally, it’s called bySeries.reindex()
, or any other method that causes realignment, with a fill_value.Examples
Here’s an example implementation, which relies on casting the extension array to object dtype. This uses the helper method
pandas.api.extensions.take()
.def take(self, indices, allow_fill=False, fill_value=None): from pandas.core.algorithms import take # If the ExtensionArray is backed by an ndarray, then # just pass that here instead of coercing to object. data = self.astype(object) if allow_fill and fill_value is None: fill_value = self.dtype.na_value # fill value should always be translated from the scalar # type for the array, to the physical storage type for # the data, before passing to take. result = take(data, indices, fill_value=fill_value, allow_fill=allow_fill) return self._from_sequence(result, dtype=self.dtype)
-
unique
() → rle_array.array.RLEArray¶ Compute the ExtensionArray of unique values.
- Returns
uniques
- Return type
-
value_counts
(dropna: bool = True) → pandas.core.series.Series¶
-
var
(skipna: bool = True, ddof: int = 1, dtype: Optional[Any] = None, axis: Optional[int] = 0, out: Optional[Any] = None) → Any¶
-
view
(dtype: Optional[Any] = None) → Any¶ Return a view on the array.
- Parameters
dtype (str, np.dtype, or ExtensionDtype, optional) – Default None.
- Returns
A view on the
ExtensionArray
’s data.- Return type