Metadata-Version: 2.1
Name: aliases
Version: 0.5.5
Summary: Keeping track of aliases
Project-URL: Homepage, https://github.com/KoenBaak/aliases
Author-email: Koen Baak <koenbaak@gmail.com>
License-File: LICENSE
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Software Development :: Libraries :: Application Frameworks
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.10
Provides-Extra: doc
Provides-Extra: test
Description-Content-Type: text/x-rst

Aliases
=======
A very small Python package for keeping track of aliases.

Installation
------------

    $ pip install aliases

Getting Started
---------------
Keeping track of aliases in your data can be annoying. This small packages
provides three small classes than can help you in the bookkeeping associated
with the occurrences of aliases in your data.

The ``AliasSpace`` objects keeps track of existing aliases. As input is accepts
a dictionary where a string (the "preferred" form) points to a list of all its
aliases. Using the ``str`` method on the space, we can transform regular strings
into ``AliasAwareString`` objects. 

.. code-block:: python

    >>> s = AliasSpace(
    >>>     {
    >>>         "The Netherlands": ["NL", "Netherlands", "Holland"],
    >>>         "The Hague": ["Den Haag", "'s-Gravenhage"],
    >>>         "Amsterdam": ["Adam"],
    >>>     },
    >>>     case_sensitive=False,
    >>> )
    >>>
    >>> s.str("nl")
    <'nl' in AliasSpace>


The preferred form of an ``AliasAwareString`` is called its *representative*
(because it represents the equivalance class of the string under the equivalance
relation of being aliases).

.. code-block:: python

    >>> s.str("nl").representative
    'The Netherlands'

``AliasAwareString`` objects with the same representative are considered equal
and have the same hash.

.. code-block:: python

    >>> s.str("holland") == s.str("NL")
    True
    >>>
    >>> data = {s.str("holland"): 12345}
    >>> data[s.str("nl")]
    12345

The example above already shows how alias aware strings can be used to store
data without worrying too much about the different aliases around. However, it
is still annoying to cast to an ``AliasAwareString`` every time manually. To
solve this you can use the ``AliasAwareDict``. This object can be created using
the ``dict`` method on the space.

.. code-block:: python

    >>> data = s.dict(holland=12345)
    >>> data['nl']
    12345

Finally, when you have ``pandas`` installed, the ``aliases`` package will
register accessors for series and dataframes. This allows you to easily enforce
aliases in your pandas DataFrame. The following example was the original
motivation for building this package:

.. code-block:: python

    >>> import pandas as pd
    >>> df = pd.DataFrame(
    >>>     {
    >>>         "Country": ["NL", "Netherlands", "Belgium"],
    >>>         "City": ["Den Haag", "amsterdam", "Brussel"],
    >>>         "SomeData": [10, 11, 12],
    >>>     }
    >>> )
    >>> df
           Country       City  SomeData
    0           NL   Den Haag        10
    1  Netherlands  amsterdam        11
    2      Belgium    Brussel        12
    >>>
    >>> df.Country.aliases.representative(space=s)
    0    The Netherlands
    1    The Netherlands
    2            Belgium
    Name: Country, dtype: object
    >>>
    >>> df.aliases.representative(space=s, missing=pd.NA)
               Country       City  SomeData
    0  The Netherlands  The Hague        10
    1  The Netherlands  Amsterdam        11
    2             <NA>       <NA>        12

Documentation
-------------
Coming soon...