lists.arthurdejong.org
RSS feed

python-stdnum branch master updated. 1.8.1-12-g647dfea

[Date Prev][Date Next] [Thread Prev][Thread Next]

python-stdnum branch master updated. 1.8.1-12-g647dfea



This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "python-stdnum".

The branch, master has been updated
       via  647dfeab91847d4d8d9e3bba486082821b887dd3 (commit)
      from  6e30cf59a225459485f380d5c03f27b9b18fca98 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://arthurdejong.org/git/python-stdnum/commit/?id=647dfeab91847d4d8d9e3bba486082821b887dd3

commit 647dfeab91847d4d8d9e3bba486082821b887dd3
Author: Arthur de Jong <arthur@arthurdejong.org>
Date:   Sun Mar 11 15:01:08 2018 +0100

    Add German Steuernummer
    
    Based on the implementation provided by Mohammed Salman of Holvi.
    
    This is the old tax number that is being replaced by the Steuerliche
    Identifikationsnummer. The number has a regional form (which is used
    most often) and a national form.
    
    Closes https://github.com/arthurdejong/python-stdnum/pull/49

diff --git a/stdnum/de/stnr.py b/stdnum/de/stnr.py
new file mode 100644
index 0000000..a9b2026
--- /dev/null
+++ b/stdnum/de/stnr.py
@@ -0,0 +1,199 @@
+# steuernummer.py - functions for handling German tax numbers
+# coding: utf-8
+#
+# Copyright (C) 2017 Holvi Payment Services
+# Copyright (C) 2018 Arthur de Jong
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2.1 of the License, or (at your option) any later version.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+# 02110-1301 USA
+
+"""St.-Nr. (Steuernummer, German tax number).
+
+The Steuernummer (St.-Nr.) is a tax number assigned by regional tax offices
+to taxable individuals and organisations. The number is being replaced by the
+Steuerliche Identifikationsnummer (IdNr).
+
+The number has 10 or 11 digits for the regional form (per Bundesland) and 13
+digits for the number that is unique within Germany. The number consists of
+(part of) the Bundesfinanzamtsnummer (BUFA-Nr.), a district number, a serial
+number and a check digit.
+
+More information:
+
+* https://de.wikipedia.org/wiki/Steuernummer
+
+>>> validate(' 181/815/0815 5')
+'18181508155'
+>>> validate('201/123/12340', 'Sachsen')
+'20112312340'
+>>> validate('4151081508156', 'Thuringen')
+'4151081508156'
+>>> validate('4151181508156', 'Thuringen')
+Traceback (most recent call last):
+    ...
+InvalidFormat: ...
+>>> validate('136695978')
+Traceback (most recent call last):
+    ...
+InvalidLength: ...
+"""
+
+import re
+
+from stdnum.exceptions import *
+from stdnum.util import clean
+
+
+# The number formats per region (regional and country-wide format)
+_number_formats_per_region = {
+    'Baden-Württemberg': ['FFBBBUUUUP', '28FF0BBBUUUUP'],
+    'Bayern': ['FFFBBBUUUUP', '9FFF0BBBUUUUP'],
+    'Berlin': ['FFBBBUUUUP', '11FF0BBBUUUUP'],
+    'Brandenburg': ['0FFBBBUUUUP', '30FF0BBBUUUUP'],
+    'Bremen': ['FFBBBUUUUP', '24FF0BBBUUUUP'],
+    'Hamburg': ['FFBBBUUUUP', '22FF0BBBUUUUP'],
+    'Hessen': ['0FFBBBUUUUP', '26FF0BBBUUUUP'],
+    'Mecklenburg-Vorpommern': ['0FFBBBUUUUP', '40FF0BBBUUUUP'],
+    'Niedersachsen': ['FFBBBUUUUP', '23FF0BBBUUUUP'],
+    'Nordrhein-Westfalen': ['FFFBBBBUUUP', '5FFF0BBBBUUUP'],
+    'Rheinland-Pfalz': ['FFBBBUUUUP', '27FF0BBBUUUUP'],
+    'Saarland': ['0FFBBBUUUUP', '10FF0BBBUUUUP'],
+    'Sachsen': ['2FFBBBUUUUP', '32FF0BBBUUUUP'],
+    'Sachsen-Anhalt': ['1FFBBBUUUUP', '31FF0BBBUUUUP'],
+    'Schleswig-Holstein': ['FFBBBUUUUP', '21FF0BBBUUUUP'],
+    'Thüringen': ['1FFBBBUUUUP', '41FF0BBBUUUUP'],
+}
+
+REGIONS = sorted(_number_formats_per_region.keys())
+"""Valid regions recognised by this module."""
+
+
+def _clean_region(region):
+    """Convert the region name to something that we can use for comparison
+    without running into encoding issues."""
+    return ''.join(
+        x for x in region.lower()
+        if x in 'abcdefghijklmnopqrstvwxyz')
+
+
+class _Format(object):
+
+    def __init__(self, fmt):
+        self._fmt = fmt
+        self._re = re.compile('^%s$' % re.sub(
+            r'([FBUP])\1*',
+            lambda x: r'(\d{%d})' % len(x.group(0)), fmt))
+
+    def match(self, number):
+        return self._re.match(number)
+
+    def replace(self, f, b, u, p):
+        items = iter([f, b, u, p])
+        return re.sub(r'([FBUP])\1*', lambda x: next(items), self._fmt)
+
+
+# Convert the structure to something that we can easily use
+_number_formats_per_region = dict(
+    (_clean_region(region), [
+        region, _Format(formats[0]), _Format(formats[1])])
+    for region, formats in _number_formats_per_region.items())
+
+
+def _get_formats(region=None):
+    """Return the formats for the region."""
+    if region:
+        region = _clean_region(region)
+        if region not in _number_formats_per_region:
+            raise InvalidComponent()
+        return [_number_formats_per_region[region]]
+    return _number_formats_per_region.values()
+
+
+def compact(number):
+    """Convert the number to the minimal representation. This strips the
+    number of any valid separators and removes surrounding whitespace."""
+    return clean(number, ' -./,').strip()
+
+
+def validate(number, region=None):
+    """Check if the number is a valid tax number. This checks the length and
+    formatting. The region can be supplied to verify that the number is
+    assigned in that region."""
+    number = compact(number)
+    if not number.isdigit():
+        raise InvalidFormat()
+    if len(number) not in (10, 11, 13):
+        raise InvalidLength()
+    if not any(region_fmt.match(number) or country_fmt.match(number)
+               for region, region_fmt, country_fmt in _get_formats(region)):
+        raise InvalidFormat()
+    return number
+
+
+def is_valid(number, region=None):
+    """Check if the number is a valid tax number. This checks the length and
+    formatting. The region can be supplied to verify that the number is
+    assigned in that region."""
+    try:
+        return bool(validate(number, region))
+    except ValidationError:
+        return False
+
+
+def guess_regions(number):
+    """Return a list of regions this number is valid for."""
+    number = compact(number)
+    return sorted(
+        region for region, region_fmt, country_fmt in _get_formats()
+        if region_fmt.match(number) or country_fmt.match(number))
+
+
+def to_regional_number(number):
+    """Convert the number to a regional (10 or 11 digit) number."""
+    number = compact(number)
+    for region, region_fmt, country_fmt in _get_formats():
+        m = country_fmt.match(number)
+        if m:
+            return region_fmt.replace(*m.groups())
+    raise InvalidFormat()
+
+
+def to_country_number(number, region=None):
+    """Convert the number to the nationally unique number. The region is
+    needed if the number is not only valid for one particular region."""
+    number = compact(number)
+    formats = (
+        (region_fmt.match(number), country_fmt)
+        for region, region_fmt, country_fmt in _get_formats(region))
+    formats = [
+        (region_match, country_fmt)
+        for region_match, country_fmt in formats
+        if region_match]
+    if not formats:
+        raise InvalidFormat()
+    if len(formats) != 1:
+        raise InvalidComponent()
+    return formats[0][1].replace(*formats[0][0].groups())
+
+
+def format(number, region=None):
+    """Reformat the passed number to the standard format."""
+    number = compact(number)
+    for region, region_fmt, country_fmt in _get_formats(region):
+        m = region_fmt.match(number)
+        if m:
+            f, b, u, p = m.groups()
+            return region_fmt.replace(f + '/', b + '/', u, p)
+    return number
diff --git a/tests/test_de_stnr.doctest b/tests/test_de_stnr.doctest
new file mode 100644
index 0000000..7996c34
--- /dev/null
+++ b/tests/test_de_stnr.doctest
@@ -0,0 +1,193 @@
+test_de_stnr.doctest - more detailed doctests for the stdnum.de.stnr module
+
+Copyright (C) 2017 Holvi Payment Services
+Copyright (C) 2018 Arthur de Jong
+
+This library is free software; you can redistribute it and/or
+modify it under the terms of the GNU Lesser General Public
+License as published by the Free Software Foundation; either
+version 2.1 of the License, or (at your option) any later version.
+
+This library is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+Lesser General Public License for more details.
+
+You should have received a copy of the GNU Lesser General Public
+License along with this library; if not, write to the Free Software
+Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+02110-1301 USA
+
+
+This file contains more detailed doctests for the stdnum.de.stnr module. It
+tries to validate a number of numbers that have been found online.
+
+>>> from stdnum.de import stnr
+
+
+Some simple tests.
+
+>>> stnr.validate('1123456789')
+'1123456789'
+>>> stnr.validate('1123456789', 'Berlin')
+'1123456789'
+>>> stnr.validate('12234567890', 'Berlin')
+Traceback (most recent call last):
+    ...
+InvalidFormat: ...
+>>> stnr.validate('1123456789', 'Unknown region')
+Traceback (most recent call last):
+    ...
+InvalidComponent: ...
+>>> stnr.validate('1234567890')  # 10-digit number
+'1234567890'
+>>> stnr.validate('12345678901')  # 11-digit number
+'12345678901'
+>>> stnr.validate('1123045678901')  # 13-digit number
+'1123045678901'
+>>> stnr.validate('123456789')  # short number
+Traceback (most recent call last):
+    ...
+InvalidLength: ...
+
+
+The module should handle various encodings of region names properly.
+
+>>> stnr.validate('9381508152', u'Baden-W\xfcrttemberg')  # Python unicode
+'9381508152'
+>>> stnr.validate('9381508152', 'Baden-W\xc3\xbcrttemberg')  # UTF-8
+'9381508152'
+>>> stnr.validate('9381508152', 'Baden-W\xfcrttemberg')  # ISO-8859-15
+'9381508152'
+>>> stnr.validate('9381508152', 'Baden Wurttemberg')  # ASCII with space
+'9381508152'
+
+
+Given a number we are able to find a region.
+
+>>> stnr.guess_regions('1123045678901')  # 13-digit number
+['Berlin']
+>>> stnr.guess_regions('98765432101')  # 11-digit number
+['Bayern', 'Nordrhein-Westfalen']
+>>> stnr.guess_regions('123')  # invalid number
+[]
+
+
+We can convert the 13-digit country number to a regional number without
+issues. We can also convert it back if we know the region.
+
+>>> stnr.guess_regions('2475081508152')
+['Bremen']
+>>> stnr.to_regional_number('2475081508152')
+'7581508152'
+>>> stnr.validate('7581508152', 'Bremen')
+'7581508152'
+>>> stnr.to_regional_number('123')
+Traceback (most recent call last):
+    ...
+InvalidFormat: ...
+>>> stnr.to_country_number('7581508152', 'Bremen')
+'2475081508152'
+>>> stnr.to_country_number('7581508152')  # not unique, need region
+Traceback (most recent call last):
+    ...
+InvalidComponent: ...
+>>> stnr.to_country_number('123')
+Traceback (most recent call last):
+    ...
+InvalidFormat: ...
+
+
+We can also format numbers by separating the groups with slashes. This is
+most often seen for regional numbers and the 13-digit numbers don't get the
+slashes.
+
+>>> stnr.format('18181508155', 'Bayern')
+'181/815/08155'
+>>> stnr.format('18181508155', 'Nordrhein-Westfalen')
+'181/8150/8155'
+>>> stnr.format('2181508150')
+'21/815/08150'
+>>> stnr.format('156 / 141 / 14808', 'Thuringen')
+'156/141/14808'
+>>> stnr.format('2893081508152')  # 13-digit number
+'2893081508152'
+>>> stnr.format('123')  # unknown format
+'123'
+
+
+These have been found online and should all be valid numbers.
+
+>>> numbers = '''
+...
+... 010/815/08182
+... 013 815 08153
+... 02/815/08156
+... 04 522 658 002
+... 042/213/02423
+... 048/815/08155
+... 079/815/08151
+... 101/5761/1744
+... 101/815/08154
+... 1010081508182
+... 1121081508150
+... 116/5701/1448
+... 123/456/7890
+... 133/5909/3295
+... 133/8150/8159
+... 14044/00050
+... 143/317/22090
+... 147/276/80579
+... 151/815/08156
+... 156 / 141 / 14808
+... 162/107/03482
+... 181/815/08155
+... 1929008636
+... 201/123/12340
+... 201/5902/3626
+... 201/5906/3686
+... 202/ 106/ 08312
+... 203/100/04333
+... 20418290688
+... 208/140/04075
+... 21/815/08150
+... 212/5730/0455
+... 2129081508158
+... 22/815/08154
+... 220/5769/0078
+... 2202081508156
+... 2324081508151
+... 24/815/08151
+... 2475081508152
+... 249/115/90057
+... 249/133/90020
+... 26 242 02421
+... 2613081508153
+... 27 173 00028
+... 27/673/50365
+... 2722081508154
+... 2893081508152
+... 29/815/08158
+... 3048081508155
+... 307/5904/0270
+... 3101081508154
+... 312/5120/1726
+... 313/5753/1315
+... 3201012312340
+... 332/5751/2 653
+... 332/5776/0076
+... 339/5822/0944
+... 342/5938/0307
+... 4079081508151
+... 4151081508156
+... 5133081508159
+... 75 815 08152
+... 76 001/12 885
+... 9181081508155
+... 93815/08152
+... 99015/28445
+... 99019132055
+...
+... '''
+>>> [x for x in numbers.splitlines() if x and not stnr.is_valid(x)]
+[]

-----------------------------------------------------------------------

Summary of changes:
 stdnum/de/stnr.py          | 199 +++++++++++++++++++++++++++++++++++++++++++++
 tests/test_de_stnr.doctest | 193 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 392 insertions(+)
 create mode 100644 stdnum/de/stnr.py
 create mode 100644 tests/test_de_stnr.doctest


hooks/post-receive
-- 
python-stdnum
-- 
To unsubscribe send an email to
python-stdnum-commits-unsubscribe@lists.arthurdejong.org or see
https://lists.arthurdejong.org/python-stdnum-commits/