lists.arthurdejong.org
RSS feed

python-stdnum branch master updated. 1.17-37-ga261a93

[Date Prev][Date Next] [Thread Prev][Thread Next]

python-stdnum branch master updated. 1.17-37-ga261a93



This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "python-stdnum".

The branch, master has been updated
       via  a261a931cb00854fc92b7278b7eb9086116e4c10 (commit)
      from  eff3f526e3c6d19bed1ab3e17910667ae31d64cc (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://arthurdejong.org/git/python-stdnum/commit/?id=a261a931cb00854fc92b7278b7eb9086116e4c10

commit a261a931cb00854fc92b7278b7eb9086116e4c10
Author: Leandro Regueiro <leandro.regueiro@gmail.com>
Date:   Sat Sep 17 21:44:04 2022 +0200

    Add North Macedonian ЕДБ
    
    Note that this is implementation is mostly based on unofficial sources
    describing the format, which match the hundreds of examples found
    online.
    
https://forum.it.mk/threads/modularna-kontrola-na-embg-edb-dbs-itn.15663/?__cf_chl_tk=Op2PaEIauip6Z.ZjvhP897O8gRVAwe5CDAVTpjx1sEo-1663498930-0-gaNycGzNCRE#post-187048
    
    Also note that the algorithm for the check digit was tested on all found
    examples, and it doesn't work for all of them, despite those failing
    examples don't seem to be valid according to the official online search.
    
    Closes https://github.com/arthurdejong/python-stdnum/pull/330
    Closes https://github.com/arthurdejong/python-stdnum/issues/222

diff --git a/stdnum/mk/__init__.py b/stdnum/mk/__init__.py
new file mode 100644
index 0000000..fac842e
--- /dev/null
+++ b/stdnum/mk/__init__.py
@@ -0,0 +1,24 @@
+# __init__.py - collection of North Macedonia numbers
+# coding: utf-8
+#
+# Copyright (C) 2022 Leandro Regueiro
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2.1 of the License, or (at your option) any later version.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+# 02110-1301 USA
+
+"""Collection of North Macedonia numbers."""
+
+# provide aliases
+from stdnum.mk import edb as vat  # noqa: F401
diff --git a/stdnum/mk/edb.py b/stdnum/mk/edb.py
new file mode 100644
index 0000000..6d84f58
--- /dev/null
+++ b/stdnum/mk/edb.py
@@ -0,0 +1,96 @@
+# edb.py - functions for handling North Macedonia EDB numbers
+# coding: utf-8
+#
+# Copyright (C) 2022 Leandro Regueiro
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2.1 of the License, or (at your option) any later version.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+# 02110-1301 USA
+
+"""ЕДБ (Едниствен Даночен Број, North Macedonia tax number).
+
+This number consists of 13 digits, sometimes with an additional "MK" prefix.
+
+More information:
+
+* http://www.ujp.gov.mk/en
+
+>>> validate('4030000375897')
+'4030000375897'
+>>> validate('МК 4020990116747')  # Cyrillic letters
+'4020990116747'
+>>> validate('MK4057009501106')  # ASCII letters
+'4057009501106'
+>>> validate('4030000375890')
+Traceback (most recent call last):
+    ...
+InvalidChecksum: ...
+>>> format('МК 4020990116747')  # Cyrillic letters
+'4020990116747'
+>>> format('MK4057009501106')  # ASCII letters
+'4057009501106'
+"""
+
+from stdnum.exceptions import *
+from stdnum.util import clean, isdigits
+
+
+def compact(number):
+    """Convert the number to the minimal representation.
+
+    This strips the number of any valid separators and removes surrounding
+    whitespace.
+    """
+    number = clean(number, ' -').upper().strip()
+    # First two are ASCII, second two are Cyrillic and only strip matching
+    # types to avoid implicit conversion to unicode strings in Python 2.7
+    for prefix in ('MK', u'MK', 'МК', u'МК'):
+        if isinstance(number, type(prefix)) and number.startswith(prefix):
+            number = number[len(prefix):]
+    return number
+
+
+def calc_check_digit(number):
+    """Calculate the check digit."""
+    weights = (7, 6, 5, 4, 3, 2, 7, 6, 5, 4, 3, 2)
+    total = sum(int(n) * w for n, w in zip(number, weights))
+    return str((-total % 11) % 10)
+
+
+def validate(number):
+    """Check if the number is a valid North Macedonia ЕДБ number.
+
+    This checks the length, formatting and check digit.
+    """
+    number = compact(number)
+    if len(number) != 13:
+        raise InvalidLength()
+    if not isdigits(number):
+        raise InvalidFormat()
+    if number[-1] != calc_check_digit(number):
+        raise InvalidChecksum()
+    return number
+
+
+def is_valid(number):
+    """Check if the number is a valid North Macedonia ЕДБ number."""
+    try:
+        return bool(validate(number))
+    except ValidationError:
+        return False
+
+
+def format(number):
+    """Reformat the number to the standard presentation format."""
+    return compact(number)
diff --git a/tests/test_mk_edb.doctest b/tests/test_mk_edb.doctest
new file mode 100644
index 0000000..07b8a37
--- /dev/null
+++ b/tests/test_mk_edb.doctest
@@ -0,0 +1,178 @@
+test_mk_edb.doctest - more detailed doctests for stdnum.mk.edb module
+
+Copyright (C) 2022 Leandro Regueiro
+
+This library is free software; you can redistribute it and/or
+modify it under the terms of the GNU Lesser General Public
+License as published by the Free Software Foundation; either
+version 2.1 of the License, or (at your option) any later version.
+
+This library is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+Lesser General Public License for more details.
+
+You should have received a copy of the GNU Lesser General Public
+License along with this library; if not, write to the Free Software
+Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+02110-1301 USA
+
+
+This file contains more detailed doctests for the stdnum.mk.edb module. It
+tries to test more corner cases and detailed functionality that is not really
+useful as module documentation.
+
+>>> from stdnum.mk import edb
+
+
+Tests for some corner cases.
+
+>>> edb.validate('4030000375897')
+'4030000375897'
+>>> str(edb.validate(u'МК 4020990116747'))  # Cyrillic letters
+'4020990116747'
+>>> edb.validate('MK4057009501106')  # ASCII letters
+'4057009501106'
+>>> edb.validate('МК4030006603425')  # Cyrillic letters
+'4030006603425'
+>>> str(edb.validate(u'МК4030006603425'))  # Cyrillic letters
+'4030006603425'
+>>> edb.validate('12345')
+Traceback (most recent call last):
+    ...
+InvalidLength: ...
+>>> edb.validate('1234567890XYZ')
+Traceback (most recent call last):
+    ...
+InvalidFormat: ...
+>>> edb.validate('4030000375890')
+Traceback (most recent call last):
+    ...
+InvalidChecksum: ...
+>>> edb.format('4030000375897')
+'4030000375897'
+>>> str(edb.format(u'МК 4020990116747'))  # Cyrillic letters
+'4020990116747'
+>>> edb.format('MK4057009501106')  # ASCII letters
+'4057009501106'
+>>> str(edb.format(u'МК4030006603425'))  # Cyrillic letters
+'4030006603425'
+
+
+These have been found online and should all be valid numbers.
+
+>>> numbers = '''
+...
+... 4001015504286
+... 4002012527974
+... 4002015539612
+... 4002015541625
+... 4002017550907
+... 4002979132007
+... 4002991125091
+... 4002995103351
+... 4002999142769
+... 4004010506301
+... 4004019517659
+... 4004996102912
+... 4006012508266
+... 4006999109376
+... 4007013515009
+... 4012995101109
+... 4017004148023
+... 4017012520357
+... 4017015527930
+... 4017996129718
+... 4017999130385
+... 4020010511130
+... 4020015529216
+... 4020017533580
+... 4020019538376
+... 4020991100666
+... 4021004145439
+... 4021018535892
+... 4023003112815
+... 4023003113056
+... 4023009503069
+... 4026012514780
+... 4027008504520
+... 4027015522240
+... 4027017526693
+... 4027991103694
+... 4028006151708
+... 4028008502435
+... 4028008506791
+... 4028008507070
+... 4028016528567
+... 4028017532851
+... 4028017533823
+... 4028018535986
+... 4028019537796
+... 4028999127025
+... 4028999139961
+... 4029007136199
+... 4029009505450
+... 4029999133765
+... 4030000407411
+... 4030001404580
+... 4030002438667
+... 4030004518463
+... 4030005553815
+... 4030005565759
+... 4030006601066
+... 4030007641649
+... 4030974163226
+... 4030984349182
+... 4030992158418
+... 4030993125769
+... 4030996123112
+... 4030996244823
+... 4030999431535
+... 4032017535882
+... 4032018538060
+... 4044012506696
+... 4044013508269
+... 4044017513879
+... 4054009500097
+... 4054016503417
+... 4057012518508
+... 4057013523246
+... 4057017536733
+... 4057018541765
+... 4057019547848
+... 4061018502090
+... 4064012500591
+... 4065015500653
+... 4069011500670
+... 4071016500495
+... 4075013502816
+... 4080011522414
+... 4080012525921
+... 4080015554205
+... 4080016560551
+... 4080019578567
+... 4080019584974
+... 4080019585377
+... 4082014512960
+... 4082014513702
+... 4082018521297
+... 4082019523064
+... MK 4030005534810
+... MK4030996254241
+... MK4032012519218
+... MK4057009501106
+... MK4057009502714
+... MK4080012530747
+... MK4080014548341
+... МК 4004991103546
+... МК 4020990116747
+... МК4011014511586
+... МК4027002133725
+... МК4027992109874
+... МК4030000398099
+... МК4030002456746
+... МК4030993244482
+...
+... '''
+>>> [x for x in numbers.splitlines() if x and not edb.is_valid(x)]
+[]

-----------------------------------------------------------------------

Summary of changes:
 stdnum/{ke => mk}/__init__.py   |   6 +-
 stdnum/{py/ruc.py => mk/edb.py} |  69 ++++++++--------
 tests/test_mk_edb.doctest       | 178 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 213 insertions(+), 40 deletions(-)
 copy stdnum/{ke => mk}/__init__.py (84%)
 copy stdnum/{py/ruc.py => mk/edb.py} (52%)
 create mode 100644 tests/test_mk_edb.doctest


hooks/post-receive
-- 
python-stdnum