lists.arthurdejong.org
RSS feed

python-stdnum branch master updated. 1.8.1-18-gceb3c62

[Date Prev][Date Next] [Thread Prev][Thread Next]

python-stdnum branch master updated. 1.8.1-18-gceb3c62



This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "python-stdnum".

The branch, master has been updated
       via  ceb3c628531f17b5a91596295f27fc8ebe0e3d39 (commit)
      from  fec1685fc83534e0b5f060a9897417b3dd47dc20 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://arthurdejong.org/git/python-stdnum/commit/?id=ceb3c628531f17b5a91596295f27fc8ebe0e3d39

commit ceb3c628531f17b5a91596295f27fc8ebe0e3d39
Author: Arthur de Jong <arthur@arthurdejong.org>
Date:   Fri Mar 23 13:55:50 2018 +0100

    Add German company registry numbers
    
    Based on the implementation provided by Markus Törnqvist and Lari
    Haataja of Holvi Payment Services.

diff --git a/stdnum/de/__init__.py b/stdnum/de/__init__.py
index 15a5734..1449961 100644
--- a/stdnum/de/__init__.py
+++ b/stdnum/de/__init__.py
@@ -19,3 +19,6 @@
 # 02110-1301 USA
 
 """Collection of German numbers."""
+
+# provide businessid as an alias
+from stdnum.de import handelsregisternummer as businessid  # noqa: F401
diff --git a/stdnum/de/handelsregisternummer.py 
b/stdnum/de/handelsregisternummer.py
new file mode 100644
index 0000000..c5b9041
--- /dev/null
+++ b/stdnum/de/handelsregisternummer.py
@@ -0,0 +1,307 @@
+# handelsregisternummer.py - functions for handling German company registry id
+# coding: utf-8
+#
+# Copyright (C) 2015 Holvi Payment Services Oy
+# Copyright (C) 2018 Arthur de Jong
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2.1 of the License, or (at your option) any later version.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+# 02110-1301 USA
+
+"""Handelsregisternummer (German company register number).
+
+The number consists of the court where the company has registered, the type
+of register and the registration number.
+
+The type of the register is either HRA or HRB where the letter "B" stands for
+HR section B, where limited liability companies and corporations are entered
+(GmbH's and AG's). There is also a section HRA for business partnerships
+(OHG's, KG's etc.). In other words: businesses in section HRB are limited
+liability companies, while businesses in HRA have personally liable partners.
+
+More information:
+
+* https://www.handelsregister.de/
+* https://en.wikipedia.org/wiki/German_Trade_Register
+
+>>> validate('Aachen HRA 11223')
+'Aachen HRA 11223'
+>>> validate('Frankfurt/Oder GnR 11223', company_form='e.G.')
+'Frankfurt/Oder GnR 11223'
+>>> validate('Aachen HRC 44123')
+Traceback (most recent call last):
+  ...
+InvalidFormat: ...
+>>> validate('Aachen HRA 44123', company_form='GmbH')
+Traceback (most recent call last):
+  ...
+InvalidComponent: ...
+"""
+
+import re
+
+from stdnum.exceptions import *
+from stdnum.util import clean
+
+
+# The known courts that have a Handelsregister
+GERMAN_COURTS = (
+    'Aachen',
+    'Altenburg',
+    'Amberg',
+    'Ansbach',
+    'Apolda',
+    'Arnsberg',
+    'Arnstadt Zweigstelle Ilmenau',
+    'Arnstadt',
+    'Aschaffenburg',
+    'Augsburg',
+    'Aurich',
+    'Bad Hersfeld',
+    'Bad Homburg v.d.H.',
+    'Bad Kreuznach',
+    'Bad Oeynhausen',
+    'Bad Salzungen',
+    'Bamberg',
+    'Bayreuth',
+    'Berlin (Charlottenburg)',
+    'Bielefeld',
+    'Bochum',
+    'Bonn',
+    'Braunschweig',
+    'Bremen',
+    'Chemnitz',
+    'Coburg',
+    'Coesfeld',
+    'Cottbus',
+    'Darmstadt',
+    'Deggendorf',
+    'Dortmund',
+    'Dresden',
+    'Duisburg',
+    'Düren',
+    'Düsseldorf',
+    'Eisenach',
+    'Erfurt',
+    'Eschwege',
+    'Essen',
+    'Flensburg',
+    'Frankfurt am Main',
+    'Frankfurt/Oder',
+    'Freiburg',
+    'Friedberg',
+    'Fritzlar',
+    'Fulda',
+    'Fürth',
+    'Gelsenkirchen',
+    'Gera',
+    'Gießen',
+    'Gotha',
+    'Greiz',
+    'Göttingen',
+    'Gütersloh',
+    'Hagen',
+    'Hamburg',
+    'Hamm',
+    'Hanau',
+    'Hannover',
+    'Heilbad Heiligenstadt',
+    'Hildburghausen',
+    'Hildesheim',
+    'Hof',
+    'Homburg',
+    'Ingolstadt',
+    'Iserlohn',
+    'Jena',
+    'Kaiserslautern',
+    'Kassel',
+    'Kempten (Allgäu)',
+    'Kiel',
+    'Kleve',
+    'Koblenz',
+    'Korbach',
+    'Krefeld',
+    'Köln',
+    'Königstein',
+    'Landau',
+    'Landshut',
+    'Langenfeld',
+    'Lebach',
+    'Leipzig',
+    'Lemgo',
+    'Limburg',
+    'Ludwigshafen a.Rhein (Ludwigshafen)',
+    'Lübeck',
+    'Lüneburg',
+    'Mainz',
+    'Mannheim',
+    'Marburg',
+    'Meiningen',
+    'Memmingen',
+    'Merzig',
+    'Montabaur',
+    'Mönchengladbach',
+    'Mühlhausen',
+    'München',
+    'Münster',
+    'Neubrandenburg',
+    'Neunkirchen',
+    'Neuruppin',
+    'Neuss',
+    'Nordhausen',
+    'Nürnberg',
+    'Offenbach am Main',
+    'Oldenburg (Oldenburg)',
+    'Osnabrück',
+    'Ottweiler',
+    'Paderborn',
+    'Passau',
+    'Pinneberg',
+    'Potsdam',
+    'Pößneck Zweigstelle Bad Lobenstein',
+    'Pößneck',
+    'Recklinghausen',
+    'Regensburg',
+    'Rostock',
+    'Rudolstadt Zweigstelle Saalfeld',
+    'Rudolstadt',
+    'Saarbrücken',
+    'Saarlouis',
+    'Schweinfurt',
+    'Schwerin',
+    'Siegburg',
+    'Siegen',
+    'Sondershausen',
+    'Sonneberg',
+    'St. Ingbert (St Ingbert)',
+    'St. Wendel (St Wendel)',
+    'Stadthagen',
+    'Stadtroda',
+    'Steinfurt',
+    'Stendal',
+    'Stralsund',
+    'Straubing',
+    'Stuttgart',
+    'Suhl',
+    'Sömmerda',
+    'Tostedt',
+    'Traunstein',
+    'Ulm',
+    'Völklingen',
+    'Walsrode',
+    'Weiden i. d. OPf.',
+    'Weimar',
+    'Wetzlar',
+    'Wiesbaden',
+    'Wittlich',
+    'Wuppertal',
+    'Würzburg',
+    'Zweibrücken',
+)
+
+
+def _to_min(court):
+    """Convert the court name for quick comparison without encoding issues."""
+    return ''.join(
+        x for x in court.lower()
+        if x in 'bcdefghijklmnpqrstvwxyz')
+
+
+# Build a dictionary for lookup up courts
+_courts = dict(
+    (_to_min(court), court) for court in GERMAN_COURTS)
+_courts.update(
+    (_to_min(alias), court) for alias, court in (
+        ('Bad Homburg', 'Bad Homburg v.d.H.'),
+        ('Berlin', 'Berlin (Charlottenburg)'),
+        ('Charlottenburg', 'Berlin (Charlottenburg)'),
+        ('Oldenburg', 'Oldenburg (Oldenburg)'),
+    ))
+
+
+# The known registry types
+REGISTRY_TYPES = (
+    'HRA',
+    'HRB',
+    'PR',
+    'GnR',
+    'VR',
+)
+
+COMPANY_FORM_REGISTRY_TYPES = {
+    'e.K.': 'HRA',
+    'e.V.': 'VR',
+    'Verein': 'VR',
+    'OHG': 'HRA',
+    'KG': 'HRA',
+    'KGaA': 'HRB',
+    'Vor-GmbH': 'HRB',
+    'GmbH': 'HRB',
+    'UG': 'HRB',
+    'UG i.G.': 'HRB',
+    'AG': 'HRB',
+    'e.G.': 'GnR',
+    'PartG': 'PR',
+}
+
+
+# possible formats the number can be specified in
+_court_re = r'(?P<court>.*)'
+_registry_re = r'(?P<registry>%s)' % '|'.join(REGISTRY_TYPES)
+_number_re = r'(?P<nr>[0-9]{3,6})(\s*(?P<x>[A-Z]{1,3}))?'
+_formats = [
+    _registry_re + r'\s+' + _number_re + r',?\s+' + _court_re,
+    _court_re + r',?\s+' + _registry_re + r'\s+' + _number_re,
+]
+
+
+def _split(number):
+    """Split the number into a court, registry, register number and
+    optionally qualifier."""
+    number = clean(number).strip()
+    for fmt in _formats:
+        m = re.match(fmt, number, flags=re.I | re.U)
+        if m:
+            return m.group('court'), m.group('registry'), m.group('nr'), 
m.group('x')
+    raise InvalidFormat()
+
+
+def compact(number):
+    """Convert the number to the minimal representation. This strips the
+    number of any valid separators and removes surrounding whitespace."""
+    court, registry, number, qualifier = _split(number)
+    return ' '.join(x for x in [court, registry, number, qualifier] if x)
+
+
+def validate(number, company_form=None):
+    """Check if the number is a valid company registry number. If a
+    company_form (eg. GmbH or PartG) is given, the number is validated to
+    have the correct registry type."""
+    court, registry, number, qualifier = _split(number)
+    court = _courts.get(_to_min(court))
+    if not court:
+        raise InvalidComponent()
+    if type(court) != type(number):  # pragma: no cover (Python 2 code)
+        court = court.decode('utf-8')
+    if company_form and COMPANY_FORM_REGISTRY_TYPES.get(company_form) != 
registry:
+        raise InvalidComponent()
+    return ' '.join(x for x in [court, registry, number, qualifier] if x)
+
+
+def is_valid(number):
+    """Check if the number is a valid company registry number."""
+    try:
+        return bool(validate(number))
+    except ValidationError:
+        return False
diff --git a/tests/test_de_handelsregisternummer.doctest 
b/tests/test_de_handelsregisternummer.doctest
new file mode 100644
index 0000000..e6143ea
--- /dev/null
+++ b/tests/test_de_handelsregisternummer.doctest
@@ -0,0 +1,207 @@
+test_de_handelsregisternummer.doctest - tests for German register number
+
+Copyright (C) 2018 Arthur de Jong
+
+This library is free software; you can redistribute it and/or
+modify it under the terms of the GNU Lesser General Public
+License as published by the Free Software Foundation; either
+version 2.1 of the License, or (at your option) any later version.
+
+This library is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+Lesser General Public License for more details.
+
+You should have received a copy of the GNU Lesser General Public
+License along with this library; if not, write to the Free Software
+Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+02110-1301 USA
+
+
+>>> from stdnum.de import handelsregisternummer
+>>> import stdnum.exceptions
+
+
+Some basic tests for valid numbers.
+
+>>> handelsregisternummer.validate('Bad Homburg v.d.H. PR 11223')
+'Bad Homburg v.d.H. PR 11223'
+>>> handelsregisternummer.validate('Ludwigshafen a.Rhein (Ludwigshafen) VR 
11223')
+'Ludwigshafen a.Rhein (Ludwigshafen) VR 11223'
+>>> handelsregisternummer.validate('Aachen HRA 11223', company_form='KG')
+'Aachen HRA 11223'
+>>> handelsregisternummer.validate('Frankfurt/Oder GnR 11223', 
company_form='e.G.')
+'Frankfurt/Oder GnR 11223'
+>>> handelsregisternummer.validate('Bad Homburg v.d.H. PR 11223', 
company_form='PartG')
+'Bad Homburg v.d.H. PR 11223'
+>>> handelsregisternummer.validate('Ludwigshafen a.Rhein (Ludwigshafen) VR 
11223', company_form='e.V.')
+'Ludwigshafen a.Rhein (Ludwigshafen) VR 11223'
+>>> handelsregisternummer.validate('Berlin (Charlottenburg) HRA 11223 B')
+'Berlin (Charlottenburg) HRA 11223 B'
+>>> handelsregisternummer.validate('Berlin (Charlottenburg) HRB 11223B')
+'Berlin (Charlottenburg) HRB 11223 B'
+>>> handelsregisternummer.validate('Berlin (Charlottenburg) HRA 11223 B')
+'Berlin (Charlottenburg) HRA 11223 B'
+>>> handelsregisternummer.validate('Berlin (Charlottenburg) HRB 11223B')
+'Berlin (Charlottenburg) HRB 11223 B'
+
+
+The court name can also be shortened and various encodings are accepted but
+we only return either Unicode or UTF-8 (Python 2 only). The tests are a bit
+funky so they work both in Python 2 and Python 3.
+
+>>> handelsregisternummer.validate('Berlin HRB 11223 B')  # Charlottenburg 
missing
+'Berlin (Charlottenburg) HRB 11223 B'
+>>> number = u'K\xf6ln HRB 49263'  # Unicode
+>>> handelsregisternummer.validate(number) == number
+True
+>>> utf8 = 'K\xc3\xb6ln HRB 49263'  # UTF-8
+>>> handelsregisternummer.validate(utf8) == 'Köln HRB 49263'
+True
+>>> iso885915 = 'K\xf6ln HRB 49263'  # ISO-8859-15
+>>> handelsregisternummer.validate(iso885915) == 'Köln HRB 49263'
+True
+>>> ascii = 'Koln HRB 49263'  # ASCII replaced
+>>> handelsregisternummer.validate(ascii) == 'Köln HRB 49263'
+True
+>>> handelsregisternummer.validate('KXln HRB 49263')  # too wrong
+Traceback (most recent call last):
+  ...
+InvalidComponent: ...
+
+
+The compact function does minimal validation.
+>>> handelsregisternummer.compact('KXln HRB 49263')
+'KXln HRB 49263'
+
+
+These have been found online and should all be valid numbers.
+
+>>> numbers = """
+... Aachen   HRB   11214
+... Aachen   HRB   5360
+... Aachen   HRB   987
+... Bad Oeynhausen   HRA   5980
+... Bad Oeynhausen   HRB   14572
+... Bad Oeynhausen   HRB   5087
+... Bad Oeynhausen   HRB   8753
+... Berlin (Charlottenburg) HRB 178881
+... Berlin HRB 87447 B
+... Bochum   HRA   5582
+... Bochum   HRA   5828
+... Braunschweig   HRB   8057
+... Chemnitz   HRB   14011
+... Coesfeld   HRA   7092
+... Coesfeld   HRB   13681
+... Coesfeld   HRB   6930
+... Dortmund   HRA   18285
+... Dortmund   HRB   13762
+... Dortmund   HRB   25525
+... Dresden   HRB   29828
+... Düren   HRA   1971
+... Düren   HRA   3014
+... Düren   HRB   3138
+... Düsseldorf   HRB   16894
+... Düsseldorf   HRB   42518
+... Düsseldorf   HRB   45892
+... Düsseldorf   HRB   67311
+... Eschwege   HRA   2115
+... Essen   HRA   8158
+... Flensburg   HRB   4057  FL
+... Flensburg HRA 4057 FL
+... Friedberg   HRB   5519
+... Fulda   HRA   653
+... Fürth, HRB 7754
+... Gelsenkirchen   HRA   1838
+... Gelsenkirchen   HRB   3694
+... Gelsenkirchen   HRB   7246
+... Gießen   HRB   7519
+... Göttingen   HRA   130944
+... Göttingen   HRB   201633
+... Gütersloh   HRB   4290
+... HRA 350654, Mannheim
+... HRB 151080 B, Charlottenburg
+... HRB 178881 B, Charlottenburg
+... Hagen   HRB   4101
+... Hagen   HRB   8315
+... Hamm   HRB   5488
+... Hamm   HRB   942
+... Hanau   HRB   5015
+... Hannover   HRA   200593
+... Hannover   HRA   203664
+... Hannover   HRB   100146
+... Hannover   HRB   110948
+... Hildesheim   HRA   100692
+... Hildesheim   HRB   203244
+... Hildesheim   HRB   3587
+... Iserlohn   HRB   8669
+... Jena   HRA   102336
+... Jena   HRA   202638
+... Jena   HRA   301593
+... Jena   HRB   106960
+... Jena   HRB   112624
+... Jena   HRB   202400
+... Jena   HRB   207705
+... Jena   HRB   305494
+... Jena   HRB   405517
+... Koblenz   HRA   12710
+... Koblenz   HRB   3000
+... Korbach   HRA   659
+... Köln   HRA   22861
+... Köln   HRB   21508
+... Köln   HRB   33876
+... Köln   HRB   48349
+... Köln   HRB   52006
+... Landau   HRB   1668
+... Leipzig   HRA   15866
+... Leipzig   HRB   17256
+... Leipzig   HRB   24591
+... Ludwigshafen a.Rhein (Ludwigshafen)   HRB   65041
+... Lübeck   HRB   12065  HL
+... Lübeck   HRB   12067  HL
+... Lübeck   HRB   12068  HL
+... Lübeck   HRB   12085  HL
+... Lübeck   HRB   5873  HL
+... Mönchengladbach   HRA   3644
+... Mönchengladbach   HRB   5867
+... Mönchengladbach   HRB   6639
+... Mönchengladbach   HRB   7785
+... München HRB 178881
+... Münster   HRA   8289
+... Neubrandenburg   HRB   4956
+... Neuss   HRB   9817
+... Oldenburg (Oldenburg)   HRA   110612
+... Oldenburg (Oldenburg)   HRA   120361
+... Oldenburg (Oldenburg)   HRB   111147
+... Oldenburg (Oldenburg)   HRB   120757
+... Oldenburg (Oldenburg)   HRB   151060
+... Oldenburg (Oldenburg)   HRB   201016
+... Osnabrück   HRB   1090
+... Paderborn   HRA   1076
+... Paderborn   HRA   1364
+... Paderborn   HRA   3549
+... Paderborn   HRB   361
+... Paderborn   HRB   3659
+... Paderborn   HRB   653
+... Paderborn   HRB   6774
+... Pinneberg   HRB   12700  PI
+... Recklinghausen   HRB   4702
+... Rostock   HRA   887
+... Saarbrücken   HRB   102069
+... Siegen   HRA   7881
+... Siegen   HRB   10955
+... Siegen   HRB   5398
+... Siegen   HRB   7426
+... Stuttgart   HRB   460675
+... Tostedt   HRB   100870
+... Walsrode   HRB   202134
+... Wiesbaden   HRB   11946
+... Wittlich   HRB   42489
+... Wuppertal   HRA   22088
+... Wuppertal   HRB   13986
+... Wuppertal   HRB   14596
+... Wuppertal   HRB   16127
+... Zweibrücken   HRB   22575
+... """
+>>> [x for x in numbers.splitlines() if x and not 
handelsregisternummer.is_valid(x)]
+[]

-----------------------------------------------------------------------

Summary of changes:
 stdnum/de/__init__.py                       |   3 +
 stdnum/de/handelsregisternummer.py          | 307 ++++++++++++++++++++++++++++
 tests/test_de_handelsregisternummer.doctest | 207 +++++++++++++++++++
 3 files changed, 517 insertions(+)
 create mode 100644 stdnum/de/handelsregisternummer.py
 create mode 100644 tests/test_de_handelsregisternummer.doctest


hooks/post-receive
-- 
python-stdnum
-- 
To unsubscribe send an email to
python-stdnum-commits-unsubscribe@lists.arthurdejong.org or see
https://lists.arthurdejong.org/python-stdnum-commits/