Project

General

Profile

Bug #13749

connecteur CSV : meilleur comportement sur des données non-UTF-8

Added by Frédéric Péters over 8 years ago. Updated over 8 years ago.

Status:
Nouveau
Priority:
Normal
Assignee:
-
Target version:
-
Start date:
26 October 2016
Due date:
% Done:

0%

Estimated time:
Patch proposed:
No
Planning:

Description

Aujourd'hui c'est erreur 500.

Internal Server Error: /csvdatasource/liste-des-rues/data
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/django/core/handlers/base.py", line 111, in get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/usr/lib/python2.7/dist-packages/passerelle/urls_utils.py", line 54, in f
    return func(request, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/django/views/generic/base.py", line 69, in view
    return self.dispatch(request, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/django/views/generic/base.py", line 87, in dispatch
    return handler(request, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/passerelle/utils/__init__.py", line 104, in _wrapped_view
    return view_func(instance, request, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/passerelle/utils/jsonresponse.py", line 293, in wrapper
    return self.method(f, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/passerelle/utils/jsonresponse.py", line 359, in api_method
    return self.api(f, args[1], *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/passerelle/utils/jsonresponse.py", line 401, in api
    return self.render_data(req, data, status, data['err'])
  File "/usr/lib/python2.7/dist-packages/passerelle/utils/jsonresponse.py", line 345, in render_data
    plain = json.dumps(data, **kwargs)
  File "/usr/lib/python2.7/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/usr/lib/python2.7/json/encoder.py", line 200, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python2.7/json/encoder.py", line 263, in iterencode
    return _iterencode(o, 0)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe9 in position 22: unexpected end of data

History

#1

Updated by Benjamin Dauvergne over 8 years ago

Je dirai que le mieux ce serait de les refuser au dépôt et pour couvrir le passé de faire une validation aussi avant le traitement de chaque requête (faut traiter /data et /query, mais /query utiliser un UnicodeDictReader qu'on pourrait améliorer pour faire cette détection et aussi utiliser dans /data).

#2

Updated by Benjamin Dauvergne over 8 years ago

On pourrait aussi utiliser 'replace' au lieu 'ignore' dans

    content = content.decode('utf-8-sig', 'ignore').encode('utf-8')

bon ça cache les erreurs c'est vrai.

Also available in: Atom PDF