Skip to content
This repository has been archived by the owner on Aug 3, 2024. It is now read-only.

haddock crashes building documentation for text #573

Open
eamsden opened this issue Jan 29, 2017 · 10 comments
Open

haddock crashes building documentation for text #573

eamsden opened this issue Jan 29, 2017 · 10 comments

Comments

@eamsden
Copy link

eamsden commented Jan 29, 2017

Version

$ haddock --version
Haddock version 2.17.3, (c) Simon Marlow 2006
Ported to use the GHC API by David Waern 2006-2008

text version:

text 1.2.2.1

Error message:

    haddock: internal error: Data/Text/Internal.hs: hGetContents: invalid argument (invalid byte sequence)

Architecture
armel (emulated by qemu on x86_64 in chroot)

@Fuuzetsu
Copy link
Member

Could it be the locale is not UTF8 but the comment contains such?

@domenkozar
Copy link

I'm seeing the same error with Deque-0.2 using stack hoogle --nix.

$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

@Fuuzetsu
Copy link
Member

Fuuzetsu commented Dec 4, 2017

@domenkozar stack hoogle --nix --no-nix-pure? perhaps the locale is only set outside of the nix shell

@kirelagin
Copy link

As far as I understand it, the issue is similar to others, e.g. sol/markdown-unlit#8 and the right way to resolve it is to set utf8 as encoding on all the file handles before reading from / writing to them (good thing there is already Documentation.haddock.Utf8, which is a perfect place for such a helper 😉). Does that sound right to you?

@alexbiehl
Copy link
Member

alexbiehl commented Dec 19, 2017

@domenkozar Which version of haddock did you use? I just tried building the documentation using haddock-2.18.1 and it worked with cd deque-0.2 && cabal new-haddock (with LANG=et_EE.iso88591).

@domenkozar
Copy link

What @Fuuzetsu suggest fixes the issue, haddock fails to build with LANG=C

@domenkozar
Copy link

domenkozar commented Dec 19, 2017

What @kirelagin suggests is sensible for the future. Using shell env to determine filesystem encoding is one of the craziest ideas in CS history :)

@kirelagin
Copy link

Using shell env to determine filesystem encoding

Not like that. According to the Language Report, Haskell programs use Unicode so it is a reasonable assumption that the all the Haskell files are always in utf8. The only act in which env is involved is determining the encoding of the console if we are printing something to it (e.g. a Haskell identifier in an error message).

@Fuuzetsu
Copy link
Member

According to the Language Report, Haskell programs use Unicode so it is a reasonable assumption that the all the Haskell files are always in utf8

Didn't know this. I'm not even sure if GHC itself respects this. But if it's in there then we definitely just should use UTF8 everywhere and call it a day.

@kirelagin
Copy link

@Fuuzetsu Some of it was fixed in GHC a couple of years ago:
https://phabricator.haskell.org/D1151
https://phabricator.haskell.org/D1153

I believe most other things related to Unicode have been working fine for a while.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants