Who am i?
Hi my name is Fernando PN, i'm trying to get my master's degree on Unicamp in Brazil, this website was made to publish some of my accomplishments in academy and real life in areas like computer graphics, computer vision and parallel programming. Thanks for coming.Tags
My Private Laboratory
Ours Laboratory
Tag Archives: Encoding
Encoding, Decoding, Weird characters??
Have you haver had a problem with charset encoding?
I had, a lot. Why? us on Brazil use a lot of ‘é’, ‘ç’, ‘ã’, ‘í’ and a lot of other chars that have different definitions on different encoding like (ISO 8895-1, UTF-8 and some others).
So often happens that you wrote a full document on latex (by example), and save it as UTF-8 on your text editor. When u open it on the next time all your special characters will be like %% and other weird characters.
To prevent it try always to put on your source code a explicitly definition of the used encoding on the header of your file.
Most of the text editors will read this header and understand that you want to use UTF-8.
If you really need to convert between charsets, this is the source of a iso8859-1 to utf-8 conversor and vice versa. It can be easily be changed to other encodings.
Selec All Code:
1 2 3 4 5 6 7 8 9 10 11 12 13 | #!/usr/bin/python import sys def my_decode(s): raw = s.decode("iso8859-1") return raw.encode("utf-8"); try: txt = open(sys.argv[1]) print my_decode(txt.read()) txt.close(); except IndexError: print my_decode(sys.stdin.read()) sys.stdin.close() |
How hard is that?
I’am sure that this core runs on python 2.x. It may need some changes to work on python 3.0 like the use of the print function.
Updated Version on our GitRepository.
