+4 votes
in Programming Languages by (14.7k points)

I am reading a file using the following code. It works perfectly fine on Ubuntu system, but when I try to run it on Windows machine, it gives error: "UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 1800: character maps to <undefined>". How can I fix this error?

with open(ifile) as fi:
    for rec in fi:
        lcount += 1
        vals = rec.strip('\n').split(';')
        id = vals[0].strip('"').strip(' ')
        title = vals[1].strip('"').strip(' ')
        title = re.sub(r'[^a-zA-Z0-9, ]', '', title)
        vals = re.sub(r'quot', '', title).split()
        title = ' '.join(vals)

1 Answer

0 votes
by (25.6k points)

Try to open the file using option encoding='utf-8' or errors='ignore'. Hopefully, it should fix the error.

Option 1

with open(ifile, 'r', encoding='utf-8') as fi:
    for rec in fi:
        lcount += 1

Option 2

with open(ifile, 'r', errors='ignore') as fi:
    for rec in fi:
        lcount += 1

...