share
Stack Overflow"ValueError: embedded null character" when using open()
[+20] [10] Erica
[2015-11-28 23:10:20]
[ python ]
[ https://stackoverflow.com/questions/33977519/valueerror-embedded-null-character-when-using-open ]

I am taking python at my college and I am stuck with my current assignment. We are supposed to take 2 files and compare them. I am simply trying to open the files so I can use them but I keep getting the error "ValueError: embedded null character"

file1 = input("Enter the name of the first file: ")
file1_open = open(file1)
file1_content = file1_open.read()

What does this error mean?

Where are the files coming from? - Padraic Cunningham
My teacher added tester files to be used when running the program. The first file in one of the testers that gives me the error is "Tests/4-test.txt" - Erica
You have a null byte embedded in the string which won't work using python, you need to remove the null bytes/s. What OS are you using? - Padraic Cunningham
If you are using linux try tr -d '\000' < Tests/4-test.txt > Tests/4-test_cleaned.txt and use test_cleaned.txt - Padraic Cunningham
I'm using windows 10. I'm also using python 3.5 - Erica
Then you will need to find a way to remove the null bytes using some windows software, I am not familiar with windows but this blog post should help security102.blogspot.ru/2010/04/… Also are you sure the data did nto get corrupted somehow? - Padraic Cunningham
@Erica try: file1_open = open(file1, 'rb') an let us know - pippo1980
[+22] [2019-09-13 17:14:11] Алексей Семенихин

It seems that you have problems with characters "\" and "/". If you use them in input - try to change one to another...


(1) This answer was downvoted for the wrong reasons as this possibly resolves the error. This answer worked for me. - RobH
1
[+8] [2015-11-29 10:05:16] stonebig

Default encoding of files for Python 3.5 is 'utf-8'.

Default encoding of files for Windows tends to be something else.

If you intend to open two text files, you may try this:

import locale
locale.getdefaultlocale()
file1 = input("Enter the name of the first file: ")
file1_open = open(file1, encoding=locale.getdefaultlocale()[1])
file1_content = file1_open.read()

There should be some automatic detection in the standard library.

Otherwise you may create your own:

def guess_encoding(csv_file):
    """guess the encoding of the given file"""
    import io
    import locale
    with io.open(csv_file, "rb") as f:
        data = f.read(5)
    if data.startswith(b"\xEF\xBB\xBF"):  # UTF-8 with a "BOM"
        return "utf-8-sig"
    elif data.startswith(b"\xFF\xFE") or data.startswith(b"\xFE\xFF"):
        return "utf-16"
    else:  # in Windows, guessing utf-8 doesn't work, so we have to try
        try:
            with io.open(csv_file, encoding="utf-8") as f:
                preview = f.read(222222)
                return "utf-8"
        except:
            return locale.getdefaultlocale()[1]

and then

file1 = input("Enter the name of the first file: ")
file1_open = open(file1, encoding=guess_encoding(file1))
file1_content = file1_open.read()

(3) I am not allowed to import anything into my program - Erica
2
[+7] [2022-01-24 11:11:01] Rahul Pandey

Try putting r (raw format).

r'D:\python_projects\templates\0.html'


3
[+5] [2022-02-04 06:04:19] srinivasan dasarathi

On Windows while specifying the full path of the file name, we should use double backward slash as the seperator and not single backward slash. For instance, C:\\FileName.txt instead of C:\FileName.txt


4
[+2] [2021-01-01 20:25:36] Francisco de Larrañaga

The problem is due to bytes data that needs to be decoded.

When you insert a variable into the interpreter, it displays it's repr attribute whereas print() takes the str (which are the same in this scenario) and ignores all unprintable characters such as: \x00, \x01 and replaces them with something else.

A solution is to "decode" file1_content (ignore bytes):

file1_content = ''.join(x for x in file1_content if x.isprintable())

5
[+2] [2021-09-22 19:07:02] yunusemredemirbas

I got this error when copying a file to a folder that starts with a number. If you write the folder path with the double \ sign before the number, the problem will be solved.


6
[+2] [2021-10-23 01:22:35] abc

The first slash of the file path name throws the error.

Need Raw, r
Raw string

FileHandle = open(r'..', encoding='utf8')

FilePath='C://FileName.txt'
FilePath=r'C:/FileName.txt'


7
[+1] [2022-09-28 06:10:19] kamakshi singh

I was also getting the same error with the following code:

with zipfile.ZipFile("C:\local_files\REPORT.zip",mode='w') as z:
    z.writestr(data)

It was happening because I was passing the bytestring i.e. data in writestr() method without specifying the name of file i.e. Report.zip where it should be saved. So I changed my code and it worked.

with zipfile.ZipFile("C:\local_files\REPORT.zip",mode='w') as z:
    z.writestr('Report.zip', data)

8
[+1] [2023-05-16 10:01:07] Viraj Patel

Instead of this D:\path\0.html try this D:/path/0.html Reason of error is python interpret \0 instead of path string.


9
[-1] [2020-09-16 23:54:41] Sabito

If you are trying to open a file then you should use the path generated by os, like so:

import os
os.path.join("path","to","the","file")

10