发布于2020-04-03 10:46 阅读(1572) 评论(0) 点赞(17) 收藏(2)
在运行程序时报错,称编码失败。
from sys import argv
from os.path import exists
script, from_file, to_file = argv
print(f"Copying from {from_file} to {to_file}")
# We could do these two on one line, how?
# 打开from_file的文件对象并将其赋值给in_file
in_file = open(from_file)
# 读取in_file内容并将其赋值给indata
indata = in_file.read()
# 输出indata的文件字符长度
print(f"The input file is {len(indata)} bytes long")
# 查看to_file文件是否存在
print(f"Does the output file exist? {exists(to_file)}")
print("Ready, hit RETURN to continue, hit CTRL-C to abort.")
input()
# 打开to_file的文件对象并将其赋值给out_file
out_file = open(to_file, 'w')
# 将indata的内容写入out_file文件
out_file.write(indata)
print("Alright, all done.")
#关闭in_file, out_file文件
out_file.close()
in_file.close()
PS D:\pythonp> # first make a sample file.
PS D:\pythonp> echo "This is a test file." > test17.txt
PS D:\pythonp> #then look at it.
PS D:\pythonp> cat test17.txt
This is a test file.
PS D:\pythonp> # now run our script on it.
PS D:\pythonp> python ex17.py test17.txt new_file17.txt
Copying from test17.txt to new_file17.txt
Traceback (most recent call last):
File "ex17.py", line 10, in <module>
indata = in_file.read()
UnicodeDecodeError: 'gbk' codec can't decode byte 0xff in position 0: illegal multibyte sequence
在read()函数处报错
- 查询发现可能是文件编码问题。
按照如下文章调试不成功,具体错误如下列代码。
UnicodeDecodeError: ‘gbk’ codec can’t decode byte 0xab in position 11126: illegal multibyte sequence
将第9行改为
# 将编码格式改为gbk
in_file = open(from_file, encoding = 'gbk')
PS D:\pythonp> python ex17.py test17.txt new_file17.txt
Copying from test17.txt to new_file17.txt
Traceback (most recent call last):
File "ex17.py", line 10, in <module>
indata = in_file.read()
UnicodeDecodeError: 'gbk' codec can't decode byte 0xff in position 0: illegal multibyte sequence
仍然在同一个位置报错
将第9行改为
# 将编码格式改为gb18030
in_file = open(from_file, encoding = 'gb18030')
PS D:\pythonp> python ex17.py test17.txt new_file.txt
Copying from test17.txt to new_file.txt
Traceback (most recent call last):
File "ex17.py", line 10, in <module>
indata = in_file.read()
UnicodeDecodeError: 'gb18030' codec can't decode byte 0xff in position 0: illegal multibyte sequence
继续同一位置报错
将第9行改为
# 将编码格式改为gb18030,令忽略错误
in_file = open(from_file, encoding = 'gb18030', errors = 'ignore')
PS D:\pythonp> python ex17.py test17.txt new_file17.txt
Copying from test17.txt to new_file17.txt
The input file is 44 bytes long
Does the output file exist? False
Ready, hit RETURN to continue, hit CTRL-C to abort.
Traceback (most recent call last):
File "ex17.py", line 19, in <module>
out_file.write(indata)
UnicodeEncodeError: 'gbk' codec can't encode character '\u2e84' in position 0: illegal multibyte sequence
原来的位置运行成功,但在write()函数处再一次报错,所以又去查阅资料,想要搞清楚问题根源。
- 经查,问题在于PowerShell对于文件的输出重定向默认选择”UTF-16 (LE)”(微软称之为Unicode编码),而实际需要文件输出格式为”UTF-8”
试用参考文章中的解决方法
Windows PowerShell 输出文件编码格式问题
Powershell改变默认编码
将PowerShell的默认输出编码更改为UTF-8
更倾向于不更改默认输出编码的方式
所以做以下尝试
PS D:\pythonp> chcp 65001
chcp : 无法将“chcp”项识别为 cmdlet、函数、脚本文件或可运行程序的名称。请检查名称的拼写,如果包括路径,请确保路径正确
,然后再试一次。
所在位置 行:1 字符: 1
+ chcp 65001
+ ~~~~
+ CategoryInfo : ObjectNotFound: (chcp:String) [], CommandNotFoundException
+ FullyQualifiedErrorId : CommandNotFoundException
报错,输入chcp 65001切换当前命令行窗口工作编码格式为”UTF-8”的方式不适用。
- 在不想尝试改变默认输出编码的情况下只能从输出方式入手,尝试不同的输出途径
PS D:\pythonp> echo "TEST one">test17.1.txt
PS D:\pythonp> cat test17.1.txt
TEST one
其编码格式为Unicode。
PS D:\pythonp> echo "test two" > test17.2.txt -encoding utf-8
PS D:\pythonp> cat test17.2.txt
test two
-encoding
utf-8
其编码格式依旧为Unicode,可见-encoding utf-8后缀被当作内容一同写入文件。运行后与 <尝试一> 报相同的错误如下。
PS D:\pythonp> python ex17.py test17.2.txt new_file.txt
Copying from test17.2.txt to new_file.txt
Traceback (most recent call last):
File "ex17.py", line 10, in <module>
indata = in_file.read()
UnicodeDecodeError: 'gbk' codec can't decode byte 0xff in position 0: illegal multibyte sequence
其编码格式为ANSI,相关内容见编码格式
PS D:\pythonp> python ex17.py test17.3.txt new_file.txt
Copying from test17.3.txt to new_file.txt
The input file is 10 bytes long
Does the output file exist? True
Ready, hit RETURN to continue, hit CTRL-C to abort.
teturn
Alright, all done.
其编码格式为ANSI
PS D:\pythonp> python ex17.py test17.4.txt new_file.txt
Copying from test17.4.txt to new_file.txt
The input file is 9 bytes long
Does the output file exist? True
Ready, hit RETURN to continue, hit CTRL-C to abort.
Alright, all done.
具体内容如如何编码,具体细节与区别等之后更深入了解后补充
参考文章:
编码格式简介(ANSI、GBK、GB2312、UTF-8、GB18030和 UNICODE)
编码格式(UTF-8 与 ANSI)各种编码解码(encode、decode)
编码方式之ASCII、ANSI、Unicode概述
- 查阅之中无意发现,将open函数的打开权限改为以二进制方式打开,可以避免编码格式不同造成的报错。
将第9行和第18行代码分别改为
#以只读二进制格式打开from_file的文件对象,并将其赋值给in_file
in_file = open(from_file, 'rb')
#以只写二进制格式打开to_file的文件对象,并将其赋值给out_file
out_file = open(to_file, 'wb')
PS D:\pythonp> python ex17.py test17.2.txt new_test17.1.txt
Copying from test17.2.txt to new_test17.1.txt
The input file is 38 bytes long
Does the output file exist? True
Ready, hit RETURN to continue, hit CTRL-C to abort.
Alright, all done.
PS D:\pythonp> cat test17.2.txt
There is a test.
PS D:\pythonp> cat new_test17.1.txt
There is a test.
经验证,运行结果未出现异常。
读文章
以默认方式读取二进制文件,可能会出现文件读取不全的情况。
碰到’0x1A’,就错误地视为文件结束(EOF)。使用二进制方式读取二进制文件可避免这种情况。
写文章
对于字符串x=‘abc\ndef’,我们可用len(x)得到它的长度为7,\n我们称之为换行符,实际上是0x0A。当我们用’w’即文本方式写的时候,在windows平台上会自动将’0x0A’变成两个字符’0x0D’, ‘0x0A’,即文件长度实际上变成8。当用’r’文本方式读取时,又自动的转换成原来的换行符。 如果换成’wb’二进制方式来写的话,则会保持一个字符不变,读取的时候也是原样读取。 所以如果用文本方式写入,用二进制方式读取的话,就要考虑这多出的一个字节了。'0x0D’也称回车符。 Linux下不会变,因为linux只使用’0X0A’来表示换行。
参考文章:
《笨办法学Python3》——练习17
Python中读取txt文本出现“ ‘gbk’ codec can’t decode byte 0xbf in position 2: illegal multibyte sequence”的解决办法
from sys import argv
script, from_file, to_file = argv
#打开from_file的文件对象,并读其内容,将内容赋值给indata
indata = open(from_file).read()
print(f"The input file is {len(indata)} bytes long")
#以只写方式打开to_file的文件对象,并将indata中的内容写入to_file,将内容赋值给outdata
outdata = open(to_file, 'w').write(indata)
#关闭in_file, out_file文件
indata.close()
outdata.close()
from sys import argv
script, from_file, to_file = argv
# 以只写方式打开to_file的文件对象,打开from_file的文件对象并读其内容,将内容写入打开的to_file的文件对象
open(to_file, "w").write(open(from_file).read())
PS D:\pythonp> python ex17.2.py test17.4.txt test17.5.txt
PS D:\pythonp>
没有输出内容,希望看到文件内容
from sys import argv
script, from_file, to_file = argv
open(to_file, "w").write(open(from_file).read())
#输出to_file指代的文件名,而非文件内容
print(to_file)
PS D:\pythonp> python ex17.2.py test17.4.txt test17.5.txt
test17.5.txt
PS D:\pythonp> cat test17.5.txt
TEST four ——new
当打印变量时,打印输出的是该变量指代的文件名,而非内容
在shell内用cat可以查看文件的全部内容
from sys import argv
script, from_file, to_file = argv
open(to_file, "w").write(open(from_file).read())
#打开to_file的文件对象并读其内容,输出文件内容
print(open(to_file).read())
PS D:\pythonp> python ex17.2.py test17.4.txt test17.5.txt
TEST four ——new
若希望直接在py文件内令其输出文件内容,则需如上
在“应该看到的结果”中我使用了一个叫cat的东西,这个古老的命令的用途是将两个文件“拼接”( concatenate)到一起,不过实际上它最大的用途是打印文件内容到屏幕上。你可以通过 man cat命令了解到更多信息。
找出为什么需要在代码中写 out_file.close()。
close() 相当于关闭文件并保存文件。如果没有close(),写入的内容可能会存在缓冲区中,并没有真正的写入文件里,无法被保存下来。
# this one is like your script with argv
# 创建函数,需对其中的参数进行解包
# 非最简单的方法
def print_two(*args):
arg1, arg2 = args
print(f"arg1: {arg1}, arg2: {arg2}")
# ok, that *args is actually pointless, we can just do this
# 在 Python中创建函数时,可以跳过整个参数解包的过程,直接使用()里边的名称作为变量名
def print_two_again(arg1, arg2):
print(f"arg1: {arg1}, arg2: {arg2}")
# this just takes one argument
# 函数如何接收一个参数
def print_one(arg1):
print(f"arg1: {arg1}")
# this one takes no arguments
# 函数可以不接收任何参数
def print_none():
print("I got nothin'.")
print_two("Zed","Shaw")
print_two_again("Zed","Shaw")
print_one("Fisrt!")
print_none()
PS D:\pythonp> python .\ex18.py
arg1: Zed, arg2: Shaw
arg1: Zed, arg2: Shaw
arg1: Fisrt!
I got nothin'.
PS D:\pythonp>
运行函数、调用函数和使用函数是同一个意思。
def cheese_and_crackers(cheese_count, boxes_of_crackers):
print(f"You have {cheese_count} cheeses!")
print(f"You have {boxes_of_crackers} boxes of crackers!")
print("Man that's enough for a party!")
print("Get a blanket!\n")
print("We can just give the function number directly:")
cheese_and_crackers(20, 30)
print("Oh! We can use the variables from our script:")
amount_of_cheese = 10
amount_of_crackers = 50
cheese_and_crackers(amount_of_cheese, amount_of_crackers)
print("We can even do math inside too:")
cheese_and_crackers(10 + 20, 5 + 6)
print("And we can combine the two, math and variables:")
cheese_and_crackers(amount_of_cheese + 100, amount_of_crackers + 1000)
PS D:\pythonp> python .\ex19.py
We can just give the function number directly:
You have 20 cheeses!
You have 30 boxes of crackers!
Man that's enough for a party!
Get a blanket!
Oh! We can use the variables from our script:
You have 10 cheeses!
You have 50 boxes of crackers!
Man that's enough for a party!
Get a blanket!
We can even do math inside too:
You have 30 cheeses!
You have 11 boxes of crackers!
Man that's enough for a party!
Get a blanket!
And we can combine the two, math and variables:
You have 110 cheeses!
You have 1050 boxes of crackers!
Man that's enough for a party!
Get a blanket!
# 定义cheese_and_crackers函数
## 将函数命名为cheese_and_crackers,设定函数接收两个参数,分别是cheese_count, boxes_of_crackers
def cheese_and_crackers(cheese_count, boxes_of_crackers):
## 输出格式化字符串:“你有 {cheese_count} 份奶酪!”
print(f"You have {cheese_count} cheeses!")
## 输出格式化字符串:“你有 {boxes_of_crackers} 盒薄脆饼干!”
print(f"You have {boxes_of_crackers} boxes of crackers!")
## 输出字符串:“这对一个派对来说足够了!”
print("Man that's enough for a party!")
## 输出字符串:“拿条毯子!”,并换一行
print("Get a blanket!\n")
# 输出字符串:“我们可以直接给出功能数字”
print("We can just give the function number directly:")
# 调用cheese_and_crackers函数,并直接赋值
cheese_and_crackers(20, 30)
# 输出字符串:“我们可以在我们的脚本中使用变量”
print("Oh! We can use the variables from our script:")
# 定义变量
amount_of_cheese = 10
amount_of_crackers = 50
# 调用cheese_and_crackers函数,并带入变量
cheese_and_crackers(amount_of_cheese, amount_of_crackers)
# 输出字符串:“我们甚至可以在函数内部进行数学运算”
print("We can even do math inside too:")
# 调用cheese_and_crackers函数,并进行数学运算
cheese_and_crackers(10 + 20, 5 + 6)
# 输出字符串:“我们可以将数学表达式和变量组合起来用”
print("And we can combine the two, math and variables:")
# 调用cheese_and_crackers函数,并将变量的值经过运算赋给函数
cheese_and_crackers(amount_of_cheese + 100, amount_of_crackers + 1000)
# def:定义函数
def cheese_and_crackers(cheese_count, boxes_of_crackers):
# f:将字符串进行格式化
print(f"You have {cheese_count} cheeses!")
# f:将字符串进行格式化
print(f"You have {boxes_of_crackers} boxes of crackers!")
print("Man that's enough for a party!")
# \n:换行符
print("Get a blanket!\n")
print("We can just give the function number directly:")
# 20/30:对函数直接赋值
cheese_and_crackers(20, 30)
print("Oh! We can use the variables from our script:")
# 定义变量并赋值
amount_of_cheese = 10
amount_of_crackers = 50
cheese_and_crackers(amount_of_cheese, amount_of_crackers)
print("We can even do math inside too:")
# 10/20/5/6:直接给定进行数学运算的数值
cheese_and_crackers(10 + 20, 5 + 6)
print("And we can combine the two, math and variables:")
# 100/1000:直接给定进行数学运算的数值
cheese_and_crackers(amount_of_cheese + 100, amount_of_crackers + 1000)
# 定义potatoes_and_tomatoes函数
def potatoes_and_tomatoes(potatoes_count, tomatoes_count):
print(f"You have {potatoes_count} potatoes and {tomatoes_count} tomatoes.")
print("Let's cook!")
# 不添加变量/无用户输入
# 方法一:直接赋值
potatoes_and_tomatoes(15, 20)
# 方法二:进行数学运算
potatoes_and_tomatoes(15 + 60, 25 * 3)
# 添加变量/无用户输入
# 方法三:直接使用变量赋值
amount_of_potatoes = 70
amount_of_tomatoes = 45
potatoes_and_tomatoes(amount_of_potatoes, amount_of_tomatoes)
# 方法四:使用变量+数学运算进行赋值
potatoes_and_tomatoes(amount_of_potatoes - 25, amount_of_tomatoes / 3)
# 添加变量/input用户输入
# 方法五:将用户输入的数值传递给变量,再传递给函数
amount_of_potatoes1 = int(input("input the number of your potatoes:"))
amount_of_tomatoes1 = int(input("input the number of your tomatoes:"))
potatoes_and_tomatoes(amount_of_potatoes1, amount_of_tomatoes1)
# 方法六:将用户输入的数值经数学运算后传递给函数
potatoes_and_tomatoes(amount_of_potatoes1 - 25, amount_of_tomatoes1 + 5)
# 不添加变量/input用户输入
# 方法七:将用户输入的数值直接传递给函数,无需创建变量
potatoes_and_tomatoes(int(input("input the number of your potatoes:")), int(input("input the number of your tomatoes:")))
# 方法八:不添加新变量,将用户输入的数据经数学运算后传递给函数
potatoes_and_tomatoes(int(input("input the number of your potatoes:")) + 15, int(input("input the number of your tomatoes:")) - 10)
#使用argv方式获得用户输入
# 方法九:直接传递数据
from sys import argv
script, potatoes, tomatoes = argv
potatoes_and_tomatoes(int(potatoes), int(tomatoes))
# 定义potatoes_and_tomatoes函数
def potatoes_and_tomatoes(potatoes_count, tomatoes_count):
print(f"You have {potatoes_count} potatoes and {tomatoes_count} tomatoes.")
print("Let's cook!")
# 方法十:方法九的基础上加数学运算
from sys import argv
script, potatoes, tomatoes = argv
potatoes_and_tomatoes(int(potatoes) + 10, int(tomatoes) - 10)
共十种结果如下:
PS D:\pythonp> python ex19.3.py 90 95
You have 15 potatoes and 20 tomatoes.
Let's cook!
You have 75 potatoes and 75 tomatoes.
Let's cook!
You have 70 potatoes and 45 tomatoes.
Let's cook!
You have 45 potatoes and 15.0 tomatoes.
Let's cook!
input the number of your potatoes:50
input the number of your tomatoes:55
You have 50 potatoes and 55 tomatoes.
Let's cook!
You have 25 potatoes and 60 tomatoes.
Let's cook!
input the number of your potatoes:70
input the number of your tomatoes:75
You have 70 potatoes and 75 tomatoes.
Let's cook!
input the number of your potatoes:80
input the number of your tomatoes:85
You have 95 potatoes and 75 tomatoes.
Let's cook!
You have 90 potatoes and 95 tomatoes.
Let's cook!
PS D:\pythonp> python ex19.3.2.py 100 110
You have 110 potatoes and 100 tomatoes.
Let's cook!
原文链接:https://blog.csdn.net/weixin_45938096/article/details/104608448
作者:what
链接:https://www.pythonheidong.com/blog/article/301748/acb1229d151a3f5936cf/
来源:python黑洞网
任何形式的转载都请注明出处,如有侵权 一经发现 必将追究其法律责任
昵称:
评论内容:(最多支持255个字符)
---无人问津也好,技不如人也罢,你都要试着安静下来,去做自己该做的事,而不是让内心的烦躁、焦虑,坏掉你本来就不多的热情和定力
Copyright © 2018-2021 python黑洞网 All Rights Reserved 版权所有,并保留所有权利。 京ICP备18063182号-1
投诉与举报,广告合作请联系vgs_info@163.com或QQ3083709327
免责声明:网站文章均由用户上传,仅供读者学习交流使用,禁止用做商业用途。若文章涉及色情,反动,侵权等违法信息,请向我们举报,一经核实我们会立即删除!