python - 无法解码yml文件 utf8'编解码器无法解码字节 #xa0: 无效的起始字节

  显示原文与译文双语对照的内容
0 0

我正在尝试读取YAML文件并将它的转换为字典文件。 我看到将文件加载到dict变量时出现问题。

我试图搜索类似的问题。 stackoverflow中的一个答复是用 '' 替换每个字符 'xa0' 。 我尝试过 line = line.replace('xa0',' ') 这里程序无法在 python 2.7版本上运行。 我尝试使用 python 3它能正常工作。


import yaml
import sys

yaml_dir ="/root/tools/test_case/"

#file_name ="TC_CFD_SR.yml"
file_name ="TC_QB.yml"
tc_file_name = yaml_dir + file_name

def write(file,content):
 file = open(file,'a')
 file.write(content)
 file.close()

def verifyYmlFile(yml_file):
 data = {}
 with open(yml_file, 'r') as fin:
 for line in fin:
 line = line.replace('xa0',' ')
 write('anand-yaml.yml',line)

 with open('anand-yaml.yml','r') as fin:
 data = yaml.load(fin)
 return data

if __name__ == '__main__':
 data = {}
 print"verifying yaml"
 data= verifyYmlFile(tc_file_name)

错误:


[root@anand-harness test_case]# python verify_yaml.py 
verifying yaml
Traceback (most recent call last):
 File"verify_yaml.py", line 29, in <module>
 data= verifyYmlFile(tc_file_name)
 File"verify_yaml.py", line 23, in verifyYmlFile
 data = yaml.load(fin)
 File"/usr/lib64/python2.6/site-packages/yaml/__init__.py", line 71, in load
 return loader.get_single_data()
 File"/usr/lib64/python2.6/site-packages/yaml/constructor.py", line 37, in get_single_data
 node = self.get_single_node()
 File"/usr/lib64/python2.6/site-packages/yaml/composer.py", line 36, in get_single_node
 document = self.compose_document()
 File"/usr/lib64/python2.6/site-packages/yaml/composer.py", line 55, in compose_document
 node = self.compose_node(None, None)
 File"/usr/lib64/python2.6/site-packages/yaml/composer.py", line 82, in compose_node
 node = self.compose_sequence_node(anchor)
 File"/usr/lib64/python2.6/site-packages/yaml/composer.py", line 111, in compose_sequence_node
 node.value.append(self.compose_node(node, index))
 File"/usr/lib64/python2.6/site-packages/yaml/composer.py", line 84, in compose_node
 node = self.compose_mapping_node(anchor)
 File"/usr/lib64/python2.6/site-packages/yaml/composer.py", line 133, in compose_mapping_node
 item_value = self.compose_node(node, item_key)
 File"/usr/lib64/python2.6/site-packages/yaml/composer.py", line 64, in compose_node
 if self.check_event(AliasEvent):
 File"/usr/lib64/python2.6/site-packages/yaml/parser.py", line 98, in check_event
 self.current_event = self.state()
 File"/usr/lib64/python2.6/site-packages/yaml/parser.py", line 449, in parse_block_mapping_value
 if not self.check_token(KeyToken, ValueToken, BlockEndToken):
 File"/usr/lib64/python2.6/site-packages/yaml/scanner.py", line 116, in check_token
 self.fetch_more_tokens()
 File"/usr/lib64/python2.6/site-packages/yaml/scanner.py", line 244, in fetch_more_tokens
 return self.fetch_single()
 File"/usr/lib64/python2.6/site-packages/yaml/scanner.py", line 653, in fetch_single
 self.fetch_flow_scalar(style=''')
 File"/usr/lib64/python2.6/site-packages/yaml/scanner.py", line 667, in fetch_flow_scalar
 self.tokens.append(self.scan_flow_scalar(style))
 File"/usr/lib64/python2.6/site-packages/yaml/scanner.py", line 1156, in scan_flow_scalar
 chunks.extend(self.scan_flow_scalar_non_spaces(double, start_mark))
 File"/usr/lib64/python2.6/site-packages/yaml/scanner.py", line 1196, in scan_flow_scalar_non_spaces
 while self.peek(length) not in u''" trnx85u2028u2029':
 File"/usr/lib64/python2.6/site-packages/yaml/reader.py", line 91, in peek
 self.update(index+1)
 File"/usr/lib64/python2.6/site-packages/yaml/reader.py", line 165, in update
 exc.encoding, exc.reason)
yaml.reader.ReaderError: 'utf8' codec can't decode byte #xa0: invalid start byte
 in"anand-yaml.yml", position 3246

我犯了什么错?

时间: 原作者:

0 0

字符序列" xa0"不是你在消息中看到的问题,问题是序列" xa0"( 请注意反斜杠未转义) 。
替换线应该是:


 line = line.replace('xa0',' ')

避开这个问题。

如果你知道自己可以做正确的转换,但这不是必要的,或者上面的补丁不是结构解决方案。 最好是以正确的方式生成YAML文件,( 它们默认为 UTF-8,因此应该包含正确的UTF-8 ) 。 可以在没有适当的BOM表( 。yaml库解释 IIRC )的情况下使用 UTF-16.


s1 = 'abcxa0xyz'
print(repr(s1))
u1 = s1.decode('utf-8') # this works fine

s = 'abcxa0xyz'
print(repr(s))
u = s.decode('utf-8') # this throws an error

原作者:
...